Add observability and resilience as a service

Martin Hellspong

Half-day workshop - in English

What if it was possible to add excellent observability and resilience to your services without making any code changes? This is the alluring promise of the fabled Service Mesh which alledgedly provides "Observability and Resilience as a Service".

Microservices have been hyped for a while, but let's not underestimate the difficulties involved - distributed system development is complicated! You quickly learn the hard way that deploying a handful of related services is only the very first step, and you will inevitably need to troubleshoot and tweak their interactions for issues that only seem to occur at load. You might suffer from cloud-blindness, where your services are running where you cannot easily observe if, or why they are having issues, and you need redeploys to test various different strategies and settings.

Do you feel the need to improve the reliability issues by mixing your business logic with ever more complicated communications observability and resilience code (logging, metrics, tracing, retries, timeouts, fallbacks, circuit breakers etc.) or do you delegate these things to a lib in your chosen language (that now all services needs to use)?

After learning more about Service Meshes, you'll realize that dealing with this on a ad hoc, per-service, per interaction basis is not an effective approach.

Primarily for: Developers, Tester/test leads, Architects, Others

Participant requirements: Laptops with a working Kubernetes development env (minikube or Docker Desktop) preinstalled - or alternatively full access to a suitable Kubernetes installation in the cloud.