Observe20/20: A clear look at observability
April 6th, Observe20/20: https://observe2020.io/
The event was excellent. The virtual sessions held lively audience participation and conversations. The speaker line consisted of individuals across disciplines sharing the findings related to the observability ecosystems and its applications in their respectable fields.
As I watched the speakers present, I noticed a common thread that brought us together. Observability allows users to understand and explain their systems. This thread is something that continues to drive my findings and thoughts around software delivery.
I shared my session on “Continuous Efficiency for Every Pipeline” towards the end of the day. The purpose of the session was to share how observability concepts and telemetry data could fit into our software delivery. Here are the highlights from my session:
Software delivery is the process of delivering software features to an end customer.
Traditional software delivery focuses on building, testing, and deploying artifacts. The challenge in this is that you haven’t reduced the risks involved with deployments.
We use software delivery pipelines to get from point a to b in that process in a repeatable and reliable way. Amongst industries, fully automated software delivery operations are reasonably rare.
Consider how post-deployment processes such as operationalizing and monitoring can inform our next iteration of work.
Observability for Verification starts with exposing service and application-level details. If you can track endpoints through metrics or failing service requests and latency through distributed tracing, then you begin to react to those anomalies.
Continuous Verification is the process of using your telemetry data to detect abnormal behaviors such you can take predefined action based on the rules you’ve defined in your software delivery pipeline.
Observability is not just an Operations responsibility or problem. We should consider that as organizations scale, manual processes do not.
You can build sustainable and efficient pipelines by four tracking key deployment metrics. This work was shared in the Accelerate book by Nicole Forsgren and Gene Kim.
- Deployment Frequency
- Lead time for changes
- Mean Time to Recover
- Change Failure Rate
And consider shared visibility and responsibility around cloud costs from:
- idle dev environments
- services with overallocated resources
- differences in cost between versions of services
I enjoy this talk because it shows that everyone with a software delivery stake (developers, operations, DevOps, SREs, and tech leaders)can benefit from visibility into their systems and reducing TOIL. You don’t have to be an expert in observability, OpenTelemetry, metrics, logs, or traces to start somewhere on this journey of better software delivery. But I believe the journey does have to start somewhere, and it can start with some of these ideas.
Shortly after my session, the conference joined together for a panel session. I had a little bit of a high from my talk. I felt that I’d have some thoughts and experiences to offer around implementing and putting to practice observability doing so in a relatively large and complex environment as a Consultant last year. The panel was a short session lasting about 30 minutes of air time. For those looking for answers around observability, I have some. So here were my thoughts around the questions from the panel discussion:
- There’s always the need to drive decisions based on data, so becoming a user of observability is fairly straightforward. The OpenTelemetry project is a fairly new and large project where users who are looking to give back can grow that community.
- Open standards produce definitions that support communication around processes in practice. Take for example the OpenTracing API and how it helped define distributed tracing concepts like spans and traces.
- Every field has jargon that practitioners and users often need to navigate. I don’t see it as a disservice in the observability space to introduce patterns and concepts that apply to so many environments and systems.
- In terms of learning observability or putting it into practice, it’s important to start somewhere.
- The reality of implementing and introducing new technologies to an organization is that not every user needs to be an expert to benefit or use that tool.
- The business value in observability is exploring the unknown unknowns. Sometimes this starts by introducing or starting with one aspect of observability, logs, metrics, or traces. Even if you expose one metric or instrument one span and realize later you need more, you’re able to iterate on these benefits fairly quickly.
In the words of Paul Bruce, this event aims to “encourage growth in OpenTelemetry and other relevant open-source projects related to increasing visibility, transparency, and traceability of data across software teams.”
For anyone catching up with the details of the conference, I hope this another dependable resource around observability.