From 2017 to 2020, I worked as software developer for Agricool, an indoor farming company that builds IoT technologies to grow fruits and vegetables inside industrial containers. Below is a talk I gave at FOSDEM 2020 about how Agricool builds observability pipelines to make IoT farming systems more reliable.
One of the most interesting aspects of building observability for farming environments is that it requires a collaborative effort from many different technical experts, each one having a role to play to ensure the reliability of the overall systems. In the case of Agricool, observability tools are designed and used by agronomists (both scientists and operators), chemists, industrial engineers, electical engineers, IoT engineers and software developers.
From this experience, I learned that building observability pipelines can sometimes benefit from ideas borrowed from the Driven Design community. Which led me to describe our work as a case of Domain-Driven Observability - or DDO.
DDD and DDO intersect in terms of their collaborative nature. The same way DDD encourages domain experts to participate in the design of domain models, DDO encourages the involvement of domain experts in both using and designing observability tools.
One DDD concept that is particularly relevant for DDO is the search for an ubiquitous language. The ubiquitous language is meant to be used to name graphs, metrics, metric units, alerts, domain events, etc. And most importantly, the ubiquitous language is meant to be used in all team discussions, so that all stakeholders are aligned in terms of what the observability assets do, what their representations mean in the context of the application domain, and how they ought to evolve in the future.
When developers and devops engineers enforce the use of the ubiquitous language within their infrastructure-as-code assets, they are bound to accelerate the evolution of these assets over time. Instead of translating from a mental model used with domain experts to another mental model used within the code base, we make a conscious effort to align both worlds behind a common language and reach greater speed when implementing changes.
This collective alignment then feeds an accelerating learning curve that helps everybody. On one hand, domain experts learn about the role of observability to improve the reliability of systems. On the other hand, devops people acquire the domain knowledge they need to build the most helpful observability tools.
The reliability of an advanced indoor farming systems depends on many properties from many different abstraction layers. Each abstraction layer belongs to a rich discipline: plant physiology, chemistry, indoor climate regulation, electrical engineering, IoT, and software infrastructure. I believe that the only way to make steady observability improvements in such a rich context, is for everyone to learn from anyone else about their respective discipline.
Collective learning is fun and generates two very important things. First, a shared clarity of thinking about system reliability. Second, every team who contributes to the system is made accountable for its observability.