How to Trace Event-driven Kafka Services
- Josef Mayrhofer
- May 8
- 2 min read
Kafka is the leading messaging platform in cloud-native environments. It offers many benefits compared to classic messaging systems, making it outstanding for modern Microservice landscapes.
A few months ago, one of our customers contacted me regarding a problem related to tracing event-driven Kafka services. After debugging these challenges for several months, they classified this issue as unsolvable. For some reason, no observability insights were created when they initiated Kafka events using a testing tool. Surprisingly, the tracing worked fine when the web-based front was connected.
Our customer used Dynatrace to create observability for their entire Microservice environment on AKS. OneAgents instrumented their Springboot-based Microservices and created insights for their web-based front end down to the Kafka and Postgres backend layers.
Instead of running automated tests on the front layer, our customer pushed Kafka messages to topics and initiated the requests for Kafka consumers and producers. This setup was efficient because the test execution was much faster than the sometimes hard UI automation approach.
Synopsis
When the tracing is not working, we have several approaches to enforcing the creation of observability insights. Some require manual effort; others can be implemented within a few minutes.
Low effort: Custom Service to force tracing of these events
Medium effort: Use SDK to start the trace
Workaround: Ingest OTel trace
Solution
We started with the most apparent recommendation and added a custom service to force Dynatrace to start the tracing for all Kafka services involved, no matter how they are initiated. The beauty of this setup is that no code change is required. We applied this configuration within 5 minutes. A custom service consists of a Java class definition that processes the event. Once used in the Dynatrace UI, you can restart the services to activate this configuration.
Immediately after the restart, we initiated Kafka messages using the Kafka testing tool. The good news was that this simple configuration fixed the tracing problem. All initiated Kafka events were traced, and our customer was so happy that we solved this problem that he debugged for several months.
Our lesson learned was that knowledge is only power if we share it with others and bring remedy to such unsolvable problems:)
Happy Performance Engineering!
Comments