How to use OpenTelemetry to Trace and Monitor Apache Kafka Systems

Поделиться
HTML-код
  • Опубликовано: 3 авг 2024
  • cnfl.io/podcast-episode-255 | How can you use OpenTelemetry to gain insight into your Apache Kafka® event systems? Roman Kolesnev, Staff Customer Innovation Engineer at Confluent, is a member of the Customer Solutions & Innovation Division Labs team working to build business-critical OpenTelemetry applications so companies can see what’s happening inside their data pipelines. In this episode, Roman joins Kris to discuss tracing and monitoring in distributed systems using OpenTelemetry. He talks about how monitoring each step of the process individually is critical to discovering potential delays or bottlenecks before they happen; including keeping track of timestamps, latency information, exceptions, and other data points that could help with troubleshooting.
    Tracing each request and its journey to completion in Kafka gives companies access to invaluable data that provides insight into system performance and reliability. Furthermore, using this data allows engineers to quickly identify errors or anticipate potential issues before they become significant problems. With greater visibility comes better control over application health - all made possible by OpenTelemetry's unified APIs and services.
    As described on the OpenTelemetry.io website, "OpenTelemetry is a Cloud Native Computing Foundation incubating project. Formed through a merger of the OpenTracing and OpenCensus projects." It provides a vendor-agnostic way for developers to instrument their applications across different platforms and programming languages while adhering to standard semantic conventions so the traces/information can be streamed to compatible systems following similar specs.
    By leveraging OpenTelemetry, organizations can ensure their applications and systems are secure and perform optimally. It will quickly become an essential tool for large-scale organizations that need to efficiently process massive amounts of real-time data. With its ability to scale independently, robust analytics capabilities, and powerful monitoring tools, OpenTelemetry is set to become the go-to platform for stream processing in the future.
    Roman explains that the OpenTelemetry APIs for Kafka are still in development and unavailable for open source. The code is complete and tested but has never run in production. But if you want to learn more about the nuts and bolts, he invites you to connect with him on the Confluent Community Slack channel. You can also check out Monitoring Kafka without instrumentation with eBPF - Antón Rodríguez to learn more about a similar approach for domain monitoring.
    EPISODE LINKS
    ► OpenTelemetry java instrumentation: github.com/open-telemetry/ope...
    ► OpenTelemetry collector: github.com/open-telemetry/ope...
    ► Distributed Tracing for Kafka with OpenTelemetry-Kafka London 2022: cnfl.io/distributed-tracing-f...
    ► Monitoring Kafka without instrumentation with eBPF: cnfl.io/monitoring-extreme-sc...
    ► Join the Confluent Community: cnfl.io/confluent-community-e...
    ► Learn more with Kafka tutorials, resources, and guides at Confluent Developer: cnfl.io/confluent-developer-e...
    ► Use PODCAST100 to get an additional $100 of free Confluent Cloud usage: cnfl.io/try-cloud-episode-255
    ► Promo code details: cnfl.io/podcast100-details-ep...
    TIMESTAMPS
    0:00 - Inro
    4:14 - What is OpenTelemetry?
    7:52 - Tracing vs. Logs
    11:26 - Three ways to do application-level tracing with OpenTelemetry
    15:47 - What can you do if OpenTelemetry's agent doesn't support a specific API?
    17:57 - What's missing in OpenTelemetry's native Kafka support?
    32:29 - What can you see when using OpenTelemetry?
    36:10 - Getting started with OpenTelemetry for event-level tracing
    39:14 - Synchronous vs. Asynchronous processes
    48:13 - It's a wrap!
    ABOUT CONFLUENT
    Confluent is pioneering a fundamentally new category of data infrastructure focused on data in motion. Confluent’s cloud-native offering is the foundational platform for data in motion - designed to be the intelligent connective tissue enabling real-time data, from multiple sources, to constantly stream across the organization. With Confluent, organizations can meet the new business imperative of delivering rich, digital front-end customer experiences and transitioning to sophisticated, real-time, software-driven backend operations. To learn more, please visit www.confluent.io.
    #streamprocessing #apachekafka #kafka #confluent
  • НаукаНаука

Комментарии • 4

  • @brijeshjaggi4579
    @brijeshjaggi4579 Год назад +4

    wish it had a demo, with this talk..

  • @francksgenlecroyant
    @francksgenlecroyant Год назад +2

    I loved everything, the background music, the content and how the Kafka ecosystem is growing bigger!

  • @javadsaljooghi2407
    @javadsaljooghi2407 6 месяцев назад

    wish to have this tracing for kafka Stream and connect sooner, in out company we use them and we need to trace messages and monitor for latency and throughput based on message , Can you also make a podcast with demo how is instrumentation with prometheus exporter or JMX or JFR for kafka stream and connect? there is not enough information on how to do business logic(App) instrumentation in kafka and which has less overhead and preffered?!

  • @bjego-the-dev
    @bjego-the-dev 2 месяца назад

    Well - I would have thought that the confluent cloud could on the serverside sends the metrics and logs directly to an opentelemetry collector like jaeger or application insights. It's pretty poor that they only support prometheus and proprietary platforms like splunk and datadog..
    With this approach each development team needs to integrate opentelemetry for their client on their own.