Recently, I had the opportunity to attend the Apache Kafka Meetup hosted at Microsoft, organized by the GitHub User Group in collaboration with Confluent.
The sessions were incredibly insightful, covering distributed systems, real-time streaming architectures, microservices communication, and the evolving landscape of Gen AI agents.
Hereβs a detailed recap of the key takeaways from the event.
One of the most engaging talks was delivered by Sasi Teja K, who walked us through practical distributed systems concepts in modern data architectures.
His talk beautifully refreshed core system design principles.
In real-time systems:
This distinction is crucial when designing scalable event-driven architectures.
The discussion went deeper into modern analytical storage patterns.
ClickHouse is a powerful column-oriented database optimized for OLAP workloads. It excels in:
Apache Iceberg is a modern table format designed for:
When combined with cloud storage, it enables scalable and reliable data lake architectures.
A very important distributed systems concept revisited was the Split Brain Problem.
It occurs when:
Handling this requires:
In event-driven streaming systems, duplicate events are common.
Watermarking techniques help by:
This becomes critical in high-throughput systems where idempotency and correctness matter.
While Kafka dominates event streaming, the talk also highlighted alternative communication patterns.
NATS is a lightweight messaging system often used for:
GlassFlow was introduced as an alternative to Kafka Streams and Apache Flink.
It aims to simplify real-time stream processing with:
This opens up interesting possibilities for teams that want stream processing without operational complexity.
Apache Avro plays a crucial role in:
Avro significantly improves performance and compatibility in distributed data pipelines.
A subtle but powerful concept discussed was linger time.
In Kafka producers:
linger.ms determines how long to wait before sending a batch.This perfectly ties into the broader tradeoffs in distributed system design.
Another fascinating talk was on Gen AI Agents and how they are reshaping automation.
Agent Skills are:
Instead of simple prompt-response systems, AI agents with skills can:
This drastically improves:
The future clearly lies in composable AI systems powered by well-defined agent skills.
One of the most informative sessions was about the Publisher-Subscriber (Pub-Sub) Model.
This decouples producers and consumers, enabling scalability and resilience.
The session also emphasized architectural tradeoffs:
| Goal | What It Optimizes | Tradeoff |
|---|---|---|
| Throughput | Maximum data processed | May increase latency |
| Latency | Fast response time | May reduce batching efficiency |
| Durability | Data safety | Higher storage & replication cost |
| Availability | System uptime | May sacrifice consistency |
This aligns closely with distributed systems theory and real-world engineering constraints.
This meetup was more than just a Kafka session β it was a masterclass in:
Events like these reinforce one important lesson:
Great system design is not about choosing tools β it's about understanding tradeoffs.
Looking forward to attending more such community-driven tech events π