Distributed Stream Processing

In simple terms, when you are consuming events from a source, transforming them, doing some aggregation (probably maintaining some internal/external state) and then writing it to a sink. And more importantly all of this in realtime, at high scale. Examples: Anomaly detection, analytics, alerting etc

Scaling Kafka Processing

Scaling Problem 🧐How do we scale Kafka processing ? Horizontally with number of consumers in a group. 🧐So can we have unlimited number of consumers? No, number_of_consumers <= number_of_partitions 🧐So can we have a very large number of partitions? Yes, but it is recommended to have some calculated number of partitions. Each partition has overhead... Continue Reading →

Up ↑