On the importance of load testing Kafka

Socrates preached, “To know thyself is the beginning of wisdom.” This ancient Greek anecdote applies to your modern Apache Kafka project: developers, go forth and load test your real-time application to understand the capacity and limitations of your project before deployment. Failure to do so will cost you time and money (e.g. Robinhood’s outage on a historic trading day). Load testing your real-time applications has three main objectives.


Get your GitOps for real-time apps on Apache Kafka & Kubernetes

Infrastructure as code has been an important practice of DevOps for years. Anyone running an Apache Kafka data infrastructure and running on Kubernetes, the chances are you’ve probably nailed defining your infrastructure this way. If you’re running on Kubernetes, you’re likely using operators as part of your CI/CD toolchain to automate your deployments.


What can the pandemic teach us about building data products?

Most people I’m talking to in the data world have to achieve more with less - and faster. Normally we don’t really feel the weight of this day-to-day. But COVID-19 has forced many of us to adapt our approach to new problems and even rethink existing business models. Often, those changes leave us better off. What better example than what’s currently happening in healthcare and life science?


Jaeger Essentials: Jaeger Persistent Storage With Elasticsearch, Cassandra & Kafka

Running systems in production involves requirements for high availability, resilience and recovery from failure. When running cloud native applications this becomes even more critical, as the base assumption in such environments is that compute nodes will suffer outages, Kubernetes nodes will go down and microservices instances are likely to fail, yet the service is expected to remain up and running.

Archive data from to S3 with the new Kafka Connect connector

The new open-source #ApacheKafka Connect sink connector for #S3 gives you full control on how to sink data to S3 and save money on long term storage costs in #Kafka. The connector has the ability to flush data out in a number of different formats including #AVRO, #JSON, #Parquet and #Binary as well as ability to create S3 buckets based on partitions, metadata fields and value fields.