David Šenica & Bojan Miličić27 February 2024

How to harness the potential of data in real time with Kafka?

Understanding the tools and technologies that enable effective data management is crucial for any organization or individual looking to harness the power of data-driven decision-making. One such tool is Kafka, an open source streaming data platform. What is Kafka and how does it work? Why do we use it here at Medius? What skills does someone who wants to use this platform in their processes need? Our colleague David Šenica will help us to find the answers to all these questions.

With his extensive experience in machine learning and data processing at Medius, David is responsible for developing and implementing models and applications that simplify data storage, processing and analysis.
His work involves not only the technical aspect of creating and deploying machine learning models, but also the architectural design of systems that enable efficient processing of large amounts of data.

Unrivalled flexibility

One of Kafka's main features is the ability to scale to meet the business needs. David points out that Kafka allows processing of growing volumes of data without compromising performance.
This is essential in data-driven environments, as better and faster ability to process and analyze data in real time can increase the ability to compete.

"Once the system is in place and the customer wants to process the data for other purposes after a certain period of time, e.g. they want additional predictions, we simply connect the new application to the existing Kafka and start or continue processing the data from there."

David points to the reliability that data replication brings as another major advantage.

"Kafka provides high reliability and fault tolerance. The replication of data across multiple nodes means that even in the event of a failure, the system can continue to run smoothly, giving us continuous processing and analysis of data without interruption."

When to use Kafka?

"The decision to use the Kafka platform depends on the project or the client's needs," says David. It's essential to determine the amount of data involved and where it will be stored.
For small amounts of data, simple applications are used that are linked to the database and Kafka is not needed here, for example.
However, when we start sending a few hundred logs per second, for example, there is a high probability that one server will not be able to provide sufficient throughput, and that's when Kafka comes into play.

An obvious example is a telecommunications project where Kafka was instrumental in processing application "logs" and monitoring the flow of a large number of requests and responses between services.
This not only improved the efficiency of data processing, but also the ability to analyse and extract insights from the data.

"Every project we take on is first analysed and then we decide on the next steps. When working with the Kafka platform, depending on the scope and requirements of the project, it is necessary to prepare all the necessary code and configuration with Kafka, in which the data will then be processed."

During our conversation, the conversation also touched on similar tools and understanding the reasons for choosing Kafka over others.
Kafka differs in his unique strategy of message retention. Unlike other tools where messages are deleted after use, Kafka keeps them for a certain period of time.
This feature is crucial in cases requiring data retention, as it allows messages to be reprocessed when needed.
Apache Pulsar, on the other hand, attracts attention for its cloud design, offering easy scalability and a unique approach to data preservation, making it an attractive solution for deployments in cloud environments.

There are several solutions, but it is reliability and, above all, the repeatedly mentioned scalability that tip the scales in Kafka's direction. As data processing needs increase, different applications can be connected to Kafka to listen and process data at the same time.
If we didn't use Kafka, we would have to send data to each application separately, which would complicate the process and increase the chances of data loss.

Understanding Kafka

And how is Kafka received by decision-makers? Some of our clients were already familiar with Kafka, while others were introduced to its possibilities through our recommendations. The feedback has been positive as they have recognised the potential, especially the scalability and data replication features.

David and his team have been instrumental in facilitating the transition to Kafka, highlighting the ease of set-up through prepared scripts and ensuring that Kafka can be used effectively even by users with no prior technical knowledge, making such solutions even more accessible.

Unrivalled flexibility

When to use Kafka?

Understanding Kafka

Cookie Settings