RSS

Understanding Kafka for Database Developers

Dear Friends,

As I transition myself from Database Developer to Data Scientist, I would take this opportunity to help my fellow friends who are on similar path as me in this journey.

What is Kafka?

Kafka open-source project founded in 2010 (released in Github) by Jay Kreps, Neha Narkhede and Jun Rao – Engineers from Confluent and Linkedin. It was built to be scalable messaging system that could meet the needs of both the monitoring and tracking systems.

‘Kafka’ name originated from Jay Kreps’ literature teacher from college – Franz Kafka

Why Kafka?

It is described as ‘Distributed commit log” or more recently as a “distributed streaming platform”.

Terminology

Kafka TermsMeaningSimilar term in Database
MessageUnit of dataRow or Record
SchemasStructure for Message contentJSON, XMLTYPE or XSD
TopicsMessage are categorizedTables or folder in filesystems
PartitionsTopics further broken down into partitionsTable Partitions/Table Sub-Partitions
StreamData from single topic which is moving.Similar to Oracle Streams in Change Data Capture used for Replication
Producer/Writer/Source/PublisherPublisher or creator of message in topicImagine as source in ETL job or source in UTL_FILE
Consumer/Reader/Sink/SubscriberSubscriber or reader of message in topicImagine as target in ETL job or target in UTL_FILE
BrokerA single Kafka server that receives, stores and transfers the message from producer to consumer.Instance (set of memory/processes) in Oracle, that acts as intermediary between datafiles/control files/logfiles and users/clients. One instance in Oracle RAC – Oracle real application cluster
ClusterGroup of Brokers. One is leader broker and other brokers are followersSimilar to Oracle RAC – Oracle Real Application cluster with Multiple instances
ZookeeperStores configuration information about the cluster and consumer client’s detailsSimilar to global dictionary views (gv$) like gv$session which contains connection metadata or interconnect in Oracle RAC to communicate/sync between different clusters

What makes Kafka Popular?

Multiple Producers: Kafka seamlessly handle many multiples producers all writing to same topic or different topics

Multiple Consumers: Multiple consumers can read simultaneously without interfering with each other

Disk-based retention: Data can be durable and need not always be consumed in real-time

Scalable: Scaling up and scaling out in Kafka makes it easy to handle huge amount of data.

High Performance: All these features above give Kafka an excellent performance.

Open Source: It offers all the benefits of Open-Source software – cost benefit, transparency, flexibility, and security.

What are the popular Event Streaming Platforms?

Amazon Kinesis, Apache Spark, Apache Flink, Apache Kafka, Apache Storm

Conclusion:

This was just an introduction and brief insight into the amazing world of Kafka and the rich features it has to offer to the world where data is growing exponentially, and format of data is getting complicated.

Cheers!

 
Leave a comment

Posted by on 13 May, 2024 in Cloud, Data Science, DBA, Oracle, PL/SQL

 

Tags: , , , , , , , , , , , ,

Top 10 features in Oracle Database for Developers

Top 10 features in Oracle Database for Developers

 
Leave a comment

Posted by on 30 April, 2024 in Data Science, Oracle, PL/SQL

 

Tags: , , , , , , , , , , , ,

How to explain Zero-Sum game to a toddler?

Zero-Sum game signifies that to maintain equilibrium in financial world, one person gain is always compensated by another person’s loss.

Below image is an effort to explain this complex game theory to a toddler 😊

Happy Friday

Cheers!

 
Leave a comment

Posted by on 26 April, 2024 in Entertainment, Finance, General

 

Tags: , ,

Fibonnaci Sequence in Life

Fibonnaci Sequence has always amazed financial analyst and traders alike. But we seldom notice that this sequence is always present in the nature and have been around for millions of years 🐌🌸🌼🪷🌀

Below are some impressive AI generated images for Fibonnai Sequence in Life ❤️

Source: https://www.bing.com/images/create

 
Leave a comment

Posted by on 21 April, 2024 in Data Science, General

 

Tags: , , ,

Remembrance on Good Friday and Best Wishes for New Life

Hi All,

Remembrance on Good Friday and Best Wishes for New life to all my friends who celebrate and follow Christian faith 💐

As we are embarking on new age and life 🐣 of AI (Artificial Intelligence), below is the beautiful depiction of Good Friday using Image Creator using Microsoft Copilot and powered by DALL-E 😎

When I search for Good Friday on Microsoft Bing, Below are list of similar AI generated images I get.

This post is Inspired from a session on AI and future using AI.

PS: If you are one of the reader who related to below, please drop in a note in the comments section below 😉

Generate using Microsoft Bing powered by DALL-E.
Generated using Microsoft Bing powered by DALL-E

Cheers 🙏

 
Leave a comment

Posted by on 29 March, 2024 in Data Science, Entertainment

 

Tags: , ,