Distributed Queue Systems (Kafka's Architecture)

Distributed Queue Systems (Kafka's Architecture)

You should think of when operations need to be performed Synchronously or in an Async manner.

For example, sending mail for invoice can be async, but getting notifications for lets say a text message on instagram needs to probably be instant.

Let’s say we have 2 servers on amazon.com, an order management server, and a delivery server. There would be obv be several servers for orders, and delivery, but on a high level, we’re assuming 2 servers. The order managmenet confirms the order being placed, and it has to tell the delivery server to initiate a delivery server.

Now if you had multiple order servers sending requests to a delivery server, and lets say the delivery server has to create invoice, update db etc

And hence, the requests are srnt at a higher speed than what it is processing.

We should then possibly have a Queue.

A queue is a DS which is FIFO, whenever a req is added to a queue, the worker takes the req from the queue to the delivery server, and this works.

We need to first add the req to the queue and then respond with an okay or not okay to the client. we dont do the opposite because if the order was placed, and if we were sending back an OK, but server crashes, so the order is placed from client but nothing has happened. but even in our solution, the queue could go down and delivery server doesnt get the information. hence, we use distributed systems.

Messaging Queues -

  • Async Communication

  • A form of Load Balancing (Controlling throughput (no of requests we can process at 1 time - throughput))

  • Fault Tolerance

  • Reliability (Persist Messages)

  • Decoupling Components (Different components are not dependent or ‘tightly coupled’ to each other) and the delivery server doesnt know anythinhg about the order management server vice versa, and both just need to know where the queue is.

  • Scaling is easy (If one queue is full, just add another queue)

  • Buffering and Throttling (Putting limit on a number of requests processed wrt time - throttling) Buffering in messaging queues refers to the process of temporarily storing messages or data in a buffer before they are delivered to the intended recipient or processed by a consumer application.

Normal Messaging Queue (the above) - RedisQ, IBMQ

Distributed Messaging Queues -

Distributed system - multiple components connected over a network.

  • Higher Availability

  • Message Persistence

  • Easily Scalable

  • Retry Mechanisms

  • Geographical Distribution - increases availability

  • Relaibility

  • Can handle Big Data

There is a higher level of decoupling in a pub-sub

One to Many uses a pub-sub pattern. (publisher (producers) and subscribers (consumers))

Common Examples: Kafka, RabbitMQ

For example, after i place an order, i could have 1) the delivery service 2) the notification by email for order completion service and more listen/consume to when an order is placed and added to a queue.

Kafka:

Producer (server which adds a message to queue), Consumer(s) (server which gets a message from the queue).

Which message goes to who is catered to by adding ‘topics’ to each message.

So we essentially make multiple Brokers (Servers) inside of Kafka. (since this is a distributed messaging queue system)

  • Problem 1: If we put a message on 1 server if the server goes down, and that message only exists there, that is a problem.

  • Problem 2: Too many requests for one topic. (Solved by load balancing (partitions in this case). For each topic, we would have multiple partitions, and each request for a topic would go to a specific partition.

  • Not all partitions would be on just 1 server. Partitions would be on different servers (for the same topic).

two things compulsory to add when sending a message: topic, value.

consumer groups: groups of consumers subscribed to the same topic.

(Replicas, Key: hash value of partition number, Offset: unique identifier that represents the position of a message within a partition of a topic, Partition)

parallelism: each consumer within a consumer group would be assigned a partition.

Within a partition, at what level is a message within the partition is what Offset is.

Internally, a partition works similarly to a Queue.

A message sent consists of a Topic and a Value.

It also optionally consists of Key (hash value that gets hashed to parition number) and Partition.

You can also add an Offset - the number which tells the position of message

You also have replicas of partitions in case one broker goes down:

The replicas only exist in case of downtime, they only read/write in case of the primary partition going down.

Several consumers have subscribed to the same topic form a consumer group to receive messages for that topic.

If I add a partition parameter to every message I send, and this is done to every message going to the same partition, this would decrease parallelism, and availability too.

Partitions get used parallelly, but the replicas dont. ie every topic would have multiple partitions which would get used parallelly, but each of their replicas would only get used when needed.

Where Kafka could be used:

  • Order confirmed updates

  • Promotional Emails

  • Log aggregation

  • Real-time data/Analytics

Kafka is a lot more scalable than RabbitMQ.

All of these use TCP as a transport layer protocol.

  • Kafka clusters consist of brokers.

  • Each broker manages partitions, which are sequences of messages within a topic.

  • Partitions can be replicated across brokers for fault tolerance.

  • Multiple consumers in a consumer group can consume messages from different partitions, enabling parallel processing.

  • Each consumer within a consumer group is assigned to a specific partition to ensure parallel consumption.