• Apr 2

More threads didn’t increase throughput

    The added complexity wasn't justified.

    Billing and audit services publish files to a Kafka topic receiving about 25 million messages per day. The messages contain files such as invoices, statements, and logs that must eventually be stored in Google Cloud Storage for long-term retention.

    The archive service

    We developed a service responsible for consuming these messages and uploading each file to Google Cloud Storage. The service was deployed in Kubernetes cluster on premises with 1 docker pod, and 20 consumer threads.

    Complexity not justified

    The assumption was that if each pod—running 20 threads—pushed many uploads concurrently, then adding more pods would increase the overall throughput.

    After deploying to test, we noticed the behavior was not what we expected. Even as we increased pods, the throughput did not grow the way we thought it would.

    Digging deeper, we realized how kafka works. The topic had 20 partitions, meaning the consumer group can process about 20 messages in parallel, regardless of how many consumers you run. That's how kafka distribute the work across partitions.

    So even if we run 20 pods with 20 threads each, the system still processes the same number of messages as 20 pods with a single consumer each.

    Competing Consumers Pattern

    Instead of running many threads inside a single pod, we embraced the competing consumers pattern.

    We ran one consumer per pod and deployed 20 pods. Each consumer reads a message and uploads the file to Google Cloud Storage. The throughput remained the same, but the system became simpler. The design choice is clearer to everyone involved. That clarity matters to me.

    The takeaway

    The mistake I often see is implementing the first idea that comes to mind—more threads to increase throughput. However, the added complexity wasn't justified and the team lived with it sprint after sprint.

    A better move is to pause, and look for the right pattern: competing consumers.

    If you're designing systems like the ones discussed here, this toolbox might help.

    • Free email delivery

    The Software Architect Toolbox

    The set of diagram pieces you can use to create awesome architecture visuals. The library package is regularly updated. Every time I find a new artifact or draw one myself, I add it because I think you should have it in your arsenal.

    You're signing up to receive emails from Justified Code.