Monday, February 3, 2025

How I Made Bulk Ordering 1200% Faster with Multithreading!

Design Pattern
Multithreading
Profile Pic of Akash AmanAkash Aman

Updated: February 2025

Table of Contents

Case Study

Handling bulk orders is no joke. When I started optimizing system for a client, the bulk order process took 4 hours ⏳ to complete. That’s insane, right? Imagine running a store and waiting that long for bulk orders to process.

But why is processing bulk orders taking so long? Well, the system handles orders for printing on all sorts of objects, and each order involves some image processing.

Well, I decided to fix it. And after a lot of research, trial, and error, I built a multithreaded scheduler with a queueing system that keeps all CPU cores busy and efficiently processes orders when cores are full. The result? A 1200% performance boost! πŸš€

Here’s how I did it.

Worker Thread PoolCompleted Sub Task Queue- (n) thread count depending CPU ConfigThread having no workThread is doing some workThread had completed the workProductEndpoint/bulk-orderEndpointControllerGET- Send to Bulk Order task Queue.CSV Task QueueCSV Sub Task QueueCompleted Task QueueEndpoint/Save-orderProcessing HandlerPOST- Save Processed Bulk Order.

πŸ” The Problem: A Messy, Slow Bulk Order System

When I first examined the existing code, I was met with 100,000+ lines of unoptimized, tangled logic written 10–12 years ago. Refactoring it meant understanding nearly every lineβ€”if not all, then at least a significant portionβ€”which would have taken months.

Instead of a full rewrite, I focused on fixing the core issue.

What I Found:

  • πŸŒ‹ Thread Management Was a Disaster

    • The system had a fixed (hardcoded) number of "threads", though, to be precise, they were actually processesβ€”because, of course, cron was just brute-force launching scripts at regular intervals. Scaling? Yeah, that wasn’t really a thing.
    • Processing was often painfully sequential, turning what should’ve been a sprint into a sluggish, unnecessary waiting game.
  • πŸͺˆ Bulk Orders Were Stuck in a Single Pipeline

    • Each process had a fixed cap on how many products it could processβ€”because, clearly, flexibility is overrated.
    • If the product count was low, the system ran sequentiallyβ€”pretty much the worst possible way to handle speed.
    • And when CPU cores were maxed out, new jobs weren’t queued properly, because efficient queuing? Nah, who needs that?

Thread Count is fixedThread Count is dynamic a/cCPUProduct per thread is fixed- no of thread need to be spawned is caculatedusing product per thread Product are served to all threadusing queue machnismTimeThread CountThread CountTime


Clearly, I needed a better approach.

⚑ The Experiment

  • I ran multiple experiments to see what would work best:

    I quickly realized this was all about scheduling and processing. If I could schedule processing parallely and spawn n (CPU count) processesβ€”each handling a separate productβ€”that might just solve the problem, at least to some extent.

  • ❌ PHP-Based Thread Management

    • Since the all was already written in PHP, I tried to manage process directly in PHP. It was a nightmareβ€”PHP just isn’t built for concurrency, making it an inefficient and problematic solution.
    • Yeah, this was definitely not the way forward.
  • βœ… Golang-Powered Multithreaded Scheduler

    Then, I decided to get a bit daring and offload thread management to Golang.

    • I built a lightweight Golang script that retrieves all products, spawns multiple threads to execute the order processing (php script) via shell execution, and collects responsesβ€”leveraging a fan-out, fan-in multithreading algorithm.
    • When I ran the script, the results were game-changingβ€”processing time dropped from 4 hours to just 20 minutes! πŸš€
    • πŸ’ͺ The system now scales seamlessly up to a point with server resources (though, of course, PHP limits scalability beyond a certain extent).

🎯 The Final Plan

Seeing the massive improvement, I build a long-term solution:

  • A Golang microservice with a multithreaded scheduler to handle bulk order processing efficiently while maximizing CPU utilization.
  • A queueing system to ensure no job gets lost, even when all cores are busy.
  • Real-time status updates right on the dashboard.
  • On-demand order processing, both manual and automated, with the ability to cancel bulk processing mid-way if needed.

Worker Thread PoolCompleted Sub Task Queue- (n) thread count depending CPU ConfigThread having no workThread is doing some workThread had completed the workProductEndpoint/bulk-orderEndpointControllerGET- Send to Bulk Order task Queue.CSV Task QueueCSV Sub Task QueueCompleted Task QueueEndpoint/Save-orderProcessing HandlerPOST- Save Processed Bulk Order.

Multithreaded Scheduler & Queuing System

  • I noticed multiple types of CSV processing happening in the system, all relying on the same cron-based schedulingβ€”where timing was critical.
  • To streamline this, I extracted the scheduler and queuing logic, making it adaptable through an interface that could handle any task.
  • I then built a task factory that implements the task interface, allowing us to easily add new tasks without modifying the core logic.

Queue < Task >Scheduler < Task >Type B API Request HandlerEndpointHandlerRes DTO- Repository DB Calls - Event Creations Call- Other business logicsType B API Request HandlerEndpointHandlerRes DTO- Repository DB Calls - Event Creations Call- Other business logics Task Type ATask Type B- Process Methods- Status Update Methods- DB Update Methods- Process Methods- Status Update Methods- DB Update Methods

πŸ’‘ Why Not Just Refactor the Old Code?

  • Refactoring 100,000+ lines could take months (or even a year), and we still wouldn’t know if the performance would improve enough.
  • Instead, we fix the core issue first (bulk order speed), then refactor the rest piece by piece alongside other improvements.

πŸš€ Wrapping It Up

  • This project proves one thing: big improvements don’t always require massive rewrites.
  • Instead of refactoring blindly, I focused on one key performance bottleneck (thread management), built a multithreaded scheduler with an efficient queue, and delivered a 1200% speed boost in the process.

What's Next?

  • This was just a glimpse of how multithreading and queuing can drive significant performance improvements! If you're interested in exploring the code implementation, check out the worker pool implementation here: GitHub - akash-aman/threadpool. I will also be writing a blog on its implementation soon, stay tuned πŸŽ‰.