Monday, February 3, 2025
How I Made Bulk Ordering 1200% Faster with Multithreading!

Updated: February 2025

Case Study
Handling bulk orders is no joke. When I started optimizing system for a client, the bulk order process took 4 hours β³ to complete. Thatβs insane, right? Imagine running a store and waiting that long for bulk orders to process.
Well, I decided to fix it. And after a lot of research, trial, and error, I built a multithreaded scheduler with a queueing system that keeps all CPU cores busy and efficiently processes orders when cores are full. The result? A 1200% performance boost! π
Hereβs how I did it.
π The Problem: A Messy, Slow Bulk Order System
When I first examined the existing code, I was met with 100,000+ lines of unoptimized, tangled logic written 10β12 years ago. Refactoring it meant understanding nearly every lineβif not all, then at least a significant portionβwhich would have taken months.
Instead of a full rewrite, I focused on fixing the core issue.
What I Found:
-
π Thread Management Was a Disaster
- The system had a fixed (hardcoded) number of "threads", though, to be precise, they were actually processesβbecause, of course, cron was just brute-force launching scripts at regular intervals. Scaling? Yeah, that wasnβt really a thing.
- Processing was often painfully sequential, turning what shouldβve been a sprint into a sluggish, unnecessary waiting game.
-
πͺ Bulk Orders Were Stuck in a Single Pipeline
- Each process had a fixed cap on how many products it could processβbecause, clearly, flexibility is overrated.
- If the product count was low, the system ran sequentiallyβpretty much the worst possible way to handle speed.
- And when CPU cores were maxed out, new jobs werenβt queued properly, because efficient queuing? Nah, who needs that?
Clearly, I needed a better approach.
β‘ The Experiment
-
I ran multiple experiments to see what would work best:
I quickly realized this was all about scheduling and processing. If I could schedule processing parallely and spawn n (CPU count) processesβeach handling a separate productβthat might just solve the problem, at least to some extent.
-
β PHP-Based Thread Management
- Since the all was already written in PHP, I tried to manage process directly in PHP. It was a nightmareβPHP just isnβt built for concurrency, making it an inefficient and problematic solution.
- Yeah, this was definitely not the way forward.
-
β Golang-Powered Multithreaded Scheduler
Then, I decided to get a bit daring and offload thread management to Golang.
- I built a lightweight Golang script that retrieves all products, spawns multiple threads to execute the order processing (php script) via shell execution, and collects responsesβleveraging a fan-out, fan-in multithreading algorithm.
- When I ran the script, the results were game-changingβprocessing time dropped from 4 hours to just 20 minutes! π
- πͺ The system now scales seamlessly up to a point with server resources (though, of course, PHP limits scalability beyond a certain extent).
π― The Final Plan
Seeing the massive improvement, I build a long-term solution:
- A Golang microservice with a multithreaded scheduler to handle bulk order processing efficiently while maximizing CPU utilization.
- A queueing system to ensure no job gets lost, even when all cores are busy.
- Real-time status updates right on the dashboard.
- On-demand order processing, both manual and automated, with the ability to cancel bulk processing mid-way if needed.
Multithreaded Scheduler & Queuing System
- I noticed multiple types of CSV processing happening in the system, all relying on the same cron-based schedulingβwhere timing was critical.
- To streamline this, I extracted the scheduler and queuing logic, making it adaptable through an interface that could handle any task.
- I then built a task factory that implements the task interface, allowing us to easily add new tasks without modifying the core logic.
π‘ Why Not Just Refactor the Old Code?
- Refactoring 100,000+ lines could take months (or even a year), and we still wouldnβt know if the performance would improve enough.
- Instead, we fix the core issue first (bulk order speed), then refactor the rest piece by piece alongside other improvements.
π Wrapping It Up
- This project proves one thing: big improvements donβt always require massive rewrites.
- Instead of refactoring blindly, I focused on one key performance bottleneck (thread management), built a multithreaded scheduler with an efficient queue, and delivered a 1200% speed boost in the process.
What's Next?
- This was just a glimpse of how multithreading and queuing can drive significant performance improvements! If you're interested in exploring the code implementation, check out the worker pool implementation here: GitHub - akash-aman/threadpool. I will also be writing a blog on its implementation soon, stay tuned π.