How I Made Bulk Ordering 1200% Faster with Multithreading!

Design Pattern

Multithreading

Akash Aman

Updated: February 2025

Case Study

Handling bulk orders is no joke. When I started optimizing system for a client, the bulk order process took 4 hours ⏳ to complete. That’s insane, right? Imagine running a store and waiting that long for bulk orders to process.

But why is processing bulk orders taking so long? Well, the system handles orders for printing on all sorts of objects, and each order involves some image processing.

Well, I decided to fix it. And after a lot of research, trial, and error, I built a multithreaded scheduler with a queueing system that keeps all CPU cores busy and efficiently processes orders when cores are full. The result? A 1200% performance boost! 🚀

Here’s how I did it.

🔍 The Problem: A Messy, Slow Bulk Order System

When I first examined the existing code, I was met with 100,000+ lines of unoptimized, tangled logic written 10–12 years ago. Refactoring it meant understanding nearly every line—if not all, then at least a significant portion—which would have taken months.

Instead of a full rewrite, I focused on fixing the core issue.

What I Found:

🌋 Thread Management Was a Disaster
- The system had a fixed (hardcoded) number of "threads", though, to be precise, they were actually processes—because, of course, cron was just brute-force launching scripts at regular intervals. Scaling? Yeah, that wasn’t really a thing.
- Processing was often painfully sequential, turning what should’ve been a sprint into a sluggish, unnecessary waiting game.
🪈 Bulk Orders Were Stuck in a Single Pipeline
- Each process had a fixed cap on how many products it could process—because, clearly, flexibility is overrated.
- If the product count was low, the system ran sequentially—pretty much the worst possible way to handle speed.
- And when CPU cores were maxed out, new jobs weren’t queued properly, because efficient queuing? Nah, who needs that?

Clearly, I needed a better approach.

⚡ The Experiment

I ran multiple experiments to see what would work best:

I quickly realized this was all about scheduling and processing. If I could schedule processing parallely and spawn n (CPU count) processes—each handling a separate product—that might just solve the problem, at least to some extent.
❌ PHP-Based Thread Management
- Since the all was already written in PHP, I tried to manage process directly in PHP. It was a nightmare—PHP just isn’t built for concurrency, making it an inefficient and problematic solution.
- Yeah, this was definitely not the way forward.
✅ Golang-Powered Multithreaded Scheduler

Then, I decided to get a bit daring and offload thread management to Golang.
- I built a lightweight Golang script that retrieves all products, spawns multiple threads to execute the order processing (php script) via shell execution, and collects responses—leveraging a fan-out, fan-in multithreading algorithm.
- When I ran the script, the results were game-changing—processing time dropped from 4 hours to just 20 minutes! 🚀
- 💪 The system now scales seamlessly up to a point with server resources (though, of course, PHP limits scalability beyond a certain extent).

🎯 The Final Plan

Seeing the massive improvement, I build a long-term solution:

A Golang microservice with a multithreaded scheduler to handle bulk order processing efficiently while maximizing CPU utilization.
A queueing system to ensure no job gets lost, even when all cores are busy.
Real-time status updates right on the dashboard.
On-demand order processing, both manual and automated, with the ability to cancel bulk processing mid-way if needed.

Multithreaded Scheduler & Queuing System

I noticed multiple types of CSV processing happening in the system, all relying on the same cron-based scheduling—where timing was critical.
To streamline this, I extracted the scheduler and queuing logic, making it adaptable through an interface that could handle any task.
I then built a task factory that implements the task interface, allowing us to easily add new tasks without modifying the core logic.

💡 Why Not Just Refactor the Old Code?

Refactoring 100,000+ lines could take months (or even a year), and we still wouldn’t know if the performance would improve enough.
Instead, we fix the core issue first (bulk order speed), then refactor the rest piece by piece alongside other improvements.

🚀 Wrapping It Up

This project proves one thing: big improvements don’t always require massive rewrites.
Instead of refactoring blindly, I focused on one key performance bottleneck (thread management), built a multithreaded scheduler with an efficient queue, and delivered a 1200% speed boost in the process.

What's Next?

This was just a glimpse of how multithreading and queuing can drive significant performance improvements! If you're interested in exploring the code implementation, check out the worker pool implementation here: GitHub - akash-aman/threadpool. I will also be writing a blog on its implementation soon, stay tuned 🎉.

Quick Links