Boosting Data Center Efficiency Without New Hardware
- •New software system 'Sandook' increases data center storage throughput by up to 94%
- •Intelligently balances workloads across SSDs to eliminate performance bottlenecks and hardware waste
- •Achieves 95% of theoretical maximum SSD performance without requiring specialized new hardware
Data centers serve as the backbone of our digital world, yet they often struggle with a fundamental inefficiency: their storage hardware is frequently underutilized. While we often think of AI as a problem of raw computing power, the storage devices holding the massive datasets required for training AI models—solid-state drives (SSDs)—often act as a bottleneck. Because these drives vary in age, health, and workload, they do not perform uniformly, causing the slowest device to drag down the entire system's potential.
MIT researchers have introduced a software-based solution called Sandook, which means 'box' in Urdu. Instead of buying newer, faster hardware to compensate for performance dips, Sandook uses a two-tier management architecture to intelligently redistribute tasks in real-time. A global controller monitors the overall state of the storage pool, while local controllers on individual machines make immediate adjustments if a specific drive encounters issues like garbage collection—a background process where the drive cleans up data, which often interrupts read and write speeds.
This approach solves three distinct types of performance variability simultaneously: uneven wear and tear on drives, the conflict between reading and writing data, and unpredictable background maintenance processes. By managing these variables dynamically, the system keeps drives running at near-peak capacity. When tested on practical, demanding applications like training machine learning models and image compression, the researchers found that their software could nearly double the performance compared to traditional static methods.
Beyond the technical speed improvements, this research highlights a critical sustainability angle in the age of AI. There is a massive environmental cost associated with the constant manufacturing and disposal of high-end hardware. By squeezing more utility out of existing infrastructure, organizations can significantly extend the lifespan of their equipment. This software-driven approach effectively proves that we can improve computational efficiency not just by building faster hardware, but by writing smarter, more adaptive software to manage the assets we already have.