HYDRAstor Creates a Decoupled Storage Grid
The ability to create a single logical pool of storage using off-the-shelf server and storage hardware is one of the key drivers behind the current revolution in data storage that is spawning new grid storage architectures. However delivering on the promise of grid storage creates new challenges - especially when one adds deduplication to the equation.
Scaling a storage grid to manage and deduplicate hundreds of TBs or even PBs of data while still protecting the integrity of the data and without creating performance bottlenecks is a different animal than merely delivering grid storage. This last week I hooked up with Dr. Christian Toelg, NEC's Director of Business Development, to discuss how NEC architected HYDRAstor to meet these specific challenges.
Dr. Toelg believes that the future of disk-based data protection clearly lies with deduplication. Unfortunately deduplication does not play well using traditional clustered storage system architectures as the amount of data under management continues to grow. Traditional clustered storage systems have a finite number of controllers and storage capacity as well as centralized file system information and mapping tables on each node. Together, these factors create hard upper limits to scalability forcing end users to deploy and manage additional systems that function as isolated data silos with no deduplication across systems increasing management complexity.
HYDRAstor's grid storage architecture conversely can start small but linearly and independently scale performance or capacity or both. HYDRAstor decouples performance nodes (called Accelerator Nodes) from capacity nodes (called Storage Nodes) creating two separate layers that act independently of one another and possess their own levels of intelligence.
In traditional clustered storage architectures, the storage controllers possess all of the information about the data while its attached disk remains stupid. HYDRAstor changes this paradigm by putting intelligence in both layers and makes the Storage Nodes "self-aware". These Storage Nodes behave as one large pool of capacity that is fully self-managed with respect to balancing capacity and performance across all nodes as well as rebalancing the system as new capacity is added or to recover from failures.
Creating a separate storage layer introduces new data protection and disaster recovery options. Dr Toelg says that NEC made a specific decision early on in HYDRAstor's design not to use RAID since loosing any chunk of deduplicated data can impact tens, hundreds or thousands of file and make all of them unrecoverable. Because the storage layer possesses its own level of intelligence, HYDRAstor can create higher levels of data protection and redundancy than what RAID can deliver with a comparable storage overhead. Also, deduplicated data can be shared between storage pools in different geographical locations using policies to replicate data so companies can automate their disaster recovery configuration.
By creating two distinct layers in its grid storage architecture, HYDRAstor addresses key concerns about data protection and scalability that deduplication typically introduces. HYDRAstor eliminates the capacity, cost and performance trade-offs that companies typically need to make while laying the foundation for companies to bring a disaster recovery solution into their environment.
Leave a comment