<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>NEC HYDRAstor</title>
        <link>http://necam.dciginc.com/</link>
        <description>HYDRAstor is a grid storage platform that addresses today&apos;s storage challenges through its &quot;community of smart nodes.&quot; Comprised of self-aware, self-healing industry-standard servers with no single point of failure and no central resource bottleneck, HYDRAstor greatly enhances the flexibility of the storage environment while reducing infrastructure complexity and management overhead.</description>
        <language>en</language>
        <copyright>Copyright 2008</copyright>
        <lastBuildDate>Fri, 27 Jun 2008 07:30:00 -0600</lastBuildDate>
        <generator>http://www.sixapart.com/movabletype/</generator>
        <docs>http://www.rssboard.org/rss-specification</docs>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>The Juxtaposition of Deduplication and Replication Needed for Global Data Protection</title>
            <description><![CDATA[<p>The juxtaposition of deduplication and replication in disk-based backup appliances is a powerful combination that companies can use to protect backed up data across data centers as well as data backed up at remote and branch offices (ROBOs). Yet where deduplication ends and replication starts can get a little confusing in grid storage architectures such as is supported by the <a href="http://www.necam.com/">NEC</a> <a href="http://www.necam.com/Storage/GridStorage.cfm">HYDRAstor</a> that features global deduplication capabilities. </p>
<p>To understand how this works, let's first take a look at this from the perspective of a company that has data centers and ROBOs in different geographic locations. In this scenario, the company may want the flexibility to leverage its primary data center to recover the data backed up at any of its secondary data centers or ROBOs. </p>
<p>To deliver on this ideal, the company would first need to install a HYDRAstor grid at each of its data centers or ROBOs that acts independently of the HYDRAstors at the other locations. Each local HYDRAstor would then deduplicate all data at that site to address that site's need to shorten backup windows and provide fast recoveries of data at that site.</p>
<p>
<span class="mt-enclosure mt-enclosure-image" style="DISPLAY: inline"><img class="mt-image-center" style="DISPLAY: block; MARGIN: 0px auto 20px; TEXT-ALIGN: center" height="304" alt="RepliGrid2.JPG" src="http://necam.dciginc.com/RepliGrid2.JPG" width="522" /></span></p>
<p>
<p>
<p>
<span class="mt-enclosure mt-enclosure-image" style="DISPLAY: inline">&nbsp;</span>HYDRAstor's optional <a href="http://www.necam.com/Storage/RepliGrid.cfm">RepliGrid</a> feature then enables the company to protect data from any of its secondary data centers and ROBOs at the primary site. Using this feature, each HYDRAstor at a remote site would replicate its deduplicated data asynchronously back to the HYDRAstor at the main data center on a regularly scheduled basis. The data replication process occurs as follows:</p>
<p></p>
<p></p>
<ul>
<li>The HYDRAstor at the remote location provides a list of hash keys for its changed or new data to the HYDRAstor at the primary site. </li>
<li>The HYDRAstor at the primary site receives the list of hash keys and removes keys it already has from the list</li>
<li>The reduced list of hash keys detailing what deduplicated data is needed at the primary site is transmitted back to the HYDRAstor at the remote site</li>
<li>The HYDRAstor at the remote site sends the deduplicated data associated with the hash keys in the reduced list back to the HYDRAstor at the main site</li></ul>
<p>The distinct business benefits that the HYDRAstor RepliGrid feature offers are two-fold:</p><b>
<ul>
<li><strong>Reduces Storage Costs.</strong> </b>The amount of deduplicated data stored on the HYDRAstor at the main data center is minimized. By performing global deduplication across all local data and remote data, only net new unique deduplicated data from each remote site is stored on the main HYDRAstor.</li>
<li><strong>Reduces Network Bandwidth Costs. </strong>Since the main HYDRAstor aggregates data across all of the sites, there is a good possibility that the deduplicated data already exists at the main site. By only transmitting net new unique deduplicated data, it minimizes the amount of data transmitted and hence the size of the network pipes needed to transmit the data.</li></ul>
<p>However, offering deduplication and replication is only part of the enterprise data protection picture. As companies look to deduplicate data across their entire enterprise and replicate between multiple sites, spikes in performance and capacity may necessitate the companies to have the flexibility to cost-effectively and easily scale these components of the HYDRAstor architecture. I'll examine how the HYDRAstor accomplishes those tasks in a forthcoming blog entry. </p>]]></description>
            <link>http://necam.dciginc.com/2008/06/the-juxtaposition-of-deduplica.html</link>
            <guid>http://necam.dciginc.com/2008/06/the-juxtaposition-of-deduplica.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Replication</category>
            
            <pubDate>Fri, 27 Jun 2008 07:30:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Disk-Based Backup Brings Replication Into End User Data Protection Conversations</title>
            <description><![CDATA[<p>At recent storage conferences (<a href="http://storagedecisions.techtarget.com/">Storage Decisions</a>, <a href="http://www.snwusa.com/">Storage Networking World</a>, etc.) replication has emerged as a hot topic of discussion among end-users. In talking with these different users and listening in on a number of end-user panel discussions, there are a number of factors that they attribute to their increased interest in using replication as part of their company's overall disk-based data protection strategy:</p>
<ul>
<li><strong>The need to move data offsite after their data is backed up to disk. </strong>Should some type of disaster strike their production site (floods, hurricanes, tornados, power outages, etc.) they need to have some means to recover their data offsite. This generally means using either removable media (tape, optical, etc.) or replication to move it to another location.</li>
<li><strong>Companies want to use removable media as a last resort to recover data as they view it as problematic to manage and recover from.</strong> If companies do store data on removable media, they prefer to view it as data's final resting place and not their primary or even secondary means of performing a data recovery.</li>
<li><a href="http://www.broadbandinfo.com/internet-access/dsl/t1-t3-compare.html"><strong>Broadband&nbsp;network connections</strong></a><strong> continue to remain affordable. </strong>The capacity and cost of a&nbsp;broadband connection is determined by the amount of data that a location needs to transmit. Remote and branch offices can obtain DSL (digital subscriber lines) that provide connection speeds at up to&nbsp;512 Kbps for under $100/month while&nbsp;T1 lines (1.5 Mbps) run around $1000/month and T3 lines (45 Mbps) are about $10,000.</li>
<li><strong>Using deduplication makes replication more affordable.</strong> Rather than replicating all of the data created by each backup, using deduplication companies only need to transmit net new blocks of data. This reduces the total amount of data that companies need to send which translates into the need for smaller, lower cost network pipes for replication.</li></ul>
<p>Of course, implementing replication in large enterprises as part of their overall disk-based data protection plan becomes more complicated. While large enterprises are always looking for solutions that are easy to deploy and implement into their environments, these are only some of their considerations when selecting a solution. Large enterprises may have offices and data centers that span the globe, multiple backup software products, different data retention policies and numerous network links of varying capacities. As a result, they will want a solution that they can install that matches the size and cost constraints of these different enterprise environments.</p>
<p>Enterprises also need to identify products that provide architectures that meet current and future needs. Features such as global deduplication and more granular control over replication, such as what data is replicated and when it is replicated, are almost prerequisites in enterprise shops. These companies also need to think outside the box to take into account new architectures such as grid storage that provide new enterprise data recovery options. For instance, using grid storage, companies can start to think about creating storage grids that replicate data across geographic distances such that companies can recover data from any of the sites.</p>
<p>End-users now readily recognize that replication is becoming as integral to their corporate data protection plan as data protection software and hard disk storage systems. However enterprise corporations need to think more broadly about how they implement replication as part of their disk-based data protection strategy.</p>
<p>Deduplication and replication alone are not enough as large enterprises require disk-based backup solutions that scale in a manner that is very different than what is generally available on the market. These new solutions require new architectures such as grid storage that scale in such a way that they remain cost-effective short and long term. The grid storage architecture the makes up <a href="http://www.necam.com/">NEC</a> <a href="http://www.necam.com/Storage/GridStorage.cfm">HYDRAstor</a> provides the type of scalability and cost effectiveness to complement the deduplication and replication features that large enterprise companies are seeking. In an upcoming blog entry, I'll take a closer look at how the HYDRAstor delivers on these features.</p>]]></description>
            <link>http://necam.dciginc.com/2008/06/diskbased-backup-brings-replic.html</link>
            <guid>http://necam.dciginc.com/2008/06/diskbased-backup-brings-replic.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Data Protection</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Replication</category>
            
            <pubDate>Fri, 13 Jun 2008 13:40:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Anderson Center for Autism Drives Storage Costs Down to 70¢/GB Using the NEC HYDRAstor</title>
            <description><![CDATA[<p>One of DCIG's objectives in blogging is to document over time how companies are using different vendors' products, the ways in which they are using the product, successes they are having and specific challenges that they are beginning to face. Greg Paulk, the IT Director for the <a href="http://www.andersonschool.org/">Anderson Center for Autism</a>, represents the first individual that DCIG has had the opportunity to do this with.</p>
<p>I first met Mr. Paulk at the Fall 2007 <a href="http://storagedecisions.techtarget.com/">Storage Decisions</a> conference in New York City and interviewed him shortly thereafter for a <a href="http://necam.dciginc.com/2007/10/one-of-those-days.html">blog entry</a> that appeared back in October 2007. However six months have passed since that interview so I followed up with Mr. Paulk to get an update on how his installation of the <a href="http://www.necam.com/Storage/GridStorage.cfm">NEC HYDRAstor</a> was performing, since Paulk was still using a beta version of the NEC HYDRAstor software when we last spoke.</p>
<p>Paulk revealed that he is now in full production with the production code loaded on the NEC HYDRAstor. However he is still using the same hardware configuration (two Accelerator Nodes and four Storage Nodes) that he started out using due to the high deduplication ratio that he is achieving with the HYDRAstor. </p>
<p>Last fall he was achieving a 17:1 deduplication ratio and hoped to eventually achieve a 35:1 ratio. Six months later, his deduplication ratio is now approximately 39:1 which has mitigated his need to buy additional capacity and has driven his cost/GB down to approximately 70<font face="Arial" size="2">¢/GB</font>. "It's like getting 390 TB for the price of 10 TBs," says Paulk.</p>
<p>He also has not found it necessary to add more Accelerator Nodes into his HYDRAstor configuration. Though he has nearly doubled the number of servers he is backing up on a nightly basis (from 13 to 21 servers), he is achieving about 3 Gbps of throughput across his two Accelerator Nodes.</p>
<p>I then asked him, "What are the biggest benefits that you have experienced since you started using the HYDRAstor?" There were four benefits he cited:</p>
<ul>
<li><font style="FONT-SIZE: 1em">First, it worked as advertised. The installation was easy (it took 68 minutes), and it has done everything he has needed it to do.</font></li>
<li><font style="FONT-SIZE: 1em">Second, it requires very little management overhead. He has one individual assigned to manage the HYDRAstor and, since it functions as one logical configuration, it takes very little time to manage.</font></li>
<li><font style="FONT-SIZE: 1em">Third, no backups have failed since he introduced the HYDRAstor, and it works 90% faster than when he was using tape.</font></li>
<li><font style="FONT-SIZE: 1em">Fourth, he has found it has reduced his stress level. Aside from alleviating his backup concerns, the HYDRAstor provides him a solid foundation that he can use to build for the future. He no longer has the traditional worries of how he will manage, upgrade or migrate data to new storage systems, because the HYDRAstor accounts for all of these concerns with its grid storage architecture.</font></li></ul>
<p>In the next few months, Paulk plans to archive about one million documents to the HYDRAstor, which will consume about another 1.4 TBs of storage. What he is curious to discover is how&nbsp;they will impact his deduplication ratio. These one million documents are currently paper documents that need to be scanned so he&nbsp;wonders how they will&nbsp;impact the&nbsp;level of&nbsp;deduplication that he is seeing with his backup data once they are stored on the HYDRAstor. My plan is to catch up and speak with Mr. Paulk again this fall so DCIG can share some more of his story and experiences at that time.</p>]]></description>
            <link>http://necam.dciginc.com/2008/05/anderson-center-for-autism-dri.html</link>
            <guid>http://necam.dciginc.com/2008/05/anderson-center-for-autism-dri.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Archiving</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Records Management</category>
            
            <pubDate>Thu, 08 May 2008 12:44:52 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>NEC HYDRAstor Restores Focus on Optimized Data Management</title>
            <description><![CDATA[<p>NEC's Vice President of Advanced Storage Products, Karen Dutch, recently brought out some salient points about storage management in her Spring 2008 SNW presentation, "<a href="http://www.snworlando.com/agendaS08/mon440c.html">Defining Storage Solutions in the Data Center 2.0</a>". Specifically, she described the features that new storage architectures should deliver in order to keep storage management manageable as storage growth in organizations continues. Of course, the not-so-subtle message is that NEC's <a href="http://www.hydrastor.com/">HYDRAstor</a> delivers on these new features. Here's how I see the HYDRAstor doing so.</p>
<ul>
<li><strong><em>Self managing.</em></strong> NEC's HYDRAstor architecture supports self-management through the dynamic addition of nodes (servers) that offer either more capacity or performance. As more <a href="http://www.necam.com/storage/HYDRAFAQ.cfm#2">Accelerator or Storage Nodes</a> are added to the HYDRAstor grid architecture, it non-disruptively redistributes data across old and new nodes to optimize performance and maximize data resiliency. This eliminates the normal processes of provisioning, sizing and data migrations that administrators have to perform, while alleviating the management overhead and costs associated with archive and backup processes. </li>
<li><strong><em>Data mobility. </em></strong>Data mobility comes more prominently into play when new Storage Nodes are added into the HYDRAstor grid storage architecture as well as during technology refreshes of Storage Nodes. As new Storage Nodes are added, the HYDRAstor re-balances data across existing and new Storage Nodes to simplify data management. As existing nodes are retired, data from an existing node is automatically migrated to a new node and, if companies have multiple sites, HYDRAstor supports the movement of data to alternative sites to create an enhanced level of data resiliency.</li>
<li><strong><em>Non-disruptive evolution. </em></strong>The HYDRAstor grid storage architecture addresses one of the most problematic aspects of storage management today: technology refreshes. As current storage systems age, usually the only option companies have is to purchase an entirely new storage system and then use either host or network based data migration tools to move to new storage controller architectures. Since HYDRAstor is based on a grid architecture, it can transparently evolve to newer technology simply through the introduction of new Accelerator or Storage Nodes based on the latest and greatest hardware technology. The HYDRAstor adds these new nodes into the grid while older nodes are marked for decommissioning and non-disruptively taken out of service.</li>
<li><strong><em>Scalability without trade-offs</em></strong>. A key problem with current storage system architectures is that you generally have to pick between performance, capacity and cost when scaling the architecture. Since HYDRAstor's Accelerator and Storage Nodes are based on industry-standard, off-the-shelf hardware, the typical hardware costs associated with proprietary storage hardware architectures are avoided. Since HYDRAstor also decouples performance (Accelerator Nodes) and capacity (Storage Nodes), users can scale one or both to accommodate whichever direction their storage environment grows.</li>
<li><strong><em>Enhanced, flexible resiliency.</em></strong> The HYDRAstor accounts for the growing possibility that today's RAID data protection architectures are insufficient when deduplication is used across 100s or 1000s of TBs of capacity. Administrators can define the level of data redundancy that is appropriate to their site and the HYDRAstor will dynamically distribute the data across the nodes to deliver the desired level of resiliency.</li>
<li><strong><em>Integrated data management services.</em></strong> The HYDRAstor comes with two sets of integrated data management services. The base level of services includes the automated management of the data, more efficient storage (deduplication) and enhanced data resiliency. Advanced services like replication, WORM, security, classification and search are other features that users can optionally license from NEC.</li>
<li><strong><em>Industry standard interface support.</em></strong> The HYDRAstor presents an industry-standard NFS and/or CIFS interface, so any Linux, Windows or UNIX server can archive data to it or, alternatively, any backup software can treat it as a disk cache and store data on it. </li></ul>The most compelling benefit of the NEC HYDRAstor grid storage architecture is that companies who adopt this architecture can start to take their focus off managing the storage infrastructure and focus more squarely on the data they are entrusted with protecting and managing. Today's businesses live and die by how well they manage their data and&nbsp;the management of data is still tied too closely to how well the hardware is managed. By meeting these new storage solution features of the Data Center 2.0, NEC HYDRAstor restores the focus of administrators back to optimal data management and protection. ]]></description>
            <link>http://necam.dciginc.com/2008/04/nec-hydrastor-restores-focus-o.html</link>
            <guid>http://necam.dciginc.com/2008/04/nec-hydrastor-restores-focus-o.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Wed, 30 Apr 2008 06:00:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Scaling to a Zettabyte; NEC&apos;s Karen Dutch Defines Storage Solutions in the Data Center 2.0</title>
            <description><![CDATA[<p>A <a href="http://www.newswire.ca/en/releases/archive/March2007/06/c7336.html">well-known study</a> released by IDC in 2007 forecast that by 2010 the amount of information that will be copied and created in the global digital universe will climb to nearly 1 zettabyte (that's 1 million petabytes). That number was based on the assumption that there was approximately 160 exabytes of information in existence in 2006 and that global data growth will continue to grow at a year over year rate of 57%. Assuming that forecast holds true, this puts the total global store of information at or over 400 exabytes by the end of this year.</p>
<p>Compelling statistics, no doubt, and while that report translates into a lot of good news and healthy revenues for storage vendors, storage managers grappling with this growth in information are probably less than enthused by this report. As most storage managers know, storage growth no longer automatically equates to more personnel to help in the task of managing this data and the underlying storage systems. </p>
<p>It wasn't that long ago that most enterprises followed a general rule of thumb in terms of how many TBs of storage that a storage administrator could manage. Whether that ratio was 1:1, 10:1 or 20:1, as more TBs of storage were added, storage managers had some assurance that they would receive more storage administrators or more robust storage management tools to help them manage this additional data.</p>
<p>This traditional assumption of "more storage=more people" is no longer realistic. Storage systems with inexpensive, high capacity SATA disk drives are now routinely used in the storage of archived and backup data. While these forms of data typically do not require the same level of day-to-day oversight and management that&nbsp;production data requires, managing the data on these storage systems takes on a different nature. </p>
<p>Responsibilities such as managing capacity and performance growth, replicating data offsite for DR, copying or backing up data to removable media such as optical or tape and migrating data during technology refreshes can grow in magnitude as the amount of storage grows. Toss in the time required to stay up-to-date and best manage new technologies like deduplication and it becomes obvious that bringing in these new storage systems is not necessarily a plug-n-play proposition.</p>
<p>However there is an added twist to the ongoing management of these storage systems for this data. These storage systems are viewed as operational expenses and not critical to daily business operations so no new staff is added for their ongoing management. As a result, the responsibility of managing these storage systems typically falls to existing individuals who try to manage these systems as they have time. </p>
<p>These problems are creating the need for a new type of storage system architecture that is specifically designed to address these needs. NEC's Vice President of Advanced Storage Products, Karen Dutch, addressed some of these concerns in her recent Spring 2008 SNW presentation, "<a href="http://www.snworlando.com/agendaS08/mon440c.html">Defining Storage Solutions in the Data Center 2.0</a>". In this presentation, she provided a number of pointers as to what features these new storage architectures should deliver to keep storage management in this environment manageable including:</p>
<ul>
<li>Self managing</li>
<li>Data mobility</li>
<li>Non-disruptive evolution</li>
<li>Scalability without trade-offs</li>
<li>Enhanced, flexible resiliency</li>
<li>Integrated data management services/li&gt; 
<li>Industry standard interface support</li></ul>
<p>In the next blog entry, I'll take a closer look at these points and how NEC's <a href="http://www.necam.com/Storage/GridStorage.cfm">HYDRAstor</a> is delivering on these new objectives.</p>]]></description>
            <link>http://necam.dciginc.com/2008/04/scaling-to-a-zettabyte-necs-ka.html</link>
            <guid>http://necam.dciginc.com/2008/04/scaling-to-a-zettabyte-necs-ka.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Thu, 24 Apr 2008 05:00:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>NEC HYDRAstor Keeps Footprint of Deduplication Appliances to a Minimum; Part 2 of 2</title>
            <description><![CDATA[<p>In <a href="http://www.dciginc.com/redirect.php?site=http://necam.dciginc.com/2008/03/nec-hydrastor-uses-a-two-step.html" target="_blank">part one</a> of this two-part series, NEC's Director of Business Development, Dr. Christian Toelg, answered some specific technical questions about how Accelerator Nodes and Storage Nodes differ from one another. This second part takes a look at what specific advantages NEC's <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/storage/Advantage.cfm" target="_blank">HYDRAstore</a> <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/Storage/GridStorage.cfm" target="_blank">grid storage</a> architecture has over siloed, two controller storage system architectures when performing deduplication.</p>
<p>The benefits that either a grid storage or a siloed, two controller storage system will provide in terms of deduplication will hinge on the amount of data that a company plans to store on the system. When there is only a small amount of data (under 10 TB), Dr. Toelg says there is probably not a big difference in the deduplication benefits that a company will realize when using one approach over another. In these circumstances, the other benefits of using a grid storage architecture, such as its ease of upgradeability and "future proofing" against technology obsolescence, become the main drivers for its adoption.</p>
<p>It is in enterprises that need to store tens, hundreds or even thousands of terabytes of data that the drawbacks of a siloed, two controller architectures become more apparent. These architectures force enterprises to deploy multiple appliances that create data silos. Multiple appliances negate benefits of deduplication since the second, third and future appliances store similar data as previous appliances and do not deduplicate across one another. "As you add more appliances, the effective deduplication rate drops from 18:1 - 20:1 to 8:1 - 9:1 which thereby cuts your total effective deduplication capacity," says Dr. Toelg.</p>
<p>NEC HYDRAstor's grid storage architecture can deliver a higher deduplication ratio since its <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/storage/HYDRAFAQ.cfm" target="_blank">Storage Nodes</a> create one logical pool of storage capacity and globally deduplicate data across that entire pool. The actual deduplication ratio any enterprise gets will vary according to how long data is retained, the data or application type since backups will likely get better dedupe ratios than archive data, the type of backups performed (full, incremental, or differential) and the uniqueness of the data stored on the appliance. However, the odds of achieving higher deduplication ratios are improved using HYDRAstor since companies can store all data in one logical pool.</p>
<p>The grid storage architecture also gives enterprises a couple of other advantages. Since all Accelerator Nodes can access data on any Storage Nodes, companies that plan to continue their use of tape can dedicate certain Accelerator Nodes to transfer data from disk to tape. This configuration prevents the performance overhead that data migrations incur from overlapping and impacting production backups and restores. Companies that run backups and restores 24x7 are the most apt to want to take advantage of this option. </p>
<p>The other notable advantage that a grid storage architecture provides is that it keeps the footprint of the system to a minimum. Since HYDRAstor can scale Accelerator Nodes and Storage Nodes independently, enterprises only need to bring in as many nodes as they need and upgrade or add either capacity or performance at any time. This flexibility minimizes power consumption, data center floor space and the cost of HYDRAstor since companies only need to purchase as many Accelerator or Storage Nodes as they need, when they need them. HYDRAstor's global deduplication feature further contributes to keeping the hardware footprint to a minimum, by improving the overall deduplication ratio and reducing the number of nodes required to hold and serve the data.</p>
<p>NEC HYDRAstor's grid storage architecture simplifies deduplication while maximizing its benefits. Companies can start with the HYDRAstor in configurations that match their initial requirements. They can then scale it according to how their environment evolves without sacrificing higher deduplication ratios or ease of management. HYDRAstor's grid storage architecture sets the stage for companies to confidently deduplicate their data while avoiding the problems that siloed, two-controller architectures introduce. </p>]]></description>
            <link>http://necam.dciginc.com/2008/04/nec-hydrastor-keeps-footprint.html</link>
            <guid>http://necam.dciginc.com/2008/04/nec-hydrastor-keeps-footprint.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Mon, 14 Apr 2008 05:30:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>NEC HYDRAstor Uses a Two Step Inline Deduplication Process Part 1</title>
            <description><![CDATA[<p>The grid architecture upon which <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/" target="_blank">NEC</a>'s <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/storage/Advantage.cfm" target="_blank">HYDRAstor</a> is designed is unique. By using a <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/Storage/GridStorage.cfm" target="_blank">grid storage</a> architecture, HYDRAstor is architected to avoid some of the scaling issues associated with performance and capacity that other deduplication appliances intended for the enterprise may encounter. To do this, HYDRAstor uses two types of servers, or nodes, <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/storage/HYDRAFAQ.cfm" target="_blank">Accelerator Nodes</a> and Storage Nodes, that are dedicated to managing these specific tasks. To better understand HYDRAstor's under-the-covers configuration, I recently had a conversation with NEC's Director of Business Development, Dr. Christian Toelg, to discuss this topic.</p>
<p><strong>Jerome</strong>: How do the HYDRAstor Accelerator and Storage Nodes differ from one another in their <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/Storage/HYDRAstorHS.cfm" target="_blank">hardware</a>?</p><b>
<p><strong>Dr. Toelg</strong>:</b> For the most part, they use off-the-shelf hardware, though there is one big difference between them. Accelerator Nodes do not need to store large amounts of data like the Storage Nodes do. Since the Accelerator Nodes only need enough storage for the operating system, they just use standard disk drives that are mirrored. Conversely, we want to put as much storage as possible into the Storage Nodes. Currently these nodes use 500 GB SATA disk drives, though 750 GB and 1 TB SATA disk drives for the Storage Nodes will be available in the near future.</p><b>
<p><strong>Jerome</strong>: </b>Since their hardware is similar, how do the nodes differ in the software that operates on them?</p><b>
<p><strong>Dr. Toelg</strong>:&nbsp;T</b>he Accelerator Nodes use file systems to present NAS interfaces to the connecting servers; backup clients see the NAS interface presented by the Accelerator Nodes and the Accelerator Nodes see the NAS interface presented by the Storage Nodes. However, what the software on these nodes does under the covers is very different. The Accelerator Nodes pre-process incoming data by chunking the data and then doing de-duplication of the data. The primary function of the Storage Nodes is to protect the data and then assign the data to the nodes where it is most efficiently and effectively protected. </p><b>
<p><strong>Jerome</strong>: </b>So both the Accelerator and the Storage Nodes deduplicate data?</p>
<p><strong>Dr. Toelg</strong>: That is correct. HYDRAstor uses a two-step inline process to deduplicate data. Two or more Accelerator Nodes may see the same file at the same time. However, Accelerator Nodes only have a part of the information required to do deduplication and do not maintain the entire global deduplication index. So the Accelerator Nodes chunk up each file into&nbsp;small chunks, eliminate as many duplicates as possible&nbsp;and then&nbsp;send the remaining&nbsp;chunks&nbsp;to the Storage Nodes. The Storage Nodes receive these chunks of data and then make the final determination as on which chunks are unique and should be stored to minimize storage requirements. Data is protected by breaking the unique chunks of data up into&nbsp;fragments and distributing the fragments across the Storage Nodes.</p><b>
<p><strong>Jerome</strong>: </b>How do the Storage Nodes protect the data without impacting performance?</p>
<p><strong>Dr. Toelg</strong>: HYDRAstor's two-tier grid architecture has been designed so that many read and write processes can run in parallel by splitting tasks between Accelerator and Storage Nodes and distributing the workload across many nodes. This makes the HYDRAstor unique when compared to monolithic appliances as it allows you to scale performance and capacity in parallel. Furthermore, the HYDRAstor system is constantly monitored for the discovery of new nodes,&nbsp;component failures or the removal of specific nodes from the cluster of Storage Nodes. Background tasks ensure the system is balanced with respect to storage capacity and&nbsp;performance at all times without manual interaction. By distributing tasks so read and writer performance is maximized and maintenance of stored data such as migration of data to optimize utilization, data deletion or recovery of lost data is ensured. &nbsp;</p><font style="FONT-SIZE: 0.8em" size="2">
<p><a href="http://necam.dciginc.com/2008/04/nec-hydrastor-keeps-footprint.html">Part 2</a> of this analysis of how NEC HYDRAstor's Accelerator and Storage Nodes are configured and deduplicate data will appear in the next few weeks. The next part will examine what benefits global deduplication provides and under what circumstances users might expect to reach those ratios.</p>
<p>Note: This blog entry was updated on 4/14/08 at 11:32 am CST to reflect some new information on how the NEC's HYDRAstor deduplication process works.</p></font>]]></description>
            <link>http://necam.dciginc.com/2008/03/nec-hydrastor-uses-a-two-step.html</link>
            <guid>http://necam.dciginc.com/2008/03/nec-hydrastor-uses-a-two-step.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Fri, 28 Mar 2008 05:00:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>NEC HYDRAstor&apos;s Grid Storage Architecture Can Eliminate Siloed Data Stores</title>
            <description><![CDATA[<p>In previous blog entries I have made reference to silos of deduplicated backup data stores but have not gone into great deal of detail as to what specific problems data silos create. So in this entry I take a closer look at:</p>
<ul>
<li>Why data silos are created</li>
<li>The problems that data silos can create</li></ul>
<p>The primary reason that many deduplicating appliances create data silos is that they are based on the traditional dual-controller storage system architecture. Dual-controller storage systems typically use two clustered servers that sit in front of a fixed pool of storage. These two servers provide high availability, improved performance and access to the data stored on backend pool of storage by either server.</p>
<p>Traditional storage systems serve consumers well when storage growth is limited. But this model starts to break down when used in conjunction with deduplicated backup data stores. Anecdotal evidence suggests that data growth in most companies continues to grow at rates of 50% or more every year. Using deduplication in the backup process does significantly reduce the amount of data that companies need to store. However, over time, even deduplicating backup appliances eventually need more performance and storage capacity and the underlying architecture of traditional dual-controller storage systems does not permit them to scale.</p>
<p>Companies then need to bring in more deduplicating backup appliances with isolated backend storage in an attempt to keep up with data growth. Yet when that occurs, the new appliance is unaware of the first appliance and is unable to take advantage of any of the indexed deduplicated data stores that it has created. As a result, the second appliance must start from scratch and build its own deduplicated data store which creates the new data silo.</p>
<p>Data silos create problems on a number of fronts. More appliances create more points of management. Backups become more difficult to manage since administrators need to determine what backup jobs they need to send where. Send too many backup jobs to the new appliance and it may encounter the same performance or capacity problems as the existing appliance. Send too few jobs and the new appliance will not deliver the full benefits of deduplication. In addition, separating the backup jobs to isolated deduplicating appliances with different data stores inhibits the ability to deduplicate data globally across all backup jobs.</p>
<p>These types of issues create a demand for a new type of storage architecture with global deduplication capabilities that new products based on grid storage architectures can address. <a href="http://www.necam.com/storage/Advantage.cfm">NEC HYDRAstor</a> is one such product that uses a grid architecture to allow it to scale performance and capacity independently. Companies can start with a configuration that meets their immediate data storage needs and budget restrictions. However companies can scale it without needing to create additional points of management or new data silos.</p>
<p>Because the NEC HYDRAstor functions as one logical storage system regardless of how much performance or capacity a company adds, it can globally deduplicate all company archive or backup data stored on it. This reduces and optimizes corporate data stores while eliminating the headaches associated with managing multiple appliances and backup jobs.</p>
<p>The adoption of disk-based backup and deduplication is accelerating in more companies. As it does, new storage system architectures are needed to help companies solve the type of problems that deduplication and today's data growth creates without creating data silos. Products based on grid storage architectures such as NEC HYDRAstor can help eliminate data silos while giving companies newfound flexibility to manage their growing volumes of data while minimizing the amount of data they need to store.</p>]]></description>
            <link>http://necam.dciginc.com/2008/03/nec-hydrastors-grid-storage-ar.html</link>
            <guid>http://necam.dciginc.com/2008/03/nec-hydrastors-grid-storage-ar.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Wed, 19 Mar 2008 06:05:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Global Deduplication Can Offset Enterprise Storage Growth Rates</title>
            <description><![CDATA[<p>It's almost impossible to pick up a trade rag or read an article on the web on data protection without some mention of deduplication. Some of that is deliberate as editors know that users are googling for the word "deduplication" and by putting DEDUPLICATION in big bold letters in the title or text of the article helps draw readers to their site. Yet these discussions, while relevant, overlook mid to long term data management requirements. </p>
<p>In fact, there was a great debate last fall at SNW between some of the deduplication vendors as to which way was the best way to deduplicate data. While entertaining, this debate solved little. Most users only care about deduplication insomuch as it affects their ability to successfully backup the data in their environment. However most of the focus in these discussions focus on solving their short-term requirements and don't take into consideration some of the longer term problems. Unfortunately hidden long term data management costs lurk for those who unwittingly adopt the wrong deduplication appliances.</p>
<p>Deduplicating appliances have gained mindshare with users because it makes disk as cheap, or cheaper, than tape by delivering data reduction ratios of 15:1 or more while expediting backups which solves their short term backup problems. However companies also need to consider, when selecting a deduplication product, how well it will best serve them in the long term. </p>
<p>For example, does the deduplication appliance extend the benefits of deduplication beyond the current appliance and, if it does, how does it do it? The capability to globally deduplicate data is very powerful, but most deduplicating storage appliances are limited in scope to just that one appliance. If the ceiling for performance or storage capacity is reached, one must bring in a new appliance. Thus, deduplication starts from scratch, even if the data on the initial appliance(s) have been deduplicated. The impact is silos of deduplicated data.</p>
<p>Is there an adverse impact of deduplicating silos of data annually? That really depends, but companies I talk to are experiencing year-over-year data growth rates of 50% or more. Recently, I spoke to a colleague at my previous employer and he said the storage rate there has continued to double yearly. While we believe data growth rates will slow down, anecdotal evidence suggests that it is not. </p>
<p>I bring this example with my previous employer up because as disk prices continue to drop, companies are bringing in more storage to store more data, often to make copies of existing data for other purposes such as testing, development, data mining and eDiscovery. But as companies back this repurposed data up to new appliances, they can not take advantage of the deduplication benefits that existing appliances provide.</p>
<p>Siloed deduplication offered&nbsp;some benefits during initial use, but forward looking companies realize global deduplication fully capitalizes on deduplication technology to provide the lowest TCO solution that meets business data requirements. Few products are available to support global deduplication, but new products like the <a href="http://www.dciginc.com/redirect.php?site=http://www.necam.com/Storage/GridStorage.cfm" target="_blank">NEC HYDRAstor</a> can deliver this based on its storage architecture. Since HYDRAstor is based on a grid storage architecture, it can scale performance and capacity independently allowing IT to deploy one system instead of many. This architecture enables it to globally deduplicate data without running up against the performance and capacity limitations that current appliance based products encounter and force IT to deploy instances, which can easily create a management nightmare. Global deduplication also&nbsp;lowers total storage costs since the amount of data stored is minimized.</p>
<p>Using deduplication is cost-effective justification used to introduce disk into the backup process. However companies must recognize that in order to fully leverage the benefits of deduplication and keep it from becoming just another management pain point a few years down the road, they need to look beyond just the problems deduplication solves and what new problems it creates when you have many silos of deduplicated data. As companies consider their mid-to-long term business issues, emerging features like global deduplication and grid storage architectures merit special consideration in the data deduplication management process.</p>]]></description>
            <link>http://necam.dciginc.com/2008/03/global-deduplication-can-offse.html</link>
            <guid>http://necam.dciginc.com/2008/03/global-deduplication-can-offse.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Wed, 05 Mar 2008 12:00:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>NEC HYDRAstor Helps to Eliminate the Need for Backup Data Migration Management</title>
            <description><![CDATA[<p>Data migrations are a painful part of storage management in most enterprise shops today. Driven by storage technology refreshes, storage upgrades, or optimizing data placement on storage systems to improve application performance, data migrations are an ongoing and laborious part of enterprise data management. Yet for most companies the pain of data migrations&nbsp;has been&nbsp;largely restricted to moving production data between from one production storage system to another. </p>
<p>Introducing disk as a target into the backup process changes this scenario. Companies must now begin to account for the difficulties that migrating data from disk to tape or other disk systems once the initial backup completes. The magnitude of the problem varies according to how the disk is configured. For instance:</p>
<ul>
<li>Using disk-as-disk may sound like the simple solution but if users are backing up vast amounts of data, the storage system may continually run out of storage capacity. This results in failed backup jobs and users needing to continually procure more storage space.</li>
<li>Virtual tape libraries (VTLs) are becoming popular in enterprises because they look and act like physical tape libraries to backup software while providing the backup and recovery speeds of disk. However migrating data from virtual tape cartridges to physical tape cartridges is fraught with problems. Companies need to keep backup catalogs in sync, account for compression and encryption on physical tape drives and account for differences in size between virtual and physical tape cartridges.</li>
<li>Deduplicating backup appliances appear as the perfect technology for backing up to disk since it eliminates redundant data, can increase storage capacities by 10 fold or more and gives the appearance of infinite capacity. There's the rub. Deduplicating backup appliances only give the appearance of infinite capacity so they may require upgrades or migrating data to tape. In either case, companies need to plan and manage the migration.</li></ul>
<p>So while the initial benefits that companies derive from using disk in any of its different formats in the backup process are usually substantial, the effort associated with managing and migrating backup data to support a technology refresh or disk system expansion to migrate from disk to tape over time can become problematic. </p>
<p>That's what makes the NEC HYDRAstor truly unique among disk-based data protection options. Even though it falls under the general category of a "deduplicating backup appliance", because it can scale to manage petabytes of data while concurrently scaling performance, companies do not need to migrate backup data for disk to tape or upgrade to a new appliance when they are running short on capacity. The NEC HYDRAstor eliminates these problems by decoupling the scaling of performance and capacity servers, or nodes, so companies can scale either to meet the particular needs of their environment without replacing existing equipment. The HYDRAstor also helps companies avoid the typical need to do data migrations during technology refreshes since data on existing nodes can be non-disruptively migrated to new nodes without downtime.</p>
<p>The introduction of disk into the backup process has fundamentally changed the backup process but with that change in process companies also need to change how they think about managing backup data once it is stored to disk. Most disk-based products used in the backup process have not fully accounted for this fundamental change in the management of backup data after it is stored on disk. The NEC HYDRAstor is one such product that has taken the management of backup data stored on disk into account in its design from the outset and prevents this from becoming a problem later on in the management of the data.</p>]]></description>
            <link>http://necam.dciginc.com/2008/02/nec-hydrastor-helps-to-elimina.html</link>
            <guid>http://necam.dciginc.com/2008/02/nec-hydrastor-helps-to-elimina.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Deduplication</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Disk Based Backup</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Tue, 26 Feb 2008 12:00:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Backup is Deduplication&apos;s First Stop</title>
            <description><![CDATA[<p>During 2007 the terms "deduplication" and "backup" became almost inextricably linked with products like NEC's HYDRAstor helping to contribute to this association. However for companies looking to introduce HYDRAstor into their backup environment, they should take into account that data deduplication is a destination and backup only the first stop in that journey.</p>
<p>Though HYDRAstor is intended as a possible replacement for tape, it does not need to replace all tape immediately... or ever. While replacing tape with disk is probably the objective for most companies - and one that may occur over time - this will not occur overnight. In the meantime, companies need to keep data offsite as part of their comprehensive data protection plan, which for most companies means that they will still need to copy data from disk to tape. </p>
<p>This configuration plays to HYDRAstor's strength. It uses NAS to present a large disk target to the backup software which gives it a distinct advantage over using Virtual Tape Libraries (VTLs). In order for the backup software to appropriately track the data when copying data from a virtual tape cartridge to a physical tape cartridge, the amount of data on a physical tape cartridge needs to match the amount of data on a virtual tape cartridge. However if the virtual tape cartridge is not filled to capacity then the physical tape cartridge is not filled either.</p>
<p>Using HYDRAstor as a large disk pool or cache changes this dynamic. When data is copied from disk to tape, the backup software copies data from the HYDRAstor and fills up each physical tape cartridge. Even though HYDRAstor stores the data in a deduplicated state, because HYDRAstor decouples Accelerator nodes from Storage nodes in its grid architecture, companies can increase performance if needed.</p>
<p>Backup is not the only way that companies can use HYDRAstor. While using it as a backup target is certainly the most popular option right now, using it to store archived data or as a network file server are other ways that companies can also look to use HYDRAstor.</p>
<p>Data deduplication is changing the backup game but it has the potential longer term to change how companies store much of their data. It's when companies start to view deduplication from this larger viewpoint that they can begin to understand the significance of HYDRAstor's underlying grid architecture. It can adapt to meet the different ways companies need to use it in the backup process but gives companies the option to use it to meet&nbsp;multiple, different requirements.</p>]]></description>
            <link>http://necam.dciginc.com/2008/01/backup-is-deduplications-first.html</link>
            <guid>http://necam.dciginc.com/2008/01/backup-is-deduplications-first.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Fri, 11 Jan 2008 10:33:09 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>HYDRAstor Creates a Decoupled Storage Grid</title>
            <description><![CDATA[<p>The ability to create a single logical pool of storage using off-the-shelf server and storage hardware is one of the key drivers behind the current revolution in data storage that is spawning new grid storage architectures. However delivering on the promise of grid storage creates new challenges - especially when one adds deduplication to the equation. </p>
<p>Scaling a storage grid to manage and deduplicate hundreds of TBs or even PBs of data while still protecting the integrity of the data and without creating performance bottlenecks is a different animal than merely delivering grid storage. This last week I hooked up with Dr. Christian Toelg, NEC's Director of Business Development, to discuss how NEC architected HYDRAstor to meet these specific challenges.</p>
<p>Dr. Toelg believes that the future of disk-based data protection clearly lies with deduplication. Unfortunately deduplication does not play well using traditional clustered storage system architectures as the amount of data under management continues to grow. Traditional clustered storage systems have a finite number of controllers and storage capacity as well as centralized file system information and mapping tables on each node. Together, these factors create hard upper limits to scalability forcing end users to deploy and manage additional systems that function as isolated data silos with no deduplication across systems increasing management complexity.</p>
<p>HYDRAstor's grid storage architecture conversely can start small but linearly and independently scale performance or capacity or both. HYDRAstor decouples performance nodes (called Accelerator Nodes) from capacity nodes (called Storage Nodes) creating two separate layers that act independently of one another and possess their own levels of intelligence. </p>
<p>In traditional clustered storage architectures, the storage controllers possess all of the information about the data while its attached disk remains stupid. HYDRAstor changes this paradigm by putting intelligence in both layers and makes the Storage Nodes "self-aware". These Storage Nodes behave as one large pool of capacity that is fully self-managed with respect to balancing capacity and performance across all nodes as well as rebalancing the system as new capacity is added or to recover from failures. </p>
<p>Creating a separate storage layer introduces new data protection and disaster recovery options. Dr Toelg says that NEC made a specific decision early on in HYDRAstor's design not to use RAID since loosing any chunk of deduplicated data can impact tens, hundreds or thousands of file and make all of them unrecoverable. Because the storage layer possesses its own level of intelligence, HYDRAstor can create higher levels of data protection and redundancy than what RAID can deliver with a comparable storage overhead. Also, deduplicated data can be shared between storage pools in different geographical locations using policies to replicate data so companies can automate their disaster recovery configuration.</p>
<p>By creating two distinct layers in its grid storage architecture, HYDRAstor addresses key concerns about data protection and scalability that deduplication typically introduces. HYDRAstor eliminates the capacity, cost and performance trade-offs that companies typically need to make while laying the foundation for companies to bring a disaster recovery solution into their environment.</p>]]></description>
            <link>http://necam.dciginc.com/2007/12/hydrastor-creates-a-decoupled.html</link>
            <guid>http://necam.dciginc.com/2007/12/hydrastor-creates-a-decoupled.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Fri, 14 Dec 2007 12:01:06 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Disaster Recovery Time Objective; HYDRAstor has an Edge</title>
            <description><![CDATA[<p>Data recovery is where the rubber meets the road. However I am not sure if deduplication vendors always practice what they preach because vendors almost immediately divert the focus of users off of data recovery speeds and onto how fast they do backups and what size data reduction ratios they can potentially deliver. While high data reduction ratios are good and wonderful, they may not do a company much good if it can't recover the data as fast as it needs or wants.</p>
<p>The biggest benefit of deduplication - high data reduction ratios - can also become its biggest drawback over time where recoveries are performed. As deduplication breaks incoming backup streams apart into smaller chunks of data, it compares these incoming chunks of data to chunks of data it already has on its storage subsystems, storing and indexing net new chunks of data while indexing duplicate chunks. </p>
<p>The problem that begins to emerge over time from a recovery perspective is inadequate processing power to reconstruct the files. Tens, hundreds or even thousands of files may reference the same chunks of data that are spread throughout the deduplicating appliance so it takes the system longer to reconstruct specific files. This scenario becomes especially problematic if the deduplicating appliance is called upon to reconstruct and recover multiple files for tens or hundreds of servers at the same time; such as may be required during a Disaster Recovery (DR) scenario. </p>
<p>NEC's HYDRAstor has a decided edge over its competitors. HYDRAstor uses a grid architecture that supports both accelerator and capacity nodes so users can scale performance and capacity independently of one another. This architecture is critical in recoveries since companies can add more accelerator nodes to meet specific application or corporate Recovery Time Objectives (RTOs). </p>
<p>For instance, it is more likely that a company will only need to recover files from a single server at one time so less accelerator nodes may be needed since the performance overhead is not as great. However if data is moved off to tape or they need to recover files at a disaster recovery site the company may need to concurrently recover tens or hundreds of servers in a short, fixed period of time. Using HYDRAstor, they can add additional accelerator nodes to meet these specific corporate RTOs. </p>
<p>Data recoveries should be first and foremost on the minds of administrators when purchasing disk-based data protection. If companies cannot recover data in time to satisfy specific application or corporate DR RTOs, its value proposition becomes questionable since data protection is ultimately about data recoveries, not faster backups or data reduction. NEC's HYDRAstor remembered to deliver on that objective. </p>]]></description>
            <link>http://necam.dciginc.com/2007/12/disaster-recovery-time-objecti.html</link>
            <guid>http://necam.dciginc.com/2007/12/disaster-recovery-time-objecti.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Fri, 07 Dec 2007 07:40:00 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>Speed Still Matters</title>
            <description><![CDATA[<font size="1">
<p><font style="FONT-SIZE: 1.25em">As I was going through the job interview process for a storage administrator position nearly seven years ago, my prospective employer took me on a tour of the data center in which I was to eventually work. Having always worked in smaller data centers, this tour took me aback. Entire rooms were filled with tape libraries while other rooms were filled with racks of tapes staged for offsite delivery or prepped for someone to put them back in the tape libraries.</font></p>
<p><font style="FONT-SIZE: 1.25em">Yet by the time I left that company about 18 months ago, most of that tape infrastructure was gone and replaced by disk as disk became the predominant method for backup and recovery. What is of note is that my former employer financially justified the changeover from tape to disk prior to the advent of deduplication.</font></p>
<p><font style="FONT-SIZE: 1.25em">This is an interesting point to ponder since if deduplication had been as widely available on disk subsystems three or four years ago as it is now, would my company have bought disk with deduplication or without? That may seem like a stupid question but companies should not assume that just because a disk library supports deduplication that it is a better choice than just buying raw disk and backing up all of their data on it. </font></p>
<p><font style="FONT-SIZE: 1.25em">Before adopting deduplication, one needs to consider all of the intangibles. The overhead associated with deduplicating the data, the additional time needed to recover the data and the risks associated with storing data in a deduplicated state contribute to the argument that deduplication is not a slam dunk in every circumstance.</font></p>
<p><font style="FONT-SIZE: 1.25em">So am I saying that one should stop reading this blog about HYDRAstor's deduplication features and start shopping for disk drives at Wal-mart? Not exactly. But what I am saying is storage administrators, engineers and architects need to take a step back and take a look at the larger picture of what problems they are really trying to solve before implementing deduplication. </font></p>
<p><font style="FONT-SIZE: 1.25em">While reducing their data stores is one of the problems they are trying to solve, it is not the only one. The real problem that most companies are trying to solve with disk is improving their backup and recovery windows which, at the enterprise level, become a much more complex architectural challenge than just buying a storage system that supports deduplication.</font></p>
<p><font style="FONT-SIZE: 1.25em">This is what makes HYDRAstor an extremely compelling story. Yes, it does do deduplication and maybe even arguably better than other products. But more importantly its architecture addresses the reasons companies introduced disk-based backup in the first place - to improve the speed of their backups and recoveries. By delivering an architecture that can independently scale performance and capacity, companies can take advantage of the benefits of deduplication while at the same time providing fast backup and recovery times.</font></p>
<p><font style="FONT-SIZE: 1.25em">Deduplication will store data more efficiently in a smaller footprint - no one disputes that. Yet for companies who prioritize data reduction ratios above faster backup and recovery times, they will quickly be reminded that it was faster backup and recovery times, not data reduction, that was their primary motivation for switching to disk in the first place.</font></p></font>]]></description>
            <link>http://necam.dciginc.com/2007/11/speed-still-matters.html</link>
            <guid>http://necam.dciginc.com/2007/11/speed-still-matters.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Wed, 28 Nov 2007 10:17:53 -0600</pubDate>
        </item>
        
        <item>
    	    <author>
	        <name>Jerome M. Wendt</name>
        	<uri>http://www.dciginc.com/about/jeromemwendt</uri>
	    </author>
            <title>The Enterprise Game Changer</title>
            <description><![CDATA[The rapid emergence and acceptance of deduplication into the mainstream of enterprise storage in the last 18 - 24 months has been nothing short of phenomenal. Enterprise storage is a segment of the computer industry that typically measures change in years, not months. While changes in specific media formats occur, the idea that a totally new technology can gain such widespread acceptance and adoption in so short a time is as revolutionary as the technology itself.<br /><br />The problem that emerges when a technology goes from the drawing board to corporate acceptance so quickly is that it is easy for individuals to start to assume that every product that supports deduplication implements deduplication the same way or delivers the same level of benefits. While it might be true that different products implement deduplication technology using similar underlying algorithms for data reduction, they do not all scale in the same way.<br /><br />Most VTLs or disk-based backup appliances that embed deduplication technology are typically limited in two ways: the number of front end controllers that can deduplicate the data and the amount of back end storage. These limitations affect how fast they can deduplicate backed up data and how much data they can store. This inability to scale forces companies to purchase another VTL or disk-based backup appliance - either a larger unit and migrate data from the old VTL or disk-based backup appliance to the new unit or purchase a second unit.<br /><br />Neither option is particularly desirable since in both cases they still have the same inherent performance and capacity limitations as the initial disk library. However using a second deduplicating VTL or disk-based backup appliance requires it to start the deduplication process from scratch so much of the new performance and capacity in the second unit is wasted re-duplicating data that the first unit previously reduced. This pattern holds true for each succeeding deduplicating VTL or disk-based backup appliance that companies buy - none of the benefits are shared between units so companies end up with lots of silos of deduplicated data.<br /><br />I bring out these points not to denigrate deduplication. Deduplication is better than just using dumb disk libraries with no deduplication at all. But there is more to consider when bringing deduplication in house than just buying and bringing any product that supports deduplication. Companies need to buy a product with scalable architectures that support the technology of deduplication.<br /><br />Deduplication is a game-changing technology as it changes how enterprises manage their data. But it does more than just change the game for how data is stored, it changes the architecture of the products that companies can and should use to store it. The sooner companies understand this architectural change and the need for <a target="_blank" href="http://www.necam.com/storage/contacts/?ItemID=148&amp;wp=10">grid storage</a> in enterprise deduplication, the sooner they can stop spending as much time managing their backup and storage infrastructure and start spending more time managing their business. <br />]]></description>
            <link>http://necam.dciginc.com/2007/11/the-enterprise-game-changer.html</link>
            <guid>http://necam.dciginc.com/2007/11/the-enterprise-game-changer.html</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">Grid Storage</category>
            
            <pubDate>Tue, 13 Nov 2007 06:21:27 -0600</pubDate>
        </item>
        
    </channel>
</rss>