The Impact of Disruptive Technologies on the Professional Storage Marketby Johan De Gelas on August 5, 2013 9:00 AM EST
CloudFounders: No More RAID
CloudFounders, a startup of former Terremark and SUN execs, also leverages a flash cache, but the building blocks are very different. Just like Nutanix, there is a VSA (Virtual Storage Appliance) that tries to make the best out of a local flash cache. The cool thing is that the backend, the second tier of storage, is not a traditional RAID based volume. The backend is either an object storage initiator that links to Amazon S3 or a storage device based upon erasure encoding called Distribute Storage System (DSS). Let's start with the DSS backend.
DSS is an object oriented storage system that uses “Bitspread”, an advanced and flexible erasure encoding system developed by the people of Amplidata. Amplidata is a startup with a mix of Belgian and US based infrastructure experts. Some of the directors are working for both CloudFounders and Amplidata. But there is a solid technical reason why CloudFounders chose to go with the Amplidata storage system. Bitspread is meant to be the “big storage alternative” to RAID.
As you probably know or have experienced hands-on, the current RAID implementations—RAID 5 and RAID 6—have reached their limitations now that we have terabyte disks. A few terabyte disks in RAID 5 can take days to rebuild. The result is that the RAID array performance and reliability is heavily degraded. RAID 6 is more reliable (although hardly 100%) but is not exactly a good performer for writes, which is another reason why VDI does not work well on a low end or midrange SAN.
“Bitspread” erasure encoding, also called Forward Error Correction Code (FEC), encodes data in “check-blocks”. The beauty is that you can configure the durability policy. In other words you can choose over how many disks these check-blocks should be spread and how many check-blocks you can lose before it becomes a problem. For example you could ask it to spread the datablock over 18 drives and tell the codec to make sure you can recover the original datablock from 12 check-blocks. So it's only if you lose more than 6 drives at once that you lose your data. As the codec requires only 12 of the check blocks to rebuild the original data object, a failure of two drives does not mean the rebuild should happen urgently. The rebuild can be done in the background at a very slow pace while the reliability stays high. You can also have the check-blocks spread over several storage modules, ensuring that you even survive a failure of a complete disk enclosure.
Bitspread: original data (yellow) is split up, encoded with high redundancy (green) and then spread over many disks and enclosures.
For those who are not convinced that the small startup Amplidata is onto something: Intel and Dr. Sam Siewert of Trellis Logic explain in this paper why it can even be mathematically proven that the Reed-Solomon based erasure codes of RAID 6 are a dead end road for large storage systems. The paper concludes:
"Amplida's Bitspread is an efficient, scalable and practical alternative to the stop-gap of combined RAID levels like 6+1."
And that is exactly the reason why CloudFounders chose to build their storage system on the Amplidata backend.
The DSS based on “Bitspread” works with objects and is thus not a block device. A volume driver must be installed that converts the DSS into a block device. This way the hypervisor can connect to an iSCSI target that is running on top of the volume driver, as an iSCSI target requires a block device and does not recognize the format of the DSS.
Bitspread is a lot more CPU intensive and needs more storage room than traditional RAID algorithms. To reduce the CPU impact, Amplidata leverages the SSE 4.2 capabilities of the latest Xeons. As Bitspread copes so well with disk failures, it is natural to use relatively slow SATA disks, which negates the capacity disadvantage compared to RAID 6. Decent media transfer can still be achieved as the DSS typically spreads the check-blocks over many disks.