Save Space: The Final Frontier - Erasure Codes With OpenStack Swift
Today we’re really excited to announce an initiative to introduce erasure codes in OpenStack Swift. Swift currently uses replicas, but a question has come up – could we save space by using erasure codes?
This initiative enables deployers to store data with erasure coding instead of or in addition to Swift’s 3-replica model. Though using 3 replicas provides for excellent performance and availability, it’s incurred in both the acquisition and operating cost of storage hardware. Swift has already enabled many companies to radically lower their storage costs with commodity hardware and the introduction of erasure coding within Swift will enable costs to drop even further.
The development of this feature will proceed with the same open mindset that has guided the OpenStack project from its inception. Just like all projects within OpenStack, Swift has many contributors. The companies who are heavily involved with Swift include SwiftStack, Rackspace, Red Hat, IBM and HP.
For erasure coding, multiple companies — Intel, SwiftStack, Box and EVault are committing effort for this specific project –
“Intel is excited to support the development of an erasure code solution for OpenStack Swift with the Swift development community. Helping our customers reduce the size of data on disk by up to half versus regular triple replication, helps decrease their costs by more than 50%. Erasure code solutions reduce both hardware requirement costs as well as the power and cooling required to run that hardware, ” says Bev Crair, Intel Storage Division GM. “Erasure code is a technology that is long overdue and Intel is pleased to be supporting efforts to promote and use it in cloud environments like OpenStack Swift.”
“EVault is excited to work with Swiftstack and the broader OpenStack Object Storage community to add erasure codes.” says George Hoenig, Vice President, Products & Services at EVault. “Erasure codes, particularly for write intensive workloads, will enable users to deploy systems using less storage and bandwidth than replicated systems of similar durability.”
Starting from a production-grade system
By using Swift as a starting point, we stand on the shoulders of the existing, battle-hardened mechanisms that Swift already has.
We are also enlisting some of the thought leaders in information theory and erasure coding who are contributing code and guidance for this project.
The design goal is to be able to have erasure-coded storage plus replicas coexisting in a single Swift cluster. This will allow a choice in how to store data and will allow applications to make the right tradeoffs based on their use case.
There are already proposals and code on the table for this effort. And we will be collaborating over these designs over the coming months to build a solution to best meet the needs of the Swift deployers.
Development as a community
We have a big project ahead of us. But we have rallied as a community before and have pulled off some big efforts. For example, region support is now included in the latest version of Swift which allows a cluster to span distant data centers.
This effort continues to demonstrate the focus of the Swift project – to grow an already great object storage system into the new areas where haven’t gone before. With continued efforts such as this, Swift is well on its way ensure your data can “live long and prosper”.