SwiftStack
Blog

Come Join SwiftStack at the Red Hat Summit 2014 This Week!

We’re super excited to be a part of the buzz at the tenth annual Red Hat Summit this week in San Francisco (Booth #722 & in Red Hat’s Storage Pod #605E). SwiftStack has been working closely with Red Hat, right from the start. Here’s how…

Our Community

Since February 2012, Red Hat has been a strong contributor to OpenStack’s Object Storage Project named Swift, for which we are closely aligned. We’re currently working together on defining Storage Policies for OpenStack Swift, and have sponsored a few joint hackathons for Swift over the past year:

And just yesterday, we participated in the Gluster Community sponsored Red Hat open source hackathon. Good times.

Technology Collaboration

SwiftStack is a Red Hat Ready partner — we’re the standard solution for OpenStack installations providing a simple and scalable Object Storage solution, easily deployed by Red Hat users. SwiftStack is certified on Red Hat Enterprise Linux (RHEL) 4, 5, 6 and soon 7, and we have a large and growing base of RHEL customers.

And… Coming Soon

  • Even further alignment of Swift and Gluster as co-habitable storage solutions at an enterprise level.
  • Continued trunk OpenStack Swift development work beyond storage policy.
  • More joint hackathons to improve the cross functionality of Swift, Red Hat, and Gluster.
  • SwiftStack certification on the latest version of RHEL, RHEL 7.

We hope to see you at the Red Hat Summit 2014! Stop by our booths, check out a demo, and be sure to get your Partner Passport stamped so you’re eligible for the raffle to win an awesome silver Misfit Shine with one of each of the wearable accessories!

When: Monday, April 14 - Wednesday, April 16

Where: Moscone Center, San Francisco, CA

Conference Location: Booth 722 and Storage Pod 605E in the Partner Pavillion

Can’t get to the summit?

For anyone who can’t make it to conference, SwiftStack holds weekly live demos where you can see SwiftStack in action, and get answers to questions from experts. Our next live demo is Wednesday April 23 @ 9AM, join us to see the latest features in action.

VP Marketing, SwiftStack

What I Learned at the NAB Show…Media and Entertainment Need the Cloud

Much learned at the National Association of Broadcasters Show (NAB) this week in Las Vegas. The entertainment and media space are undergoing radical transformations. And from all indications by our customers (shout out to Guillaume and Ramy with DigitalFilm Tree) of whom are driving this change, there are way more transformations underway at a fast and furious pace (pardon the pun).

Houston, We Have A Problem…

Production and distribution are moving to “Internet Protocol” (IP) workflows, which has increased the need for agile software and commodity hardware or components. Here’s why…

Bandwidth is the Bottleneck

Bandwidth is one of the biggest capacity constraints in the production process. The size of raw data is substantially larger than a produced and compressed product - often times 20:1 - with a need to preserve each pixel through the entire production process until final compressed video format.

Data Gravity

A post-production house has specialized equipment (and people) for color correction, editing, sound, etc. The data should be near where they are working. The challenge happens when the director who is doing the shooting isn’t nearby and wants to participate in the editing process. It’s impractical to have both working on the project at the same time because of bandwidth constraints. It’s also impractical to send raw video back and forth whenever changes are made.

Petabyte Productions

Resolution is effectively quadrupling in the near term. The current HD standard is about 700-1000 pixels, compared to the new 4K HD format (4,000 pixels of resolution). These new formats allow more flexibility with the raw data. Frame rates are increasing as well the ability to record richer color detail. And finally, since video isn’t film, directors tend to “let it roll”, with a “sort it out in post” mentality.

Metadata Collaboration

Finally, one of the biggest issues facing the industry is the proprietary nature of the tools and systems being used throughout the process. Very few applications are compatible with each other. Storing this metadata as part of the file (like what can be done with object storage) is a part of the solution.

Straight to Cloud

Traditional enterprise IT will be skipped over during this transition to internet workflows (content creation, production, distribution). VMWare will not be adopted, instead – standard servers, and OpenStack will be the next generation platforms for more and more phases of media production.

DigitalFilm Tree To The Rescue

DigitalFilm Tree (DFT) is a cutting edge post-production house. It is their job to transform raw footage into a final product. This involves an intense amount of collaboration with all involved with the production process. They are pulling it off…

First is their ability to pull down all the raw data and metadata used during shooting. (This is where SwiftStack comes in) There have been two pains in their business. 1) acquiring the raw data and getting it where it needs to be and 2) high-fidelity, two-way collaboration.

Multi-region SwiftStack cluster solves the data gravity problem. As data is ingested, it can reside in multiple locations. This allows for fast, low latency access at each location.

Next, is solving the problem of collaboration. For high-fidelity collaboration, each site can run identical software on workstations as they now have identical sets of these large, raw data files. Then the applications (editing, color correction) can exchange just the small changes that are being made to the raw data – the “project files”. This data can be shared in real time much in the same way that online multiplayer games transmit coordinates and player actions.

And that my friends, is my NAB take-away. We’re excited about the future of this burgeoning industry! Oh, and thanks for the shout out DFT (SwiftStack was plugged on stage)!

Joe Arnold

Joe Arnold

CEO, SwiftStack

Great Swift Sessions to Vote-up for OpenStack Summit Atlanta 2014

The polls are now open for you and your friends to exercise the power of the vote. The Icehouse release in May ’14 will be the biggest ever in terms of new capabilities for OpenStack Object Storage. So it is only fitting that a such large number of sessions are in the running for inclusion in the OpenStack Summit in Atlanta this May. Sessions are proportional to the popularity and buzz of OpenStack Swift.

More than a dozen SwiftStack experts will be in Atlanta, along with several SwiftStack customers. We ask that you do your part to help us stay very busy at the Summit with session presentations. Thank you for your vote and for giving us your 3-Star “Would Love To See This!” rating for the two dozen sessions on Swift:

More OpenStack Swift from the Community

We hope that you are not too tired for some additional voting, as there eight more great sessions that could use your vote. We said there was a lot of popularity and buzz!

Mario Blandini

Mario Blandini

VP Marketing, SwiftStack

Extending OpenStack Swift

As application storage needs grow, you quickly run into the limitations of traditional storage. Start with a hard drive; run out of space. Buy a bigger hard drive; run out of space again. Get a set of hard drives and use RAID; run out of space. Use a bigger RAID set, and now you have painfully long RAID rebuild times and you still run into scale limits. You’ve built a storage system that is a set of siloed storage domains, and you’ve got to figure out where to place your data, how to expand the storage capacity, and how to handle hardware failures. These are exactly the problems that OpenStack Swift was designed to solve.

With the growing popularity of the OpenStack project and Swift in particular, the question sometimes comes up if a different object storage system can be used to implement Swift. This question, however, is based on a fundamental misunderstanding of what Swift does. Swift is not an abstraction layer on top of an object storage system. OpenStack Swift provides clients a unified namespace that abstracts away the underlying storage volumes.

OpenStack Swift provides clients a unified namespace that abstracts away the underlying storage volumes.

With this simple concept, Swift gives operators seamless capacity management and provides storage for applications that just works, even when hardware fails. More importantly, it gives operators a simple storage service that can be tuned and extended to their exact needs–functionally, financially, and over the lifetime of their applications.

In this blog post, I’ll cover the two main ways Swift can be extended: the DiskFile abstraction and middleware.

Extending Swift with Storage Volumes

Storage volumes are the fundamental thing that Swift uses for data placement and failure handling. In most deployments, storage volumes are mapped one-to-one to hard drives. This makes a lot of sense when managing hardware failure domains, improving performance, and reducing costs. Still, there is a ton of functionality that can be implemented by the storage volume itself.

In the Swift code, objects are represented by a class called DiskFile. Although there are more parts to a volume abstraction than just this class, we’ve taken to calling a particular volume abstraction a DiskFile. This volume abstraction is a fundamental part of storage policies in Swift. Swift’s out-of-the-box DiskFile simply assumes locally-mounted volumes with a standard filesystem on it. This implementation works very well for most use cases, especially when deployers are looking for simple storage at the lowest cost per gigabyte.

More functionality in storage volumes

Although Swift’s default DiskFile implementation works very well, there are some great examples and ideas for using different DiskFile implementations to provide different functionality to both deployers and end-users.

The main advances in the DiskFile abstraction in the last year are the result of Red Hat’s work to use OpenStack Swift as the object storage interface to GlusterFS storage volumes. Instead of reimplementing the Swift API, Red Hat is fully participating in the OpenStack Swift community to ensure that Gluster can take full advantage of the latest Swift code and features. This is absolutely the right way to pair Swift with another storage system: use the existing functionality in Swift and contribute back to community where additional functionality is missing.

This is absolutely the right way to pair Swift with another storage system: use the existing functionality in Swift and contribute back to community where additional functionality is missing.

Other functionality can be implemented in a Swift storage cluster by using the DiskFile abstraction. Data-at-rest encryption can be provided by simply using encrypted storage volumes, but by making a new EncryptingDiskFile, advanced functionality, such as key management, could be included. Similarly, compression or de-duplication could be implemented by a DiskFile. More advanced functionality could be added, as ZeroVM is doing, by adding compute functionality at the storage location.

Better knowledge sharing between Swift and storage volumes

Additional functionality is not the only benefit possible with DiskFiles. When applications and the storage volume can share information, tremendous efficiency can be built into the system as a whole. For example, if the storage volume knows the logical structure of the data being written, then the volume itself can help in data integrity and efficient storage. This is demonstrated in Seagate’s Kinetic drive platform. By implementing a key/value API, the drive itself knows how data will be accessed and can make efficient decisions on how the data is written to disk. Also, since the drive now knows the logical keys associated with a piece of data, the drive itself can report to Swift actual objects that are in danger of being lost due to hardware issues.

No matter your preferred hardware, the DiskFile abstraction within Swift can be used to provide more efficiency in the system. Today, the default DiskFile simply assumes a storage volume is a POSIX-compliant filesystem that supports extended attributes. While we recommend XFS as a deployment choice, there is little in Swift that requires that particular filesystem. It is possible to write volume interfaces that are for a particular filesystem and thus use low-level semantics to improve Swift’s mean time to error detection and mean time to error recovery. One can also imagine DiskFile implementations that take advantage of the particular read and write characteristics of SSDs.

Extending Swift with Middleware

But DiskFile abstractions aren’t the only way to extend Swift’s functionality. Swift supports middleware that can intercept and modify requests and responses to the server. Many of Swift’s core features are implemented using middleware, including large objects, auth integration, and static website hosting.

Swift gives operators a simple storage service that can be tuned and extended to their exact needs–functionally, financially, and over the lifetime of their applications.

Middleware is one of the best ways to add functionality to Swift. The OpenStack community has written middleware to add an S3 API translation layer, integrate with the OpenStack Ceilometer project, add third-party search, and CDN integration. Others have created middleware to automatically generate thumbnails when an image is requested, generate profiling metrics, and facilitate transparent data migration from other storage systems.

If new extensions to the API need to be made, Swift’s middleware capability is the best place to start.

OpenStack Swift, the Extensible Object Storage System

OpenStack Swift powers some of the world’s largest storage clouds, and it’s flexible, modular design allows new functionality to be added easily. Volume abstractions and middleware both give deployers and storage vendors the ability to integrate with Swift and work in the community to build a storage system used by everyone, every day.

John Dickinson

John Dickinson

Director of Technology, SwiftStack
OpenStack Swift Project Technical Lead

Coming Soon – Storage Policies in OpenStack Swift

Recently a few members of the Swift community gathered in Oakland to talk about the ongoing storage policy and erasure code (EC) work. We had a good time and made some good progress. I want to take this opportunity to give the community an update on storage policies and erasure code work in Swift.

What are we working on?

OpenStack Swift provides very high durability, availability, and concurrency across the entire data set. These benefits are perfect for modern web and mobile use cases. But one other place where Swift is also commonly used is for backups. Backups are typically large, compressed objects and they are infrequently read once they have been written to the storage system. (At least you hope they are infrequently read!) Although Swift can be used for very cost-effective storage today, technology like erasure codes enable even lower storage costs for those users looking to storage larger objects.

When the Swift community started this erasure code work, SwiftStack blogged about it, and I recently presented on this topic at Linux Conf Australia

In order to build support for erasure codes into Swift, we realized that we needed a way to support general storage policies in a single, logical Swift cluster. A storage policy allows deployers and users to choose three things: what hardware data is on, how the data is stored across that hardware, and how Swift actually talks to the storage volume.

Let’s take each of those three parts in turn.

First, given the global set of hardware available in a single Swift cluster, choose which subset of hardware on which to store data. This can be done by geography (e.g. US-East vs EU vs APAC vs global) or by hardware properties (e.g. SATA vs SSDs). An obviously, the combination can give a lot of flexibility.

Second, given the subset of hardware being used to store the data, choose how to encode the data across that set of hardware. For example, perhaps you have 2-replica, 3-replica, or erasure code policies. Combining this with the hardware possibilities, you get e.g. US-East reduced redundancy, global triple replicas, and EU erasure coded.

Third, give the subset of hardware and how to store the data across that hardware, control how Swift talks to a particular storage volume. This may be optimized local file systems. This may be Gluster volumes. This may be non-POSIX volumes like Seagate’s new Kinetic drives. This may even be volume a driver paired with additional functionality as ZeroVM is doing.

As a community, we’ve been working on storage policies (with a goal of supporting erasure codes as an option for deployment) for many months. SwiftStack, Intel, Box, and Red Hat have all participated, and in order to accelerate the work, we met up in Oakland for a couple of days of hacking and design discussion.

How’s progress?

So how’s progress going on this set of work? I’m glad you asked.

First, we’ve already released the cleaned-up DiskFile abstraction in Swift 1.11. This allows deployers and vendors to implement custom functionality, and we’ve already seen this in use with GlusterFS and Seagate’s Kinetic platform. Work is underway on providing a similar abstraction for Swift’s account and container on-disk representation.

Second, Kevin (at Box) and Tushar (at Intel) have been working on PyECLib and ensuring that all necessary functionality and interfaces are there to support any erasure code algorithm that is desired. This library provides the standard interface for EC data in Swift. Intel has also released their own library to help accelerate erasure code operations on Intel hardware.

Finally, we’re nearly done with getting full multi-ring support into mainline Swift. We’ve been doing all of the multi-ring work on a feature branch in the Swift source repo. You can take this code today and run a test Swift cluster with multiple, replicated storage policies. We’ve got one last component to include in the multi-ring feature before it can be merged into mainline and used in production, but expect to see rapid development on this soon.

We’ve been using a couple of different tools to track the storage policy and erasure code work in Swift. First, our primary task tracker has been a Trello board we set up for this feature. We also have several high-level LaunchPad blueprints to track this in the wider OpenStack release process.

What’s next?

Onward and upward, of course. The first task is to get the multi-ring feature into Swift. This will allow deployers to create Swift clusters with multiple replication policies, and in-and-of-itself will enable many new use cases for Swift. We’re targeting OpenStack’s Icehouse release for this feature. When the multi-ring support is done, we’ll be able to add support for erasure code policies into Swift clusters. I’m expecting to see production-ready erasure code support in Swift a few months after the OpenStack Juno summit.

My vision for Swift is that everyone will use it, everyday. The new use cases enabled by storage policies and erasure codes helps us fulfill that vision. I’m excited by what’s coming. If you’d like to get involved in this work, we’d love to have your help. We’re in #openstack-swift on Freenode IRC every day. Stop by and get involved!

John Dickinson

John Dickinson

Director of Technology, SwiftStack
OpenStack Swift Project Technical Lead

ZeroVM Design Summit-Swift Integration

Yesterday I had the pleasure of attending the first ZeroVM design summit in San Antonio. The ZeroVM team (recently acquired by Rackspace) is working on building a multi-tenant, distributed compute system built on top of OpenStack Swift. It’s a very interesting project, and I’ll be continuing to keep track of their work.

Yesterday was the second day of their design summit, and we spent a majority of the day focusing on Swift integration work. A couple of key feature additions to Swift were discussed.

First, we spent a lot of time discussing adding object append support into Swift. It’s possible to append to objects today by using large object manifests, but when using that method, all of the object’s constituent parts get stored throughout the cluster. However, if Swift supported adding to an existing object by updating the on-disk data, then a process executor can more efficiently work with the resulting object.

Adding object append support to Swift seems simple at first glance, but it gets rather complicated when considering different edge cases. We had an active discussion on the best way forward. The two basic options are to either add some sort of consistency semantics to this part of Swift or to allow appends to be added out-of-order. Both methods have strengths and weaknesses, and I’m looking forward to seeing what the end design proposal will be. I’m currently supportive of the out-of-order append resolution strategy.

The other major feature discussion was to add triggers or webhooks to Swift. This feature would allow a Swift client to set an endpoint and a filter on a particular container and have that endpoint called when filtered actions happen. For example “send a request to foo when any object is deleted from container A” or “send a request to bar when any object with content-type text/plain is added to container B”.

I’m excited about the idea of adding hooks into Swift. It will add a great piece of functionality that will enable new use cases that are difficult to support today.

Overall, we had some great discussions yesterday at the design summit. ZeroVM has some great potential, but I’ll be really excited when we see some demonstrated production use cases with it. In early 2015, I think it would be great to see some public examples of ZeroVM in use at scale inside of Rackspace.

I hope the discussions we had yesterday continue within the larger Swift community. These big features, along with a couple of smaller ones discussed, allow Swift to support some very interesting new use cases. The future is bright.

John Dickinson

John Dickinson

Director of Technology, SwiftStack
OpenStack Swift Project Technical Lead

OpenStack Swift at LCA 2014

A couple of weeks ago I had the privilege of attending Linux Conf Australia for the second time. This year, it was held on the campus of the University of Western Australia in Perth. Like always, the conference content was great. Knowledgeable presenters speaking on the topics of their expertise, great keynote presentations each morning, and a casual atmosphere make LCA one of my favorite tech conferences each year.

I had the opportunity to present twice this year. My first talk of the week was during the OpenStack miniconf on Tuesday, and I spoke on the ongoing storage policy and erasure code work in the Swift community. My second talk was on Thursday during the main conference track. That talk was on Swift’s global clusters feature: why we built it, how it works, and a little demo of the pieces working together.

Storage Policies

One of our top priorities in the Swift community right now is implementing storage policies. This feature gives operators and user enormous flexibility in how to match storage usage with storage needs. We’re specifically working on storage policies with the goal of supporting erasure coded content for more efficient storage of large content. In the video above I cover the goal for the work and give a little demo of two different storage policies in the same Swift cluster.

One note about the talk. I referenced “conversations earlier in the day” and take a little detour to talk about Swift’s extensibility. Swift is a highly-extensible storage system, and I’ll be writing a more detailed blog post soon on this topic. Several discussions at earlier talks in the day were claiming a lack of extensibility in the system, so I took some time to specifically address those issues in my talk.

Global Clusters

On Thursday I presented on Swift’s global clusters feature. I talked about the global clusters feature in Swift, covering its use cases, real-world deployments, and with plenty of Q/A at the end.

Can’t get to a Conference?

For anyone who can’t make it to conferences, SwiftStack holds weekly group live demos where you can see Swift and SwiftStack in action, and get answers to questions from experts. Our next live demo is Tuesday Jan 21 @ 9AM, join us to see the latest features like Global Clusters in action.

John Dickinson

John Dickinson

Director of Technology, SwiftStack
OpenStack Swift Project Technical Lead

The Remarkable Growth of Swift - and What It Means in 2014 (and Beyond)

In 2013, we had the privilege of working with hundreds of architects, operators, and deployers of private cloud storage infrastructure. Understanding their business requirements - and how they plan to evolve their infrastructure - is critical to what we do here at SwiftStack. This past year we observed several trends worth noting: (1) a transition to object based storage from traditional file system technologies to support new application initiatives, (2) a strong desire to shift to open source software away from proprietary and vendor-specific technologies, (3) a preference for using technologies that best address a specific use-case, and (4) the adoption of OpenStack Swift as the standard for object storage.

In this post, I’d like to describe these observations in a bit more detail - and discuss what the trends mean for the storage market in 2014 and beyond.

The Transition to Object Storage

In 2013, adoption of public object storage services continued at an ever-accelerating rate. In April, Amazon announced that 1 trillion objects had been uploaded to S3 in the prior 10 months. For a bit of context, it took 6 years for the first 1 trillion objects to be uploaded. By some estimates, S3 generated several hundred millions of dollars of revenue for Amazon in 2013 - and grew over 100% annually. HP, IBM, Google, Microsoft, Rackspace, and Oracle also grew or launched their public object storage offerings.

Much of this amazing growth is fueled by the numerous Web, mobile, and software as a service (SaaS) applications that depend on object storage. Some of the most popular applications on the Web are powered by object storage services that store and serve data to millions of users every day. SwiftStack users like Disney Interactive and Concur are among the enterprises that use Swift as the storage engine for their high-growth Web and SaaS applications. Wikipedia.org, the 6th most popular site on the Web, uses Swift to store data.

Companies like MercadoLibre, the e-commerce leader in Latin America, are storing more than 1.4 billion images on Swift as part of their most critical business application - their marketplace. For these applications, the performance, scale, and durability of object storage is essential. Traditional file system technologies cannot provide the dramatic increase of scale that these applications require. Arrays do not meet the price per capacity needs either. Object storage is quite simply the natural storage platform for applications being built today.

“Implementing Swift in MercadoLibre was really easy for us and our developers, mainly because the use of APIs to manage the solution make it developer friendly and provide cost savings, and also because of the big community behind it that gives us support, where we can also contribute with our user experience. We have been using OpenStack for three years and really trust the community involved in this project. We are always ready to contribute,” said Maximiliano Venesio, Technical Leader from MercadoLibre.com

The Swift solution delivers low latency, enabling the team to display its billions of stored pictures much more rapidly.

“We are an e-commerce platform and we sell our products through pictures so being able to quickly and reliably display pictures is crucial to our business. With Swift, we are able to display pictures 10x faster than before—allowing us to serve our customers better than ever before,” said Alejandro Comisario, Manager from the MercadoLibre CloudBuilders team

“OpenStack Swift in particular has gained a lot of traction both in the enterprise and in the service provider space. All these trends support IDC’s assertion that 2013 marked a turning point for the OBS (object-based storage) market. IDC estimates that in 2013, OBS solutions accounted for nearly 37% share of the file-and-OBS (FOBS) market in revenue and is forecast to be a $21.7 billion market in 2017.” - IDC, October 2013

Companies looking to store vast amounts of archival data are also benefiting from the cost advantages of object storage. EVault, Seagate’s cloud backup division, took the wraps off their Glacier-like storage service, LTS2, that is based on Swift. With LTS2, EVault is also offering a Swift-based storage service at Glacier-like prices, which is made possible by Swift’s highly flexible architecture. All of the major tape vendors are now also promoting object-based technologies to meet the needs of increased scale and lower the cost of long-term archives. 
IDC has recognized the effect that OpenStack Swift is having on the file and object storage market:

OpenStack Swift is re-shaping the object storage landscape in a pretty fundamental way - and is a major force in a market that IDC forecasts to grow to more than $21 billion within the next three years.

Why Being Open Matters

OpenStack Swift is open source software. To understand why this matters for storage architects and operators, let’s look back at what happened to the rest of the infrastructure stack. Once dismissed by their proprietary competitors as “immature,” open source operating systems, middleware, application frameworks, and databases are now standards in enterprise and Web infrastructure. In fact, the open source model has so completely and fundamentally transformed the infrastructure tier in the data center that not many proprietary infrastructure platform technologies have a sustainable advantage any longer.

That same transformation is now changing the storage tier, one organization at a time. Vendors of proprietary storage technology will claim that open source cannot produce enterprise-grade storage solutions. We have all seen this movie before - when done the right way (open code, open design, open community), open source platforms can and do win. This change is led by storage architects, operators, and deployers at enterprises, Web companies and service providers alike. Today, using open source platforms is an irreversible change in their enterprise architectures.

Using an open source storage engine does not preclude these organizations from working with commercial vendors in the OpenStack ecosystem. In fact, most organizations adopting Swift use complementary products, such as the SwiftStack Controller, which makes it easy to scale and manage their Swift clusters. Jonathan Bryce, Executive Director of the OpenStack Foundation, explained it this way at the last OpenStack Summit:

“Cutting edge technology from an open source project is being used by some of the biggest companies in the world. How do they get a comfort level deploying this technology? Partners and support from the community, like SwiftStack and others in the ecosystem that have helped Enterprises meet their business needs.”

All the Wood Behind one Arrow

But to succeed as a project or as a company, focusing on one thing and doing that thing really well is generally a good strategy. In highly competitive markets, it is often the only strategy that can lead to success. Conversely, there is a long list of defunct companies that attempted to do too many things, too soon. David Packard, co-founder of HP, famously said that more companies die of indigestion than starvation. The same can be true for both products and open source projects.

The message we heard from storage architects and operators in 2013 was that they prefer the best technologies to meet specific requirements and use-cases. All-in-one solutions, like a printer/copier/fax/scanner, are by design not optimized to deliver the best results for each use-case. Swift, which was designed from the ground up to do one thing really well - object storage, has this focus. Swift is not block storage, so it is not for running VMs or databases. It is also not a file system and does not try to emulate those interfaces. Rather, Swift was designed specifically for unstructured data, the type of data that is growing the fastest with today’s applications. A big reason it serves so well for unstructured data is its specialization and design for eventual consistency to support massive scalability and geographic distribution of data.

What makes this focus even more powerful for Swift is the immense support behind it. Over 145 developers have contributed to the OpenStack Swift codebase. The Havana release of Swift, which included support for global clusters, had top contributors (by patch count) from 6 different companies: SwiftStack, Rackspace, Red Hat, eNovance, IBM, and United Stack. In the upcoming IceHouse release, Intel, Red Hat, and SwiftStack are all working on storage policies for Swift. Other contributors like Rackspace are making major improvements to replication. 
 Intel, Box and SwiftStack are working on erasure codes. In short, the incredible ecosystem of who’s-who in technology that has formed around Swift gives the project strong and deep technical backing, i.e. a lot of wood behind one arrow. This ecosystem will also be the reason why Swift will continue to succeed. IDC, in their October 2013 Vendor Assessment of the Worldwide Object-Based Storage Market, agrees:

“In all likelihood, the only survivors in this market may be vendors with robust partner ecosystems and/or vendors with commercial variants of open source platforms.”

With this deep and active community, the Swift project has great development velocity as well as real diversity. Swift is much larger than what any single storage company could build. While storage architects and operators acknowledge the value of open source, not all open source projects are equal. Any open source community where >90% of the contributions come from one entity is not much of a community at all. The Swift community, however, gains momentum and strength from so many contributors distributed across many companies and countries. It’s this community, in large part, that is driving enterprises to adopt Swift for their production object storage needs.

Swift, the Object Storage Standard

The Swift API has been adopted by or is on the roadmap for most vendors as an object storage interface - Terri McClure, Senior Analyst, Enterprise Strategy Group

While Amazon, Google, and Microsoft run very large storage clouds, each using its own proprietary technologies, other big names at the top of the list, such as HP Cloud, IBM Softlayer, Internap and Rackspace all run Swift. Rackspace has more than 85PB of raw disk deployed in 6 data centers, with sustained throughput to a single cluster of 60Gbit/sec. In fact, Swift’s adoption is so robust that by several important measures it already is the object storage technology of choice for large clouds. Deployments like these show that Swift is mature, proven in production, and already runs some of the world’s largest storage clouds.

For companies that need to operate private object storage services in their own data centers - for either control, cost, or performance reasons - choices have been limited. Single-vendor object storage technologies have not been a viable option as they were designed for a different use case and locked customers into a closed-source storage platform. With Swift, storage operators have access to a production-ready object storage platform designed for standard hardware to store and manage massive amounts of unstructured data at scale. Recent features like global clusters and new upcoming features, such as storage policies and erasure codes, will enable a new set of use cases and further transform the object storage landscape.

Swift’s momentum is also part of the reason it is the ideal choice as an engine for software-defined storage solutions. Swift has been used successfully in production by large enterprises and the Swift API is increasingly becoming the API of choice. Naturally, the biggest cloud operators have APIs that developers use so applications can write to those storage clouds. EMC, Oracle, and Red Hat are a few of the largest vendors who announced support for the Swift API in 2013. This allows their users to integrate with public clouds that run Swift, and leverage private clouds riding the popularity of Swift in private deployments.

The adoption of the Swift API as an industry standard has been noted by industry analyst firm ESG:

“With current and expected unstructured data growth rates, organizations developing new applications want and need to leverage the benefits of object storage scalability and manageability. But they need standards-based object interfaces so their investment in development is protected” said Terri McClure, Senior Analyst, Enterprise Strategy Group.

“Storage vendors recognize this fact and are responding. The Swift API has been adopted by or is on the roadmap for most vendors as an object storage interface, enabling those vendors to serve public, private, and hybrid cloud deployments without concerns about users making proprietary API investments.”

What this means in 2014

With continued growth of Swift clusters already in production, compounded by the rapid adoption of Swift for new deployments, 2014 will further cement Swift as the standard for object storage. The technology will continue to evolve as well, driven by its exceptional developer community and focus. Industry-changing features like storage policies and erasure codes will enable many more use cases and accelerate the adoption of Swift.

At SwiftStack, we’ve also been developing the management layer for Swift, including features like rolling upgrades, LDAP integration, and management of geographic clusters. Coupled with Swift’s large and growing ecosystem, these advances enable a new wave of innovation in the application and infrastructure tier by enterprises and service providers.

Users prefer technologies that best meet their needs, and we based our product here at SwiftStack on Swift for the same reasons. Swift is the engine that already runs the world’s largest storage clouds, its API has been adopted as standard by vendors both established and emerging, and it is the open source storage technology with the most vibrant contributor community. When deciding on the best object storage technology for your cloud in 2014, consider the proven, highly supported and standard choice: Swift.

Anders Tjernlund

Anders Tjernlund

COO, SwiftStack

Ushering in the New Year With Predictions

After a great holiday season, most of us are back at work, refreshed and eager to work toward ambitious goals for the new year. SwiftStack’s views on the 2014 can be found in a few outlets that published different articles by CEO and co-founder Joe Arnold.

Companies will compete with open cloud technologies Public cloud storage offerings have disrupted both the consumption model and the cost equation for IT organizations, which must now compete with the big operators to serve their internal customers. The result will be increased focus on public and private cloud storage, increased adoption of open source, and more companies building their clouds with the best parts for them.

Read the entire Virtual Strategy Magazine executive viewpoint 2014 prediction.

Cloud storage technologies become even more open in 2014 To say more business data will go to the cloud is not so much a prediction as it is a certainty. We will though see new open options feeding the certain growth of cloud storage in businesses. The result will be the adoption of the Swift API for object storage, new abstraction with storage policies, and use of mainstream storage with standard server hardware.

Read the entire VMBlog 2014 prediction.

SwiftStack co-founder Anders Tjernlund goes further into predictions for 2014 by describing observations on Swift and the storage marketplace from the past year in detail. Stay tuned for that blog post coming next week.

Mario Blandini

Mario Blandini

VP Marketing, SwiftStack

How to Upgrade an OpenStack Swift Cluster With No Downtime

OpenStack Swift deployers can upgrade from one version of Swift to the next with zero downtime for end users. This has been supported since the initial release of OpenStack Swift back in 2010.

An HA Swift Cluster

Swift has a modular design that allows you to match your cluster exactly to your use case. Client requests go through Swift’s proxy servers to Swift’s storage nodes. The proxy abstraction means that the storage nodes are naturally HA in Swift—the proxy server detects and automatically works around storage node failure.

This leaves the proxy nodes. For this post, I’m assuming the Swift proxy nodes are behind a load balancer that is checking the /healthcheck Swift endpoint.

Time to Upgrade!

So it’s time to upgrade your production Swift cluster. You’ve got users actively connected to it, and you can’t have any downtime. What’s the process? There are three easy steps to upgrade any Swift cluster.

Step 0: Take a Look at the Swift Release Notes

Every Swift release includes curated release notes in the CHANGELOG file. This file includes major changes, including explicit references to any change in default config options or addition of new functionality that will affect existing clusters.

Before starting your upgrade, be sure to look at the CHANGELOG to see if there are any changes that may affect your upgrade process. Although very rare, we do sometimes need to add or change things that affect existing clusters. However, if new things are added, sane defaults are chosen. If existing defaults change, we provide a migration path.

Step 1: Upgrade a single Storage Node

First upgrade a single storage node as a canary node. Upgrade one server, monitor it for problems, and then move on if everything is ok.

To upgrade a Swift storage node, perform the following steps:

  • Stop all background Swift jobs with swift-init rest stop
  • Shutdown all Swift storage processes with swift-init {account|container|object} shutdown. This will do a graceful stop, allowing current requests to complete.
  • Upgrade all system packages and new Swift code
  • Update the Swift configs with any needed changes
  • If necessary (eg for kernel upgrades), reboot the server
  • Start the storage services with swift-init {account|container|object} start
  • Start the background processes with swift-init rest start

After you’ve performed these steps, monitor the Swift logs for any errors or other anomalous behavior. If everything looks ok, let’s move on!

Step 2: Upgrade all of the other Storage Nodes

Once you’ve upgraded one storage node successfully, you’re ready to upgrade all of the other ones. Going zone-by-zone, upgrade the storage nodes by performing the same tasks as above. Doing one zone at a time will allow you to take advantage of Swift’s ability to work around an entire zone of data disappearing during the upgrade. Since Swift places data across all of your zones, this means that you’ll still have both high availability and high durability for your data during this process.

If you have a smaller Swift cluster with just one zone, then you can still upgrade seamlessly. Go server-by-server instead of zone-by-zone.

Step 3: Upgrade your Proxy Servers

The Swift proxy servers support a /healthcheck endpoint. By monitoring this endpoint, a load balancer can know when a proxy is available and automatically add and remove it from the load balancer pool.

One nice feature of the /healthcheck endpoint is that a server admin can drop a file onto the local drive that will cause the /healthcheck endpoint to return with a 503 response code. You can find documentation of how to configure this feature in the sample proxy config file provided with Swift.

Like the storage nodes, first upgrade one proxy server, monitor it, and then upgrade the rest. Here are the steps to upgrade a proxy server.

  • Shutdown the proxy server with swift-init proxy shutdown. This will gracefully stop the process so that existing connections can finish.
  • Create the disable_path file to cause the /healthcheck endpoint to return errors to the load balancer. This will cause the load balancer to remove this proxy server from the load balancer pool and prevent new client requests from going to it.
  • Upgrade any system packages and Swift code
  • Update the proxy configs with any needed changes
  • If necessary, reboot the server
  • Start the proxy with swift-init proxy start
  • Remove the disable_path file so that the load balancer can add the proxy back into the pool.

Enjoy!

And that’s it! With those three steps, you can upgrade your existing, production Swift cluster with zero downtime for your end users.

Taking it a step further

The steps descibed above are available in the open-source Swift codebase. SwiftStack has automated this entire process down to a single “Upgrade” button-click for your entire cluster. Check out the SwiftStack documention for rolling OpenStack Swift upgrades or watch Joe’s video below:

John Dickinson

John Dickinson

Director of Technology, SwiftStack
OpenStack Swift Project Technical Lead

© 2014 SwiftStack Inc.        San Francisco, CA         contact@swiftstack.com