Havana Design Summit: Extending ACLs and Metadata
Every year the Openstack conference gets bigger. Every conference is at a bigger venue with more people and more things going on. But for Openstack Developers hashing out the plan for the next set of work is the most fun part of the collaborative environment at the conference. And I’m having a blast.
This year the conference opened with a full day Swift design sessions on Monday. It was great to get down to brass tacks with Operators and Deployers using Swift as well as good number of other Active Technical Contributers. There is a TON of focus right now around some specific core topics, but then on Tuesday Swift almost overran the unconference sessions. With so many people at the conference using Swift there was just too much to fit into one packed room for one day. The unconference sessions tend to be where a bunch of smaller ideas can come together. And some of these ideas can have a big impact.
In particular David Hadas (another ATC for Swift who’s been contributing since last year and currently working at IBM) led a session at the end of the day on extending ACL’s and Metadata.
He introduced a couple of simple ideas to incrementally enhance Swift in small ways.
Swift supports metadata at every level. On objects, containers, and even directly on the account. Users can set metadata through the Swift API. Middleware and other core features will set metadata and then retrieve it later and change the behavior of the API when acting on the entity based on the metadata. A straightforward example is the X-Delete-At metadata that is central to the expiring object feature in Swift. Once an object has expired, even before it’s been reaped by the consistency processes, the object server will not serve an object after the X-Delete-At timestamp. This is metadata that is set by the user, but changes the way the storage system behaves.
Adding new capabilities often necessitates creating new metadata. But, today there’s a number of places in the Swift code that have to distinguish between metadata that the user can format to the own needs, or that must be validated as recognized and in the correct format to be consumed by system. As more features and metadata are added, it becomes harder to individually rationalize if the handling of the metadata is performing the proper validation in all cases.
Every time new metadata is added to support a feature you have to write validators. That’s not going to change. We have to protect the system from invalid input (garbage in garbage out). But, deciding where to apply the validation in Swift can require some relatively arcane knowledge about Swift internals. And still, most of the time you just piggy back on the processing for a “similar type” of metadata anyway!
By classifying metadata we can make it easier to add new metadata (and therefore new features that depend on metadata!)
To get us started David highlighted some high-level metadata classes:
- Storage System MD: Created by the storage system and consumed by the user (e.g. counters)
- System MD: Created by the user and consumed by the storage system (e.g. ACLs, Quata)
- User MD: Created by the user and consumed by the user (e.g. any regular user MD)
He gave a nod to the CDMI standards definition for influencing these classes, and got some good laughs poking fun that it might still be a good idea anyway. Ha! I totally agree, this is a good idea!
The work to do now is to identify and consolidate metadata validation and build a system that will simplify the introduction of new metadata. Working out the details will require identifying all of the metadata that Swift is currently supporting plus the known usecases where we know we want to extend and verifying that they fit into these groups. Then internally aligning the places where Swift is processing and validating metadata under these buckets.
Access Control Lists on Accounts
I think Swift’s container ACL’s do a great job of balancing simplicity and functionality. Under a container you can individually grant (or revoke) read, write or listing access based on individual users or groups identified by your auth system or the Referer (sic).
This is very awesome.
Whether your usecase is simply sharing some static content in your container with the world, or a more complex temporary granting of another authorized user the ability to upload data under your account. Swift ACL’s allow YOU to describe access to your data.
However, at the account level access is granted by the auth system. If you want a user to create containers in in a Swift account you would typically grant them the admin role for that account in your auth system.
This works well with most of the auth systems that were built with cloud systems like Swift in mind, certainly Keystone and Swauth.
But as Swift integrates with more businesses and existing auth systems it becomes apparent that it may not always be easy to update the structure in the pre-existing auth system for every new account in a highly scalable storage system that’s separating projects into thousands or even hundreds of thousands of Swift accounts!
However, in the approach outlined by David, we can add the ability to describe within Swift itself which pre-existing users or groups in the auth system have access. This puts the control of access to your data completely in the hands of account owner.
There’s still tons of issues to work through. Can I remove my own access to an account? Can I transfer complete ownership? As a service provider do I want to allow users to have public accounts? But I’m really excited about that work.
Elegance is Powerful
I think both of these ideas are taking existing concepts that are already in Swift and expanding them incrementally. But, personally, I’m blown away by the implications. Swift has always taken the approach of solving problems in the simplest way that solves the broadest usecases. I might even go as far to say that style of simplification is a tenant of elegance. And who doesn’t need a little more elegance in their software defined storage system?