SwiftStack
Admin Guide

Introduction

When installing SwiftStack, the [SwiftStack Install Guide][install-guide] will walk you through the process of installing SwiftStack on your hardware configuring a basic SwiftStack cluster. The [SwiftStack Nodes][download] software includes all components required for the installation and is provided as a package installer. In addition to OpenStack Swift, included are all components required for a production deployment, such as a built-in load balancer, SSL termination and extensive monitoring instrumentation. After the SwiftStack Nodes software has been installed on your hardware, the storage nodes will connect to the SwiftStack Controller via a secure VPN connection, where you will configure the overall cluster.

With the SwiftStack Controller console, the drives will be formatted and mounted on the nodes, builder files (which contains information about all the devices in the cluster) will be created and provisioned, and rings will be created and validated - all based on the information you provide about your cluster settings. While some information about your environment needs to be provided by the operator, such as networking information and user accounts, The SwiftStack Controller will allow an operator to configure over 70 configuration settings. These settings include:

  • Number of replicas
  • Zone layout
  • Worker, auditor and replicator settings
  • Proxy server settings
  • Object server settings

This guide provides additional information and best practices on how to configure your SwiftStack cluster and how to conduct day-to-day operational tasks, such as adding capacity, procedures for handling a drive failures, monitoring and getting technical support.


Configuring your Cluster

Networking

A SwiftStack cluster consist of 3 basic tiers - a load balancing tier, a proxy tier and a storage tier. Proxy servers co-reside on the storage nodes, which simplifies management of your SwiftStack cluster. Each proxy/storage node has its own IP, which is configured during the SwiftStack Node install process. For each SwiftStack cluster, there will also be a Cluster API IP address, which will be used by the load balancer. The Cluster API IP Address should be on the same network as each node’s outward-facing IP but be different from all nodes outward-facing IPs. If using the built-in SwiftStack load balancer, this IP address will be automatically set up and you will not need to do any additional bindings to this IP on the nodes. If you are using your own load balancer, the client-facing IP address of your load balancer will need to be entered in the SwiftStack Controller console here.

Storage Accounts and Users

A SwiftStack cluster has many storage accounts. Storage accounts have containers, and containers store objects. Containers logically reside in storage accounts, so a container named “Documents” in two different storage accounts are two distinct containers within the cluster.

Users are associated with storage accounts. URLs in the API contain the storage account, container name, and object name. For example, in the URL https://192.168.22.100/v1/AUTH_accounting/Documents/invoice001.pdf, the storage account is “accounting”, the container name is “Documents”, and the object name is “invoice001.pdf”.

There are three types of users in a SwiftStack cluster:

Super user – may do anything to any account, container, or object in any storage account. Admin user – may do anything to the containers and objects within her associated storage account. “Normal user” – pmay only access their own account.

In the SwiftStack Controller console, you will need to set up a minimum of one storage account and one user. For each user, you define if it is a normal, admin or super user and the password key.

An ACL for a container may reference the string ”:” to allow the user, or the string ”” to allow any user associated with the account. Users associated with other storage accounts may also be referenced in ACLs, allowing them to access containers or objects in a different storage account. For read-only access, the “X-Container-Read” ACL is used. For read-write access, both the “X-Container-Read” and “X-Container-Write” ACLs must include the normal user or the normal user’s storage account name.

SSL

SwiftStack includes a built-in SSL feature, which handles SSL termination for your SwiftStack cluster. To enable SSL for your cluster, check the “Use SSL” box in the Configure Cluster section of the SwiftStack Controller console and upload and upload the certificate and private key (in PEM format). If you do not upload an SSL certificate, a self-signed one will be automatically generated for you. You can replace the self-signed certificate with an uploaded certificate at any time.

SwiftStack Controller will validate the SSL certificates as they are uploaded. Once uploaded, push that config to the cluster for the SSL feature to take effect. Once pushed, each SwiftStack node will run a pound process which handles SSL termination and talk to the local proxy server on that node.

If you are using an external SSL termination load balancer, it can be configured in the SwiftStack Controller console. Contact SwiftStack support for assistance.

Load Balancing

SwiftStack includes a built-in http load-balancer, which is recommended for small to mid-size deployments. To enable the built-in SwiftStack http load-balancer, check the box in the Create New Cluster section of the SwiftStack Controller console and enter the IP address that you would like to use for the Cluster API Address.

There is also an option to disable the built-in SwiftStack load balancer and use a separate, external load balancing tier, such as a a round-robin DNS or a commercial load balancer (F5”s, NetScalers etc.). This is especially recommended for larger and/or high performance environments. When a commercial / external load balancer is used, SwiftStack presents a healthcheck URL for each of the proxy nodes, which can be used by the external load balancer.


Configuring your Cluster

Adding Capacity to Cluster

When adding new capacity, nodes or drives, to a SwiftStack cluster, data will be re-distributed evenly across cluster resources. For a large cluster, this may not be noticeable but for a small cluster, this additional replication traffic needs to be managed to ensure it does not impact performance of the cluster.

SwiftStack makes it easy to add capacity. Additional drives can be added on existing nodes and new nodes can be added through the SwiftStack Controller. To fold these resources into your existing clusters, you have two options: Add Gradually and Add Immediately.

If you choose “Add Gradually” and click “Change”,SwiftStack will add a bit of weight to that drive once an hour until it is a full member of the cluster. You can see how it is progressing…

…and a (long) while later, your new drives have been added:

This process ensures that any new capacity that is added to the cluster is folded in gracefully without an impact to performance. There is also an option to “Add Immediately”, which folds in new capacity right away.

When adding new nodes, just set all (or a portion) of the drives to either “Add Gradually” or “Add Immediately”.

Removing Capacity from a Cluster

Conversely, SwiftStack makes it easy to remove drives gradually, which will be required when you want to upgrade to larger drives or swap out older drives. It looks just like gradual capacity addition; just choose “Remove Gradually” and click “Change”, then wait until all data has been removed from the disk(s) so you can remove it from the cluster.


Monitoring and Alerting

Monitoring

The SwiftStack Controller collects and stores monitoring data for over 500 metrics for each node in your SwiftStack cluster. A subset of these are reported in the monitoring section of the SwiftStack Controller console so you can get a birdseye view of your cluster performance, with options to drill down into specific metrics for tuning and troubleshooting.

To ensure that you can quickly measure the overall health of your SwiftStack environment, the SwiftStack platform highlights the following top level metrics in the Controller console for each cluster:

  • Cluster CPU Utilization
  • Cluster Proxy Throughput
  • Average Node Memory Utilization
  • Total Cluster Disk I/O
  • Top 5 Least Free Disks

Cluster CPU Utilization – Measure the percentage of the time that a node is using the CPU or performing disk I/O. CPU utilization is provided both for the overall cluster as well as for individual nodes. High numbers are generally bad. If your CPU utilization is high, processes may have a harder time getting resources. This means that processes on this node will start to slow down and that it may be time to add additional nodes to the cluster.

Cluster Proxy Throughput – Displayed in bytes/second. This is the aggregate throughput for all inbound and outbound traffic. This graph will indicate network bottlenecks if any exist.

Average Node Memory Utilization – This is the memory usage for all nodes in your cluster. Memory utilized is displayed as either used, buffered, cached or free.

Total Cluster Disk I/O – Measures the total number of Input/Output Operations Per Second (IOPS) on disks - for the overall cluster and for individual nodes. Disk I/O is shown for both Read IOPS and Write IOPS. Note that since Swift constantly guards against bitrot, the cluster will continuously read some amount of data.

Top 5 Least Free Disks – Displays the most full disks in the cluster. As Swift distributes data across the cluster ‘evenly’. Therefore, the most full disks will correspond to how full the overall custer is and when you should consider adding additional capacity.

The SwiftStack Controller also makes available several other monitoring graphs for your SwiftStack cluster, which are used when tuning or troubleshooting your SwiftStack cluster. For the overall cluster, these graphs include:

  • Total Cluster Interface Bandwidth
  • StatsD Statistics Per Node
  • Avg OpenVPN Traffic
  • Top 4 Avg Node Process Groups by RSS
  • object-updater sweep Timing and Count
  • object-replicator.partition.update Req Timing and Count
  • Proxy Req Timing and Count
  • account-replicator replication Timing and Count

For each node, the following monitoring graphs are available:

  • Total Node Disk I/O
  • Per-Disk Read Throughput
  • Per-Disk Write Throughput
  • Per-Disk Read IOPs
  • Per-Disk Write IOPs
  • Node Proxy Server Throughput
  • CPU Utilization (All CPUs)
  • Per Processes Group CPU Usage
  • Account Processes CPU Usage
  • Container Processes CPU Usage
  • Object Processes CPU Usage
  • Memory Utilization
  • Node Interface Bandwidth
  • StatsD Statistics
  • OpenVPN Traffic
  • Top 4 Node Process Groups by RSS
  • Object Replicator Operations
  • object-updater sweep Timing and Count
  • object-replicator.partition.update Req Timing and Count
  • Proxy Req Timing and Count
  • account-replicator replication Timing and Count

These metrics and graphics are viable under the “View all graphs for this cluster” menu.

Learn more about SwiftStack monitoring here:

Alerting

The SwiftStack Controller provides alerts for two events:

  • “Node is unreachable” - The SwiftStack Controller can’t reach a node. It could either be down or something is interfering with the connection between the node and the Controller
  • “Missing device” - If a disk that should be mounted is not.

How to manage alerts:

  • To acknowledge & archive an alert, click the “Acknowledge Alert” button. That alert will be removed from the count of alerts that appear on the top of the page.
  • To acknowledge & archive all alerts, click “Archive All Alerts” button.
  • To navigate to the Archived Alerts, click the “Visit Your Alert Archive” button.
  • To delete all archived alerts, click on the “Clear all archived alerts” button.

Handling Hardware Failures

Swift, by default, stores 3 copies of all objects. These objects are distributed across as-unique-as-possible zones, which, depending on the size of your cluster, can be a group of drive, a node, a PDU, a rack etc. In a SwiftStack cluster, there following scenarios needs to be considered:

Drive failure – Should an individual drive fail, the SwiftStack Controller console will alert the operator of the failure and the entire cluster will begin replicate the data that was stored on the failed drive to various handoff locations in the cluster, always ensuring that the number of replicas remain constant.

There are two options to handle a failed drive:

  1. Replace – The failed drive can normally be replaced during regularly scheduled maintenance periods. I.e. a drive failure in Swift is not a critical event as the data on the drive has been re-created in handoff locations elsewhere in the cluster, thus always ensuring that the defined number of replicas exists. Unlike RAID systems, Swift does not require any re-build time when a failed drive is replaced. The replacement drive will be folded into the cluster (see section on adding capacity) and the cluster will re-distribute data in an even manner across cluster resources.
  2. Remove – Though the SwiftStack Controller, you can remove the disk entirely from the cluster. This will operationally remove the disk from the cluster. Use this option if you will not be getting to the data center in a reasonable timeframe. If the disk is ‘dying’, use the Remove Gradually option, if the disk has failed and is not responsive, use the Remove Immediately option.

Node failure or power failures to an entire rack – Should an entire node or even a whole rack become unreachable, e.g. for a power-failure, networking issue, motherboard failure etc., the SwiftStack Controller console will alert that the “Node is unreachable” and Swift will assume that the data is still durably on the drives there but not begin to replicate the data elsewhere in the cluster. The rationale for this is that data is likely still available on the drives on the “failed” node, and it may be a simpler operation to make the repair and re-connect the node to the cluster instead of immediately replicating the data. New writes will continue to write 3 replicas.

If the node needs to be replaced or repaired, the operator as the following options:

  1. Do not make any configuration changes in the SwiftStack Controller and promptly repair the failed equipment.
  2. If removing the equipment for an extended period of time:
    • In the SwiftStack Controller, select to remove all drives for the failed node “immediately”. This will ensure that the cluster will re-create the data as quickly as possible so 3 replicas of all data is achieved. This option will, however, result in an increase in replication traffic on your cluster, which may impact performance for a period of time when the replication occurs.
    • In the SwiftStack Controller, select to remove all drives for the failed node “gradually”. This will minimize any spikes in traffic resulting from the increase in replication traffic in your cluster, but will take a longer time to replicate the data from the failed node(s) to the handoff locations in the cluster.

When there is a node failure, it is more critical for the operator to take action quickly than for drive failures to ensure that the desired availability level for the cluster is maintained.

Learn more about how Swift handles hardware failures here:


Managing Middleware

SwiftStack provides several middleware options, which can be enabled in the SwiftStack Controller console. Once enabled, the selected middleware will be installed and set up in your SwiftStack cluster next time you “Push Config to Cluster”.

The following middleware options are currently available:

  • Static Web - Web Server Gateway Interface (WSGI) middleware which serves container data as a static web site with index file and error file resolution and optional file listings. Static Web can be used both for anonymous and authenticated requests.
  • TempUrl - Allows URLs to be created to provide temporary access to objects. This may be handy when needing to provide download link for large objects from a Swift account which has not public access.
  • FormPost - Translates a browser form post into a regular Swift object PUT.
  • Name Check - A filter that disallows any paths that contain defined forbidden characters or that exceed a defined length.
  • Swift Web Console - SwiftStack Add-on which provides an web based interface for end-users to upload, download and manage their containers and files.
  • Active Directory (AD) Proxy Authentication - SwiftStack Add-on which enables organizations to join their SwiftStack cluster to an AD domain, authenticate users against AD and get metadata for users from AD
  • LDAP-based Proxy Authorization - SwiftStack Add-on which enables organizations to authorize users against LDAP.

Technical Support

SwiftStack is committed to providing a great product and service experience for our customers. As an essential part of this commitment, SwiftStack provides technical support to all customers with a valid subscription agreement, 9 hours a day, 5 days a week. SwiftStack also offers optional 24x7 support coverage, which provides emergency support assistance for critical issues during weekends and nights.

Technical support is provided for the SwiftStack storage system software. Support is provided for the following areas:

  • Installation and configuration issues
  • Troubleshooting issues related to the usage of the SwiftStack storage system and SwiftStack management and monitoring
  • Assistance interpreting monitoring and alerts
  • Providing workarounds or resolutions for known problems
  • Answering general how-to questions, and providing pointers to documentation
  • Troubleshooting software packages in the SwiftStack storage system showing erratic or faulty behavior, independent of the user’s application code

Customers with a valid subscription agreement can report support issues to SwiftStack through the following official support channels:

Via the SwiftStack technical support site: https://swiftstack.zendesk.com

Via email: support@swiftstack.com

Via phone: +1 (415) 630-2955

SwiftStack provides support for all ticket priorities between between 8:00 a.m. – 5:00 p.m. Pacific Time, Mon-Fri, excluding US Holidays. For customers with the optional 24x7 support coverage, SwiftStack responds to Urgent Priority support tickets on a 24x7 basis.

© 2012 SwiftStack Inc., San Francisco, CA