Swift Part-Power and Performance
We work a lot with smaller-scale Swift clusters. Today we were asked a question about a Swift configuration setting which relates to the number of partitions in the ring data structure that are created. Data lives in partitions and it’s up to an operator to decide how many partitions should be created.
I will take this opportunity to ask you a question (if you don’t mind and have time) regarding the ring size. I would like to know if you had an experience of a very large ring file (something like 2^22 or 2^23) on a small cluster (between 4 and 20 servers). I would like to know if you saw any impact on performance CPU wise or others. We have good performance with 2^21 right now.. but was wondering if I should change it before launching in case we grow fast.
Basically, you’re not going to chew any more CPU on the Swift nodes by using a bigger ring. You will chew more memory, but not CPU.
Internally, the ring has 2 interesting data structures: a list of arrays called “_replica2part2dev”, and a list of devices called “devs”. To look up the devices for a partition, the (pseudo)code is something like this:
1 2 3 4
You can see that all those operations are O(1), so increasing your ring size will not materially affect the CPU consumed on your nodes.
If you want to get an idea of how much extra memory would be consumed, grab a python repl (I like ipython) and load up a ring file:
1 2 3
Now check your process’s memory usage.
Now check it again and take the difference. That’ll be a decent approximation to how much extra memory would be consumed (per Swift daemon!) by adding 1 to your part_power.
Note that Swift daemons don’t do any sort of data-sharing with this big data structure, so every proxy, account/container/object server, replicator, and auditor will each have its own copy of the ring. Remember to multiply the memory-usage delta by the number of daemons you’ve got running.
Hope this helps.