The challenges of QoS in the Cloud
Current approaches to achieving QoS
Burte-force
Historically, a number of approaches to achieving QoS have existed in the marketplace. The brute-force approach, which is practiced often in the Enterprise, is to purchase multiple storage systems and assign each to a department or application. Separation is achieved physically, with the downside being the high cost of multiple systems, the inefficiency of maintaining spare capacity in each system, the low utilization resulting from the inability to adapt quickly to changing needs, and the need to manage each system separately.
Another way QoS has been addressed is with solid-state storage. Solid-state drives (SSDs) have a very helpful characteristic – unlike rotating drives, SSD access time (i.e. latency) does not increase when the reads and writes are random (not sequential) in order. This means that multiple tenants or applications can be accessing the same drive at the same time, without adversely impacting its performance. The most important downside is cost. Solid-state drives are still several times, to tens of times, more expensive than rotating drives on a per-gigabyte basis. An additional, major pitfall is controller performance. Even if the storage media themselves are able to provide the necessary level of service, all-flash arrays are served through shared storage controllers which become a bottleneck if their peak performance is not as high as the combined peak performance of the solid state drives. This means that a few heavy users can impact of the performance of all other users, which, in turn, means poor QoS.
One last approach, one is that is sometimes used in Enterprise storage systems, is dedicating drives on a per-user basis. This means that each user has exclusive use of a set of drives (rotating and/or SSDs) and as a result their latency is unaffected by what other users are doing. This approach is far more flexible and cost-effective than the all-SSD approach, because users can match the most cost-effective media with each application. The downside is that the minimum unit of storage is a drive group, which means that users who needs QoS for very small capacities (e.g., 10 gigabytes) will not find this approach compelling. Also, controller performance is again a bottleneck, just like in the all-SSD approach described above.
Here at Zadara Storage we looked at all of these approaches and thought we could do better by combining these three approaches into a best-of-all-worlds offering: We decide to start with dedicated drives, given it is the most flexible and cost-effective of the above approaches. Then, using storage virtualization techniques, we added what we think is a clever variation on the multiple storage system approach. Rather than a brute-force approach of multiple systems, we create multiple, software-defined virtual controllers. These controllers behave like storage systems but are simply software. And, through the use of dedicated compute, network and memory resources and virtualization wizardry, we can guarantee that these multiple controllers cannot interfere with one another. To top it off, we then added an element of the all-SSD approach, by adding SSD caching in order to accelerate random reads and writes, the achilles heel of rotating drives.
The cloud has brought about new QoS challenges, and with it a whole new set of solutions and approaches to storage QoS, each with its respective advantages and drawbacks. Solutions like our own allow customers to achieve QoS in an affordable, reliable, flexible storage system, and take advantage of the cloud’s foremost benefits, bringing costs down and consistency up.