researchHQ’s Key Takeaways:
- Scalability in cloud computing is the ability to easily add or subtract computing or storage resources.
- While the cloud offers a means to easily scale resources, an effective scaling is strategy is still required.
- Cloud vertical scaling involves increasing an organisation’s server power replacing a server with a more powerful one. Horizontal scaling involves provisioning additional servers to meet company needs.
- Scalability is key to managing cost in enterprise cloud environments, allowing instances to be right-sized to meet company needs and avoid excessive cloud expenditure.
The Power of Scalability
There are many reasons to make the move to the cloud, but one of the most common is scalability. What is scalability in cloud computing? Scalability is the ability to easily add or subtract compute or storage resources. In ‘the old days’ of on-premise data centers, scalability was incredibly costly, slow, and difficult to manage. Back then, scaling up meant buying new server hardware and disk arrays. Even after the purchase was approved in the budget and the order made, it would take months before the equipment arrived. Meanwhile, some of the companies’ highest-paid engineers would spend hours unpacking cardboard boxes with servers and storage inside, plugging them in and getting them hooked up to the system.
Consequences of not having enough compute or storage resources are dire: First come performance issues, then users start getting error messages and getting locked out of the application.
That’s how resources were added to a traditional IT infrastructure. But what if you needed fewer resources? Sometimes scalability is erroneously used as a synonym for growth. In a real-world IT environment, demand isn’t steady. Even a thriving business might encounter times when there is more or less demand. Demand changes seasonally, weekly, and hourly. In a data center world, reducing capacity was almost never practical, so companies were left provisioning enough resources to cover their expected peak demand. In other words, an eCommerce site would need enough computing resources to handle Black Friday traffic, every single day. Utilization rates were obviously very low, especially because most companies would provision resources based on expected peak demand, plus some.
Simplifying Scaling Problems
The alternative is to provision just enough resources for daily use and not for peak traffic. Consequences of not having enough compute or storage resources are dire: First come performance issues, then users start getting error messages and getting locked out of the application. In a business setting, that equals lost revenue. Conversely, resources are not free. Over-provisioning can lead to ballooning IT costs.
The cloud has dramatically simplified scaling problems by making it easier to scale up and out while also making it possible to scale down and in. However, scaling continues to be a challenge, even in cloud environments. It’s also important to remember that all parts of your application need to scale, from the compute resources to database and storage resources. Neglecting any pieces of the scaling puzzle can lead to unplanned downtime or worse.
Cloud Scaling Strategies
There are two ways to scale: vertically or horizontally. When you scale vertically, it’s often called scaling up or down. When you scale horizontally, you are scaling out or in.
- Cloud Vertical Scaling refers to adding more CPU, memory, or I/O resources to an existing server, or replacing one server with a more powerful server. Amazon Web Services (AWS) vertical scaling and Microsoft Azure vertical scaling can be accomplished by changing instance sizes, or in a data center by purchasing a new, more powerful appliance and discarding the old one. AWS and Azure cloud services have many different instance sizes, so scaling vertically is possible for everything from EC2 instances to RDS databases.
- Cloud Horizontal Scaling refers to provisioning additional servers to meet your needs, often splitting workloads between servers to limit the number of requests any individual server is getting. In a cloud-based environment, this would mean adding additional instances instead of moving to a larger instance size.
In practice, scaling horizontally (or out and in) is usually the best practice. It’s much easier to accomplish without downtime—even in a cloud environment, scaling vertically usually requires making the application unavailable for some amount of time. Horizontal scaling is also easier to manage automatically, and limiting the number of requests any instance gets at one time is good for performance, no matter how large the instance.