All you need to know about cloud-native storage
The CNCF Storage Special Interest Group published a detailed whitepaper to explain storage systems structure for cloud-native applications. The document is a must-read to get a full understanding and comparisons of different kinds of storage and how they are being used in production today. This post is an excerpt from the paper that was presented at KubeCon Shanghai and the PDF is available on the CNCF Storage SIG GitHub repository.
Cloud-native applications are defined by the way we architect, develop and deploy them, not necessarily by where they “live.” Cloud native means microservices, containers, orchestrators, and more; the underlying idea is to build truly platform-agnostic, flexible apps that can be run by any authorized end user.
Here are some key attributes of cloud-native applications, as defined by the Cloud Native Computing Foundation (CNCF) via a post on The New Stack:
- Packaged in containers
- Designed as loosely-coupled microservices
- Isolated from server and OS dependencies
- Policy-driven resource allocation
- Managed through agile DevOps processes
- Centered around APIs for interaction
If you don’t go cloud native, you will miss out on important, cutting edge technologies and end up spending more of your time patching and maintaining your application. There are exceptions, of course.
A need for storage
Let’s assume for argument’s sake you have decided to deploy a stateful application on Kubernetes. Stateful means that the app needs to remember things when a client interacts with it. Saving things like this requires storage. What is important for your application?
The first section of the whitepaper introduces you to the basic attributes of storage interfaces and systems. Those are features/characteristics that the storage system usually has or is expected to have. Since every application has a unique set of functionality and objectives, it may be that you will want to have all of the attributes, just some of them, or have them all but pay more attention to a subset of features and less to the others. Key attributes of a storage system:
- Instantiation & Deployment
Another way to talk about storage systems is to talk about what they’re made of. This helps determine how your data is stored and retrieved, how it’s protected, and how it interacts with your applications (operating system and orchestrator). These layers are tightly connected with the attributes mentioned above.
Storage Topology – how different parts of a system are interrelated and connected (like storage devices, compute nodes, and data). Topology is important to take into consideration as it influences many attributes of a storage system as it’s built. Storage topology can be centralized, distributed, sharded, or hyperconverged.
Data Protection – an important service that makes sure your data persists even in the event of some disaster. There are a couple of ways to do this:
- RAID (redundant array of independent disks) – techniques used to distribute data across multiple discs with redundancy in mind
- Erasure coding – a method used to protect data where it is split into fragments that are encoded and stored with a number of redundant parity sets.
- Replica – a full copy of a dataset distributed across multiple servers.
Data services – those services often implemented as extra features for the main storage system functions. This can vary from system to system, but some common ones are replication, version control, management of some sort, and more.
Encryption – storage systems can offer ways to ensure data protection by encrypting it. It should be noted that encryption will have an impact on performance because of computing overhead, but acceleration options are available on modern systems.
Physical layer – the actual hardware where the data is stored. The choice of physical storage impacts the overall performance of the system and the durability of data stored. Traditional systems use magnetic discs and flash-based SSD/NVM.