Ready or not, it’s time to rethink storage

cat_idgaf
There are probably exabytes of just internet cat pictures.

It’s no secret that we’re in the midst of a data explosion. Digital assets have grown due to web-scale services like Facebook, YouTube, and Netflix.

50% of households use video on-demand services. And the medical industry has vast imaging needs. These are only some of the causes.

Today’s data challenges

In the “Why software-defined storage matters” session, Ross Turk, director of Storage product marketing at Red Hat, said data is now an asset companies cherish. This means storage matters more than ever. But all of the data we create results in challenges for enterprises:

  • Increased pressure on capacity, scalability, and cost
  • The need to access  data from anywhere, anytime, on any device, which requires unprecedented agility that you can’t get from traditional storage
  • The need for flexibility to store data on premise or in the cloud by modern services
  • Advanced data protection that ensures integrity and high availability at very large scale

Data growth is outpacing IT storage budgets

Screenshot from 2016-06-20 21:27:04Traditional storage appliances compound the problem because most data is stored in a proprietary storage system with a layer of proprietary software that hides complexity and flexibility from users. It also leads to pricing premiums due to vendor lock-in.

Then there’s public cloud storage. It’s convenient and a lot of data goes into this solution today. But this approach also hides complexity–and to some extent, flexibility–from users.

Organizations need to figure out what’s next, and many of them are spending a lot of time rethinking storage.

The datacenter is evolving

Innovation is happening in every dimension of the datacenter. The waterfall development model led to agile, which has led to DevOps. Bare-metal deployment led to virtual services and then to microservices. In storage, the scale-up model led to scale-out and then software-defined storage (SDS).

After transforming their datacenters, organizations usually leave their storage changes for last. They then realize everything they have is being used with old storage, which doesn’t work with the rest of their datacenter.

What is software-defined storage?

Turk asked who in the audience knows what SDS is. Quite a few people raised their hands. But many hands went down when he asked who can define it. Red Hat defines it as a combination of server-based storage and storage orchestration:

Server-based storage is the use of software and standard hardware to provide services traditionally provided by single-purpose storage systems. This lets you add storage nodes at same time you add compute nodes, scaling them up at a much smaller increment. You can even convert them so your storage node and compute node are the same node—something you can’t do with a storage appliance.

Enterprise storage has been growing at 40%+ per year¹, and the share of storage deployed in servers grew 20%+ between 2010 and 2016.²

Storage orchestration is the ability to provision, grow, shrink, and decommission storage resources on demand and programmatically. Instead of having to manually do this, you can programmatically spin up new resources and make them immediately available.

Why SDS matters

  1. Proprietary hardware is giving way to common off-the-shelf hardware. Instead of using specialized servers, we’re using standardized servers. Costs are lower. The supply chain is standardized. And standardization makes storage more convenient. Rather than having 15-20 appliance models to choose from, you can build a cluster that’s perfect for your workload using standard hardware. Clusters can be performance-optimized, capacity-optimized, or throughput-optimized. Need capacity? Add more disks. Too slow? Add more servers. Clusters can become larger or smaller with no downtime.
  2. Scale-up architecture is now scale-out architecture. Instead of buying larger machines you buy more machines. This can increase operational flexibility. Generally, a storage appliance gets slower the more you fill it with disks because it has only so many processors. SDS clusters get faster when you make them larger because you’re adding servers while you’re adding disks. That leads to performance that scales with capacity, even at multipetabyte scale. Containers
  3. Intelligence that used to be locked in hardware is now in software. Software is more flexible than hardware. Deploy SDS on bare metal, inside containers, inside virtual machines, or in a public cloud. Deploy it on a single server, or thousands. Upgrade and reconfigure it on the fly. It can grow and shrink programmatically to meet changing demands.
  4. Open development leads to better technology than closed development. Innovation that used to happen in conference rooms or on internal mailing lists now happens on Github, on public mailing lists, and in bug trackers. Open source leads to more flexible, well-integrated technology.

What can you use SDS for?

The ability to massively distribute data makes Ceph uniquely suited for OpenStack. It:

  • Allows for the instantaneous parallel creation of virtual machines at massive scale.
  • Integrates easily and tightly with OpenStack Cinder, Glance, Nova, Keystone, and Manila.
  • Offers instant backup capabilities.
  • Provides persistent object, file, and database storage for apps.

Ceph can grow to petabyte scale, which makes it great for object storage. It:

  • Stores unstructured data at web scale, using standard hardware.
  • Works with industry-standard APIs for a wide range of application compatibility.
  • Spans multiple geographical regions with no single point of failure.
  • Matches the distributed architecture of SDS.

Gluster works in many places. Because it’s flexible it can be deployed inside containers (where large storage appliances can’t fit). It:

  • Offers persistent storage to apps running in containers.
  • Lets apps and storage co-exist on the same hardware.
  • Allows for higher server usage and lowers operational costs.
  • Generates only 3%–10% overhead on converged servers.

 

Notes:

  1. “Choosing a Dynamic Storage Foundation for OpenStack.” 451 Research. May 2016.
  2. Q1 2015 Quarterly Storage Forecast. IDC. June 2015.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s