Live session blog from Cloud Connect – Pondering Infinity with Ceph

Posted by Gina Rosenthal in ceph | Tagged , , , , | Leave a comment

These are my live notes from the session Ross Turk of Inktank presented today at Cloud Connect Chicago.

Exp of how things scale bigger and bigger – hotels (e.g hotels in Vegas). What if a hotel had a million rooms? Its always under construction, self-maintaining.

At some point scaling becomes impractical.

Cool yet crazy elevator example to explain the Ceph project.

As Ceph was being designed, these were the considerations:

  • Philosophy – open source, community-focused
  • Design – scalable, no single point of failure, software based (not appliance based), self-managing

Architecture:

  • Object store (RADOS) is the base
    • OSD runs on top of an FS (brtfs, xfs,) 1 per disk (recommended), 3 required to make a cluster, serves objects to clients, intelligently peer to perform replication tasks)
    • Monitors – member of cluster that maintains cluster map – not info-serving
  • Librados – programming app that allows you to interact with the object store (no http overhead)
  • RADOSGW – Rest gw that is compatible w S3 and Swift. Southbound = native, rest = northbound. Supports buckets and accounting
  • RBD – RADOS block device. Assembles blocks from the cluster in a fashion that can be mounted as block or as virtual machine. Interesting ways to live migrate VMs.
  • CephFS – posix-compliance distributed fs. Meta data needs to be managed – so there is a metadata server only if you are running CephFS.
    • Metadata servers are clustered, they don’t share data
  • Crush – is algorithm that stores and retrieves data
    • hashes object name, puts it into placement groups
    • gets passed to cluster, with cluster set and rule set
    • all work happens on the client
    • its a psuedo random placement group.
    • Configuration is rule-based
    • If a node gets losts, ceph recalculates crush and figures out where duplicate should be and moves it there, so its ready for the client when it calculates

 

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.