Ceph/Installation

All necessary Ceph software is available through the package. It contains all services as well as basic administration utilities for managing a Ceph cluster.

Design
Before embarking on a Ceph deployment scenario, take the time to make a basic Ceph cluster design.

What is the purpose of the Ceph cluster? Is it to play around and experiment with Ceph? Is it to host all critical data in form of rbd devices? Is it to create a highly available file server?

What features are needed on the Ceph cluster? How many monitors are likely to be needed? How much storage will be used, and how will this storage be represented (as in, how many OSDs will be available and where will they run)? Will the cluster provide S3- or Swift-like APIs to the outside world?

What are the IP addresses that will be used by the cluster? Ceph requires a static IP environment, so making a well designed network infrastructure is important for Ceph to function properly.

How will the servers be distributed across the environment? Ceph has a number of buckets that it can use to differentiate servers and make well-thought-through distribution and replication decisions. The default is an OSD on a host in a rack in a row in a room inside a data center.

There are a number of best practices to account for through:


 * Most clusters require 3 monitor servers, perhaps 5. Clusters generally do not need more than 5 monitor servers to function in even the harshest environments.
 * Distribute the monitor servers across the environment. If the cluster is over a couple of racks, make sure that the monitor servers are distributed across the racks as well.
 * There is usually no need for RAID on the file system that an OSD uses. Instead, rely on the Ceph availability and distribution.
 * OSD services do not need a lot of CPU or RAM. A metadata server however does benefit from high-speed CPU and lots of memory.

Hardware layout
The hardware specification of this example consist of three machines: host1, host2, host3, each has three harddisk, first driver (/dev/sda) for OS installation, second, third (/dev/sdb, /dev/sdb) for OSD service, Ceph Monitor will be deployed at each machine, while Metadata serive will be deployed only at host1

System configuration
The first configuration to decide on is which Ceph version to deploy. At the time of writing, Ceph version 0.87 ("Giant") is available in the tree in ~arch while version 0.80 ("Firefly") is available as stable release. To use the ~arch version, add to :

Next, validate that the Linux kernel is configured to support Ceph.

Installation
With the system configuration done, install the Ceph software.

The following USE flags are available for fine-tuning the installation.

With the USE flags defined, install the software:

Cluster creation
Use uuidgen to generate a cluster id.

Create the basic skeleton for the file, and use the generated id for the   parameter.

In this example, a cluster is used with a replication factor of 2 (which means it is replicated once - there are two instances of each block) and a minimum of 1 (i.e. as long as one copy of the data is available, continue).

Next create the administrative key. The default administrative key is called client.admin:

Monitors
To create the monitors, first add in the information to the file:

Next create the keyring for the monitor (so that the Ceph monitors can integrate and interact with the Ceph cluster) and add the administrative keyring to it:

Now create the initial monitor map (which is a binary file that the Ceph monitors use to find the default, initial monitor list).

Create the file system that the monitors will use to keep their information in.

Repeat this step on each system for the right id ( becomes   etc.)

Finally, create the init script to launch the monitor at boot:

Also repeat this on each system for the right id.

Adding manually
Get a UUID for intended osd:

Create a new OSD in the cluster:

Create the mountpoint on which the data of the OSD will be stored:

Make the filesystem for storing data and mount it (assuming you plan to store data on /dev/{partition}):

Also, consider adding that filesystem to fstab like so:

Then, create the OSD files on it:

Change owner and group of that files to, otherwise osd would not be able to write anything:

Add the OSD keyring to the clusters' authentication database:

Adding via ceph-volume
Ceph-volume is a tool for OSD deployment, at the moment recommended by the upstream way to deploy OSDs is through LVM, enable it before proceeding. To create new OSD:

Mind the id and uuid of the OSD that was created (look for  and   keys in invocation of  ). At the moment OSD files are created in the tmpfs, that is mounted over  directory. At the boot these files should be recreated from LVM metadata, to help boot scripts do so:

Finalizing
Add the current host to the CRUSH map if it is the first OSD of this host that participates in the cluster:

Add each OSD to the map with a default weight value:

Create the init script for the OSD and have it start at boot:

Metadata server
Update the information for the MDS:

Create two pools - one for data and one for metadata. The number 128 in the example below is the number of placement groups to assign inside the pool. Tune this correctly depending on the size of the cluster (see Ceph's placement groups information).

Now create a file system that uses these pools. The name of the file system can be chosen freely - the example uses cephfs:

Create the keyring for the MDS service:

Create the init script and have it start at boot: