en:navody:object_storage:cesnet_rbd:start

Connecting and configuring Ceph RBD using a Linux client

Ceph RBD (RADOS Block Device) provides users with a network block device that looks like a local disk on the system where it is connected. The block device is fully managed by the user. An user can create a file system there and use it according to his needs.

Advantages of RBD

  • Possibility to enlarge the image of the block device.
  • Import / export block device image.
  • Stripping and replication within the cluster.
  • Possibility to create read-only snapshots; restore snapshots (if you need snapshots on the RBD level you must contact us).
  • Possibility to connect using Linux or QEMU KVM client

Tutorial for connecting RBD using a Linux client

Preparing for RBD connection

To connect RBD, it is recommended to have a newer kernel version on your system. In lower kernel versions are the appropriate RBD connection modules deprecated. So not all advanced features are supported. Developers even recommend a kernel version at least 5.0 or higher. However developers has backported some functionalities to CentOS 7 core.
For proper functioning it is highly desired to use the same version of Ceph tools as is the current version being operated on our clusters. Currently it is version 18 with the code name Reef. In case you use the very new operating system you might need to use the newest repository packages of Ceph tools.

The following instructions are valid for the CENTOS / RHEL distribution. Instructions for UBUNTU / DEBIAN are at the end of this section .

First, install the release.asc key for the Ceph repository.

sudo rpm --import 'https://download.ceph.com/keys/release.asc'

In the directory /etc/yum.repos.d/ create a text file ceph.repo and fill in the record for Ceph instruments. For CentOS 9, we modify the baseurl line from “rpm-nautilus/el7” to “/rpm-reef/el9/”.

Some packages from the Ceph repository also require third-party libraries for proper functioning, so add the EPEL repository.

CentOS 9

sudo dnf install -y epel-release

RedHat 9

sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-9.noarch.rpm

Finally, install the basic tools for Ceph which also include RBD support.

sudo yum install ceph-common

Installing Ceph tools in Debian / Ubuntu

Installing Ceph tools in Debian / Ubuntu

First it is necessary to add the appropriate repositories to the package, to / etc / apt / sources.list add the following lines.

deb https://eu.ceph.com/debian-reef/ bionic main
deb http://cz.archive.ubuntu.com/ubuntu/ bionic main  (need for one package during installation)

Install the necessary packages from the buster repository.

 sudo apt install x11-common libevent-core-2.1-7 libevent-pthreads-2.1-7

Add Ubuntu PGP keys.

apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E84AC2C0460F3994
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 3B4FE6ACC0B21F32

Install the Ceph package.

sudo apt install ceph


RBD configuration and its mapping

Use the credentials which you received from the system administrator to configure and connect the RBD. These are the following:

  • pool name: rbd_vo_poolname
  • image name: vo_name_username
  • keyring: [client.rbd_user] key = key_hash ==

In the directory /etc/ceph/ create the text file ceph.conf with the following content.

In the case of the Jihlava data storage with a code designation CL1:

[global]
fsid = 19f6785a-70e1-45e8-a23a-5cff0c39aa54
mon initial members = mon001-cl1-aba-jihl1,mon002-cl1-aba-jihl1,mon003-cl1-aba-jihl1
mon host = [v2:78.128.244.33:3300,v1:78.128.244.33:6789],[v2:78.128.244.37:3300,v1:78.128.244.37:6789],[v2:78.128.244.41:3300,v1:78.128.244.41:6789]
auth_client_required = cephx

In the case of the Pilsen data storage with a code designation CL2:

[global]
fsid = 3ea58563-c8b9-4e63-84b0-a504a5c71f76
mon_initial_members = mon001-cl2-aba-plz1,mon005-cl2-aba-plz1,mon007-cl2-aba-plz1
mon_host = [v2:78.128.244.65:3300/0,v1:78.128.244.65:6789/0],[v2:78.128.244.69:3300/0,v1:78.128.244.69:6789/0],[v2:78.128.244.71:3300/0,v1:78.128.244.71:6789/0]
auth_client_required = cephx

In the case of the Ostrava data storage with a code designation CL3:

[global]
fsid = b16aa2d2-fbe7-4f35-bc2f-3de29100e958
mon_initial_members = mon001-cl3,mon002-cl3,mon003-cl3
mon_host = [v2:78.128.244.240:3300/0,v1:78.128.244.240:6789/0],[v2:78.128.244.241:3300/0,v1:78.128.244.241:6789/0],[v2:78.128.244.242:3300/0,v1:78.128.244.242:6789/0]
auth_client_required = cephx

In the case of the Brno data storage with a code designation CL4:

[global]
fsid = c4ad8c6f-7ef3-4b0e-873c-b16b00b5aac4
mon_initial_members = mon001-cl4,mon002-cl4,mon003-cl4,mon004-cl4,mon005-cl4
mon_host = [v2:78.128.245.29:3300/0,v1:78.128.245.29:6789/0] [v2:78.128.245.30:3300/0,v1:78.128.245.30:6789/0] [v2:78.128.245.31:3300/0,v1:78.128.245.31:6789/0]
auth_client_required = cephx

In the case of the Prague data storage with a code designation CL5:

[global]
fsid = c581dace-40ff-4519-878b-c0ffeec0ffee
mon_initial_members = mon001-cl5,mon002-cl5,mon003-cl5,mon004-cl5,mon005-cl5
mon_host = [v2:78.128.245.157:3300/0,v1:78.128.245.157:6789/0] [v2:78.128.245.158:3300/0,v1:78.128.245.158:6789/0] [v2:78.128.245.159:3300/0,v1:78.128.245.159:6789/0]
auth_client_required = cephx

Further in the directory /etc/ceph/ create the text file ceph.keyring. Then save in that file the keyring, see the example below.

[client.rbd_user]
	key = sdsaetdfrterp+sfsdM3iKY5teisfsdXoZ5==

We strongly recommend using –exclusive option while mapping the RBD image. That option will prevent mapping an image on multiple devices or multiple times locally. This multiple mapping can cause data corruption! So please be aware that if you foresee any risk of multiple mapping then use “–exclusive” option.

On the other hand do not use –exclusive option in case you need to mount the RBD image on multiple machines, e.g. clustered file system.

Now RBD mapping can be performed (rbd_user is a string originating from the keyring, after stripping the string client..

sudo rbd --id rbd_user --exclusive device map name_pool/name_image
If the location of the files _ceph.conf_ and _username.keyring_ differs from the default directory /etc/ceph/, the corresponding paths must be specified during mapping. See below.
sudo rbd -c /home/username/ceph/ceph.conf -k /home/username/ceph/username.keyring --id rbd_user device map name_pool/name_image

Then check the connection in kernel messages.

dmesg

Now check the status of RBD.

sudo rbd device list | grep "name_image"

Encrypting and creating a file system

The next step is to encrypt the mapped image. Use cryptsetup-luks for encryption

sudo yum install cryptsetup-luks

Then it encrypts the device.

sudo cryptsetup -s 512 luksFormat --type luks2 /dev/rbdX

Finally, check the settings.

sudo cryptsetup luksDump /dev/rbdX

In order to perform further actions on an encrypted device, it must be decrypted first.

sudo cryptsetup luksOpen /dev/rbdX luks_rbdX

We recommend using XFS instead of EXT4 for larger images or those they will need to be enlarged to more than 200TB over time, because EXT4 has a limit on the number of inodes.

Now create file system on the device, here is an example xfs.

sudo mkfs.xfs /dev/mapper/luks_rbdX
If you use XFS, do not use the nobarrier option while mounting, it could cause data loss!

Once the file system is ready, we can mount the device in a pre-created folder in /mnt/.

sudo mount /dev/mapper/luks_rbdX /mnt/rbd

Ending work with RBD

Unmount the volume.

sudo umount /mnt/rbd/

Volume lock.

sudo cryptsetup luksClose /dev/mapper/luks_rbdX

Volume unmapping.

sudo rbd --id rbd_user device unmap /dev/rbdX/
To get better performance choose appropriate size of read_ahead cache depends on your size of memory.

Example for 8GB:
echo 8388608 > /sys/block/rbd0/queue/read_ahead_kb

Example for 512MB:
echo 524288 > /sys/block/rbd0/queue/read_ahead_kb

To apply changes you have to unmap image and map it again.

The approach described above is not persistent (won't survive reboot). To do it persistent you have to add following line into “/etc/udev/rules.d/50-read-ahead-kb.rules” file.

# Setting specific kernel parameters for a subset of block devices (Ceph RBD)
KERNEL=="rbd[0-9]*", ENV{DEVTYPE}=="disk", ACTION=="add|change", ATTR{bdi/read_ahead_kb}="524288"

Permanently mapping of RBD

Settings for automatic RBD connection, including LUKS encryption and mount filesystems. + proper disconnection (in reverse order) when the machine is switched off in a controlled manner.

Add the following lines to the configuration files:

Add the following lines to the configuration files:

rbdmap tool
ATTENTION the rbdmap.service must be enabled using systemctl enable rbdmap.service.

/etc/ceph/rbdmap

# RbdDevice             Parameters
#poolname/imagename     id=client,keyring=/etc/ceph/ceph.client.keyring
rbd_pool_name/image_name id=rbd_user,keyring=/etc/ceph/ceph.keyring,exclusive

LUKS encryption
ATTENTION cryptab will create corresponding service called systemd-cryptsetup@rbd_luks.service.
/etc/crypttab

# <target name> <source device>         <key file>      <options>
rbd_luks /dev/rbd/rbd_pool_name/image_name  /etc/ceph/luks.keyfile luks,_netdev

/etc/ceph/luks.keyfile je LUKS klíč.

path to block device (“<source device>”) is generally /dev/rbd/$POOL/$IMAGE

fstab
ATTENTION fstab will create the service dev-mapper-rbd_luks.device.
ATTENTION fstab will also create the service mnt-rbd_luks.mount, which will be used for manual connecting and disconnecting of RBD image.
/etc/fstab

# <file system> <mount point>   <type>  <options>       <dump>  <pass>
/dev/mapper/rbd_luks /mnt/rbd_luks xfs defaults,noatime,auto,_netdev 0 0
path to LUKS container (“<file system>”) is generally /dev/mapper/$LUKS_NAME,
where $LUKS_NAME is defined in/etc/crypttab (like “<taget name>”)

systemd
We strongly recommend to do the editing via command systemctl edit systemd-cryptsetup and then saving the changes as the 10-deps.conf file.

/etc/systemd/system/systemd-cryptsetup@rbd_luks.service.d/10-deps.conf

[Unit]
After=rbdmap.service
Requires=rbdmap.service
Before=mnt-rbd_luks.mount
In one case, systemd units were used on Debian 10 for some reason
ceph-rbdmap.service instead of rbdmap.service
(must be adjusted to lines After= and Requires=)


manual connection / disconnection

connection
systemctl start mnt-rbd_luks.mount

- If the dependencies of the systemd units are correct, it performs an RBD map, unlocks LUKS and mounts all the automatic fs dependent on the rbdmap that the specified .mount unit needs (⇒ mounts both images in the described configuration).

disconnection
systemctl stop rbdmap.service

(resp. systemctl stop ceph-rbdmap.service)
- this command should execute if the dependencies are set correctly umount, LUKS close i RBD unmap.

resize image with LUKS container

When resizing an encrypted image, you need to follow the order and use the right resizing tool.

rbd -c ceph_conf -k ceph_keyring --id ceph_user resize rbd_pool_name/image_name --size 200T
rbd -c ceph_conf -k ceph_keyring --id ceph_user device map rbd_pool_name/image_name
cryptsetup open --key-file luks_key_file /dev/rbd/rbd_pool_name/image_name rbd_luks
cryptsetup resize --key-file luks_key_file --verbose rbd_luks
mount /dev/mapper/rbd_luks /mnt/mount_point
xfs_growfs /mnt/mount_point

Často kladené dotazy

Timeout Connection

Problem Description: Unable to map RBD and the connection times out.

Solution:

Most likely, your firewall is blocking the initiation of communication to the internet. It is necessary to allow the range according to the given cluster on ports 3300 and the range of ports 6789-7568.

1. Set “related/established” on the firewall.

2. Configure the firewall for the cluster range clX and allow ports 3300/tcp and the range of ports 6789-7568/tcp.

cl1 - 78.128.244.32/27
cl2 - 78.128.244.64/26
cl3 - 78.128.244.128/25
cl4 - 78.128.245.0/25
cl5 - 78.128.245.128/25

3. Activate jumbo frames (support for large frames). These must be correctly configured throughout the entire path up to the CESNET backbone network, i.e., they must be set on all your active network elements towards the CESNET network, as well as on the server where you are trying to connect the RBD image. We recommend setting 9000 bytes on the server; for active network elements, it depends on several factors, which you should discuss with your network administrator.

Last modified:: 09.12.2024 10:35