19 Using Rook to Build a Production-Ready Storage Environment - Practical Implementation #
Rook is a storage service framework built on top of Kubernetes. It supports the creation and management of various underlying storage systems such as Ceph and NFS. It helps system administrators automate the entire lifecycle of storage, including deployment, startup, configuration, provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource management. This seems like a lot of tasks, but Rook aims to reduce the operational complexity by leveraging Kubernetes and Rook to manage these tasks for you.
Managing Ceph Cluster with Rook #
Ceph distributed storage is the first officially supported storage engine marked as Stable in Rook. During my verification process of using Rook to operate Ceph, I found that the community documents and scripts are mixed together, making it difficult for beginners to know how to step by step experience the process of setting up Ceph with Rook. This reflects the long-term iterative process of the technical difficulty and compatibility of distributed storage. The original intention of Rook was to reduce the difficulty of deploying and managing Ceph clusters, but it didn’t work out as expected. The initial use process was not user-friendly, and there were many unknown issues in the official documentation.
Before installing Ceph, it is important to note that the latest version of Ceph only supports the BlueStore backend, which only works with raw devices and does not support creating storage blocks on the local file system. Due to the confusion in the Rook documentation, initially we need to find the installation script directory ourselves, which is located at:
https://github.com/rook/rook/tree/master/cluster/examples/kubernetes/ceph
$ git clone https://github.com/rook/rook.git
$ cd rook
$ git checkout release-1.4
$ cd cluster/examples/kubernetes/ceph
$ kubectl create -f common.yaml
# Check if the namespace "rook-ceph" exists
$ kubectl get namespace
$ kubectl create -f operator.yaml
# The above steps must ensure that the pods are in a "running" or "complete" state before proceeding to the next stage, otherwise the process is likely to fail. These steps may take some time.
$ kubectl create -f cluster.yaml
# Wait for the Ceph cluster to be created
$ kubectl -n rook-ceph get pods
# mgr 1, mon 3,
# rook-ceph-crashcollector (as many as the number of nodes)
# rook-ceph-osd (as many as the number of disks, indexed from 0)
Ceph has many issues and often requires the use of the toolbox to check certain situations. Follow the steps below to deploy the toolbox:
$ kubectl create -f toolbox.yaml
$ kubectl -n rook-ceph get pods | grep ceph-tools
rook-ceph-tools-649c4dd574-gw8tx 1/1 Running 0 3m20s
$ kubectl -n rook-ceph exec -it rook-ceph-tools-649c4dd574-gw8tx bash
$ ceph -s
cluster:
id: 9ca03dd5-05bc-467f-89a8-d3dfce3b9430
health: HEALTH_OK
services:
mon: 3 daemons, quorum a,d,e (age 12m)
mgr: a(active, since 8m)
osd: 44 osds: 44 up (since 13m), 44 in (since 13m)
data:
pools: 1 pools, 1 pgs
objects: 0 objects, 0 B
usage: 45 GiB used, 19 TiB / 19 TiB avail
pgs: 1 active+clean
# Available capacity in the Ceph cluster
$ ceph df
# Distribution of Ceph OSDs across nodes
$ ceph osd tree
# Delete the Ceph toolbox
$ kubectl delete -f toolbox.yaml
Using the Dashboard to view the status of Ceph:
$ vim dashboard-external-https.yaml
apiVersion: v1
kind: Service
metadata:
name: rook-ceph-mgr-dashboard-external-https
namespace: rook-ceph
labels:
app: rook-ceph-mgr
rook_cluster: rook-ceph
spec:
ports:
- name: dashboard
port: 8443
protocol: TCP
targetPort: 8443
selector:
app: rook-ceph-mgr
rook_cluster: rook-ceph
sessionAffinity: None
type: NodePort
$ kubectl create -f dashboard-external-https.yaml
$ kubectl -n rook-ceph get service
rook-ceph-mgr-dashboard-external-https NodePort 10.107.117.151 <none> 8443:31955/TCP 8m23s
The access address is 31955, and you can access it at https://master_ip:31955. The username is admin, and you can find the password online:
$ kubectl -n rook-ceph get secret rook-ceph-dashboard-password -o jsonpath="{['data']['password']}" | base64 --decode && echo
Clean up Ceph:
$ cd /rook/cluster/examples/kubernetes/ceph
$ kubectl -n rook-ceph delete cephcluster rook-ceph
$ kubectl -n rook-ceph get cephcluster
# Confirm that rook-ceph has been deleted
$ kubectl delete -f operator.yaml
# Delete the cluster
$ kubectl delete -f common.yaml
$ kubectl delete -f cluster.yaml
Managing NFS File System with Rook #
NFS file system is still a common storage solution in domestic enterprises. Using Rook to manage NFS file systems can greatly facilitate the storage environment for developers. Before installing Rook, you need to install the NFS Client package. Install nf-utils on CentOS nodes and nf-common on Ubuntu nodes. Then you can install Rook. The steps are as follows:
git clone --single-branch --branch v1.4.6 https://github.com/rook/rook.git
cd rook/cluster/examples/kubernetes/nfs
kubectl create -f common.yaml
kubectl create -f provisioner.yaml
kubectl create -f operator.yaml
# Check the status
[root@dev-mng-temp ~]# kubectl -n rook-nfs-system get pod
NAME READY STATUS RESTARTS AGE
rook-nfs-operator-59fb455d77-2cxn4 1/1 Running 0 75m
rook-nfs-provisioner-b4bbf4cc4-qrzqd 1/1 Running 1 75m
Create permissions, with the following content in rbac.yaml:
---
apiVersion: v1
kind: Namespace
metadata:
name: rook-nfs
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: rook-nfs-server
namespace: rook-nfs
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rook-nfs-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["create", "update", "patch"]
- apiGroups: [""]
resources: ["services", "endpoints"]
verbs: ["get"]
- apiGroups: ["policy"]
resources: ["podsecuritypolicies"]
resourceNames: ["rook-nfs-policy"]
verbs: ["use"]
- apiGroups: [""]
resources: ["endpoints"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups:
- nfs.rook.io
resources:
- "*"
verbs:
- "*"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: rook-nfs-provisioner-runner
subjects:
- kind: ServiceAccount
name: rook-nfs-server
namespace: rook-nfs
roleRef:
kind: ClusterRole
name: rook-nfs-provisioner-runner
apiGroup: rbac.authorization.k8s.io
Execute YAML creation permissions:
kubectl create -f rbac.yaml
The current mainstream practice is to create NFSServer using dynamically allocated resources. The steps are as follows:
kubectl create -f nfs.yaml
# sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
labels:
app: rook-nfs
name: rook-nfs-share1
parameters:
exportName: share1
nfsServerName: rook-nfs
nfsServerNamespace: rook-nfs
provisioner: rook.io/nfs-provisioner
reclaimPolicy: Delete
volumeBindingMode: Immediate
`kubectl create -f sc.yaml` will create a StorageClass, and then resources can be requested:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: rook-nfs-pv-claim
spec:
storageClassName: "rook-nfs-share1"
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Mi
`kubectl create -f pvc.yaml` will create a file volume. Verify the result:
[root@dev-mng-temp nfs]# kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
rook-nfs-pv-claim Bound pvc-504eb26d-1b6f-4ad8-9318-75e637ab50c7 1Mi RWX rook-nfs-share1 7m5s
Test scenario used:
> kubectl create -f busybox-rc.yaml
> kubectl create -f web-rc.yaml
> kubectl get pod -l app=nfs-demo
> kubectl create -f web-service.yaml
> echo; kubectl exec $(kubectl get pod -l app=nfs-demo,role=busybox -o jsonpath='{.items[0].metadata.name}') -- wget -qO- http://$(kubectl get services nfs-web -o jsonpath='{.spec.clusterIP}'); echo
Thu Oct 22 19:28:55 UTC 2015
nfs-busybox-w3s4t
If you find that the NFS Server is not running, you can use this command to view the problem:
kubectl -n rook-nfs-system logs -l app=rook-nfs-operator
Summary
Since I started with the Rook project, its target positioning is still accurate, and it really solves the pain point of simplifying Ceph installation and configuration. Based on the experience of using Ceph, it begins to inject more storage drivers, such as NFS storage drivers. Using it is not complicated, but its documentation is really bad. There is no one in the community to maintain these documents, which means that many descriptions in the article are out of date, and you have no idea how to configure it correctly. You could easily make configuration errors. So when using it, you still need to carefully read through the contents of the yaml document and understand its functions before installing it, which will lead to better results. This imperfection is also an opportunity for enthusiasts of open source technology, allowing you to participate in the Rook project by modifying the documentation. After I sorted it out, using the latest version of the installation steps, you can deploy your distributed storage environment in minutes. Rook is indeed more efficient, and it is worth recommending and extensively practicing.
References
- <https://draveness.me/papers-ceph/>
- <https://rook.io/docs/rook/v1.4/nfs.html>