Computer Science
Algorithm
Data Processing
Digital Life
Distributed System
Distributed System Infrastructure
How to Cleanup Ceph Filesystem for Deleted Kubernetes Persistent Volume (2023)
Machine Learning
Operating System
Android
Linux
MacOS
Tizen
Windows
iOS
Programming Language
C++
Erlang
Go
Scala
Scheme
Type System
Software Engineering
Storage
UI
Flutter
Javascript
Virtualization
Life
Life in Guangzhou (2013)
Recent Works (2013)
东京之旅 (2014)
My 2017 Year in Review (2018)
My 2020 in Review (2021)
十三年前被隔离的经历 (2022)
A Travel to Montreal (2022)
My 2022 in Review (2023)
Travel Back to China (2024)
A 2-Year Reflection for 2023 and 2024 (2025)
Projects
Bard
Blog
RSS Brain
Scala2grpc
Comment Everywhere (2013)
Fetch Popular Erlang Modules by Coffee Script (2013)
Psychology
耶鲁大学心理学导论 (2012)
Thoughts
Chinese
English

How to Cleanup Ceph Filesystem for Deleted Kubernetes Persistent Volume

Posted on 04 Nov 2023, tagged KubernetesCephDistributed file system

Ceph is a distributed file system. Rook is a project to deploy it with Kubernetes. I recently replaced GlusterFS in my Kubernetes cluster with Ceph. I will write a blog (or a series of blogs) for the migration. But in this article, I will just talk about a problem I encountered, just in case I forget it.

Once Rook is deployed in Kubernetes, you can create a Ceph Filesystem and use it to persistent volume (PV). Each PV’s data will be stored in a folder in the filesystem. If the PV’s reclaiming policy is set to retain, the data will not be deleted after the persistent volume is manually deleted. It’s safer in this way. But what could you do if you want to cleanup the data? Normally you should change the PV’s reclaim policy before you delete the PV, then Rook’s operator will auto reclaim the storage in Ceph. But what if you forget or didn’t know that (like me), and want to cleanup the data after?

First, we need to the folder/subvolume names in Ceph that store’s each PV’s data. We an get that by using kubectl describe pv <pv-name> and look for the field subvolumeName. But since the PV is deleted, we need to find the mappings for existing PVs and compare that with the folders/subvolumes in Ceph. This is the command to show all of the existing ones:

kubectl get pv -o yaml | grep subvolumeName  | sort

Then we need to find all the existing folders/subvolumes in Ceph’s filesystem: Start a Ceph toolbox pod based on the doc. Then go into the pod and find the filesystem’s name first:

ceph fs ls

After getting the filesystem’s name, get all the subvolumegroup from it:

ceph fs subvolume ls <fs-name> csi | grep 'name' | sort

Compare this list with the list above, you should be able to find a subvolume that exists in Ceph but not shown in Kubernetes’ PV mapping. Use this command to check its info:

ceph fs subvolume info <fs-name> <subvolume-name> csi

If you are sure this is the folder you want to delete, use this command to delete it:

ceph fs subvolume rm <fs-name> <subvolume-name> csi