Overview
The Ceph MDS standby-replay daemon is affected by a now-fixed memory leak bug that causes gradually increasing memory consumption1.
This is fixed in Ceph Versions 17.2.82 and 18.2.43.
Workarounds
-
Memory-Based Restart:
- Monitor MDS memory usage
- Restart the MDS when it reaches a specific memory threshold
- Can be automated using
earlyoom
-
Disable Standby-Replay:
Bug Reference
This issue is tracked in Ceph’s issue tracker1. The bug affects multiple versions of Ceph and requires either implementing one of the workarounds or upgrading to a version with the fix.