Actions
Bug #64729
openmon.a (mon.0) 1281 : cluster 3 [WRN] MDS_SLOW_METADATA_IO: 3 MDSs report slow metadata IOs" in cluster log
Status:
Triaged
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
Backport:
quincy,reef,squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
/a/vshankar-2024-03-04_08:26:39-fs-wip-vshankar-testing-20240304.042522-testing-default-smithi/7580913
Description: fs/workload/{0-centos_9.stream begin/{0-install 1-cephadm 2-logrotate} clusters/1a11s-mds-1c-client-3node conf/{client mds mon osd} mount/kclient/{base/{mount-syntax/{v1} mount overrides/{distro/stock/{centos_9.stream k-stock} ms-die-on-skipped}} ms_mode/crc wsync/no} objectstore-ec/bluestore-ec-root omap_limit/10000 overrides/{cephsqlite-timeout frag ignorelist_health ignorelist_wrongly_marked_down osd-asserts session_timeout} ranks/multi/{balancer/random export-check n/5 replication/default} standby-replay tasks/{0-subvolume/{with-quota} 1-check-counter 2-scrub/no 3-snaps/no 4-flush/yes 5-workunit/suites/ffsb}}
I haven't debug deeper, but the warnings are coming from both active (multimds) and standby-replay MDS daemons. The warnings were eventually cleared though, but I think this needs RCA as to why the slow ops were observed. Another thing worth mentioning here is that osd.9 too reported slow ops.
Updated by Venky Shankar about 2 months ago
Discussed a bit in standby - this could be a fallout from the recently introduced mdlog trim decay counter.
Updated by Venky Shankar about 2 months ago
- Related to Feature #61908: mds: provide configuration for trim rate of the journal added
Actions