Bug #64729: mon.a (mon.0) 1281 : cluster 3 [WRN] MDS_SLOW_METADATA_IO: 3 MDSs report slow metadata IOs" in cluster log - CephFS - Ceph

Actions

Copy link

Bug #64729

open

mon.a (mon.0) 1281 : cluster 3 [WRN] MDS_SLOW_METADATA_IO: 3 MDSs report slow metadata IOs" in cluster log

Added by Venky Shankar 2 months ago. Updated about 2 months ago.

Status:

Triaged

Priority:

Normal

Assignee:

Patrick Donnelly

Category:

Correctness/Safety

Target version:

Ceph - v20.0.0

% Done:

Source:

Tags:

Backport:

quincy,reef,squid

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDS

Labels (FS):

multimds

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

/a/vshankar-2024-03-04_08:26:39-fs-wip-vshankar-testing-20240304.042522-testing-default-smithi/7580913

Description: fs/workload/{0-centos_9.stream begin/{0-install 1-cephadm 2-logrotate} clusters/1a11s-mds-1c-client-3node conf/{client mds mon osd} mount/kclient/{base/{mount-syntax/{v1} mount overrides/{distro/stock/{centos_9.stream k-stock} ms-die-on-skipped}} ms_mode/crc wsync/no} objectstore-ec/bluestore-ec-root omap_limit/10000 overrides/{cephsqlite-timeout frag ignorelist_health ignorelist_wrongly_marked_down osd-asserts session_timeout} ranks/multi/{balancer/random export-check n/5 replication/default} standby-replay tasks/{0-subvolume/{with-quota} 1-check-counter 2-scrub/no 3-snaps/no 4-flush/yes 5-workunit/suites/ffsb}}

I haven't debug deeper, but the warnings are coming from both active (multimds) and standby-replay MDS daemons. The warnings were eventually cleared though, but I think this needs RCA as to why the slow ops were observed. Another thing worth mentioning here is that osd.9 too reported slow ops.

Related issues 1 (0 open — 1 closed)