Actions
Bug #22428
closedmds: don't report slow request for blocked filelock request
Status:
Resolved
Priority:
High
Assignee:
Category:
Administration/Usability
Target version:
% Done:
0%
Source:
other
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS, MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Updated by Patrick Donnelly over 6 years ago
- Assignee set to Zheng Yan
Perhaps reclassify the slow requests blocked by locks as "clients not releasing file locks" or similar to be different from "slow requestS".
Updated by John Spray over 6 years ago
Here's a recent example from someone in #ceph:
2018-01-16 17:47:05.910681 mon.0 [INF] mon.5 192.168.193.73:6789/0 2018-01-16 17:49:14.095764 mon.0 [WRN] MDS health message (mds.1): 1 slow requests are blocked > 30 sec 2018-01-16 17:49:14.095812 mon.0 [INF] daemon mds.manager02 is now active in filesystem cephfs as rank 1 2018-01-16 17:49:14.276275 mon.0 [WRN] Health check failed: 1 MDSs report slow requests (MDS_SLOW_REQUEST) 2018-01-16 17:49:10.248891 mds.1 [WRN] 1 slow requests, 1 included below; oldest blocked for > 31.623102 secs 2018-01-16 17:49:10.248906 mds.1 [WRN] slow request 31.623102 seconds old, received at 2018-01-16 17:48:38.625681: client_request(client.434194:1157564 setfilelockrule 2, type 2, owner 10101539253148314382, pid 14917, start 0, length 0, wait 1 #0x20000001809 2018-01-16 17:48:38.513624 caller_uid=101, caller_gid=102{}) currently acquired locks 2018-01-16 17:49:40.249814 mds.1 [WRN] 1 slow requests, 1 included below; oldest blocked for > 61.624025 secs 2018-01-16 17:49:40.249825 mds.1 [WRN] slow request 61.624025 seconds old, received at 2018-01-16 17:48:38.625681: client_request(client.434194:1157564 setfilelockrule 2, type 2, owner 10101539253148314382, pid 14917, start 0, length 0, wait 1 #0x20000001809 2018-01-16 17:48:38.513624 caller_uid=101, caller_gid=102{}) currently acquired locks 2018-01-16 17:49:58.098323 mon.0 [INF] MDS health message cleared (mds.1): 1 slow requests are blocked > 30 sec 2018-01-16 17:49:58.098366 mon.0 [INF] daemon mds.manager02 is now active in filesystem cephfs as rank 1 2018-01-16 17:49:58.336035 mon.0 [INF] Health check cleared: MDS_SLOW_REQUEST (was: 1 MDSs report slow requests) 2018-01-16 17:49:58.336158 mon.0 [INF] Cluster is now healthy
Updated by Dan van der Ster about 6 years ago
In case you need more examples, we're seeing this recently on 12.2.4:
2018-04-25 10:23:29.964959 mds.cephdwightmds1 [WRN] 1 slow requests, 1 included below; oldest blocked for > 30.144366 secs 2018-04-25 10:23:29.964964 mds.cephdwightmds1 [WRN] slow request 30.144366 seconds old, received at 2018-04-25 10:22:59.820554: client_request(client.575739618:75133749 setfilelockrule 1, type 2, owner 17575622701090626199, pid 27476, start 0, length 0, wait 1 #0x2000309a73f 2018-04-25 10:22:59.819251 caller_uid=0, caller_gid=0{}) currently acquired locks 2018-04-25 10:23:30.965128 mds.cephdwightmds1 [WRN] 3 slow requests, 2 included below; oldest blocked for > 31.144517 secs 2018-04-25 10:23:30.965133 mds.cephdwightmds1 [WRN] slow request 30.891629 seconds old, received at 2018-04-25 10:23:00.073442: client_request(client.575739618:75134492 setfilelockrule 1, type 2, owner 11650441161165249951, pid 9990, start 0, length 0, wait 1 #0x2000309a73f 2018-04-25 10:23:00.072054 caller_uid=0, caller_gid=0{}) currently acquired locks 2018-04-25 10:23:30.965136 mds.cephdwightmds1 [WRN] slow request 30.829593 seconds old, received at 2018-04-25 10:23:00.135478: client_request(client.575739618:75134657 setfilelockrule 1, type 2, owner 11876786896114809271, pid 39172, start 0, length 0, wait 1 #0x2000309a73f 2018-04-25 10:23:00.134047 caller_uid=0, caller_gid=0{}) currently acquired locks 2018-04-25 10:23:33.396905 mon.cephdwight-mon-1633994557 [WRN] MDS health message (mds.1): 3 slow requests are blocked > 30 sec 2018-04-25 10:23:33.752354 mon.cephdwight-mon-1633994557 [WRN] Health check failed: 1 MDSs report slow requests (MDS_SLOW_REQUEST) 2018-04-25 10:23:37.396863 mon.cephdwight-mon-1633994557 [INF] MDS health message cleared (mds.1): 3 slow requests are blocked > 30 sec 2018-04-25 10:23:37.805466 mon.cephdwight-mon-1633994557 [INF] Health check cleared: MDS_SLOW_REQUEST (was: 1 MDSs report slow requests) 2018-04-25 10:23:37.805522 mon.cephdwight-mon-1633994557 [INF] Cluster is now healthy
Updated by Patrick Donnelly about 6 years ago
- Category set to Administration/Usability
- Priority changed from Normal to High
- Target version set to v13.0.0
- Source set to other
- Backport set to luminous
- Component(FS) MDS, MDSMonitor added
Updated by Zheng Yan about 6 years ago
- Status changed from New to Fix Under Review
Updated by Patrick Donnelly about 6 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler about 6 years ago
- Copied to Backport #23989: luminous: mds: don't report slow request for blocked filelock request added
Updated by Nathan Cutler almost 6 years ago
- Status changed from Pending Backport to Resolved
Actions