Bug #50821
openqa: untar_snap_rm failure during mds thrashing
0%
Description
2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: linux-2.6.33/arch/microblaze: Cannot stat: Permission denied 2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: linux-2.6.33/arch: Cannot stat: Permission denied 2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: linux-2.6.33: Cannot stat: Permission denied 2021-05-14T22:51:46.078 INFO:tasks.workunit.client.0.smithi094.stderr:tar: Error is not recoverable: exiting now 2021-05-14T22:51:46.079 DEBUG:teuthology.orchestra.run:got remote process result: 2 2021-05-14T22:51:46.080 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0... 2021-05-14T22:51:46.080 DEBUG:teuthology.orchestra.run.smithi094:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0 2021-05-14T22:51:46.264 ERROR:teuthology.run_tasks:Saw exception from tasks. Traceback (most recent call last): File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/run_tasks.py", line 91, in run_tasks manager = run_one_task(taskname, ctx=ctx, config=config) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/run_tasks.py", line 70, in run_one_task return task(**kwargs) File "/home/teuthworker/src/github.com_batrick_ceph_e78e41c7f45263bfc3d22dafa953b7e485aac84d/qa/tasks/workunit.py", line 147, in task cleanup=cleanup) File "/home/teuthworker/src/github.com_batrick_ceph_e78e41c7f45263bfc3d22dafa953b7e485aac84d/qa/tasks/workunit.py", line 297, in _spawn_on_all_clients timeout=timeout) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 84, in __exit__ for result in self: File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 98, in __next__ resurrect_traceback(result) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 30, in resurrect_traceback raise exc.exc_info[1] File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/parallel.py", line 23, in capture_traceback return func(*args, **kwargs) File "/home/teuthworker/src/github.com_batrick_ceph_e78e41c7f45263bfc3d22dafa953b7e485aac84d/qa/tasks/workunit.py", line 425, in _run_tests label="workunit test {workunit}".format(workunit=workunit) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/remote.py", line 509, in run r = self._runner(client=self.ssh, name=self.shortname, **kwargs) File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/run.py", line 455, in run r.wait() File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/run.py", line 161, in wait self._raise_for_status() File "/home/teuthworker/src/git.ceph.com_git_teuthology_19220a3bd6e252c6e8260827019668a766d85490/teuthology/orchestra/run.py", line 183, in _raise_for_status node=self.hostname, label=self.label teuthology.exceptions.CommandFailedError: Command failed (workunit test fs/snaps/untar_snap_rm.sh) on smithi094 with status 2: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=e78e41c7f45263bfc3d22dafa953b7e485aac84d TESTDIR="/home/ubuntu/cephtest" CEPH_ARGS="--cluster ceph" CEPH_ID="0" PATH=$PATH:/usr/sbin CEPH_BASE=/home/ubuntu/cephtest/clone.client.0 CEPH_ROOT=/home/ubuntu/cephtest/clone.client.0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/clone.client.0/qa/workunits/fs/snaps/untar_snap_rm.sh'
From: /ceph/teuthology-archive/pdonnell-2021-05-14_21:45:42-fs-master-distro-basic-smithi/6115751/teuthology.log
With RHEL stock kernel. Might be related to some other issues I've been suddenly seeing with the stock RHEL kernel.
Updated by Patrick Donnelly almost 3 years ago
I don't think this is related to #50281 but may be.
Updated by Patrick Donnelly almost 3 years ago
- Related to Bug #50823: qa: RuntimeError: timeout waiting for cluster to stabilize added
Updated by Patrick Donnelly almost 3 years ago
- Related to Bug #50824: qa: snaptest-git-ceph bus error added
Updated by Patrick Donnelly almost 3 years ago
- Related to Bug #51278: mds: "FAILED ceph_assert(!segments.empty())" added
Updated by Venky Shankar about 2 years ago
Similar failure here: https://pulpito.ceph.com/vshankar-2022-04-11_12:24:06-fs-wip-vshankar-testing1-20220411-144044-testing-default-smithi/6786336/
although in this instance, we see ESTALE/EIO.
2022-04-11T15:56:23.599 INFO:teuthology.orchestra.run.smithi141.stderr:2022-04-11T15:56:23.590+0000 7f3cba9ff700 1 -- 172.21.15.141:0/3624046670 --> [v2:172.21.15.153:6808/205989,v1:172.21.15.153:6809/205989] -- command(tid 11: {"prefix": "get_command_descriptions"}) v1 -- 0x7f3c90018dc0 con 0x7f3c90011730 2022-04-11T15:56:23.599 INFO:teuthology.orchestra.run.smithi141.stderr:2022-04-11T15:56:23.590+0000 7f3cb37fe700 1 --2- 172.21.15.141:0/3624046670 >> [v2:172.21.15.153:6808/205989,v1:172.21.15.153:6809/205989] conn(0x7f3c90011730 0x7f3c90011b60 unknown :-1 s=BANNER_CONN ECTING pgs=0 cs=0 l=1 rev1=0 crypto rx=0 tx=0 comp rx=0 tx=0)._handle_peer_banner_payload supported=3 required=0 2022-04-11T15:56:23.628 INFO:tasks.ceph.osd.7.smithi153.stderr:2022-04-11T15:56:23.619+0000 7f22a0340700 -1 received signal: Hangup from /usr/bin/python3 /bin/daemon-helper kill ceph-osd -f --cluster ceph -i 7 (PID: 27672) UID: 0 2022-04-11T15:56:23.644 INFO:tasks.workunit.client.0.smithi141.stdout:'.snap/k' -> './k' 2022-04-11T15:56:23.644 INFO:tasks.workunit.client.0.smithi141.stdout:'.snap/k/linux-2.6.33.tar.bz2' -> './k/linux-2.6.33.tar.bz2' 2022-04-11T15:56:23.645 INFO:tasks.workunit.client.0.smithi141.stderr:cp: error writing './k/linux-2.6.33.tar.bz2': Stale file handle 2022-04-11T15:56:23.645 INFO:teuthology.orchestra.run.smithi141.stderr:umount: /home/ubuntu/cephtest/mnt.0: target is busy. 2022-04-11T15:56:23.646 INFO:tasks.workunit.client.0.smithi141.stderr:cp: cannot stat '.snap/k/linux-2.6.33': Input/output error 2022-04-11T15:56:23.646 INFO:tasks.workunit.client.0.smithi141.stderr:cp: preserving times for './k': Input/output error 2022-04-11T15:56:23.647 INFO:teuthology.orchestra.run.smithi141.stderr:2022-04-11T15:56:23.639+0000 7f3cb37fe700 1 --2- 172.21.15.141:0/3624046670 >> [v2:172.21.15.153:6808/205989,v1:172.21.15.153:6809/205989] conn(0x7f3c90011730 0x7f3c90011b60 crc :-1 s=READY pgs=222 cs=0 l=1 rev1=1 crypto rx=0 tx=0 comp rx=0 tx=0).ready entity=osd.5 client_cookie=0 server_cookie=0 in_seq=0 out_seq=0 2022-04-11T15:56:23.647 DEBUG:teuthology.orchestra.run:got remote process result: 1 2022-04-11T15:56:23.648 INFO:tasks.workunit:Stopping ['fs/snaps'] on client.0... 2022-04-11T15:56:23.648 DEBUG:teuthology.orchestra.run.smithi141:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0 2022-04-11T15:56:23.658 DEBUG:teuthology.orchestra.run:got remote process result: 32
Updated by Venky Shankar 9 months ago
This popped up again with centos 9.stream, but I don't think anything to do with the distro. ref: /a/yuriw-2023-07-26_14:28:57-fs-wip-yuri-testing-2023-07-25-0833-reef-distro-default-smithi/7353025
The failures are the usual -EIO errno:
2023-07-26T21:51:59.533 INFO:tasks.workunit.client.0.smithi043.stdout:'.snap/k/linux-2.6.33/drivers/isdn/mISDN/dsp_dtmf.c' -> './k/linux-2.6.33/drivers/isdn/mISDN/dsp_dtmf.c' 2023-07-26T21:51:59.534 INFO:tasks.workunit.client.0.smithi043.stdout:'.snap/k/linux-2.6.33/drivers/isdn/mISDN/dsp_ecdis.h' -> './k/linux-2.6.33/drivers/isdn/mISDN/dsp_ecdis.h' 2023-07-26T21:51:59.534 INFO:tasks.workunit.client.0.smithi043.stdout:'.snap/k/linux-2.6.33/drivers/isdn/mISDN/dsp_hwec.c' -> './k/linux-2.6.33/drivers/isdn/mISDN/dsp_hwec.c' 2023-07-26T21:51:59.534 DEBUG:teuthology.orchestra.run:got remote process result: 1 2023-07-26T21:51:59.535 INFO:tasks.workunit.client.0.smithi043.stderr:cp: cannot stat '.snap/k/linux-2.6.33/drivers/isdn/mISDN/dsp_hwec.h': Input/output error 2023-07-26T21:51:59.535 INFO:tasks.workunit.client.0.smithi043.stderr:cp: cannot stat '.snap/k/linux-2.6.33/drivers/isdn/mISDN/dsp_pipeline.c': Input/output error 2023-07-26T21:51:59.535 INFO:tasks.workunit.client.0.smithi043.stderr:cp: cannot stat '.snap/k/linux-2.6.33/drivers/isdn/mISDN/dsp_tones.c': Input/output error 2023-07-26T21:51:59.535 INFO:tasks.workunit.client.0.smithi043.stderr:cp: cannot stat '.snap/k/linux-2.6.33/drivers/isdn/mISDN/fsm.c': Input/output error 2023-07-26T21:51:59.535 INFO:tasks.workunit.client.0.smithi043.stderr:cp: cannot stat '.snap/k/linux-2.6.33/drivers/isdn/mISDN/fsm.h': Input/output error
No MDS core dumps and/or anything in the kernel ring buffer.
Updated by Venky Shankar about 2 months ago
- Category set to Correctness/Safety
- Assignee set to Xiubo Li
- Target version set to v20.0.0
This is the latest instance - https://pulpito.ceph.com/pdonnell-2024-03-20_18:16:52-fs-wip-batrick-testing-20240320.145742-distro-default-smithi/7612983/
Nothing in the kernel ring buffer.
Updated by Venky Shankar about 2 months ago
- Related to Bug #64707: suites/fsstress.sh hangs on one client - test times out added
Updated by Patrick Donnelly 13 days ago
Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381041 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381043 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381045 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381047 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381049 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381051 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381053 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381055 Apr 21 02:55:25 smithi043 kernel: ceph: dropping unsafe request 381057 Apr 21 02:55:25 smithi043 kernel: ceph: ceph_do_invalidate_pages: inode 1000000b5c4.fffffffffffffffe is shut down Apr 21 02:55:25 smithi043 kernel: ceph: ceph_do_invalidate_pages: inode 1000000bc24.fffffffffffffffe is shut down Apr 21 02:55:25 smithi043 kernel: ceph: ceph_do_invalidate_pages: inode 1000000bd57.fffffffffffffffe is shut down Apr 21 02:55:25 smithi043 kernel: ceph: ceph_do_invalidate_pages: inode 1000000c4dd.fffffffffffffffe is shut down Apr 21 02:55:25 smithi043 kernel: ceph: ceph_do_invalidate_pages: inode 1000000c4e7.fffffffffffffffe is shut down Apr 21 02:55:25 smithi043 kernel: ceph: ceph_do_invalidate_pages: inode 1000000c4e6.fffffffffffffffe is shut down Apr 21 02:55:26 smithi043 sudo[71057]: ubuntu : PWD=/home/ubuntu ; USER=root ; COMMAND=/bin/rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0 Apr 21 02:55:26 smithi043 sudo[71057]: pam_unix(sudo:session): session opened for user root(uid=0) by ubuntu(uid=1000) Apr 21 02:55:26 smithi043 sudo[71089]: ubuntu : PWD=/home/ubuntu ; USER=root ; ENV=PATH=/usr/sbin:/home/ubuntu/.local/bin:/home/ubuntu/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbinCOMMAND=/bin/lsof Apr 21 02:55:26 smithi043 sudo[71089]: pam_unix(sudo:session): session opened for user root(uid=0) by ubuntu(uid=1000) Apr 21 02:55:26 smithi043 sudo[71057]: pam_unix(sudo:session): session closed for user root Apr 21 02:55:27 smithi043 kernel: libceph: mds0 (1)172.21.15.73:6839 socket closed (con state V1_BANNER)
From: /teuthology/pdonnell-2024-04-20_23:33:17-fs-wip-pdonnell-testing-20240420.180737-debug-distro-default-smithi/7665863/remote/smithi043/syslog/journalctl-b0.gz
Updated by Xiubo Li 8 days ago
Patrick Donnelly wrote in #note-10:
[...]
From: /teuthology/pdonnell-2024-04-20_23:33:17-fs-wip-pdonnell-testing-20240420.180737-debug-distro-default-smithi/7665863/remote/smithi043/syslog/journalctl-b0.gz
The client.4607 closed the session at 2024-04-21T02:55:25.991:
2024-04-21T02:55:25.773+0000 7f1135472640 10 mds.2.log trim 2 / 128 segments, 8 / -1 events, 0 (0) expiring, 0 (0) expired 2024-04-21T02:55:25.773+0000 7f1135472640 10 mds.2.log trim: new_expiring_segments=0, num_remaining_segments=2, max_segments=128 2024-04-21T02:55:25.773+0000 7f1135472640 10 mds.2.log trim: breaking out of trim loop - segments/events fell below ceiling max_segments/max_ev 2024-04-21T02:55:25.773+0000 7f1135472640 20 mds.2.log _trim_expired_segments: examining LogSegment(1/0x400000 events=1) 2024-04-21T02:55:25.773+0000 7f1135472640 10 mds.2.log _trim_expired_segments waiting for expiry LogSegment(1/0x400000 events=1) 2024-04-21T02:55:25.991+0000 7f1139c7b640 1 -- [v2:172.21.15.43:6832/3227946235,v1:172.21.15.43:6834/3227946235] <== client.4607 v1:172.21.15.43:0/3153587376 12 ==== client_session(request_close seq 2) ==== 28+0+0 (unknown 95654502 0 0) 0x556cf7d41200 con 0x556cf7f8bb00 2024-04-21T02:55:25.991+0000 7f1139c7b640 20 mds.2.177 get_session have 0x556cf7f5e800 client.4607 v1:172.21.15.43:0/3153587376 state open 2024-04-21T02:55:25.991+0000 7f1139c7b640 3 mds.2.server handle_client_session client_session(request_close seq 2) from client.4607 2024-04-21T02:55:25.991+0000 7f1139c7b640 10 mds.2.server journal_close_session : client.4607 v1:172.21.15.43:0/3153587376 pending_prealloc_inos [] free_prealloc_inos [] delegated_inos [] 2024-04-21T02:55:25.991+0000 7f1139c7b640 20 mds.2.sessionmap mark_projected s=0x556cf7f5e800 name=client.4607 pv=6 -> 7 2024-04-21T02:55:25.991+0000 7f1139c7b640 20 mds.2.log _submit_entry ESession client.4607 v1:172.21.15.43:0/3153587376 close cmapv 7 2024-04-21T02:55:25.991+0000 7f113246c640 5 mds.2.log _submit_thread 4195728~121 : ESession client.4607 v1:172.21.15.43:0/3153587376 close cmapv 7 2024-04-21T02:55:25.991+0000 7f113246c640 1 -- [v2:172.21.15.43:6832/3227946235,v1:172.21.15.43:6834/3227946235] --> [v2:172.21.15.73:6800/1594527196,v1:172.21.15.73:6801/1594527196] -- osd_op(unknown.0.177:43 2.11 2:8975f766:::202.00000001:head [write 1424~141 [fadvise_dontneed] in=141b] snapc 0=[] ondisk+write+known_if_redirected+full_force+supports_pool_eio e132) -- 0x556cf7f57800 con 0x556cf7db7680 2024-04-21T02:55:25.991+0000 7f113246c640 1 -- [v2:172.21.15.43:6832/3227946235,v1:172.21.15.43:6834/3227946235] --> [v2:172.21.15.43:6808/2211093047,v1:172.21.15.43:6810/2211093047] -- osd_op(unknown.0.177:44 2.1 2:85bbe569:::202.00000000:head [writefull 0~90 [fadvise_dontneed] in=90b] snapc 0=[] ondisk+write+known_if_redirected+full_force+supports_pool_eio e132) -- 0x556cf80ccc00 con 0x556cf7db7f80 2024-04-21T02:55:25.995+0000 7f113c480640 1 -- [v2:172.21.15.43:6832/3227946235,v1:172.21.15.43:6834/3227946235] <== osd.2 v2:172.21.15.43:6808/2211093047 13 ==== osd_op_reply(44 202.00000000 [writefull 0~90 [fadvise_dontneed]] v132'2411 uv2411 ondisk = 0) ==== 156+0+0 (crc 0 0 0) 0x556cf708ca00 con 0x556cf7db7f80 2024-04-21T02:55:25.995+0000 7f113cc81640 1 -- [v2:172.21.15.43:6832/3227946235,v1:172.21.15.43:6834/3227946235] <== osd.6 v2:172.21.15.73:6800/1594527196 9 ==== osd_op_reply(43 202.00000001 [write 1424~141 [fadvise_dontneed]] v132'1035 uv1035 ondisk = 0) ==== 156+0+0 (crc 0 0 0) 0x556cf708c280 con 0x556cf7db7680 2024-04-21T02:55:25.995+0000 7f113346e640 10 MDSIOContextBase::complete: 20C_MDS_session_finish 2024-04-21T02:55:25.995+0000 7f113346e640 10 MDSContext::complete: 20C_MDS_session_finish 2024-04-21T02:55:25.995+0000 7f113346e640 10 mds.2.server _session_logged client.4607 v1:172.21.15.43:0/3153587376 state_seq 2 close 7 inos_to_free [] inotablev 0 inos_to_purge [] 2024-04-21T02:55:25.995+0000 7f113346e640 20 mds.2.sessionmap mark_dirty s=0x556cf7f5e800 name=client.4607 v=6 2024-04-21T02:55:25.995+0000 7f113346e640 10 mds.2.177 send_message_client client.4607 v1:172.21.15.43:0/3153587376 client_session(close) 2024-04-21T02:55:25.995+0000 7f113346e640 1 -- [v2:172.21.15.43:6832/3227946235,v1:172.21.15.43:6834/3227946235] --> v1:172.21.15.43:0/3153587376 -- client_session(close) -- 0x556cf8156000 con 0x556cf7f8bb00 2024-04-21T02:55:25.995+0000 7f113346e640 10 remove_session: mds.metrics: session=0x556cf7f5e800, client=client.4607 v1:172.21.15.43:0/3153587376
While the unmount seems done just before the files to be removed:
2024-04-21T02:55:25.942 DEBUG:tasks.cephfs.kernel_mount:Unmounting client client.0... 2024-04-21T02:55:25.942 INFO:teuthology.orchestra.run:Running command with timeout 300 2024-04-21T02:55:25.942 DEBUG:teuthology.orchestra.run.smithi043:> sudo umount /home/ubuntu/cephtest/mnt.0 -f 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/platform.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/irqs.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/param.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/i2c.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/entry-macro.S' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/memory.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/system.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/timex.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/io.h' 2024-04-21T02:55:25.991 INFO:tasks.workunit.client.0.smithi043.stdout:removed 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach/vmalloc.h' 2024-04-21T02:55:25.992 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/include/mach': Input/output error 2024-04-21T02:55:25.992 INFO:teuthology.orchestra.run.smithi043.stderr:umount: /home/ubuntu/cephtest/mnt.0: target is busy. 2024-04-21T02:55:25.994 DEBUG:teuthology.orchestra.run:got remote process result: 1 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/Makefile.boot': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/Makefile': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/gpio.c': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/serial.c': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/dma.c': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/i2c.c': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/core.c': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-pnx4008/time.c': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-ebsa110': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/plat-stmp3xxx': Input/output error 2024-04-21T02:55:25.995 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-davinci': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-s3c6400': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-at91': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/boot': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-s3c2410': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-mx25': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/plat-omap': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/plat-orion': Input/output error 2024-04-21T02:55:25.996 INFO:tasks.workunit.client.0.smithi043.stderr:rm: cannot remove 'linux-2.6.33/arch/arm/mach-clps711x': Input/output error
That means some files would be removed just after the mountpoint was unmounted.
Updated by Xiubo Li 8 days ago
It seems the mds.b daemon wasn't brought up in 300s and then the watchdog barked and then all the daemons were killed and all the mountpoints were unmounted during the test was going on:
2024-04-21T02:55:23.499 INFO:tasks.mds_thrash.fs.[cephfs]:no change 2024-04-21T02:55:24.142 INFO:tasks.daemonwatchdog.daemon_watchdog:daemon ceph.mds.b is failed for ~304s 2024-04-21T02:55:24.142 INFO:tasks.daemonwatchdog.daemon_watchdog:BARK! unmounting mounts and killing all daemons 2024-04-21T02:55:24.142 DEBUG:teuthology.orchestra.run.smithi043:> set -ex 2024-04-21T02:55:24.142 DEBUG:teuthology.orchestra.run.smithi043:> dd if=/proc/self/mounts of=/dev/stdout 2024-04-21T02:55:24.172 DEBUG:teuthology.orchestra.run.smithi043:> set -ex 2024-04-21T02:55:24.173 DEBUG:teuthology.orchestra.run.smithi043:> dd if=/proc/self/mounts of=/dev/stdout
Updated by Xiubo Li 8 days ago
Okay, finally it was because the mds.b crashed and this was why it wasn't brought up:
-7> 2024-04-21T02:50:17.623+0000 7f32c63a7640 10 mds.4.cache got inode locks [inode 0x608 [...301,head] ~mds0/stray8/ rep@0.2 fragtree_t(*^3) v146838 f(v4 m2024-04-21T02:49:06.802174+0000 303=189+114) n(v8 rc2024-04-21T02:49:06.802174+0000 115=0+115) old_inodes=5 (inest mix r) (ifile mix) 0x55cb01dc0c00] -6> 2024-04-21T02:50:17.623+0000 7f32c63a7640 10 mds.4.cache got inode locks [inode 0x604 [...301,head] ~mds0/stray4/ rep@0.2 v147574 f(v4 m2024-04-21T02:49:25.648924+0000 225=153+72) n(v9 rc2024-04-21T02:49:25.648924+0000 73=0+73) old_inodes=11 (inest mix r) (ifile mix) 0x55cb01f7e580] -5> 2024-04-21T02:50:17.623+0000 7f32c63a7640 10 mds.4.cache got inode locks [inode 0x10000000000 [...301,head] /client.0/ rep@0.2 v3184 f(v0 m2024-04-21T02:40:21.081153+0000 1=0+1) n(v276 rc2024-04-21T02:49:28.824882+0000 b216805083 rs1 18657=17439+1218) old_inodes=226 (inest mix r) | dirfrag=1 0x55cb01eac580] -4> 2024-04-21T02:50:17.623+0000 7f32c63a7640 10 mds.4.cache got inode locks [inode 0x1000000a2ed [...301,head] /client.0/tmp/ rep@0.2 v6175 snaprealm=0x55cb01e69440 f(v0 m2024-04-21T02:42:45.574234+0000 2=1+1) n(v35 rc2024-04-21T02:49:28.824882+0000 b216805083 rs1 18656=17439+1217) old_inodes=2 (inest mix r) 0x55cb01eac000] -3> 2024-04-21T02:50:17.623+0000 7f32c63a7640 10 mds.4.cache got inode locks [inode 0x1 [...301,head] / rep@0.2 v1162 snaprealm=0x55cb01e686c0 f(v0 m2024-04-21T01:57:20.110195+0000 1=0+1) n(v119 rc2024-04-21T02:49:28.467887+0000 b217004861 rs1 18688=17466+1222)/n(v0 rc2024-04-21T01:56:41.512966+0000 1=0+1) old_inodes=168 (inest mix r) | dirfrag=1 discoverbase=0 0x55cb01e70c00] -2> 2024-04-21T02:50:17.623+0000 7f32c63a7640 10 mds.4.cache got inode locks [inode 0x100 [...301,head] ~mds0/ rep@0.2 v1051 snaprealm=0x55cb01e68240 f(v0 10=0+10) n(v193 rc2024-04-21T02:49:30.705857+0000 b4699681 1138=339+799)/n(v0 rc2024-04-21T01:56:41.514395+0000 11=0+11) old_inodes=136 (inest mix r) | dirfrag=1 discoverbase=0 0x55cb01dc0680] -1> 2024-04-21T02:50:17.624+0000 7f32c63a7640 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-3201-g0b60fd01/rpm/el9/BUILD/ceph-19.0.0-3201-g0b60fd01/src/mds/MDCache.cc: In function 'void MDCache::handle_cache_rejoin_ack(ceph::cref_t<MMDSCacheRejoin>&)' thread 7f32c63a7640 time 2024-04-21T02:50:17.624573+0000 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-3201-g0b60fd01/rpm/el9/BUILD/ceph-19.0.0-3201-g0b60fd01/src/mds/MDCache.cc: 5158: FAILED ceph_assert(isolated_inodes.empty()) ceph version 19.0.0-3201-g0b60fd01 (0b60fd01511511bc020e1a45638ede6ead9e38ec) squid (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x127) [0x7f32cc04aa17] 2: /usr/lib64/ceph/libceph-common.so.2(+0x24ac24) [0x7f32cc04ac24] 3: (MDCache::handle_cache_rejoin_ack(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x232f) [0x55cafbf9e7a3] 4: (MDCache::handle_cache_rejoin(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x395) [0x55cafbf95f75] 5: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0xec) [0x55cafbfbd400] 6: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x11d) [0x55cafbe09b8b] 7: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x28f) [0x55cafbe0e975] 8: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x94) [0x55cafbe0f29c] 9: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x2c4) [0x55cafbdf14aa] 10: /usr/lib64/ceph/libceph-common.so.2(+0x394187) [0x7f32cc194187] 11: (DispatchQueue::entry()+0x837) [0x7f32cc194cf1] 12: /usr/lib64/ceph/libceph-common.so.2(+0x474271) [0x7f32cc274271] 13: (Thread::entry_wrapper()+0x43) [0x7f32cc0247f5] 14: (Thread::_entry_func(void*)+0xd) [0x7f32cc024811] 15: /lib64/libc.so.6(+0x89c02) [0x7f32cb689c02] 16: /lib64/libc.so.6(+0x10ec40) [0x7f32cb70ec40] 0> 2024-04-21T02:50:17.625+0000 7f32c63a7640 -1 *** Caught signal (Aborted) ** in thread 7f32c63a7640 thread_name:ms_dispatch ceph version 19.0.0-3201-g0b60fd01 (0b60fd01511511bc020e1a45638ede6ead9e38ec) squid (dev) 1: /lib64/libc.so.6(+0x3e6f0) [0x7f32cb63e6f0] 2: /lib64/libc.so.6(+0x8b94c) [0x7f32cb68b94c] 3: raise() 4: abort() 5: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x262) [0x7f32cc04ab52] 6: /usr/lib64/ceph/libceph-common.so.2(+0x24ac24) [0x7f32cc04ac24] 7: (MDCache::handle_cache_rejoin_ack(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x232f) [0x55cafbf9e7a3] 8: (MDCache::handle_cache_rejoin(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x395) [0x55cafbf95f75] 9: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0xec) [0x55cafbfbd400] 10: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x11d) [0x55cafbe09b8b] 11: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x28f) [0x55cafbe0e975] 12: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x94) [0x55cafbe0f29c]
Updated by Xiubo Li 8 days ago
This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It should be other cases could cause it instead of the `subtree`.
I need to dig it further to see what has happened.
Updated by Xiubo Li 7 days ago
This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It should be other cases could cause it instead of the `subtree`.
I need to dig it further to see what has happened.
Xiubo Li wrote in #note-14:
This is the same issue with https://tracker.ceph.com/issues/62036, which has already been fixed and it hit again. It should be other cases could cause it instead of the `subtree`.
I need to dig it further to see what has happened.
This time the rejoin has already successfully done just before the crash:
2024-04-21T02:41:46.654+0000 7f32c63a7640 7 mds.4.cache handle_cache_rejoin cache_rejoin ack from mds.0 (130 bytes) 2024-04-21T02:41:46.654+0000 7f32c63a7640 7 mds.4.cache handle_cache_rejoin_ack from mds.0 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache exporting caps for client.4607 ino 0x10000000000 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.171 send_message_client_counted client.4607 seq 1 client_caps(export ino 0x10000000000 1 seq 0 caps=- dirty=- wanted=- follows 0 size 0/0 mtime 0.000000 ctime 0.000000 change_attr 0) 2024-04-21T02:41:46.654+0000 7f32c63a7640 1 -- [v2:172.21.15.73:6835/2782686517,v1:172.21.15.73:6837/2782686517] --> v1:172.21.15.43:0/3153587376 -- client_caps(export ino 0x10000000000 1 seq 0 caps=- dirty=- wanted=- follows 0 size 0/0 mtime 0.000000 ctime 0.000000 change_attr 0) -- 0x55cb01ddb500 con 0x55cb01e7ed80 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache open_snaprealms 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache send_snaps 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache.snaprealm(0x3 seq 1 0x55cb01e68480) build_snap_set on snaprealm(0x3 seq 1 lc 0 cr 0 cps 1 snaps={} last_modified 0.000000 change_attr 0 0x55cb01e68480) 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache.snaprealm(0x3 seq 1 0x55cb01e68480) build_snap_trace my_snaps [2fe] 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache.snaprealm(0x3 seq 1 0x55cb01e68480) check_cache rebuilt 2fe seq 2fe cached_seq 2fe cached_last_created 2fe cached_last_destroyed 2fd) 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache finish_snaprealm_reconnect client.4607 up to date on snaprealm(0x3 seq 1 lc 0 cr 0 cps 1 snaps={} last_modified 0.000000 change_attr 0 0x55cb01e68480) 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache send_snaps 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.171 send_message_client_counted client.4607 seq 2 client_snap(update split=0x3 tracelen=56) 2024-04-21T02:41:46.654+0000 7f32c63a7640 1 -- [v2:172.21.15.73:6835/2782686517,v1:172.21.15.73:6837/2782686517] --> v1:172.21.15.43:0/3153587376 -- client_snap(update split=0x3 tracelen=56) -- 0x55cb01c410e0 con 0x55cb01e7ed80 2024-04-21T02:41:46.654+0000 7f32c63a7640 5 mds.4.cache open_snaprealms has unconnected snaprealm: 2024-04-21T02:41:46.654+0000 7f32c63a7640 5 mds.4.cache 0x1 {client.4607/1} 2024-04-21T02:41:46.654+0000 7f32c63a7640 5 mds.4.cache 0x1000000a2ed {client.4607/2fe} 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache open_snaprealms - all open 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache do_delayed_cap_imports 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 MDSContext::complete: 12C_MDS_VoidFn 2024-04-21T02:41:46.654+0000 7f32c63a7640 1 mds.4.171 rejoin_done 2024-04-21T02:41:46.654+0000 7f32c63a7640 15 mds.4.cache show_subtrees 2024-04-21T02:41:46.654+0000 7f32c63a7640 10 mds.4.cache |__ 4 auth [dir 0x104 ~mds4/ [2,head] auth v=1 cv=0/0 dir_auth=4 state=1073741824 f(v0 10=0+10) n(v0 rc2024-04-21T01:57:04.622273+0000 10=0+10) hs=0+0,ss=0+0 | subtree=1 subtreetemp=0 0x55cb01c4fa80] 2024-04-21T02:41:46.654+0000 7f32c63a7640 7 mds.4.cache show_cache 2024-04-21T02:41:46.654+0000 7f32c63a7640 7 mds.4.cache unlinked [inode 0x3 [...2,head] #3/ auth v1 snaprealm=0x55cb01e68480 f() n(v0 rc2024-04-21T02:41:31.515383+0000 1=0+1) 0x55cb01e6e000] 2024-04-21T02:41:46.654+0000 7f32c63a7640 7 mds.4.cache unlinked [inode 0x104 [...2,head] ~mds4/ auth v1 snaprealm=0x55cb01c2dd40 f(v0 10=0+10) n(v0 rc2024-04-21T01:57:04.622273+0000 11=0+11) | dirfrag=1 openingsnapparents=0 0x55cb00ff3180] 2024-04-21T02:41:46.654+0000 7f32c63a7640 7 mds.4.cache dirfrag [dir 0x104 ~mds4/ [2,head] auth v=1 cv=0/0 dir_auth=4 state=1073741824 f(v0 10=0+10) n(v0 rc2024-04-21T01:57:04.622273+0000 10=0+10) hs=0+0,ss=0+0 | subtree=1 subtreetemp=0 0x55cb01c4fa80] 2024-04-21T02:41:46.654+0000 7f32c63a7640 3 mds.4.171 request_state up:active 2024-04-21T02:41:46.654+0000 7f32c63a7640 5 mds.beacon.b set_want_state: up:rejoin -> up:active 2024-04-21T02:41:46.654+0000 7f32c63a7640 5 mds.beacon.b Sending beacon up:active seq 32 ... -1> 2024-04-21T02:50:17.624+0000 7f32c63a7640 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-3201-g0b60fd01/rpm/el9/BUILD/ceph-19.0.0-3201-g0b60fd01/src/mds/MDCache.cc: In function 'void MDCache::handle_cache_rejoin_ack(ceph::cref_t<MMDSCacheRejoin>&)' thread 7f32c63a7640 time 2024-04-21T02:50:17.624573+0000 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos9/DIST/centos9/MACHINE_SIZE/gigantic/release/19.0.0-3201-g0b60fd01/rpm/el9/BUILD/ceph-19.0.0-3201-g0b60fd01/src/mds/MDCache.cc: 5158: FAILED ceph_assert(isolated_inodes.empty()) ceph version 19.0.0-3201-g0b60fd01 (0b60fd01511511bc020e1a45638ede6ead9e38ec) squid (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x127) [0x7f32cc04aa17] 2: /usr/lib64/ceph/libceph-common.so.2(+0x24ac24) [0x7f32cc04ac24] 3: (MDCache::handle_cache_rejoin_ack(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x232f) [0x55cafbf9e7a3] 4: (MDCache::handle_cache_rejoin(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x395) [0x55cafbf95f75] 5: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0xec) [0x55cafbfbd400] 6: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x11d) [0x55cafbe09b8b]
Updated by Xiubo Li 7 days ago
Yeah, really this time we hit another case. The local MDS was in up:active state but not others, so in this case the local MDS need to start a rejoin too:
2024-04-21T02:50:11.680+0000 7f32c4ba4640 20 mds.4.171 updating export targets, currently 0 ranks are targets 2024-04-21T02:50:11.688+0000 7f32c63a7640 1 -- [v2:172.21.15.73:6835/2782686517,v1:172.21.15.73:6837/2782686517] <== mds.1 v2:172.21.15.43:6836/3580610477 10 ==== ==== 50+0+0 (crc 0 0 0) 0x55cb02051200 con 0x55cb01f78d00 2024-04-21T02:50:11.688+0000 7f32c63a7640 10 quiesce.mds.4 <quiesce_dispatch> got q-db[v:(198:0) sets:0/0] from 9054 2024-04-21T02:50:11.688+0000 7f32c63a7640 3 quiesce.mds.4 <quiesce_dispatch> error (-116) submitting q-db[v:(198:0) sets:0/0] from 9054 2024-04-21T02:50:11.690+0000 7f32c63a7640 1 -- [v2:172.21.15.73:6835/2782686517,v1:172.21.15.73:6837/2782686517] <== mon.1 v2:172.21.15.73:3300/0 205 ==== mdsmap(e 198) ==== 3684+0+0 (secure 0 0 0) 0x55cb01e05e00 con 0x55cb01019180 2024-04-21T02:50:11.690+0000 7f32c63a7640 1 mds.b Updating MDS map to version 198 from mon.1 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.b my compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2,11=minor log segments,12=quiesce subvolumes} 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.b mdsmap compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses versioned encoding,6=dirfrag is stored in omap,7=mds uses inline data,8=no anchor table,9=file layout v2,10=snaprealm v2,11=minor log segments,12=quiesce subvolumes} 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.b my gid is 8158 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.b map says I am mds.4.171 state up:active 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.b msgr says I am [v2:172.21.15.73:6835/2782686517,v1:172.21.15.73:6837/2782686517] 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.b handle_mds_map: handling map as rank 4 2024-04-21T02:50:11.691+0000 7f32c63a7640 1 mds.4.171 rejoin_joint_start 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.4.cache rejoin_send_rejoins with recovery_set 0,1,2,3 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.4.cache disambiguate_other_imports 2024-04-21T02:50:11.691+0000 7f32c63a7640 10 mds.4.cache rejoin_walk [dir 0x1 / [2,head] rep@0.1 dir_auth=0 state=0 f(v0 m2024-04-21T01:57:20.110195+0000 1=0+1) n(v95 rc2024-04-21T02:41:15.972424+0000 b19129474 rs1 438=425+13) hs=1+0,ss=0+0 | dnwaiter=0 child=1 subtree=1 0x55cb01c4d200] 2024-04-21T02:50:11.691+0000 7f32c63a7640 15 mds.4.cache add_strong_dirfrag [dir 0x1 / [2,head] rep@0.1 dir_auth=0 state=0 f(v0 m2024-04-21T01:57:20.110195+0000 1=0+1) n(v95 rc2024-04-21T02:41:15.972424+0000 b19129474 rs1 438=425+13) hs=1+0,ss=0+0 | dnwaiter=0 child=1 subtree=1 0x55cb01c4d200]
We should fix this case too.