Project

General

Profile

Actions

Bug #2218

closed

CephFS "mismatch between child accounted_rstats and my rstats!"

Added by Matthew Roy about 12 years ago. Updated over 10 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The mismatch is detected at 2012-03-26 18:39:54.306661

2012-03-26 18:39:54.306317 7f43b6c4b700 mds.0.locker  got rdlock on (ipolicy sync r=1) [inode 10000000001 [...2,head] /home/matthew/ auth v27952 ap=1+1 f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226463=210629+15834) (iauth sync r=1) (isnap sync r=1) (inest lock w=1) (ipolicy sync r=1) (ifile excl w=1) (iversion lock) caps={4707=pAsLsXsFsx/-@6},l=4707 | lock dirfrag caps dirty authpin 0x124d920]
2012-03-26 18:39:54.306329 7f43b6c4b700 mds.0.inotable: project_alloc_id 0 to [1000003ade1~1fffffc521e]/[1000003ade1~1fffffc521e]
2012-03-26 18:39:54.306334 7f43b6c4b700 mds.0.server prepare_new_inode alloc 1000003ade1
2012-03-26 18:39:54.306340 7f43b6c4b700 mds.0.inotable: project_alloc_ids [1000003ade2~3e8] to [1000003b1ca~1fffffc4e35]/[1000003ade1~1fffffc521e]
2012-03-26 18:39:54.306345 7f43b6c4b700 mds.0.server prepare_new_inode prealloc [1000003ade2~3e8]
2012-03-26 18:39:54.306348 7f43b6c4b700 mds.0.server  dir mode 040700 new mode 0100644
2012-03-26 18:39:54.306355 7f43b6c4b700 mds.0.server prepare_new_inode [inode 1000003ade1 [2,head] #1000003ade1 auth v1 s=0 n() (iversion lock) 0x1996a1e0]
2012-03-26 18:39:54.306360 7f43b6c4b700 mds.0.cache.dir(10000000001) pre_dirty 40971
2012-03-26 18:39:54.306368 7f43b6c4b700 mds.0.cache.den(10000000001 .dmrc.FS6WBW)  pre_dirty [dentry #1/home/matthew/.dmrc.FS6WBW [2,head] auth NULL (dn xlock x=1 by 0x4862c800) (dversion lock w=1 last_client=4707) pv=40971 v=40970 ap=2+0 inode=0 | request lock authpin 0x12a6308]
2012-03-26 18:39:54.306375 7f43b6c4b700 mds.0.server journal_allocated_inos sessionmapv 482240 inotablev 241135
2012-03-26 18:39:54.306387 7f43b6c4b700 mds.0.cache predirty_journal_parents do_parent_mtime linkunlink=1 primary_dn follows head [inode 1000003ade1 [2,head] {#1000003ade1 /home/matthew/.dmrc.FS6WBW} auth v40971 s=0 n(v0 1=1+0)/n() (iversion lock) cr={4707=0-4194304@1} 0x1996a1e0]
2012-03-26 18:39:54.306400 7f43b6c4b700 mds.0.cache.dir(10000000001) auth_pin by 0x4862c800 on [dir 10000000001 /home/matthew/ [2,head] auth v=40970 cv=40821/40821 ap=1+2+2 state=1610612738|complete f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833) hs=124+4,ss=0+0 dirty=10 | child dirty authpin 0x125ebe8] count now 1 + 2
2012-03-26 18:39:54.306405 7f43b6c4b700 mds.0.cache.dir(10000000001) project_fnode 0x24e79c0
2012-03-26 18:39:54.306408 7f43b6c4b700 mds.0.cache.dir(10000000001) pre_dirty 40972
2012-03-26 18:39:54.306419 7f43b6c4b700 mds.0.cache predirty_journal_parents updating mtime on [dir 10000000001 /home/matthew/ [2,head] auth pv=40972 v=40970 cv=40821/40821 ap=1+2+2 state=1610612738|complete f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833) hs=124+4,ss=0+0 dirty=10 | child dirty authpin 0x125ebe8]
2012-03-26 18:39:54.306430 7f43b6c4b700 mds.0.cache predirty_journal_parents updating size on [dir 10000000001 /home/matthew/ [2,head] auth pv=40972 v=40970 cv=40821/40821 ap=1+2+2 state=1610612738|complete f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833) hs=124+4,ss=0+0 dirty=10 | child dirty authpin 0x125ebe8]
2012-03-26 18:39:54.306440 7f43b6c4b700 mds.0.cache projected_rstat_inode_to_frag first 2 linkunlink 1 [inode 1000003ade1 [2,head] {#1000003ade1 /home/matthew/.dmrc.FS6WBW} auth v40971 s=0 n(v0 1=1+0)/n() (iversion lock) cr={4707=0-4194304@1} 0x1996a1e0]
2012-03-26 18:39:54.306444 7f43b6c4b700 mds.0.cache     frag head is [2,head] 
2012-03-26 18:39:54.306447 7f43b6c4b700 mds.0.cache  inode update is [2,head]
2012-03-26 18:39:54.306455 7f43b6c4b700 mds.0.cache  floor of 2 from parent dn [dentry #1/home/matthew/.dmrc.FS6WBW [2,head] auth NULL (dn xlock x=1 by 0x4862c800) (dversion lock w=1 last_client=4707) pv=40971 v=40970 ap=2+0 inode=0 | request lock authpin 0x12a6308]
2012-03-26 18:39:54.306459 7f43b6c4b700 mds.0.cache _project_rstat_inode_to_frag [2,head]
2012-03-26 18:39:54.306468 7f43b6c4b700 mds.0.cache   inode           rstat n(v0 1=1+0)
2012-03-26 18:39:54.306471 7f43b6c4b700 mds.0.cache   inode accounted_rstat n()
2012-03-26 18:39:54.306475 7f43b6c4b700 mds.0.cache                   delta n(v0 1=1+0)
2012-03-26 18:39:54.306481 7f43b6c4b700 mds.0.cache  projecting to head [2,head] n(v2508 rc2012-03-26 18:39:54.306322 b44020242481 226462=210629+15833)
2012-03-26 18:39:54.306486 7f43b6c4b700 mds.0.cache   project to [2,head] n(v2508 rc2012-03-26 18:39:54.306322 b44020242481 226462=210629+15833)
2012-03-26 18:39:54.306499 7f43b6c4b700 mds.0.cache       result [2,head] n(v2508 rc2012-03-26 18:39:54.306322 b44020242481 226463=210630+15833) [dir 10000000001 /home/matthew/ [2,head] auth pv=40972 v=40970 cv=40821/40821 ap=1+2+2 state=1610612738|complete f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833) hs=124+4,ss=0+0 dirty=10 | child dirty authpin 0x125ebe8]
2012-03-26 18:39:54.306516 7f43b6c4b700 mds.0.locker local_wrlock_grab  on (iversion lock) on [inode 10000000001 [...2,head] /home/matthew/ auth v27952 ap=1+1 f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226463=210629+15834) (iauth sync r=1) (isnap sync r=1) (inest lock w=1) (ipolicy sync r=1) (ifile excl w=1) (iversion lock) caps={4707=pAsLsXsFsx/-@6},l=4707 | lock dirfrag caps dirty authpin 0x124d920]
2012-03-26 18:39:54.306523 7f43b6c4b700 mds.0.cache.ino(10000000001) project_inode 0x49e15000
2012-03-26 18:39:54.306527 7f43b6c4b700 mds.0.cache.dir(10000000000) pre_dirty 27954
2012-03-26 18:39:54.306534 7f43b6c4b700 mds.0.cache.den(10000000000 matthew)  pre_dirty [dentry #1/home/matthew [2,head] auth (dversion lock) pv=27954 v=27952 ap=0+2 inode=0x124d920 | inodepin dirty 0x12681d8]
2012-03-26 18:39:54.306539 7f43b6c4b700 mds.0.cache.ino(10000000001) pre_dirty 27954 (current v 27952)
2012-03-26 18:39:54.306543 7f43b6c4b700 mds.0.cache predirty_journal_parents add_delta f(v2 m2012-03-26 18:39:54.306322 125=56+69)
2012-03-26 18:39:54.306548 7f43b6c4b700 mds.0.cache predirty_journal_parents         - f(v2 m2012-03-26 18:11:51.885054 124=55+69)
2012-03-26 18:39:54.306564 7f43b6c4b700 mds.0.cache predirty_journal_parents     gives f(v2 m2012-03-26 18:39:54.306322 125=56+69) on [inode 10000000001 [...2,head] /home/matthew/ auth v27952 pv27954 ap=1+1 f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226463=210629+15834) (iauth sync r=1) (isnap sync r=1) (inest lock w=1) (ipolicy sync r=1) (ifile excl w=1) (iversion lock w=1 last_client=4707) caps={4707=pAsLsXsFsx/-@6},l=4707 | lock dirfrag caps dirty authpin 0x124d920]
2012-03-26 18:39:54.306576 7f43b6c4b700 mds.0.cache predirty_journal_parents frag->inode on [dir 10000000001 /home/matthew/ [2,head] auth pv=40972 v=40970 cv=40821/40821 ap=1+2+2 state=1610612738|complete f(v2 m2012-03-26 18:11:51.885054 124=55+69) n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833) hs=124+4,ss=0+0 dirty=10 | child dirty authpin 0x125ebe8]
2012-03-26 18:39:54.306580 7f43b6c4b700 mds.0.cache project_rstat_frag_to_inode [2,head]
2012-03-26 18:39:54.306585 7f43b6c4b700 mds.0.cache   frag           rstat n(v2508 rc2012-03-26 18:39:54.306322 b44020242481 226463=210630+15833)
2012-03-26 18:39:54.306590 7f43b6c4b700 mds.0.cache   frag accounted_rstat n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833)
2012-03-26 18:39:54.306595 7f43b6c4b700 mds.0.cache                  delta n(v2508 rc2012-03-26 18:39:54.306322 1=1+0)
2012-03-26 18:39:54.306600 7f43b6c4b700 mds.0.cache  projecting to [2,head] n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226463=210629+15834)
2012-03-26 18:39:54.306606 7f43b6c4b700 mds.0.cache         result [2,head] n(v2508 rc2012-03-26 18:39:54.306322 b44020242481 226464=210630+15834)
2012-03-26 18:39:54.306610 7f43b6c4b700 mds.0.cache.dir(10000000001) get_num_head_items() = 124; fnode.fragstat.nfiles=55 fnode.fragstat.nsubdirs=69
2012-03-26 18:39:54.306661 7f43b6c4b700 mds.0.cache.dir(10000000001) mismatch between child accounted_rstats and my rstats!
2012-03-26 18:39:54.306667 7f43b6c4b700 mds.0.cache.dir(10000000001) total of child dentrys: n(v0 rc2012-03-26 18:22:30.417126 b44020242237 226462=210629+15833)
2012-03-26 18:39:54.306672 7f43b6c4b700 mds.0.cache.dir(10000000001) my rstats:              n(v2508 rc2012-03-26 18:22:30.417126 b44020242481 226462=210629+15833)

These logs are not from a perfectly clean cluster, the directory /home/matthew/.config/chromium/Default/ also had rstat issues that started before logging was elevated. Several cluster installs have developed this issue over time (1-3 days after install), most often in a user's home directory. The attached log is at debug=20 for the second during which the mismatch was first detected. I have additional MDS logs from before and after if needed.

This is coming from a 3 mon, 2 MDS (1 active), 2 OSD cluster.


Files

matthewhome183954.mds.grep (641 KB) matthewhome183954.mds.grep MDS log from 2012-03-26 18:39:54 Matthew Roy, 03/26/2012 06:24 PM
Actions

Also available in: Atom PDF