Bug #43825
closedosd stuck down
0%
Description
osd stuck at epoch 99, cluster at 2000 or something.
monc fails to reconnect to the mon
/a/sage-2020-01-24_23:29:53-rados-wip-sage2-testing-2020-01-24-1408-distro-basic-smithi/4703135
Updated by Sage Weil over 4 years ago
huh, also /a/sage-2020-01-24_23:29:53-rados-wip-sage2-testing-2020-01-24-1408-distro-basic-smithi/4703159 osd.7
Updated by Sage Weil over 4 years ago
- Status changed from New to Need More Info
Updated by Sage Weil over 4 years ago
- Subject changed from monc fails to reconnect, osd stuck down to osd stuck down
2020-01-28T15:07:56.366+0000 7fd3bc46f700 10 osd.6 400 handle_osd_map waiting for pgs to consume 300 (shard 0 min 0, map cache is 50, max_lag_factor 2)
osd is stuck at an old epoch; cluster marks it down and it doesn't come back up.
/a/sage-2020-01-28_03:52:05-rados-wip-sage2-testing-2020-01-27-1839-distro-basic-smithi/4713213
Thread 38 (Thread 0x7fd3bc46f700 (LWP 40649)): #0 0x00007fd3d20884dc in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007fd3d1749780 in std::condition_variable::wait(std::unique_lock<std::mutex>&) () from /lib64/libstdc++.so.6 #2 0x000055b84445f8bf in std::condition_variable::wait<OSDShard::wait_min_pg_epoch(epoch_t)::<lambda()> > (__p=..., __lock=..., this=0x55b85090e858) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/log/Entry.h:35 #3 OSDShard::wait_min_pg_epoch (this=0x55b85090e6c0, need=300) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/osd/OSD.cc:10007 #4 0x000055b84445ffd8 in OSD::handle_osd_map (this=0x55b84fce8000, m=0x55b853c5b440) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/osd/OSD.cc:7770 #5 0x000055b844474b8b in OSD::_dispatch (this=0x55b84fce8000, m=0x55b853c5b440) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/osd/OSD.cc:7180 #6 0x000055b844474e16 in OSD::ms_dispatch (this=0x55b84fce8000, m=0x55b853c5b440) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/osd/OSD.cc:6917 #7 0x000055b844db4228 in Dispatcher::ms_dispatch2 (m=..., this=0x55b84fce8000) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/msg/Dispatcher.h:124 #8 Messenger::ms_deliver_dispatch (m=..., this=0x55b84fb7a800) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/msg/Messenger.h:693 #9 DispatchQueue::entry (this=0x55b84fb7ab28) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/msg/DispatchQueue.cc:199 --Type <RET> for more, q to quit, c to continue without paging-- #10 0x000055b844c0c5d1 in DispatchQueue::DispatchThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.0.0-9869.g4b944a6.el8.x86_64/src/msg/DispatchQueue.h:101 #11 0x00007fd3d20822de in start_thread () from /lib64/libpthread.so.0 #12 0x00007fd3d0e2c4b3 in clone () from /lib64/libc.so.6
it's stuck here:
2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 400 advance_pg 1.5 is merge target, sources are 1.25 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 400 advance_pg have 0/1 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 400 advance_pg not ready to merge yet 2020-01-28T15:07:56.366+0000 7fd3ac071700 10 log is not dirty 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 400 advance_pg kicking source 1.25 2020-01-28T15:07:56.366+0000 7fd3ac071700 15 osd.6 400 enqueue_peering_evt 1.25 epoch_sent: 282 epoch_requested: 282 NullEvt 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _enqueue OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282) 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _process 1.25 to_process <> waiting <> waiting_peering {282=<OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282),OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282),OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282)>} 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _process OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282) queued 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _process 1.25 to_process <OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282)> waiting <> waiting_peering {282=<OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282),OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282),OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282)>} 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _process OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282) pg 0 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _process 1.25 splitting 268 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq _add_slot_waiter 1.25 peering, item epoch is 282, will wait on OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282) 2020-01-28T15:07:56.366+0000 7fd3ac071700 20 osd.6 op_wq(5) _process empty q, waiting
Updated by Sage Weil over 4 years ago
2020-01-28T14:56:26.155+0000 7fd3ba08d700 20 osd.6 285 identify_splits_and_merges 1.5 e245 to e285 pg_nums {76=28,89=38,116=37,119=36,123=35,127=34,131=33,135=32,139=31,143=30,147=29,151=28,179=27,189=28,248=27,251=28,268=38,282=37}
epoch 268: 1.25 logically created from 1.5
epoch 282: 1.25 logically merges back into 1.5
Updated by Sage Weil over 4 years ago
- Status changed from Need More Info to In Progress
- Assignee set to Sage Weil
we are splitting:
2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 285 split_pgs splitting pg[1.5( empty local-lis/les=195/196 n=0 ec=15/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5]/[5,4] r=-1 lpr=268 pi=[195,268)/3 crt=0'0 mlcod 0'0 remapped NOTIFY mbc={}] into 1.25 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 285 _make_pg 1.25 2020-01-28T14:56:26.159+0000 7fd3ac071700 5 osd.6 pg_epoch: 268 pg[1.25(unlocked)] enter Initial 2020-01-28T14:56:26.159+0000 7fd3ac071700 20 osd.6 pg_epoch: 268 pg[1.25(unlocked)] enter NotTrimming 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 bluestore(/var/lib/ceph/osd/ceph-6) _osr_attach 1.25_head fresh osr 0x55b850a28280 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 285 pg_num is 38, m_seed 37, split_bits is 6 2020-01-28T14:56:26.159+0000 7fd3ac071700 20 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] on_new_interval 2020-01-28T14:56:26.159+0000 7fd3ac071700 20 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] on_new_interval upacting_features 0x3f01cfb8ffadffff from [5]+[5] 2020-01-28T14:56:26.159+0000 7fd3ac071700 20 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] on_new_interval checking missing set deletes flag. missing = missing(0 may_include_deletes = 1) 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] init_hb_stamps now [] 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] on_new_interval prior_readable_until_ub 0s (mnow 0.072469917s + 0s) 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] cancel_recovery 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 pg_epoch: 268 pg[1.25( empty local-lis/les=195/196 n=0 ec=268/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5] r=-1 lpr=0 pi=[195,268)/3 crt=0'0 mlcod 0'0 unknown mbc={}] clear_recovery_state 2020-01-28T14:56:26.159+0000 7fd3ac071700 10 osd.6 pg_epoch: 268 pg[1.5( empty local-lis/les=195/196 n=0 ec=15/15 lis/c=195/195 les/c/f=196/196/0 sis=268) [5]/[5,4] r=-1 lpr=268 pi=[195,268)/3 crt=0'0 mlcod 0'0 remapped NOTIFY mbc={}] release_backoffs [MIN,MAX) ...
and then we bail out of advance_pg because there is a merge coming up,
2020-01-28T14:56:26.162+0000 7fd3ac071700 20 osd.6 285 advance_pg not ready to merge yet 2020-01-28T14:56:26.162+0000 7fd3ac071700 10 log is not dirty 2020-01-28T14:56:26.162+0000 7fd3ac071700 20 osd.6 285 advance_pg kicking source 1.25 2020-01-28T14:56:26.162+0000 7fd3ac071700 15 osd.6 285 enqueue_peering_evt 1.25 epoch_sent: 282 epoch_requested: 282 NullEvt 2020-01-28T14:56:26.162+0000 7fd3ac071700 20 osd.6 op_wq(5) _enqueue OpSchedulerItem(1.25 PGPeeringEvent(epoch_sent: 282 epoch_requested: 282 NullEvt) prio 255 cost 10 e282) ... 2020-01-28T14:56:26.162+0000 7fd3ac071700 20 osd.6 op_wq(5) _process 1.25 to_process <> waiting <> waiting_peering {}
but we do not submit the rctx transaction and thus queue the split finisher before dropping the pg lock
Updated by Sage Weil over 4 years ago
- Status changed from In Progress to Fix Under Review
- Backport set to nautilus
- Pull request ID set to 32942
Updated by Sage Weil over 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #43919: nautilus: osd stuck down added
Updated by Nathan Cutler about 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".