Project

General

Profile

Actions

Bug #2234

closed

Sometimes 'ceph -s' is unable to show pg data and crashes

Added by Szymon Szypulski about 12 years ago. Updated almost 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph -s / ceph -w sometimes gives me output as below:

2012-04-04 10:18:06.961917   mds e46461: 1/1/1 up {0=backup1=up:replay}
2012-04-04 10:18:06.962392   osd e9588: 3 osds: 3 up, 3 in
2012-04-04 10:18:06.962572   mon e3: 3 mons at {aux1=46.4.71.236:6789/0,backup1=176.9.28.253:6789/0,backup2=176.9.28.148:6789/0}
2012-04-04 10:18:26.084629 7efefd305700 monclient: hunting for new mon
2012-04-04 10:18:30.239197   log 2012-04-04 10:17:46.929687 mon.1 176.9.28.148:6789/0 99 : [INF] mon.backup2 calling new monitor election
2012-04-04 10:18:30.239197   log 2012-04-04 10:17:52.137145 mon.1 176.9.28.148:6789/0 100 : [INF] mon.backup2@1 won leader election with quorum 1,2
2012-04-04 10:18:30.239197   log 2012-04-04 10:18:02.282710 mon.1 176.9.28.148:6789/0 101 : [INF] mon.backup2 calling new monitor election
2012-04-04 10:18:30.239197   log 2012-04-04 10:18:17.229971 mon.0 46.4.71.236:6789/0 150 : [INF] mon.aux1 calling new monitor election
2012-04-04 10:18:30.239197   log 2012-04-04 10:18:27.373042 mon.0 46.4.71.236:6789/0 151 : [INF] mon.aux1@0 won leader election with quorum 0,1
ceph: mon/PGMap.cc:131: void PGMap::apply_incremental(const PGMap::Incremental&): Assertion `inc.version == version+1' failed.
*** Caught signal (Aborted) **
 in thread 7efefd305700
 ceph version 0.44.1 (commit:c89b7f22c8599eb974e75a2f7a5f855358199dee)
 1: ceph() [0x48744f]
 2: (()+0xfc60) [0x7eff001cbc60]
 3: (gsignal()+0x35) [0x7efefe98dd05]
 4: (abort()+0x186) [0x7efefe991ab6]
 5: (__assert_fail()+0xf5) [0x7efefe9867c5]
 6: (PGMap::apply_incremental(PGMap::Incremental const&)+0xfc5) [0x47b9f5]
 7: ceph() [0x46a074]
 8: (Admin::ms_dispatch(Message*)+0x9f1) [0x476ed1]
 9: (SimpleMessenger::dispatch_entry()+0x7db) [0x5164bb]
 10: (SimpleMessenger::DispatchThread::entry()+0xd) [0x46a3dd]
 11: (()+0x6d8c) [0x7eff001c2d8c]
 12: (clone()+0x6d) [0x7efefea40c2d]
Aborted

background:
I don't know if this issue was present earlier. Today I've upgraded whole cluster from 0.39 to 0.44.1. Upgrade went smooth. Next I've added one new osd node and I've noticed that ceph -s / ceph -w gives output as above.

If you need more data please write so.

Actions #1

Updated by Sage Weil almost 12 years ago

  • Status changed from New to Resolved

This code has all been replaced!

Actions

Also available in: Atom PDF