Bug #2110
osdc/Journaler.cc: 360: FAILED assert(r >= 0)
0%
Description
Assert in MDS. This cluster was running a CephFS home directory workload with one active MDS and one MDS in standby replay. mds.b is designated as the standby, but may have been active since the MDSes had both been individually restarted recently.
./mds.a.log-osdc/Journaler.cc: In function 'void Journaler::_finish_write_head(int, Journaler::Header&, Context*)' thread 7f299df90700 time 2012-02-26 14:39:40.295642 ./mds.a.log:osdc/Journaler.cc: 360: FAILED assert(r >= 0) ./mds.a.log- ceph version 0.42.2 (commit:732f3ec94e39d458230b7728b2a936d431e19322) ./mds.a.log- 1: (Journaler::_finish_write_head(int, Journaler::Header&, Context*)+0x1e1) [0x6a2271] ./mds.a.log- 2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x11b7) [0x688587] ./mds.a.log- 3: (MDS::handle_core_message(Message*)+0x987) [0x4c49a7] ./mds.a.log- 4: (MDS::_dispatch(Message*)+0x2f) [0x4c4b3f] ./mds.a.log- 5: (MDS::ms_dispatch(Message*)+0x70) [0x4c6280] ./mds.a.log- 6: (SimpleMessenger::dispatch_entry()+0x783) [0x720d83] ./mds.a.log- 7: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4a409c] ./mds.a.log- 8: (()+0x7efc) [0x7f29a1a5aefc] ./mds.a.log- 9: (clone()+0x6d) [0x7f29a028f89d] ./mds.a.log- ceph version 0.42.2 (commit:732f3ec94e39d458230b7728b2a936d431e19322) ./mds.a.log- 1: (Journaler::_finish_write_head(int, Journaler::Header&, Context*)+0x1e1) [0x6a2271] ./mds.a.log- 2: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x11b7) [0x688587] ./mds.a.log- 3: (MDS::handle_core_message(Message*)+0x987) [0x4c49a7] ./mds.a.log- 4: (MDS::_dispatch(Message*)+0x2f) [0x4c4b3f] ./mds.a.log- 5: (MDS::ms_dispatch(Message*)+0x70) [0x4c6280] ./mds.a.log- 6: (SimpleMessenger::dispatch_entry()+0x783) [0x720d83] ./mds.a.log- 7: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4a409c] ./mds.a.log- 8: (()+0x7efc) [0x7f29a1a5aefc] ./mds.a.log- 9: (clone()+0x6d) [0x7f29a028f89d] ./mds.a.log-*** Caught signal (Aborted) ** ./mds.a.log- in thread 7f299df90700 ./mds.a.log- ceph version 0.42.2 (commit:732f3ec94e39d458230b7728b2a936d431e19322) ./mds.a.log- 1: /usr/bin/ceph-mds() [0x79b0d6] ./mds.a.log- 2: (()+0x10060) [0x7f29a1a63060] ./mds.a.log- 3: (gsignal()+0x35) [0x7f29a01e43a5] ./mds.a.log- 4: (abort()+0x17b) [0x7f29a01e7b0b] ./mds.a.log- 5: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f29a0aa2d7d] ./mds.a.log- 6: (()+0xb9f26) [0x7f29a0aa0f26] ./mds.a.log- 7: (()+0xb9f53) [0x7f29a0aa0f53] ./mds.a.log- 8: (()+0xba04e) [0x7f29a0aa104e] ./mds.a.log: 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x200) [0x7384f0] ./mds.a.log- 10: (Journaler::_finish_write_head(int, Journaler::Header&, Context*)+0x1e1) [0x6a2271] ./mds.a.log- 11: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0x11b7) [0x688587] ./mds.a.log- 12: (MDS::handle_core_message(Message*)+0x987) [0x4c49a7] ./mds.a.log- 13: (MDS::_dispatch(Message*)+0x2f) [0x4c4b3f] ./mds.a.log- 14: (MDS::ms_dispatch(Message*)+0x70) [0x4c6280] ./mds.a.log- 15: (SimpleMessenger::dispatch_entry()+0x783) [0x720d83] ./mds.a.log- 16: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x4a409c] ./mds.a.log- 17: (()+0x7efc) [0x7f29a1a5aefc] ./mds.a.log- 18: (clone()+0x6d) [0x7f29a028f89d]
History
#1 Updated by Sage Weil about 12 years ago
Do you have a core file? I'm curious what the value of 'r' is.
#2 Updated by Matthew Roy about 12 years ago
- File core.mdsAssert1439.gz added
Sage Weil wrote:
Do you have a core file? I'm curious what the value of 'r' is.
Attached. Probably. (datetime matches, I didn't make the naming change suggested on the wiki yet)
#3 Updated by Sage Weil about 12 years ago
can you attach ceph-mds too? or better yet, fire up gdb ceph-mds core and print out the value of r from that frame. (I've had poor luck making gdb give me anything useful in a mismatched environment.) we can help with that in the #ceph irc channel...
#5 Updated by John Spray over 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
1)
Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.