Project

General

Profile

Bug #1759

Updated by Sage Weil over 12 years ago


 My version of ceph is a minor variant of 0.38, running with ext4, and ceph-fuse.    It looks like my fs has gotten corrupted somehow.    I've seen this assert failure on two of the osds I have running, and it hits the same assertion on restart of the osd.    It looks like the EINVAL is actually coming from a truncate of an object, due to the size passed to the truncate being extremely large (18446744073709551615).    Any way to debug this or correct it? 

 Log from one of the failed osds: 

 Nov 29 18:52:29 sug-chifj21 osd.59[19711]: 7f1dfa92b700 filestore(/srv/ceph/osd.59)    error error 22: Invalid argument not handled 
 Nov 29 18:52:29 sug-chifj21 osd.59[19711]: ../../src/os/FileStore.cc: In function 'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&)', in thread '7f1dfa92b700'#012../../src/os/FileStore.cc: 2407: FAILED assert(0 == "unexpected error") 
 Nov 29 18:52:29 sug-chifj21 osd.59[19711]:    ceph version    (commit:)#012 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x89) [0x9142d9]#012 2: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x199c) [0xa7921e]#012 3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x105) [0xa76cc1]#012 4: (FileStore::_do_op(FileStore::OpSequencer*)+0x1b9) [0xa7545f]#012 5: (FileStore::OpWQ::_process(FileStore::OpSequencer*)+0x27) [0xa8d445]#012 6: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*)+0x2e) [0xa9c5a6]#012 7: (ThreadPool::worker()+0x42c) [0x914b54]#012 8: (ThreadPool::WorkThread::entry()+0x1c) [0x8ae79e]#012 9: (Thread::_entry_func(void*)+0x23) [0x97cd09]#012 10: (()+0x6d8c) [0x7f1e05a10d8c]#012 11: (clone()+0x6d) [0x7f1e0425204d] 
 Nov 29 18:52:29 sug-chifj21 osd.59[19711]:    ceph version    (commit:)#012 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x89) [0x9142d9]#012 2: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x199c) [0xa7921e]#012 3: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x105) [0xa76cc1]#012 4: (FileStore::_do_op(FileStore::OpSequencer*)+0x1b9) [0xa7545f]#012 5: (FileStore::OpWQ::_process(FileStore::OpSequencer*)+0x27) [0xa8d445]#012 6: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*)+0x2e) [0xa9c5a6]#012 7: (ThreadPool::worker()+0x42c) [0x914b54]#012 8: (ThreadPool::WorkThread::entry()+0x1c) [0x8ae79e]#012 9: (Thread::_entry_func(void*)+0x23) [0x97cd09]#012 10: (()+0x6d8c) [0x7f1e05a10d8c]#012 11: (clone()+0x6d) [0x7f1e0425204d] 
 Nov 29 18:52:29 sug-chifj21 osd.59[19711]: *** Caught signal (Aborted) **#012 in thread 7f1dfa92b700    
 Nov 29 18:52:29 sug-chifj21 osd.59[19711]:    ceph version    (commit:)#012 1: (ceph::BackTrace::BackTrace(int)+0x2d) [0x914655]#012 2: /usr/bin/ceph-osd() [0xa650ff]#012 3: (()+0xfc60) [0x7f1e05a19c60]#012 4: (gsignal()+0x35) [0x7f1e0419fd05]#012 5: (abort()+0x186) [0x7f1e041a3ab6]#012 6: (__gnu_cxx::__verbose_terminate_handler()+0x11d) [0x7f1e04a566dd]#012 7: (()+0xb9926) [0x7f1e04a54926]#012 8: (()+0xb9953) [0x7f1e04a54953]#012 9: (()+0xb9a5e) [0x7f1e04a54a5e]#012 10: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1f3) [0x914443]#012 11: (FileStore::_do_transaction(ObjectStore::Transaction&)+0x199c) [0xa7921e]#012 12: (FileStore::do_transactions(std::list<ObjectStore::Transaction*, std::allocator<ObjectStore::Transaction*> >&, unsigned long)+0x105) [0xa76cc1]#012 13: (FileStore::_do_op(FileStore::OpSequencer*)+0x1b9) [0xa7545f]#012 14: (FileStore::OpWQ::_process(FileStore::OpSequencer*)+0x27) [0xa8d445]#012 15: (ThreadPool::WorkQueue<FileStore::OpSequencer>::_void_process(void*)+0x2e) [0xa9c5a6]#012 16: (ThreadPool::worker()+0x42c) [0x914b54]#012 17: (ThreadPool::WorkThread::entry()+0x1c) [0x8ae79e]#012 18: (Thread::_entry_func(void*)+0x23) [0x97cd09]#012 19: (()+0x6d8c) [0x7f1e05a10d8c]#012 20: (clone()+0x6d) [0x7f1e0425204d] 


 strace truncate error for osd.59: 

 truncate("/srv/ceph/osd.59/current/0.274_head/1000000001a.00000002__head_9A0B7274", 18446744073709551615) = -1 EINVAL (Invalid argument) 

 strace truncate error for osd.65: 

 truncate("/srv/ceph/osd.65/current/0.274_head/1000000001a.00000002__head_9A0B7274", 18446744073709551615) = -1 EINVAL (Invalid argument) 

Back