Bug #45574
opensubinterpreters: ceph/mgr/rook RuntimeError on import of RookOrchestrator - ceph cluster does not start
0%
Description
The 'devicehealth' plugins' dependency on rook (package ceph-mgr-rook) code causes a cluster to not boot, after upgrade to mimic v15.2.1 on Debian Buster.
Seems apparently to be due to interaction between Rook code and python3-numpy (version 1:1.16.2-1), and not unique to Rook ( https://github.com/numpy/numpy/issues/14384 ).
Fix not available upstream, seems "WON'T IMPLEMENT", so fix required in Rook.
May 17 21:01:42 davinci ceph-mgr[82022]: 2020-05-17T21:01:42.587+0200 7f262360ff40 -1 mgr[py] Module not found: 'rook' May 17 21:01:42 davinci ceph-mgr[82022]: 2020-05-17T21:01:42.587+0200 7f262360ff40 -1 mgr[py] Traceback (most recent call last): May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/share/ceph/mgr/rook/__init__.py", line 2, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from .module import RookOrchestrator May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/share/ceph/mgr/rook/module.py", line 16, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from kubernetes import client, config May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/kubernetes/__init__.py", line 22, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: import kubernetes.stream May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/kubernetes/stream/__init__.py", line 15, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from .stream import stream May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/kubernetes/stream/stream.py", line 13, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from . import ws_client May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/kubernetes/stream/ws_client.py", line 19, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from websocket import WebSocket, ABNF, enableTrace May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/websocket/__init__.py", line 22, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from ._abnf import * May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/websocket/_abnf.py", line 34, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: import numpy May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/numpy/__init__.py", line 142, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from . import core May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/numpy/core/__init__.py", line 40, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from . import multiarray May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/numpy/core/multiarray.py", line 12, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: from . import overrides May 17 21:01:42 davinci ceph-mgr[82022]: File "/usr/lib/python3/dist-packages/numpy/core/overrides.py", line 46, in <module> May 17 21:01:42 davinci ceph-mgr[82022]: """) May 17 21:01:42 davinci ceph-mgr[82022]: RuntimeError: implement_array_function method already has a docstring May 17 21:01:42 davinci ceph-mgr[82022]: 2020-05-17T21:01:42.591+0200 7f262360ff40 -1 mgr[py] Class not found in module 'rook' May 17 21:01:42 davinci ceph-mgr[82022]: 2020-05-17T21:01:42.591+0200 7f262360ff40 -1 mgr[py] Error loading module 'rook': (2) No such file or directory May 17 21:01:43 davinci ceph-mgr[82022]: 2020-05-17T21:01:43.099+0200 7f262360ff40 -1 log_channel(cluster) log [ERR] : Failed to load ceph-mgr modules: rook May 17 21:01:46 davinci ceph-mgr[82022]: 2020-05-17T21:01:46.211+0200 7f260ae73700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'devicehealth' while running on mgr.davinci.lund.millnert.se: May 17 21:01:46 davinci ceph-mgr[82022]: 2020-05-17T21:01:46.211+0200 7f260ae73700 -1 devicehealth.serve: May 17 21:01:46 davinci ceph-mgr[82022]: 2020-05-17T21:01:46.211+0200 7f260ae73700 -1 Traceback (most recent call last): May 17 21:01:46 davinci ceph-mgr[82022]: File "/usr/share/ceph/mgr/devicehealth/module.py", line 260, in serve May 17 21:01:46 davinci ceph-mgr[82022]: self.scrape_all() May 17 21:01:46 davinci ceph-mgr[82022]: File "/usr/share/ceph/mgr/devicehealth/module.py", line 327, in scrape_all May 17 21:01:46 davinci ceph-mgr[82022]: ioctx = self.open_connection() May 17 21:01:46 davinci ceph-mgr[82022]: File "/usr/share/ceph/mgr/devicehealth/module.py", line 297, in open_connection May 17 21:01:46 davinci ceph-mgr[82022]: assert r == 0 May 17 21:01:46 davinci ceph-mgr[82022]: AssertionError
Updated by Martin Millnert about 4 years ago
Update: Red herring that this bug prevented cluster from starting.
I was doing upgrade from Luminous through Nautilus to Octopus and had blocked OSDs from starting by the ceph osd require-osd-release = luminous which didn't let n-2 version start. Cluster is up and happy, but that error still shows up nonetheless.
Clearly severity can be reduced.
Updated by Sebastian Wagner about 4 years ago
https://github.com/numpy/numpy/issues/14384#issuecomment-626832460
And we have sub-interpreters again.
Updated by Sebastian Wagner about 4 years ago
- Status changed from New to Need More Info
I think we're going to need the full mgr log here.
Updated by Tim Serong about 4 years ago
Sebastian Wagner wrote:
https://github.com/numpy/numpy/issues/14384#issuecomment-626832460
And we have sub-interpreters again.
There's a couple of subsequent comments on that bug from rgommers, saying that even though they're not going to attempt to fix subinterpreter issues in numpy, that there should be a fix for this particular docstring error "soon". So we may have a brief reprieve...
Updated by Sebastian Wagner about 4 years ago
but that in turn means, we're not able to load numpy from two sub-interpreters. Which means no k8sevents module?
Updated by Sebastian Wagner almost 4 years ago
- Project changed from Orchestrator to mgr
- Subject changed from ceph/mgr/rook RuntimeError on import of RookOrchestrator - ceph cluster does not start to subinterpreters: ceph/mgr/rook RuntimeError on import of RookOrchestrator - ceph cluster does not start
- Category changed from mgr/rook to ceph-mgr
Updated by Sebastian Wagner about 3 years ago
- Has duplicate Bug #50979: rook: implement_array_function method already has a docstring added
Updated by Sebastian Wagner about 3 years ago
- Status changed from Need More Info to New
- Priority changed from Normal to High
Updated by Sebastian Wagner about 3 years ago
2021-05-24T19:40:05.822+0000 7f56cd302040 -1 mgr[py] Module not found: 'rook' 2021-05-24T19:40:05.826+0000 7f8800ece040 -1 mgr[py] Module not found: 'rook' 2021-05-24T19:40:05.826+0000 7f8800ece040 -1 mgr[py] Traceback (most recent call last): File "/usr/share/ceph/mgr/rook/__init__.py", line 2, in <module> from .module import RookOrchestrator File "/usr/share/ceph/mgr/rook/module.py", line 17, in <module> from kubernetes import client, config File "/usr/lib/python3/dist-packages/kubernetes/__init__.py", line 22, in <module> import kubernetes.stream File "/usr/lib/python3/dist-packages/kubernetes/stream/__init__.py", line 15, in <module> from .stream import stream File "/usr/lib/python3/dist-packages/kubernetes/stream/stream.py", line 13, in <module> from . import ws_client File "/usr/lib/python3/dist-packages/kubernetes/stream/ws_client.py", line 19, in <module> from websocket import WebSocket, ABNF, enableTrace File "/usr/lib/python3/dist-packages/websocket/__init__.py", line 22, in <module> from ._abnf import * File "/usr/lib/python3/dist-packages/websocket/_abnf.py", line 34, in <module> import numpy File "/usr/lib/python3/dist-packages/numpy/__init__.py", line 142, in <module> from . import core File "/usr/lib/python3/dist-packages/numpy/core/__init__.py", line 17, in <module> from . import multiarray File "/usr/lib/python3/dist-packages/numpy/core/multiarray.py", line 14, in <module> from . import overrides File "/usr/lib/python3/dist-packages/numpy/core/overrides.py", line 16, in <module> add_docstring( RuntimeError: implement_array_function method already has a docstring
Updated by Sebastian Wagner about 3 years ago
- Related to Bug #38407: Funny issues with python sub-interpreters added
Updated by Sebastian Wagner about 3 years ago
- Related to Bug #48787: ceph-mgr segfault added
Updated by Deepika Upadhyay about 3 years ago
Not digged deep but might be helpful, I was taking a look at the issue yesterday, this issue has a workaround in numpy <= 1.19 [0]I checked on Ubuntu 20.04, pip can provide `numpy==1.19`, since have seen it only on focal so far in Octopus; can we just pip install instead of using distro package?
[0] [[https://github.com/numpy/numpy/issues/14384#issuecomment-641340591]]
Updated by Sebastian Wagner almost 3 years ago
Deepika Upadhyay wrote:
Not digged deep but might be helpful, I was taking a look at the issue yesterday, this issue has a workaround in numpy <= 1.19 [0]I checked on Ubuntu 20.04, pip can provide `numpy==1.19`, since have seen it only on focal so far in Octopus; can we just pip install instead of using distro package?
[0] https://github.com/numpy/numpy/issues/14384#issuecomment-641340591
I think numpy is installed via APT instead of pip:
https://github.com/ceph/ceph/blob/26fbbefa827cfab0837296df2c8f5d1cc88331ae/debian/control#L319
We can avoid that for now, if we don't install ceph-mgr-rook on focal. but that only works, till other distributions upgrade mypy to > 1.19
Updated by Kefu Chai almost 3 years ago
https://github.com/ceph/ceph/pull/41688 is created so we don't install ceph-mgr-rook because it is Recommented by ceph-mgr-modules-core.
Updated by Deepika Upadhyay almost 3 years ago
- Copied to Bug #51240: mgr module fails in focal, due to ceph-mgr-rook module added
Updated by Ernesto Puerta 7 months ago
- Blocked by Cleanup #63294: mgr: enable per-subinterpreter GIL (Python >= 3.12) added
Updated by Laura Flores 4 months ago
- Related to Bug #64054: test failure due to HEALTH_ERR: 2 mgr modules have failed added