Project

General

Profile

Actions

Bug #49727

closed

lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

Added by David Zafman about 3 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

100%

Source:
Tags:
backport_processed
Backport:
pacific,quincy,reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This has been seen in cases where all of pool 1 PGs are scrubbed and none of pool 2's. I suggest that this is because the mgr who handles the scrub request doesn't have an updated pgmap. The test could delay a little anywhere before issuing the scrub request.

2021-01-23T13:53:18.420 INFO:teuthology.orchestra.run.gibba021.stdout:Wrote 2000 omap keys of 445 bytes to the 350005e6-6ddd-44a6-950d-db89fed4a6c2 object
2021-01-23T13:53:18.433 INFO:teuthology.orchestra.run.gibba021.stdout:Wrote 2000 omap keys of 445 bytes to the b949555d-45df-48f4-ab5c-feb8f41221cd object
2021-01-23T13:53:18.434 INFO:teuthology.orchestra.run.gibba021.stdout:Scrubbing
2021-01-24T01:40:30.258 DEBUG:teuthology.exit:Got signal 15; running 2 handlers...
2021-01-24T01:40:30.304 DEBUG:teuthology.task.console_log:Killing console logger for gibba021
2021-01-24T01:40:30.305 DEBUG:teuthology.task.console_log:Killing console logger for gibba021
2021-01-24T01:40:30.305 DEBUG:teuthology.exit:Finished running handlers

/a/teuthology-2021-01-23_07:01:02-rados-master-distro-basic-gibba/5819506

/a/ideepika-2021-01-22_07:01:14-rados-wip-deepika-testing-master-2021-01-22-0047-distro-basic-smithi/5814891


Related issues 3 (0 open3 closed)

Copied from RADOS - Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangsResolvedDavid Zafman

Actions
Copied to RADOS - Backport #57208: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangsResolvedBrad HubbardActions
Copied to RADOS - Backport #57209: quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangsResolvedRadoslaw ZarzynskiActions
Actions #1

Updated by David Zafman about 3 years ago

  • Copied from Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added
Actions #2

Updated by David Zafman about 3 years ago

  • Description updated (diff)
Actions #3

Updated by David Zafman about 3 years ago

  • Pull request ID deleted (39535)
Actions #4

Updated by David Zafman about 3 years ago

Note that instead of a delay you can tell the OSDs to flush their pg stats. I wonder if that flushes to the mon and eventually it gets the mgr or if it guarantees that the mgr is up to date too.

See the qa/standalone/ceph-helpers.sh function flush_pg_stats for the bash version that waits for all the flushes.

Actions #5

Updated by Brad Hubbard about 3 years ago

  • Pull request ID set to 39980
Actions #6

Updated by Neha Ojha almost 3 years ago

  • Priority changed from Urgent to Normal

Haven't seen this recently.

Actions #7

Updated by Laura Flores almost 2 years ago

  • Status changed from New to Pending Backport
  • Backport changed from pacific to pacific,quincy

/a/yuriw-2022-08-11_16:46:00-rados-wip-yuri3-testing-2022-08-11-0809-pacific-distro-default-smithi/6968195

2022-08-11T22:32:05.016 INFO:teuthology.orchestra.run.smithi138.stdout:{"status":"HEALTH_OK","checks":{},"mutes":[]}
2022-08-11T22:32:05.017 INFO:tasks.ceph.ceph_manager.ceph:wait_until_healthy done
2022-08-11T22:32:05.017 INFO:teuthology.run_tasks:Running task exec...
2022-08-11T22:32:05.029 INFO:teuthology.task.exec:Executing custom commands...
2022-08-11T22:32:05.030 INFO:teuthology.task.exec:Running commands on role client.0 host ubuntu@smithi138.front.sepia.ceph.com
2022-08-11T22:32:05.030 DEBUG:teuthology.orchestra.run.smithi138:> sudo TESTDIR=/home/ubuntu/cephtest bash -c ceph_test_lazy_omap_stats
2022-08-11T22:32:05.500 INFO:teuthology.orchestra.run.smithi138.stdout:pool 'lazy_omap_test_pool' created
2022-08-11T22:32:05.504 INFO:teuthology.orchestra.run.smithi138.stdout:Created payload with 2000 keys of 445 bytes each. Total size in bytes = 890000
2022-08-11T22:32:05.504 INFO:teuthology.orchestra.run.smithi138.stdout:Waiting for active+clean
2022-08-11T22:32:05.757 INFO:teuthology.orchestra.run.smithi138.stdout:.
2022-08-11T22:32:06.544 INFO:teuthology.orchestra.run.smithi138.stdout:Wrote 2000 omap keys of 445 bytes to the f7c525bd-bd86-48cb-8ed1-4e673df56515 object
2022-08-11T22:32:06.544 INFO:teuthology.orchestra.run.smithi138.stdout:Scrubbing
2022-08-12T10:22:16.704 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2022-08-12T10:22:16.743 DEBUG:teuthology.task.console_log:Killing console logger for smithi138
2022-08-12T10:22:16.745 DEBUG:teuthology.exit:Finished running handlers

Actions #8

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #57208: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added
Actions #9

Updated by Backport Bot almost 2 years ago

  • Copied to Backport #57209: quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added
Actions #10

Updated by Backport Bot almost 2 years ago

  • Tags set to backport_processed
Actions #11

Updated by Laura Flores about 1 year ago

  • Translation missing: en.field_tag_list set to test-failure
  • Backport changed from pacific,quincy to pacific,quincy,reef
Actions #12

Updated by Laura Flores about 1 year ago

/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203287

Actions #13

Updated by Brad Hubbard about 1 year ago

Laura Flores wrote:

/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203287

This one is different and I'm looking at it in https://tracker.ceph.com/issues/59058

Also working on submitting a backport for pacific so this issue won't affect
that branch.

Actions #14

Updated by Laura Flores 9 months ago

/a/yuriw-2023-08-21_23:10:07-rados-pacific-release-distro-default-smithi/7375579

Actions #16

Updated by Konstantin Shalygin 5 months ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100
Actions

Also available in: Atom PDF