Project

General

Profile

Actions

Bug #22756

closed

RGW will not list contents of older buckets at all: reshard makes it show up again

Added by Robin Johnson over 6 years ago. Updated about 6 years ago.

Status:
Won't Fix
Priority:
Normal
Assignee:
Target version:
% Done:

0%

Spent time:
Source:
Tags:
rgw, omap
Backport:
luminous
Regression:
Yes
Severity:
2 - major
Reviewed:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is very similar to bug 17372 at the face, but different underneath. Luminous does contain the fix for bug 17372 already.

Old buckets appear to be entirely empty, both via the S3 API & radosgw-admin bi list. They are also reported as empty by radosgw-admin bucket stats.

Initially I had thought this specific to buckets w/ has_bucket_info=true (and old_bucket_info) being populated, but I have found at least one bucket where that is NOT true.

DHO upgraded from Jewel to Luminous early in January 2018.

Some of the affected buckets date back to Bobtail & Argonaut, but others are Firefly-era.

Further comments follow with output of bucket metadata & bucket instance metadata.

Actions #1

Updated by Robin Johnson over 6 years ago

# radosgw-admin metadata get bucket:slim
{
    "key": "bucket:slim",
    "ver": {
        "tag": "_M3XT0TRU1x93OpWQRQ81BO2",
        "ver": 1
    },
    "mtime": "2013-12-18 05:45:02.000000Z",
    "data": {
        "bucket": {
            "name": "slim",
            "marker": "default.42527465.160",
            "bucket_id": "default.42527465.160",
            "tenant": "",
            "explicit_placement": {
                "data_pool": ".rgw.data.1",
                "data_extra_pool": "",
                "index_pool": ".rgw.data.1" 
            }
        },
        "owner": "coderpetebackups",
        "creation_time": "2013-12-18 05:44:27.000000Z",
        "linked": "true",
        "has_bucket_info": "false" 
    }
}

# radosgw-admin metadata get bucket.instance:slim:default.42527465.160
{
    "key": "bucket.instance:slim:default.42527465.160",
    "ver": {
        "tag": "_8nmhy2zl-eiLg3WQb0yUSCR",
        "ver": 1
    },
    "mtime": "2013-12-18 05:44:27.000000Z",
    "data": {
        "bucket_info": {
            "bucket": {
                "name": "slim",
                "marker": "default.42527465.160",
                "bucket_id": "default.42527465.160",
                "tenant": "",
                "explicit_placement": {
                    "data_pool": ".rgw.data.1",
                    "data_extra_pool": "",
                    "index_pool": ".rgw.data.1" 
                }
            },
            "creation_time": "2013-12-18 05:44:27.000000Z",
            "owner": "coderpetebackups",
            "flags": 0,
            "zonegroup": "default",
            "placement_rule": "",
            "has_instance_obj": "true",
            "quota": {
                "enabled": false,
                "check_on_raw": false,
                "max_size": -1,
                "max_size_kb": 0,
                "max_objects": -1
            },
            "num_shards": 0,
            "bi_shard_hash_type": 0,
            "requester_pays": "false",
            "has_website": "false",
            "swift_versioning": "false",
            "swift_ver_location": "",
            "index_type": 0,
            "mdsearch_config": [],
            "reshard_status": 0,
            "new_bucket_instance_id": "" 
        },
        "attrs": [
            {
                "key": "user.rgw.acl",
                "val": "AgK7AAAAAgIoAAAAEAAAAGNvZGVycGV0ZWJhY2t1cHMQAAAAY29kZXJwZXRlYmFja3VwcwMDhwAAAAEBAAAAEAAAAGNvZGVycGV0ZWJhY2t1cHMPAAAAAQAAABAAAABjb2RlcnBldGViYWNrdXBzAwNIAAAAAgIEAAAAAAAAABAAAABjb2RlcnBldGViYWNrdXBzAAAAAAAAAAACAgQAAAAPAAAAEAAAAGNvZGVycGV0ZWJhY2t1cHMAAAAAAAAAAA==" 
            },
            {
                "key": "user.rgw.idtag",
                "val": "" 
            }
        ]
    }
}

# radosgw-admin --bucket slim bi list
[]

# radosgw-admin --bucket slim bucket stats
{
    "bucket": "slim",
    "zonegroup": "",
    "placement_rule": "",
    "explicit_placement": {
        "data_pool": ".rgw.data.1",
        "data_extra_pool": "",
        "index_pool": ".rgw.data.1" 
    },
    "id": "default.42527465.160",
    "marker": "default.42527465.160",
    "index_type": "Normal",
    "owner": "coderpetebackups",
    "ver": "0#0",
    "master_ver": "0#0",
    "mtime": "0.000000",
    "max_marker": "0#",
    "usage": {},
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

# radosgw-admin --bucket slim bucket reshard --num-shards=1 --yes-i-really-mean-it
*** NOTICE: operation will not remove old bucket index objects ***
***         these will need to be removed manually             ***
tenant: 
bucket name: slim
old bucket instance id: default.42527465.160
new bucket instance id: default.497541194.1
total entries: 1000 2000 3000 4000 5000 6000 7000 7544

# radosgw-admin --bucket slim bucket stats
{
    "bucket": "slim",
    "zonegroup": "default",
    "placement_rule": "",
    "explicit_placement": {
        "data_pool": ".rgw.data.1",
        "data_extra_pool": "",
        "index_pool": ".rgw.data.1" 
    },
    "id": "default.497541194.1",
    "marker": "default.42527465.160",
    "index_type": "Normal",
    "owner": "coderpetebackups",
    "ver": "0#119",
    "master_ver": "0#0",
    "mtime": "2018-01-23 00:42:50.472315",
    "max_marker": "0#",
    "usage": {
        "rgw.main": {
            "size": 198745513198,
            "size_actual": 198760845312,
            "size_utilized": 0,
            "size_kb": 194087416,
            "size_kb_actual": 194102388,
            "size_kb_utilized": 0,
            "num_objects": 7544
        }
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

# radosgw-admin --bucket slim bi list
[
    {
        "type": "plain",
        "idx": "REDACTED-FILENAME",
        "entry": {
            "name": "REDACTED-FILENAME",
            "instance": "",
            "ver": {
                "pool": 5,
                "epoch": 23248
            },
            "locator": "REDACTED-FILENAME",
            "exists": "true",
            "meta": {
                "category": 1,
                "size": 149535251,
                "mtime": "2014-01-16 10:29:20.000000Z",
                "etag": "dfe750683dd66f67b163baa5344361b4",
                "owner": "coderpetebackups",
                "owner_display_name": "coderpetebackups",
                "content_type": "application/octet-stream",
                "accounted_size": 149535251,
                "user_data": "" 
            },
            "tag": "default.45239442.2806223",
            "flags": 0,
            "pending_map": [],
            "versioned_epoch": 0
        }
    },
    ...

Actions #2

Updated by Robin Johnson over 6 years ago

As shown, resharding the bucket will make the content show up again; but you have to know that it is missing first. offline reshard also presently breaks ACLs, but there is a fix in flight for that.

Actions #3

Updated by Robin Johnson over 6 years ago

  • Target version set to v12.2.3
  • Tags set to rgw, omap
  • Affected Versions v12.0.0, v12.1.0, v12.2.0, v12.2.1, v12.2.2 added
  • ceph-qa-suite rgw added

Workaround fix for this:

dir_obj=.dir.$bucket_id
KEY=.fixing.bucket.index VAL=foobar

$ rados -p $POOL getomapheader $dir_obj
# This will return a 0-byte header

# Set a throwaway key in the OMAP
$ rados -p $POOL setomapval $dir_obj $KEY $VAL
# Remove it again
$ rados -p $pool rmomapkey $dir_obj $KEY

# Removing it is CRITICAL, otherwise the listing operation will fail.

Actions #4

Updated by Robin Johnson over 6 years ago

  • Backport set to luminous
Actions #5

Updated by Casey Bodley over 6 years ago

  • Status changed from New to 12
Actions #6

Updated by Casey Bodley about 6 years ago

  • Priority changed from High to Normal

there's not much that radosgw can do here - workaround is necesary

Actions #7

Updated by Yehuda Sadeh about 6 years ago

The bug is actually in rados. The problem is that the omap flag was never set on the index objects due to an old rados bug, and now there's an optimization that returns empty result when fetching omap entries if the omap flag is not set on the object. The workaround sets that flag.

Actions #8

Updated by Yehuda Sadeh about 6 years ago

  • Status changed from 12 to Won't Fix
  • Assignee set to Josh Durgin

We don't really have any good way to fix it. It's a rados bug, so assigning to josh, but closing it for now.

Actions #9

Updated by Yehuda Sadeh about 6 years ago

Just to be clear, note the workaround in comment #3.

Actions

Also available in: Atom PDF