Bug #40700
closedmemory usage of: radosgw-admin bucket rm
0%
Description
Cluster is Nautilus 14.2.1, 500 OSDs with BlueStore. Both of the RadosGW pools that are involved here (for data and for index) are replicated and without SSDs.
Steps that led to the problem:
1. There is a bucket $BIG_BUCKET with about 60 M objects, with 1024 shards.
2. radosgw-admin bucket rm --bucket=$BIG_BUCKET --bypass-gc --purge-objects
3. After several hours, the removal command was killed by the out-of-memory killer. Then looking at the graphs, we see a continuous increase of memory usage for this process, about +24 GB per day. Removal rate is about 3 M objects per day.
So with this bucket with 60 M objects, we would need about 480 GB of RAM to come through.
Expected behaviour:
Bucket removal with radosgw-admin should work with a somewhat limited amount of memory, also with buckets with lots of objects.
Some additional information:
The killed remove command can just be called again, but it will be killed again before it finishes. Also, it has to run some time until it continues to actually remove objects. This "wait time" is also increasing. Last time, after about 16 M objects already removed, the wait time was nearly 9 hours. Also during this time, there is a memory ramp, but not so steep.
Harry
Updated by Paul Emmerich almost 5 years ago
I've also got two clusters here with this problem, one is running 14.2.1 (50M objects in a bucket) and one 13.2.5 (450M objects in a bucket).
Looks like radosgw-admin uses libc malloc, so it's hard to say what the memory is being used for
Updated by Casey Bodley almost 5 years ago
- Status changed from New to 12
- Assignee set to J. Eric Ivancich
Updated by Casey Bodley almost 5 years ago
- Assignee changed from J. Eric Ivancich to Mark Kogan
Updated by Mark Kogan almost 5 years ago
Investigating this issue,
it is possible to alleviate the "wait time" increasing incrementally after each iteration of
radosgw-admin bucket rm --bucket=$BIG_BUCKET --bypass-gc --purge-objects
by running
radosgw-admin bucket check --bucket=$BIG_BUCKET --fix
between each itteration of radosgw-admin bucket rm operations.
Updated by J. Eric Ivancich almost 5 years ago
That's interesting, Mark!
So the bucket index is left in an unsynchronized state (i.e., original state) when bucket removal is terminated part-way through. And then when bucket removal is restarted, it begins by trying to re-remove those same objects at the head of the bucket index all over again, causing a delay before forward progress is made.
Since the bucket removal is generally expected to complete, there "should" be no need to update the bucket index at "check-points" during the bucket removal process.
If terminating bucket removal is semi-expected (possibly through manual admin intervention), it seems that updating the index after every 100,000 to 1,000,000 objects is removed would mitigate this, without creating a lot of overhead.
And would there be any benefit to removing the objects from back to front in the bucket index? In other words, is there an easy way to truncate the index of its tail members, making the update of the bucket index quick?
Updated by Mark Kogan over 4 years ago
Update -
found the source of the memory growth:
src/rgw/rgw_rados.cc
RGWObjState *RGWObjectCtx::get_state(const rgw_obj& obj) {
RGWObjState *result;
typename std::map<rgw_obj, RGWObjState>::iterator iter;
lock.lock_shared();
assert (!obj.empty());
iter = objs_state.find(obj);
if (iter != objs_state.end()) {
result = &iter->second;
lock.unlock_shared();
} else {
lock.unlock_shared();
lock.lock();
result = &objs_state[obj]; <--------------
lock.unlock();
}
return result;
}
Submitted proposed fix PR.
Updated by J. Eric Ivancich over 4 years ago
- Status changed from 12 to 17
- Target version set to v15.0.0
- Backport set to nautilus,mimic,luminous
Updated by J. Eric Ivancich over 4 years ago
- Status changed from 7 to Pending Backport
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41858: nautilus: memory usage of: radosgw-admin bucket rm added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41859: mimic: memory usage of: radosgw-admin bucket rm added
Updated by Nathan Cutler over 4 years ago
- Copied to Backport #41860: luminous: memory usage of: radosgw-admin bucket rm added
Updated by Nathan Cutler over 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".