Project

General

Profile

Actions

Bug #65746

open

rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)

Added by Matt Benjamin 16 days ago. Updated 11 days ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
multipart backport_processed
Backport:
reef squid
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

+ # XXXX re-trying the complete is failing in RGW due to an internal error that appears not caused
+ # checksums;
+ # 2024-04-25T17:47:47.991-0400 7f78e3a006c0 0 req 4931907640780566174 0.011000143s s3:complete_multipart check_previously_completed() ERROR: get_obj_attrs() returned ret=-2
+ # 2024-04-25T17:47:47.991-0400 7f78e3a006c0 2 req 4931907640780566174 0.011000143s s3:complete_multipart completing
+ # 2024-04-25T17:47:47.991-0400 7f78e3a006c0 1 req 4931907640780566174 0.011000143s s3:complete_multipart ERROR: either op_ret is negative (execute failed) or target_obj is null, op_ret: -2200
+ # -2200 turns into 500, InternalError

It's not clear to me what the state of the attempted upload is after this error, e.g., if it could leak space.


Related issues 3 (2 open1 closed)

Blocks rgw - Backport #63857: quincy: notification: etag is missing in CompleteMultipartUpload eventNewAli MasarwaActions
Copied to rgw - Backport #65821: squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)ResolvedCasey BodleyActions
Copied to rgw - Backport #65822: reef: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum)In ProgressCasey BodleyActions
Actions #1

Updated by Casey Bodley 14 days ago

  • Status changed from New to Triaged
  • Tags set to multipart
  • Backport set to reef squid
  • Regression changed from No to Yes

i think this was a regression from https://github.com/ceph/ceph/pull/54569, which moved meta_obj->delete_object() from RGWCompleteMultipart::execute() to RGWCompleteMultipart::complete(). before that change, it was only called on success. after moving, it runs on failures too. deleting meta_obj on failure will prevent a retry from succeeding

adding reef backport because that regression was already backported for https://tracker.ceph.com/issues/63532

Actions #2

Updated by Casey Bodley 14 days ago

  • Assignee set to Casey Bodley
Actions #3

Updated by Casey Bodley 14 days ago

  • Status changed from Triaged to Fix Under Review
  • Pull request ID set to 57257
Actions #4

Updated by Casey Bodley 14 days ago

  • Blocks Backport #63857: quincy: notification: etag is missing in CompleteMultipartUpload event added
Actions #5

Updated by Casey Bodley 11 days ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Casey Bodley 11 days ago

  • Copied to Backport #65821: squid: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum) added
Actions #7

Updated by Casey Bodley 11 days ago

  • Copied to Backport #65822: reef: rgw: multipart upload: complete multipart upload complete cannot be retried after some errors (e.g., after complete was attempted with an invalid checksum) added
Actions #8

Updated by Casey Bodley 11 days ago

  • Tags changed from multipart to multipart backport_processed
Actions

Also available in: Atom PDF