本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。
搭配目錄儲存貯體使用分段上傳
您可以使用分段上傳程序,將單一物件以一組組件進行上傳。每個組件都是物件資料的接續部分。您可依任何順序分別上傳這些物件組件。若任何組件的傳輸失敗,您可再次傳輸該組件,而不會影響其他組件。當物件的所有組件都全部上傳完後,Amazon S3 會將這些組件組合起來建立該物件。一般而言,當物件大小達到 100 MB 時,應考慮使用分段上傳,而不是以單次操作上傳物件。
使用分段上傳具備下列優勢:
-
改善輸送量 - 您可平行上傳各組件以改進輸送量。
-
快速從任何網路問題復原 – 組件大小若較小,對於重新開始因為網路發生錯誤而上傳失敗的影響可降到最低。
-
暫停及繼續上傳物件 - 您可在一段時間內上傳物件組件。啟動分段上傳之後,就沒有過期日。您必須明確完成或中止分段上傳。
-
在您知道最終物件大小前開始上傳 - 您可在建立物件的同時上傳物件。
建議您依照下列方式使用分段上傳:
當您使用分段上傳將物件上傳至目錄儲存貯體中的 Amazon S3 Express One Zone 儲存類別時,分段上傳程序類似於使用分段上傳將物件上傳至一般用途儲存貯體的程序。不過,還是有一些顯著的差異。
如需使用分段上傳將物件上傳至 S3 Express One Zone 的詳細資訊,請參閱下列主題。
分段上傳程序
分段上傳是三個步驟的程序:
-
啟動上傳。
-
上傳物件組件。
-
上傳所有組件之後,即完成分段上傳。
Amazon S3 一收到完成分段上傳請求,就會從上傳的組件建構物件,然後您即可像存取儲存貯體中的任何其他物件一樣存取該物件。
啟動分段上傳
當您傳送要求要啟動分段上傳時,Amazon S3 會傳回具有上傳 ID 的回應,其為分段上傳的唯一識別符。每次上傳分段各組件、列出各組件、完成上傳或中止上傳時,都必須納入此上傳 ID。
組件上傳
上傳某個分段組件時,除了上傳 ID 之外,還必須指定組件編號。搭配 S3 Express One Zone 使用分段上傳時,分段組件編號必須是連續的組件編號。如果您嘗試使用非連續的組件編號完成分段上傳請求,則會產生 HTTP 400 Bad Request
錯誤 (組件順序無效)。
組件編號可找出獨特的某個組件,以及其在上傳中物件內的位置。若使用和前一個上傳組件相同的組件編號上傳新的組件,將會覆寫前一個已上傳的組件。
每次上傳一個組件時,Amazon S3 會在其回應中傳回 企業標籤 (ETag) 標頭。您必須記錄每個上傳組件的組件編號與 ETag 值。所有物件組件上傳的 ETag 值將保持不變,但每個組件將獲指派不同的組件編號。後續的要求中必須包含這些值,才能完成分段上傳。
Amazon S3 會自動加密上傳到 S3 儲存貯體的所有新物件。進行分段上傳時,若您未在請求中指定加密資訊,所上傳分段的加密設定會設為目的地儲存貯體的預設加密組態。Amazon S3 儲存貯體的預設加密組態一律為啟用狀態,且最低限度設定為使用 Amazon S3 受管金鑰 (SSE-S3) 的伺服器端加密。對於目錄儲存貯體,支援 SSE-S3 和伺服器端加密與 AWS KMS 金鑰 (SSE-KMS)。如需詳細資訊,請參閱資料保護和加密。
完成分段上傳
當您完成分段上傳時,Amazon S3 會根據組件編號以遞增順序串連各個組件,建立物件。成功完成請求之後,這些片段就不再存在。
您的「完成分段上傳」請求必須包含上傳 ID,以及組件編號和其對應 ETag 值的清單。Amazon S3 回應包含的 ETag 可識別獨特的物件資料組合。這個 ETag 不是物件資料的 MD5 雜湊。
分段上傳清單
您可列出特定分段上傳的組件或所有進行中之分段上傳。列出組件操作會傳回特定分段上傳之已上傳組件的資訊。Amazon S3 會為每項列出的組件要求,傳回指定分段上傳組件的資訊,上限為 1,000 個組件。若分段上傳中有超過 1,000 個分段,您必須使用分頁來擷取所有分段。
傳回的組件清單不會包含尚未完成上傳的組件。使用列出分段上傳操作,即可取得正在進行中的分段上傳清單。
進行中的分段上傳是您已啟動但尚未完成或已中止的上傳。每個要求最多可傳回 1,000 個分段上傳。若正在進行超過 1,000 個的分段上傳,您必須另行傳送請求以擷取剩餘的分段上傳。傳回的清單僅用於進行驗證。傳送完成分段上傳請求時,請不要使用此清單的結果。而是在上傳 Amazon S3 傳回的組件與相對應之 ETag 值時,保有您自己的組件編號清單。
如需分段上傳清單的詳細資訊,請參閱 Amazon Simple Storage Service API 參考中的ListParts。
使用分段上傳操作的檢查總和
當您上傳物件時,可以指定檢查總和演算法來檢查物件完整性。目錄儲存貯體不支援 MD5。您可以指定以下 Secure Hash 演算法 (SHA) 或循環冗餘檢查 (CRC) 資料完整性演算法之一:
-
CRC32
-
CRC32C
-
SHA-1
-
SHA-256
您可以使用 Amazon S3 REST API 或 AWS SDKs,使用 GetObject
或 擷取個別組件的檢查總和值HeadObject
。如果想要擷取分段上傳 (仍在進行中) 之個別部分的總和檢查值,您可以使用 ListParts
。
使用上述檢查總和演算法時,分段上傳組件編號必須使用連續的組件編號。如果您嘗試使用非連續的組件編號完成分段上傳請求,則 Amazon S3 會產生 HTTP 400 Bad Request
(組件順序無效) 錯誤。
如需如何搭配分段上傳物件使用檢查總和的詳細資訊,請參閱在 Amazon S3 中檢查物件完整性。
並行分段上傳操作
在分散式開發環境中,您的應用程式可以同時對相同的物件啟動數項更新。例如,您的應用程式可能使用相同的物件金鑰啟動數個分段上傳。然後針對這些每一個上傳,應用程式會上傳各組件,並對 Amazon S3 傳送完成上傳要求,以建立物件。對於 S3 Express One Zone 來說,物件建立時間是指分段上傳的完成日期。
分段上傳與定價
啟動分段上傳之後,Amazon S3 就會保留所有分段,直到您完成或中止上傳為止。在其整個生命週期內,您都要支付此分段上傳及其相關組件的儲存體、頻寬與要求之費用。如果您中止分段上傳,Amazon S3 會刪除上傳成品及任何已上傳的組件,而且您不需要再支付其費用。無論指定的儲存類別為何,刪除未完成的分段上傳都不會收取提前刪除費用。如需定價的詳細資訊,請參閱 Amazon S3 定價。
若未能成功傳送完成分段上傳請求,就不會組合物件組件,也不會建立物件。系統會向您收取與已上傳部分相關的所有儲存空間費用。請務必完成分段上傳以建立物件,或者中止分段上傳以刪除任何已上傳的組件。
刪除目錄儲存貯體之前,您必須先完成或中止所有進行中的分段上傳。目錄儲存貯體不支援 S3 生命週期組態。如有需要,您可以列出作用中分段上傳,然後中止上傳,再刪除儲存貯體。
分段上傳 API 操作與許可
若要允許存取目錄儲存貯體上的物件管理 API 操作,您可以在儲存貯體政策或 AWS Identity and Access Management
(IAM) 身分型政策中授予 s3express:CreateSession
許可。
您必須要有必要許可,才可使用分段上傳操作。您可以使用儲存貯體政策或 IAM 身分型政策,授予 IAM 主體執行這些操作的許可。下表列出各種分段上傳操作所需的許可。
您可以透過 Initiator
元素識別分段上傳的啟動者。如果啟動器是 AWS 帳戶,此元素會提供與 Owner
元素相同的資訊。若啟動者是 IAM 使用者,此元素會提供使用者 ARN 與顯示名稱。
動作 |
所需的許可 |
建立分段上傳
|
若要建立分段上傳,您必須能夠對目錄儲存貯體執行 s3express:CreateSession 動作。
|
啟動分段上傳
|
若要啟動分段上傳,您必須能夠對目錄儲存貯體執行 s3express:CreateSession 動作。
|
上傳組件 |
若要上傳組件,您必須能夠對目錄儲存貯體執行 s3express:CreateSession 動作。
若要讓啟動者上傳組件,儲存貯體擁有者必須允許啟動者對目錄儲存貯體執行 s3express:CreateSession 動作。
|
上傳組件 (複製) |
若要上傳組件,您必須能夠對目錄儲存貯體執行 s3express:CreateSession 動作。
啟動者若要分段上傳該物件,儲存貯體擁有者必須允許啟動者對物件執行 s3express:CreateSession 動作。
|
完成分段上傳 |
若要完成分段上傳,您必須能夠對目錄儲存貯體執行 s3express:CreateSession 動作。
若要讓啟動者完成分段上傳,儲存貯體擁有者必須允許啟動者對物件執行 s3express:CreateSession 動作。
|
中止分段上傳 |
若要中止分段上傳,您必須能夠執行 s3express:CreateSession 動作。
若要讓啟動者中止分段上載,必須明確授權允許啟動者執行 s3express:CreateSession 動作。
|
列出組件 |
若要列出分段上傳的組件,您必須能夠對目錄儲存貯體執行 s3express:CreateSession 動作。
|
列出進行中的分段上傳 |
若要列出對儲存貯體的進行中分段上傳,您必須能夠對該儲存貯體執行 s3:ListBucketMultipartUploads 動作。
|
分段上傳的 API 操作支援
Amazon Simple Storage Service API 參考中的下列各節將描述用於分段上傳的 Amazon S3 REST API 操作。
範例
若要使用分段上傳將物件上傳至目錄儲存貯體中的 S3 Express One Zone,請參閱下列範例。
建立分段上傳
對於目錄儲存貯體,當您執行 CreateMultipartUpload
操作和 UploadPartCopy
操作時,儲存貯體的預設加密必須使用所需的加密組態,而且您在 CreateMultipartUpload
請求中提供的請求標頭必須符合目的地儲存貯體的預設加密組態。
下列範例示範如何建立分段上傳。
- SDK for Java 2.x
-
/**
* This method creates a multipart upload request that generates a unique upload ID that is used to track
* all the upload parts
*
* @param s3
* @param bucketName - for example, 'doc-example-bucket
--use1-az4
--x-s3'
* @param key
* @return
*/
private static String createMultipartUpload(S3Client s3, String bucketName, String key) {
CreateMultipartUploadRequest createMultipartUploadRequest = CreateMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.build();
String uploadId = null;
try {
CreateMultipartUploadResponse response = s3.createMultipartUpload(createMultipartUploadRequest);
uploadId = response.uploadId();
}
catch (S3Exception e) {
System.err.println(e.awsErrorDetails().errorMessage());
System.exit(1);
}
return uploadId;
- SDK for Python
-
def create_multipart_upload(s3_client, bucket_name, key_name):
'''
Create a multipart upload to a directory bucket
:param s3_client: boto3 S3 client
:param bucket_name: The destination bucket for the multipart upload
:param key_name: The key name for the object to be uploaded
:return: The UploadId for the multipart upload if created successfully, else None
'''
try:
mpu = s3_client.create_multipart_upload(Bucket = bucket_name, Key = key_name)
return mpu['UploadId']
except ClientError as e:
logging.error(e)
return None
此範例示範如何使用 AWS CLI建立對目錄儲存貯體的分段上傳。此命令會開始將物件 KEY_NAME
分段上傳至目錄儲存貯體 bucket-base-name
--zone-id
--x-s3。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api create-multipart-upload --bucket bucket-base-name
--zone-id
--x-s3 --key KEY_NAME
如需詳細資訊,請參閱 AWS Command Line Interface中的 create-multipart-upload。
上傳分段上傳的組件
下列範例示範如何上傳分段上傳的組件。
- SDK for Java 2.x
-
下列範例示範如何將單一物件分成多個組件,然後使用適用於 Java 2.x 的 SDK 將這些組件上傳至目錄儲存貯體。
/**
* This method creates part requests and uploads individual parts to S3 and then returns all the completed parts
*
* @param s3
* @param bucketName
* @param key
* @param uploadId
* @throws IOException
*/
private static ListCompletedPart
multipartUpload(S3Client s3, String bucketName, String key, String uploadId, String filePath) throws IOException {
int partNumber = 1;
ListCompletedPart
completedParts = new ArrayList<>();
ByteBuffer bb = ByteBuffer.allocate(1024 * 1024 * 5); // 5 MB byte buffer
// read the local file, breakdown into chunks and process
try (RandomAccessFile file = new RandomAccessFile(filePath, "r")) {
long fileSize = file.length();
int position = 0;
while (position < fileSize) {
file.seek(position);
int read = file.getChannel().read(bb);
bb.flip(); // Swap position and limit before reading from the buffer.
UploadPartRequest uploadPartRequest = UploadPartRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.partNumber(partNumber)
.build();
UploadPartResponse partResponse = s3.uploadPart(
uploadPartRequest,
RequestBody.fromByteBuffer(bb));
CompletedPart part = CompletedPart.builder()
.partNumber(partNumber)
.eTag(partResponse.eTag())
.build();
completedParts.add(part);
bb.clear();
position += read;
partNumber++;
}
}
catch (IOException e) {
throw e;
}
return completedParts;
}
- SDK for Python
-
下列範例示範如何將單一物件分成多個組件,然後使用適用於 Python 的 SDK 將這些組件上傳至目錄儲存貯體。
def multipart_upload(s3_client, bucket_name, key_name, mpu_id, part_size):
'''
Break up a file into multiple parts and upload those parts to a directory bucket
:param s3_client: boto3 S3 client
:param bucket_name: Destination bucket for the multipart upload
:param key_name: Key name for object to be uploaded and for the local file that's being uploaded
:param mpu_id: The UploadId returned from the create_multipart_upload call
:param part_size: The size parts that the object will be broken into, in bytes.
Minimum 5 MiB, Maximum 5 GiB. There is no minimum size for the last part of your multipart upload.
:return: part_list for the multipart upload if all parts are uploaded successfully, else None
'''
part_list = []
try:
with open(key_name, 'rb') as file:
part_counter = 1
while True:
file_part = file.read(part_size)
if not len(file_part):
break
upload_part = s3_client.upload_part(
Bucket = bucket_name,
Key = key_name,
UploadId = mpu_id,
Body = file_part,
PartNumber = part_counter
)
part_list.append({'PartNumber': part_counter, 'ETag': upload_part['ETag']})
part_counter += 1
except ClientError as e:
logging.error(e)
return None
return part_list
此範例示範如何將單一物件分成多個組件,然後使用 AWS CLI將這些組件上傳至目錄儲存貯體。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api upload-part --bucket bucket-base-name
--zone-id
--x-s3 --key KEY_NAME
--part-number 1
--body LOCAL_FILE_NAME
--upload-id "AS_mgt9RaQE9GEaifATue15dAAAAAAAAAAEMAAAAAAAAADQwNzI4MDU0MjUyMBYAAAAAAAAAAA0AAAAAAAAAAAH2AfYAAAAAAAAEBSD0WBKMAQAAAABneY9yBVsK89iFkvWdQhRCcXohE8RbYtc9QvBOG8tNpA
"
如需詳細資訊,請參閱 AWS Command Line Interface中的 upload-part。
完成分段上傳
下列範例示範如何完成分段上傳。
- SDK for Java 2.x
-
下列範例示範如何使用適用於 Java 2.x 的 SDK 完成分段上傳。
/**
* This method completes the multipart upload request by collating all the upload parts
* @param s3
* @param bucketName - for example, 'doc-example-bucket
--usw2-az1
--x-s3'
* @param key
* @param uploadId
* @param uploadParts
*/
private static void completeMultipartUpload(S3Client s3, String bucketName, String key, String uploadId, ListCompletedPart
uploadParts) {
CompletedMultipartUpload completedMultipartUpload = CompletedMultipartUpload.builder()
.parts(uploadParts)
.build();
CompleteMultipartUploadRequest completeMultipartUploadRequest =
CompleteMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.uploadId(uploadId)
.multipartUpload(completedMultipartUpload)
.build();
s3.completeMultipartUpload(completeMultipartUploadRequest);
}
public static void multipartUploadTest(S3Client s3, String bucketName, String key, String localFilePath) {
System.out.println("Starting multipart upload for: " + key);
try {
String uploadId = createMultipartUpload(s3, bucketName, key);
System.out.println(uploadId);
ListCompletedPart
parts = multipartUpload(s3, bucketName, key, uploadId, localFilePath);
completeMultipartUpload(s3, bucketName, key, uploadId, parts);
System.out.println("Multipart upload completed for: " + key);
}
catch (Exception e) {
System.err.println(e.getMessage());
System.exit(1);
}
}
- SDK for Python
-
下列範例示範如何使用適用於 Python 的 SDK 完成分段上傳。
def complete_multipart_upload(s3_client, bucket_name, key_name, mpu_id, part_list):
'''
Completes a multipart upload to a directory bucket
:param s3_client: boto3 S3 client
:param bucket_name: The destination bucket for the multipart upload
:param key_name: The key name for the object to be uploaded
:param mpu_id: The UploadId returned from the create_multipart_upload call
:param part_list: The list of uploaded part numbers with their associated ETags
:return: True if the multipart upload was completed successfully, else False
'''
try:
s3_client.complete_multipart_upload(
Bucket = bucket_name,
Key = key_name,
UploadId = mpu_id,
MultipartUpload = {
'Parts': part_list
}
)
except ClientError as e:
logging.error(e)
return False
return True
if __name__ == '__main__':
MB = 1024 ** 2
region = 'us-west-2
'
bucket_name = 'BUCKET_NAME
'
key_name = 'OBJECT_NAME
'
part_size = 10 * MB
s3_client = boto3.client('s3', region_name = region)
mpu_id = create_multipart_upload(s3_client, bucket_name, key_name)
if mpu_id is not None:
part_list = multipart_upload(s3_client, bucket_name, key_name, mpu_id, part_size)
if part_list is not None:
if complete_multipart_upload(s3_client, bucket_name, key_name, mpu_id, part_list):
print (f'{key_name} successfully uploaded through a ultipart upload to {bucket_name}')
else:
print (f'Could not upload {key_name} hrough a multipart upload to {bucket_name}')
此範例示範如何使用 AWS CLI完成目錄儲存貯體的分段上傳。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api complete-multipart-upload --bucket bucket-base-name
--zone-id
--x-s3 --key KEY_NAME
--upload-id "AS_mgt9RaQE9GEaifATue15dAAAAAAAAAAEMAAAAAAAAADQwNzI4MDU0MjUyMBYAAAAAAAAAAA0AAAAAAAAAAAH2AfYAAAAAAAAEBSD0WBKMAQAAAABneY9yBVsK89iFkvWdQhRCcXohE8RbYtc9QvBOG8tNpA
" --multipart-upload file://parts.json
此範例採用 JSON 結構,描述應重新組合成完整檔案的分段上傳組件。在此範例中,file://
字首用於從本機資料夾中名為 parts
的檔案載入 JSON 結構。
parts.json:
parts.json
{
"Parts": [
{
"ETag": "6b78c4a64dd641a58dac8d9258b88147",
"PartNumber": 1
}
]
}
如需詳細資訊,請參閱 AWS Command Line Interface中的 complete-multipart-upload。
中止分段上傳
下列範例示範如何中止分段上傳。
- SDK for Java 2.x
-
下列範例示範如何使用適用於 Java 2.x 的 SDK 中止分段上傳。
public static void abortMultiPartUploads( S3Client s3, String bucketName ) {
try {
ListMultipartUploadsRequest listMultipartUploadsRequest = ListMultipartUploadsRequest.builder()
.bucket(bucketName)
.build();
ListMultipartUploadsResponse response = s3.listMultipartUploads(listMultipartUploadsRequest);
ListMultipartUpload
uploads = response.uploads();
AbortMultipartUploadRequest abortMultipartUploadRequest;
for (MultipartUpload upload: uploads) {
abortMultipartUploadRequest = AbortMultipartUploadRequest.builder()
.bucket(bucketName)
.key(upload.key())
.uploadId(upload.uploadId())
.build();
s3.abortMultipartUpload(abortMultipartUploadRequest);
}
}
catch (S3Exception e) {
System.err.println(e.getMessage());
System.exit(1);
}
}
- SDK for Python
-
下列範例示範如何使用適用於 Python 的 SDK 中止分段上傳。
import logging
import boto3
from botocore.exceptions import ClientError
def abort_multipart_upload(s3_client, bucket_name, key_name, upload_id):
'''
Aborts a partial multipart upload in a directory bucket.
:param s3_client: boto3 S3 client
:param bucket_name: Bucket where the multipart upload was initiated - for example, 'doc-example-bucket
--usw2-az1
--x-s3'
:param key_name: Name of the object for which the multipart upload needs to be aborted
:param upload_id: Multipart upload ID for the multipart upload to be aborted
:return: True if the multipart upload was successfully aborted, False if not
'''
try:
s3_client.abort_multipart_upload(
Bucket = bucket_name,
Key = key_name,
UploadId = upload_id
)
except ClientError as e:
logging.error(e)
return False
return True
if __name__ == '__main__':
region = 'us-west-2
'
bucket_name = 'BUCKET_NAME
'
key_name = 'KEY_NAME
'
upload_id = 'UPLOAD_ID
'
s3_client = boto3.client('s3', region_name = region)
if abort_multipart_upload(s3_client, bucket_name, key_name, upload_id):
print (f'Multipart upload for object {key_name} in {bucket_name} bucket has been aborted')
else:
print (f'Unable to abort multipart upload for object {key_name} in {bucket_name} bucket')
下列範例示範如何使用 AWS CLI中止分段上傳。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api abort-multipart-upload --bucket bucket-base-name
--zone-id
--x-s3 --key KEY_NAME
--upload-id "AS_mgt9RaQE9GEaifATue15dAAAAAAAAAAEMAAAAAAAAADQwNzI4MDU0MjUyMBYAAAAAAAAAAA0AAAAAAAAAAAH2AfYAAAAAAAAEAX5hFw-MAQAAAAB0OxUFeA7LTbWWFS8WYwhrxDxTIDN-pdEEq_agIHqsbg
"
如需詳細資訊,請參閱 AWS Command Line Interface中的 abort-multipart-upload。
建立分段上傳複製操作
若要使用 SSE-KMS 加密目錄儲存貯體中的新物件組件複本,您必須使用 KMS 金鑰 (特別是客戶自管金鑰) 將 SSE-KMS 指定為目錄儲存貯體的預設加密組態。不支援 AWS 受管金鑰 (aws/s3
)。您的 SSE-KMS 組態在儲存貯體的生命週期內,每個目錄儲存貯體只能支援 1 個客戶自管金鑰。為 SSE-KMS 指定客戶自管金鑰之後,即無法覆寫儲存貯體 SSE-KMS 組態的客戶自管金鑰。您無法在 UploadPartCopy 請求標頭中,使用 SSE-KMS 為新的物件組件複本指定伺服器端加密設定。此外,您在 CreateMultipartUpload
請求中提供的請求標頭必須符合目的地儲存貯體的預設加密組態。
當您透過 UploadPartCopy,將 SSE-KMS 加密物件從一般用途儲存貯體複製到目錄儲存貯體、從目錄儲存貯體複製到一般用途儲存貯體或在目錄儲存貯體之間複製時,不支援 S3 儲存貯體金鑰。在此情況下,Amazon S3 AWS KMS 會在每次對 KMS 加密物件提出複製請求時呼叫 。
下列範例示範如何使用分段上傳,將物件從一個儲存貯體複製到另一個。
- SDK for Java 2.x
-
下列範例示範如何透過適用於 Java 2.x 的 SDK,使用分段上傳以程式設計方式將物件從一個儲存貯體複製到另一個。
/**
* This method creates a multipart upload request that generates a unique upload ID that is used to track
* all the upload parts.
*
* @param s3
* @param bucketName
* @param key
* @return
*/
private static String createMultipartUpload(S3Client s3, String bucketName, String key) {
CreateMultipartUploadRequest createMultipartUploadRequest = CreateMultipartUploadRequest.builder()
.bucket(bucketName)
.key(key)
.build();
String uploadId = null;
try {
CreateMultipartUploadResponse response = s3.createMultipartUpload(createMultipartUploadRequest);
uploadId = response.uploadId();
} catch (S3Exception e) {
System.err.println(e.awsErrorDetails().errorMessage());
System.exit(1);
}
return uploadId;
}
/**
* Creates copy parts based on source object size and copies over individual parts
*
* @param s3
* @param sourceBucket
* @param sourceKey
* @param destnBucket
* @param destnKey
* @param uploadId
* @return
* @throws IOException
*/
public static ListCompletedPart
multipartUploadCopy(S3Client s3, String sourceBucket, String sourceKey, String destnBucket, String destnKey, String uploadId) throws IOException {
// Get the object size to track the end of the copy operation.
HeadObjectRequest headObjectRequest = HeadObjectRequest
.builder()
.bucket(sourceBucket)
.key(sourceKey)
.build();
HeadObjectResponse response = s3.headObject(headObjectRequest);
Long objectSize = response.contentLength();
System.out.println("Source Object size: " + objectSize);
// Copy the object using 20 MB parts.
long partSize = 20 * 1024 * 1024;
long bytePosition = 0;
int partNum = 1;
ListCompletedPart
completedParts = new ArrayList<>();
while (bytePosition < objectSize) {
// The last part might be smaller than partSize, so check to make sure
// that lastByte isn't beyond the end of the object.
long lastByte = Math.min(bytePosition + partSize - 1, objectSize - 1);
System.out.println("part no: " + partNum + ", bytePosition: " + bytePosition + ", lastByte: " + lastByte);
// Copy this part.
UploadPartCopyRequest req = UploadPartCopyRequest.builder()
.uploadId(uploadId)
.sourceBucket(sourceBucket)
.sourceKey(sourceKey)
.destinationBucket(destnBucket)
.destinationKey(destnKey)
.copySourceRange("bytes="+bytePosition+"-"+lastByte)
.partNumber(partNum)
.build();
UploadPartCopyResponse res = s3.uploadPartCopy(req);
CompletedPart part = CompletedPart.builder()
.partNumber(partNum)
.eTag(res.copyPartResult().eTag())
.build();
completedParts.add(part);
partNum++;
bytePosition += partSize;
}
return completedParts;
}
public static void multipartCopyUploadTest(S3Client s3, String srcBucket, String srcKey, String destnBucket, String destnKey) {
System.out.println("Starting multipart copy for: " + srcKey);
try {
String uploadId = createMultipartUpload(s3, destnBucket, destnKey);
System.out.println(uploadId);
ListCompletedPart
parts = multipartUploadCopy(s3, srcBucket, srcKey,destnBucket, destnKey, uploadId);
completeMultipartUpload(s3, destnBucket, destnKey, uploadId, parts);
System.out.println("Multipart copy completed for: " + srcKey);
} catch (Exception e) {
System.err.println(e.getMessage());
System.exit(1);
}
}
- SDK for Python
-
下列範例示範如何透過適用於 Python 的 SDK,使用分段上傳以程式設計方式將物件從一個儲存貯體複製到另一個。
import logging
import boto3
from botocore.exceptions import ClientError
def head_object(s3_client, bucket_name, key_name):
'''
Returns metadata for an object in a directory bucket
:param s3_client: boto3 S3 client
:param bucket_name: Bucket that contains the object to query for metadata
:param key_name: Key name to query for metadata
:return: Metadata for the specified object if successful, else None
'''
try:
response = s3_client.head_object(
Bucket = bucket_name,
Key = key_name
)
return response
except ClientError as e:
logging.error(e)
return None
def create_multipart_upload(s3_client, bucket_name, key_name):
'''
Create a multipart upload to a directory bucket
:param s3_client: boto3 S3 client
:param bucket_name: Destination bucket for the multipart upload
:param key_name: Key name of the object to be uploaded
:return: UploadId for the multipart upload if created successfully, else None
'''
try:
mpu = s3_client.create_multipart_upload(Bucket = bucket_name, Key = key_name)
return mpu['UploadId']
except ClientError as e:
logging.error(e)
return None
def multipart_copy_upload(s3_client, source_bucket_name, key_name, target_bucket_name, mpu_id, part_size):
'''
Copy an object in a directory bucket to another bucket in multiple parts of a specified size
:param s3_client: boto3 S3 client
:param source_bucket_name: Bucket where the source object exists
:param key_name: Key name of the object to be copied
:param target_bucket_name: Destination bucket for copied object
:param mpu_id: The UploadId returned from the create_multipart_upload call
:param part_size: The size parts that the object will be broken into, in bytes.
Minimum 5 MiB, Maximum 5 GiB. There is no minimum size for the last part of your multipart upload.
:return: part_list for the multipart copy if all parts are copied successfully, else None
'''
part_list = []
copy_source = {
'Bucket': source_bucket_name,
'Key': key_name
}
try:
part_counter = 1
object_size = head_object(s3_client, source_bucket_name, key_name)
if object_size is not None:
object_size = object_size['ContentLength']
while (part_counter - 1) * part_size <object_size:
bytes_start = (part_counter - 1) * part_size
bytes_end = (part_counter * part_size) - 1
upload_copy_part = s3_client.upload_part_copy (
Bucket = target_bucket_name,
CopySource = copy_source,
CopySourceRange = f'bytes={bytes_start}-{bytes_end}',
Key = key_name,
PartNumber = part_counter,
UploadId = mpu_id
)
part_list.append({'PartNumber': part_counter, 'ETag': upload_copy_part['CopyPartResult']['ETag']})
part_counter += 1
except ClientError as e:
logging.error(e)
return None
return part_list
def complete_multipart_upload(s3_client, bucket_name, key_name, mpu_id, part_list):
'''
Completes a multipart upload to a directory bucket
:param s3_client: boto3 S3 client
:param bucket_name: Destination bucket for the multipart upload
:param key_name: Key name of the object to be uploaded
:param mpu_id: The UploadId returned from the create_multipart_upload call
:param part_list: List of uploaded part numbers with associated ETags
:return: True if the multipart upload was completed successfully, else False
'''
try:
s3_client.complete_multipart_upload(
Bucket = bucket_name,
Key = key_name,
UploadId = mpu_id,
MultipartUpload = {
'Parts': part_list
}
)
except ClientError as e:
logging.error(e)
return False
return True
if __name__ == '__main__':
MB = 1024 ** 2
region = 'us-west-2
'
source_bucket_name = 'SOURCE_BUCKET_NAME
'
target_bucket_name = 'TARGET_BUCKET_NAME
'
key_name = 'KEY_NAME
'
part_size = 10 * MB
s3_client = boto3.client('s3', region_name = region)
mpu_id = create_multipart_upload(s3_client, target_bucket_name, key_name)
if mpu_id is not None:
part_list = multipart_copy_upload(s3_client, source_bucket_name, key_name, target_bucket_name, mpu_id, part_size)
if part_list is not None:
if complete_multipart_upload(s3_client, target_bucket_name, key_name, mpu_id, part_list):
print (f'{key_name} successfully copied through multipart copy from {source_bucket_name} to {target_bucket_name}')
else:
print (f'Could not copy {key_name} through multipart copy from {source_bucket_name} to {target_bucket_name}')
下列範例示範如何透過 AWS CLI,使用分段上傳以程式設計方式將物件從一個儲存貯體複製到另一個。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api upload-part-copy --bucket bucket-base-name
--zone-id
--x-s3 --key TARGET_KEY_NAME
--copy-source SOURCE_BUCKET_NAME/SOURCE_KEY_NAME
--part-number 1
--upload-id "AS_mgt9RaQE9GEaifATue15dAAAAAAAAAAEMAAAAAAAAADQwNzI4MDU0MjUyMBYAAAAAAAAAAA0AAAAAAAAAAAH2AfYAAAAAAAAEBnJ4cxKMAQAAAABiNXpOFVZJ1tZcKWib9YKE1C565_hCkDJ_4AfCap2svg
"
如需詳細資訊,請參閱 AWS Command Line Interface中的 upload-part-copy。
列出進行中的分段上傳
若要列出對目錄儲存貯體進行中的分段上傳,您可以使用 AWS SDKs或 AWS CLI。
- SDK for Java 2.x
-
下列範例示範如何使用適用於 Java 2.x 的 SDK 列出進行中 (未完成) 的分段上傳。
public static void listMultiPartUploads( S3Client s3, String bucketName) {
try {
ListMultipartUploadsRequest listMultipartUploadsRequest = ListMultipartUploadsRequest.builder()
.bucket(bucketName)
.build();
ListMultipartUploadsResponse response = s3.listMultipartUploads(listMultipartUploadsRequest);
List MultipartUpload uploads = response.uploads();
for (MultipartUpload upload: uploads) {
System.out.println("Upload in progress: Key = \"" + upload.key() + "\", id = " + upload.uploadId());
}
}
catch (S3Exception e) {
System.err.println(e.getMessage());
System.exit(1);
}
}
- SDK for Python
-
下列範例示範如何使用適用於 Python 的 SDK 列出進行中 (未完成) 的分段上傳。
import logging
import boto3
from botocore.exceptions import ClientError
def list_multipart_uploads(s3_client, bucket_name):
'''
List any incomplete multipart uploads in a directory bucket in e specified gion
:param s3_client: boto3 S3 client
:param bucket_name: Bucket to check for incomplete multipart uploads
:return: List of incomplete multipart uploads if there are any, None if not
'''
try:
response = s3_client.list_multipart_uploads(Bucket = bucket_name)
if 'Uploads' in response.keys():
return response['Uploads']
else:
return None
except ClientError as e:
logging.error(e)
if __name__ == '__main__':
bucket_name = 'BUCKET_NAME
'
region = 'us-west-2
'
s3_client = boto3.client('s3', region_name = region)
multipart_uploads = list_multipart_uploads(s3_client, bucket_name)
if multipart_uploads is not None:
print (f'There are {len(multipart_uploads)} ncomplete multipart uploads for {bucket_name}')
else:
print (f'There are no incomplete multipart uploads for {bucket_name}')
下列範例示範如何使用 AWS CLI列出進行中 (未完成) 的分段上傳。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api list-multipart-uploads --bucket bucket-base-name
--zone-id
--x-s3
如需詳細資訊,請參閱 AWS Command Line Interface中的 list-multipart-uploads。
列出分段上傳的分段
下列範例示範如何列出分段上傳至目錄儲存貯體的組件。
- SDK for Java 2.x
-
下列範例示範如何使用適用於 Java 2.x 的 SDK 列出分段上傳至目錄儲存貯體的組件。
public static void listMultiPartUploadsParts( S3Client s3, String bucketName, String objKey, String uploadID) {
try {
ListPartsRequest listPartsRequest = ListPartsRequest.builder()
.bucket(bucketName)
.uploadId(uploadID)
.key(objKey)
.build();
ListPartsResponse response = s3.listParts(listPartsRequest);
ListPart
parts = response.parts();
for (Part part: parts) {
System.out.println("Upload in progress: Part number = \"" + part.partNumber() + "\", etag = " + part.eTag());
}
}
catch (S3Exception e) {
System.err.println(e.getMessage());
System.exit(1);
}
}
- SDK for Python
-
下列範例示範如何使用適用於 Python 的 SDK 列出分段上傳至目錄儲存貯體的組件。
import logging
import boto3
from botocore.exceptions import ClientError
def list_parts(s3_client, bucket_name, key_name, upload_id):
'''
Lists the parts that have been uploaded for a specific multipart upload to a directory bucket.
:param s3_client: boto3 S3 client
:param bucket_name: Bucket that multipart uploads parts have been uploaded to
:param key_name: Name of the object that has parts uploaded
:param upload_id: Multipart upload ID that the parts are associated with
:return: List of parts associated with the specified multipart upload, None if there are no parts
'''
parts_list = []
next_part_marker = ''
continuation_flag = True
try:
while continuation_flag:
if next_part_marker == '':
response = s3_client.list_parts(
Bucket = bucket_name,
Key = key_name,
UploadId = upload_id
)
else:
response = s3_client.list_parts(
Bucket = bucket_name,
Key = key_name,
UploadId = upload_id,
NextPartMarker = next_part_marker
)
if 'Parts' in response:
for part in response['Parts']:
parts_list.append(part)
if response['IsTruncated']:
next_part_marker = response['NextPartNumberMarker']
else:
continuation_flag = False
else:
continuation_flag = False
return parts_list
except ClientError as e:
logging.error(e)
return None
if __name__ == '__main__':
region = 'us-west-2
'
bucket_name = 'BUCKET_NAME
'
key_name = 'KEY_NAME
'
upload_id = 'UPLOAD_ID
'
s3_client = boto3.client('s3', region_name = region)
parts_list = list_parts(s3_client, bucket_name, key_name, upload_id)
if parts_list is not None:
print (f'{key_name} has {len(parts_list)} parts uploaded to {bucket_name}')
else:
print (f'There are no multipart uploads with that upload ID for {bucket_name} bucket')
下列範例示範如何使用 AWS CLI列出分段上傳至目錄儲存貯體的組件。若要使用此命令,請以您自己的資訊取代 使用者輸入預留位置
。
aws s3api list-parts --bucket bucket-base-name
--zone-id
--x-s3
--key KEY_NAME
--upload-id "AS_mgt9RaQE9GEaifATue15dAAAAAAAAAAEMAAAAAAAAADQwNzI4MDU0MjUyMBYAAAAAAAAAAA0AAAAAAAAAAAH2AfYAAAAAAAAEBSD0WBKMAQAAAABneY9yBVsK89iFkvWdQhRCcXohE8RbYtc9QvBOG8tNpA
"
如需詳細資訊,請參閱 AWS Command Line Interface中的 list-parts。