This guide covers issues relating specifically to S3 Deep Archive buckets. For more general information on using S3 Buckets (connecting to a bucket, uploading files), please refer to How do I use a Spinup S3 bucket? .

Objective
Upload files, create retrieval jobs, track a jobs status, manage expiry, and download the retrieved objects contained within a S3 Deep Archive Bucket.

Introduction

S3 Deep Archive is different from other S3 storage classes primarily in terms of retrieval times and costs. Unlike the immediate access you get with standard S3 buckets, S3 Deep Archive is designed for long-term data storage where access is infrequent and retrieval can afford to take hours. This translates to highly reduced storage costs, making it an attractive option for archival purposes.

This guide will equip you with knowledge on creating retrieval jobs, tracking their status, downloading the retrieved data, and managing object expiry.

S3 Deep archive should only be used for storage where files need to be retrieved 1-2 times a year. If you are actively working with files (uploading / downloading / deleting), you should consider other storage options, such as S3 (via Spinup) or Wasabi. More info on comparing storage options can be found on Storage Finder @ Yale.

Retrieving a file (via the Amazon Command Line Interface)

Amazon’s Documentation on S3 Deep Archive Retrieval.

  1. Initiate a Restore Request:

    aws s3api restore-object --bucket YOUR_BUCKET_NAME --key YOUR_FILE_KEY --restore-request Days=NUMBER_OF_DAYS_TO_KEEP_RESTORED,GlacierJobParameters={Tier="STANDARD"}
  2. Check the Restoration Status:
    It typically takes about 12 hours for the restoration process to complete. To check the status:

    aws s3api head-object --bucket YOUR_BUCKET_NAME --key YOUR_FILE_KEY

    In the output, look for the "Restore" field. When the file is ready, it'll state "ongoing-request": "false".

  3. Retrieve the Restored File:
    After restoration completes:

    aws s3 cp s3://YOUR_BUCKET_NAME/YOUR_FILE_KEY DESTINATION_PATH_ON_YOUR_LOCAL_MACHINE

    Replace DESTINATION_PATH_ON_YOUR_LOCAL_MACHINE with where you want to download the file to.

You are only billed for the restored copy of the file for the number of days you specified in the restore request. After those days, the temporary restored copy will be deleted, but the original in the Glacier Deep Archive will remain intact.

Retrieving a file (via Cyberduck)

Cyberduck’s Documentation on S3 Archive Retrieval.

  1. Initiate a Restore Request:

  2. Check the Restoration Status:

  3. Retrieve the Restored File:

By default, Cyberduck retrievals are set to expire after two days. After two days, the file will no longer be available, and another retrieval job will need to be requested. If you need to modify the expiry time of a restore request, you will need to modify hidden configuration options within Cyberduck. For more information, please refer to their documentation.