In this video, we take you through a demonstration on how to delete Amazon S3 Glacier Vaults which may contain existing archives that you no longer need. This how-to-guide includes a series of commands that need to be run using the AWS Command Line Interface (AWS CLI) as well as a batch file. For a copy of the commands and the sample batch file, click on the link below

Amazon Glacier offers secure, durable, and extremely low-cost cloud storage classes for data archiving and long-term backup.

  • Data is stored in Amazon S3 Glacier in “archives.” – Upload a single file as an archive or aggregate multiple files in tar/zip format
  • Single archive can be up to 40TB
  • Archive is immutable, meaning that after an archive is created it cannot be updated
  • Amazon S3 Glacier uses “vaults” as containers to store archives
  • Under a single AWS account, you can have up to 1000 vaults
  • Vault Lock allows you to enforce compliance controls

 

How to delete old vaults you no longer need if they contain archives?

Before you can delete an Amazon Glacier Vault, you need to first delete any archives stored in that vault.  While the vault itself can be deleted from the AWS Management Console, deleting archives requires you to either write code or use the AWS Command Line Interface (CLI).  In the following steps, we go through the process of retrieving an inventory of your Glacier Vault, identifying all the archives, and then deleting the archives.  The video walk-through above shows each step in detail and below are the commands your need to run.

You can access all the commands from our Git Hub pages: https://github.com/iaasacademy/aws-how-to-guide/tree/main/delete-glacier-vaults

Step 1

Run an inventory job on your AWS Glacier Vault.  This is the first step to work our how many archives you have in the vault.

aws glacier initiate-job –vault-name awsexamplevault –account-id 111122223333 –job-parameters ‘{“Type”: “inventory-retrieval”}’
Important Note – if you are running the command on a Microsoft Windows machine using the command prompt, then use this command:
aws glacier initiate-job –account-id 111122223333  –vault-name awsexamplevault –job-parameters “{\”Type\”: \”inventory-retrieval\”}”

Step 2

Use the describe-job command to check the status of the previous retrieval job.  Each job make take some time, you will need to run the following command to see if the inventory task is complete.  It can take 3.5 to 4.5 hours in most cases.

aws glacier describe-job –vault-name awsexamplevault –account-id 111122223333 –job-id *** jobid ***

 

Step 3

Use the get-job-output command to download the retrieval job to the file output.json. Because we want to actually retrieve all the Archive IDs that we can then reference to delete the archives, this command will help us to export the archive details into a JSON document.

aws glacier get-job-output –vault-name awsexamplevault –account-id 111122223333 –job-id *** jobid *** output.json

 

Step 4

Extract the Archive IDs from the output.json file using jq.  The jq tool will enable you to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text.  We can then extract just the archive IDs from the ISON file and export that into another text file that can be parsed by a script to retrieve each archive ID line by line

jq ‘.ArchiveList[] | .ArchiveId’ output.json > archiveIds.txt

jq “.ArchiveList[] | .ArchiveId” output.json > archiveIds.txt – For those using a Windows operating system and the command prompt (CMD)

Step 5

In this step we can delete archives by referencing the archive ID in the archiveids.txt file.  Note that this command will need to be run for each archive ID

aws glacier delete-archive –vault-name awsexamplevault –account-id 111122223333 –archive-id *** archiveid ***

 

Step 6

At this stage, you can fully appreciate that we may have hundreds, maybe thousands of archives to delete. Instead of deleting each one manually, you can create a script to automate the process.  You will find the script is available in our Git Hub pages which have been written for the Windows Command Prompt, available as a standard batch file.