Tutorial
Overview
The AWS S3 file compression (zip) program is a tool that allows users to compress one or more files stored in S3 into a single zip file all in memory without the need to download to your Hard Disk. This can be useful for reducing file sizes and saving storage costs.
Requirements
Before using the AWS S3 file compression (zip) program, you'll need to have the following:
Installation
How to install the project
Overview
The AWS S3 file compression (zip) program is a tool that allows users to compress one or more files stored in S3 into a single zip file all in memory without the need to download to your Hard Disk. This can be useful for reducing file sizes and saving storage costs. Requirements
Before using the AWS S3 file compression (zip) program, you'll need to have the following:
Bash
An AWS account Access to the AWS S3 service Basic knowledge of command line and AWS
Installation How to install the project
For installation of the project's CLI, we recommend using poetry to install:
Although this is only a recommendation! You can also install the project with your preferred package manager, such as pip:
How to use the program
To use the AWS S3 file compression (zip) program, follow the steps below:
- First, we need to import our package
- Then, instantiate the class
- We may or may not need to call the `credentials` method, depending on whether `~/.aws/credentials` already exists or not
- Finally, we just need to call the `zipping_in_s3` method
Code Example
| example.py | |
|---|---|
Here's a Python code example that implements the AWS S3 file compression (zip) program's functionality 100% in memory.
Going deeper into each function
credentials():
The credentials method is often used to verify who you are and whether you have permission to access the resources you are requesting. So the method may not be necessary to call if you already have a configured ~/.aws/credentials or if you have environment variables defined with these names (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN).
Parameters:
ACCESS_KEY- The access key for your AWS account.SECRET_KEY- The secret key for your AWS account.SESSION_TOKEN- The session key for your AWS account. This is only needed when you are using temporary credentials. TheAWS_SECURITY_TOKENenvironment variable can also be used, but is only supported for backwards compatibility purposes.AWS_SESSION_TOKENis supported by multiple AWS SDKs besides python.
Tip
You can find more information on the boto3 documentation by clicking here
| credentials.py | |
|---|---|
s3_download_in_memory():
s3_download_in_memory is a method that allows you to download a file directly to the RAM memory of a device, without saving it to a permanent storage. This can save disk space and speed up file transfers.
Parameters:
bucket_name- The name of the bucket.prefix- The prefix is used to find the path/file matches.
Return:
list[tuple[str, io.BytesIO()]]
A method that returns a list of tuples, where each tuple contains a string (name file) and an io.BytesIO() object (file binary), is returning information about binary files at runtime, without the need to create temporary physical files. This allows for manipulation of binary data without taking up disk space.
| s3_download_in_memory.py | |
|---|---|
zipping_in_s3():
zipping_in_s3 is a compression method that compresses one or more files into a single compressed ZIP file and stores it directly in AWS S3 cloud storage service. This way, it is possible to save storage space and reduce file transfer costs by sending only one compressed file instead of several uncompressed files.
Parameters:
bucket_name- The name of the bucket .prefix- The prefix is used to find the path/file matches.zip_name- zip_name is the name given to the compressed file generated from the compression of one or more files in zip format.files- It is a list of tuples, where each tuple contains a string and an io.BytesIO() object. When this parameter is used, the s3_download_in_memory() method is not executed, which means that the file is not downloaded from AWS S3. This way, it is possible to send a ZIP file directly from the local machine to S3 without the need to download the file from the cloud.not requiredextra_args- The extra_args parameter is an optional parameter used in the Boto3 library to send additional arguments for the upload or download operation of files in AWS S3. It allows specifying additional options such as metadata or storage settings that can be passed to the S3 service during the file transfer.not required