en:navody:object_storage:cesnet_s3:start

Connecting and configuring CESNET S3

The instructions refer to the access_key, secret_key and URL to the S3 endpoint. You should have received all the information during the creation of your S3 account. The URL address of the S3 endpoint is in the form s3.clX.du.cesnet.cz, where X is a natural number 1 and greater.

Linux client AWS CLI

The AWS CLI client is a standard tool supporting work via the s3 interface. The client is written in python, so it is necessary to install python at least in version 3.3 + . According to the official documentation, python 2.6.5+ is also supported, however you may encounter compatibility issues with regards to the version of Ceph being operated at CESNET Data Centers.

In case you run CENTOS 7 you have by default the version of python, which is 2.7.5. In that case we recommend either upgrading to CENTOS 8, where python 3+ is already implemented, or using SCL (Software Collection environment).

AWS CLI installation and configuration

First install the AWS CLI client using pip. The options --upgrade upgrades all already installed components; --user uses the subdirectory of the current user under which the pip is running to avoid modifying the system libraries.

$ pip3.6 install awscli --upgrade --user

Next, upgrade to the latest version of AWS CLI

$ aws --version
aws-cli/1.16.239 Python/3.6.3 Linux/3.10.0-957.27.2.el7.x86_64 botocore/1.12.229
$ pip3.6 install --upgrade --user awscli
If you need to install AWS CLI in a virtual environment, you can use this guide.
We recommend to use the option to configure the AWS CLI --profile. That option allows you to define multiple user profiles, for example for you and the service identity. However, you can also use the default settings (without option --profile). All commands will be then identical. If they won't contain an option --profile so the default settings are used.

Next configure the AWS CLI. The following examples use the configuration with the option --profile.

$ aws configure --profile pepa_servis
AWS Access Key ID [None]: AKIAI44QH8DHBEXAMPLE
AWS Secret Access Key [None]: je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY
Default region name [None]:
Default output format [None]: text

AWS Access Key ID - access key you received during the creation of your S3 account
Secret Access Key - the secret key you received during the creation of your S3 account
Default region name - prefix for servers to which you will send your requests, Do not fill in here In case of request of some SW, e.g. Veeam, fill in here as a region storage
Default output format - output data format (json, text, table)

For proper functioning, it is necessary to reconfigure endpoint-url to CESNET servers. To do that you need to install the tool awscli-plugin-endpoint, see the instructions below.

Installing awscli-plugin-endpoint

First, perform the installation using pip.

pip3.6 install awscli-plugin-endpoint --user

Alternatively, you can install the latest version from the Github repository

pip3.6 install git+https://github.com/wbingli/awscli-plugin-endpoint.git --user

To use the installed plugin, you need to turn it on first.

aws configure set plugins.endpoint awscli_plugin_endpoint

Then we add the address of the CESNET s3 endpoint for the Pilsen storage with the code designation CL2:

aws configure --profile pepa_servis set s3.endpoint_url https://s3.cl2.du.cesnet.cz

Above mentioned commands should insert proper configuration lines into your config, see below.

[user@distro ~]$ cat .aws/config 

[profile pepa_servis]
output = text
s3 =
    endpoint_url = https://s3.cl2.du.cesnet.cz
[plugins]
endpoint = awscli_plugin_endpoint

Now we can verify the connection, for example listing existing buckets.

$ aws s3 --profile pepa_servis ls
2019-09-11 17:06:53 test-win
2019-09-11 14:45:45 large-files
2019-09-11 14:48:21 small-files

AWS CLI control - high-level (s3)

To view full help (available commands) you can use help

$ aws s3 help

Working with buckets

Working with buckets

The bucket name must be unique and should contain only lowercase and uppercase letters, numbers, dashes and periods. The name of the bucket must begin with a letter or number only and cannot contain periods next to dashes or more periods.

Bucket creation

$ aws s3 --profile pepa_servis mb s3://test1

Listing buckets

$ aws s3 --profile pepa_servis ls
2019-09-18 13:30:17 test1

Bucket removal

aws s3 --profile pepa_servis rb s3://test1


Working with files and directories

Working with files and directories

Files
Upload the file

$ aws s3 --profile pepa_servis cp file_1.tar s3://test1
upload: ./file_1.tar to s3://test1/file_1.tar

Download the file

$ aws s3 --profile pepa_servis cp s3://test1/file_1.tar downloads/
download: s3://test1/file1.tar to downloads/file1.tar

Delete the file

$ aws s3 --profile pepa_servis rm s3://test1/file_1.tar 
delete: s3://test1/file1.tar

Directories
Upload the directory

If you specify a slash “/” at the end of the source directory, the command applies only to the contents of the source directory. Without slash, the command applies to the directory itself.
$ aws s3 --profile pepa_servis cp my_dir s3://test1/test_dir1 --recursive

Download the directory

$ aws s3 --profile pepa_servis cp s3://test1/test_dir1 downloads/ --recursive

Delete the directory

$ aws s3 --profile pepa_servis rm s3://test1/test_dir1 --recursive

Directory synchronization
Directory synchronization to the object storage over s3 from local machine

$ aws s3 --profile pepa_servis sync downloads s3://test1/my_sync/

Directory synchronization from storage over s3 to the local machine

$ aws s3 --profile pepa_servis sync s3://test1/my_sync/ ./restored/


AWS CLI control - api-level (s3api)

$ aws s3api help

Amazon api-level S3

Extension functions S3

- S3 object versioning instructions here.

- Sharing an S3 object using a (presigned) URL instructions zde.



Linux client s3cmd

S3cmd is a free command line tool and client for uploading, retrieving and managing data in s3 cloud storage. S3cmd is written in Python. It is an open source project available under the GNU Public License v2 (GPLv2) and is free for both commercial and private use.

As a preferred tool, we recommend AWS CLI.
We've had problems with s3cmd in some cases. For example, bucket names cannot begin with a number or capital letters.

Installing s3cmd

s3cmd is available in the default rpm repositories for CentOS, RHEL and Ubuntu. Install it by simply running the following commands on your system.

On CentOS/RHEL:

$ sudo yum install s3cmd 

On Ubuntu/Debian:

$ sudo apt install s3cmd 

Configuration of s3cmd

To configure s3cmd, you need the Access Key and Secret Key that you received from data center admins. After obtaining the keys, insert them into the configuration file /home/user/.s3cfg.

[default]
host_base = https://s3.cl2.du.cesnet.cz
use_https = True
access_key = xxxxxxxxxxxxxxxxxxxxxx
secret_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
host_bucket = s3.cl2.du.cesnet.cz
gpg_command = /​usr/​bin/​gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s
gpg_passphrase = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

Using s3cmd commands

s3 commands support common bucket operation, such as creating, listing and deleting.

Working with buckets

Working with buckets

Listing of all s3 buckets

$ s3cmd ls 

Creation of a new s3 bucket

$ s3cmd mb s3://newbucket 

Remove s3 bucket

$ s3cmd rb s3://newbucket 

Bucket can be removed only if it's empty!


Working with files and directories

Working with files and directories

Listing the contents of the s3 bucket

$ s3cmd ls s3://newbucket/ 

Uploading data to s3 bucket

Upload files

$ s3cmd put file.txt s3://newbucket/ 

Upload encrypted files

$ s3cmd put -e file.txt s3://​newbucket/​

Upload folders

$ s3cmd put -r directory s3://newbucket/ 

Make sure you do not add a trailing slash in the directory (e.g. .: directory/),otherwise, it only upload the contents of the directory.

Download file from s3 bucket

$ s3cmd get s3://newbucket/file.txt 

Delete data from s3 bucket

$ s3cmd del s3://newbucket/file.txt 
$ s3cmd del s3://newbucket/directory 

Data synchronization to s3 bucket

$ s3cmd sync /local/path/ s3://newbucket/backup/ 

Synchronization of data from s3 bucket

$ s3cmd sync s3://newbucket/backup/ ~/restore/ 


s5cmd for very fast transfers

If you have a connection faster than 1-2Gbps and you want to optimize the transmission for maximum speed, use the s5cmd tool. The tool is available in the form of precompiled binaries for Linux and macOS. It is also available in the form of a code or docker image. The choice depends on your system and the intended use. You can find a complete overview at github project.

To .aws/credentials add the following lines.

[default]
aws_access_key_id = xxxxxxxxxxxxxxxxxxxxxx
aws_secret_access_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
max_concurrent_requests = 2000
max_queue_size = 20000
multipart_threshold = 128MB
multipart_chunksize = 32MB

Access Key a Secret Key you received while creating your S3 account.

To list all available buckets, use:

s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp myfile s3://

Easy file upload:

s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp myfile s3://bucket

To achieve higher speeds for larger data volumes of files, it is necessary to adjust the parameters, specifically to connect more CPU cores and workers, for example:

s5cmd  --endpoint-url=https://s3.clX.du.cesnet.cz cp -c=8 -p=5000 /directory/big_file s3://bucket

VEEAM

Instructions to connect S3 storage via VEEAM software.

Windows clients

There are more options to choose from, here we offer a selection of several tested free clients for S3.

CloudBerry Explorer for Amazon S3

CloudBerry Explorer is an intuitive file explorer that helps you manage your S3 account as if it were another folder on your local drive . The program has a double-pane interface and acts as an FTP client, with each window dedicated to a single folder. These locations are not fixed and can be switched to suit your current task: a local computer and a remote S3 server, two local folders, or even two S3 accounts. Guide to CloudBerry

Advanced S3 features

Sharing S3 objects

S3 Object Versioning

With S3 versioning, you can keep multiple versions of an object in one bucket. This will allow you to recover objects that have been accidentally deleted or overwritten. S3 object versioning instructions

Frequently Asked Questions for S3

Multipart

Multipart recording allows you to record a single object as a set of related parts. After loading all parts of your object, Ceph presents the data as a single object. With this feature, you can create parallel recordings, pause and resume recording an object, and start recording before you know the total size of the object.

Individual objects can have a size from a minimum of 0B to typically 5GB, but this value may vary depending on the client used. The largest object that can be uploaded in one PUT operation is typically 5 GB. In general, for objects larger than 100 MB, you should consider using the multipart upload feature.

Last modified:: 05.09.2022 17:14