The AWS CLI client is a standard tool supporting work via the s3 interface. The client is written in python, so it is necessary to install python at least in version 3.3 + . According to the official documentation, python 2.6.5+ is also supported, however you may encounter compatibility issues with regards to the version of Ceph being operated at CESNET Data Centers.
First install the AWS CLI client using pip. The options --upgrade upgrades all already installed components; --user uses the subdirectory of the current user under which the pip is running to avoid modifying the system libraries.
$ pip3.6 install awscli --upgrade --user
Next, upgrade to the latest version of AWS CLI
$ aws --version aws-cli/1.16.239 Python/3.6.3 Linux/3.10.0-957.27.2.el7.x86_64 botocore/1.12.229
$ pip3.6 install --upgrade --user awscli
Next configure the AWS CLI. The following examples use the configuration with the option --profile.
$ aws configure --profile pepa_servis AWS Access Key ID [None]: AKIAI44QH8DHBEXAMPLE AWS Secret Access Key [None]: je7MtGbClwBF/2Zp9Utk/h3yCo8nvbEXAMPLEKEY Default region name [None]: Default output format [None]: text
AWS Access Key ID - access key you received during the creation of your S3 account
Secret Access Key - the secret key you received during the creation of your S3 account
Default region name - prefix for servers to which you will send your requests, Do not fill in here In case of request of some SW, e.g. Veeam, fill in here as a region storage
Default output format - output data format (json, text, table)
First, perform the installation using pip.
pip3.6 install awscli-plugin-endpoint --user
Alternatively, you can install the latest version from the Github repository
pip3.6 install git+https://github.com/wbingli/awscli-plugin-endpoint.git --user
To use the installed plugin, you need to turn it on first.
aws configure set plugins.endpoint awscli_plugin_endpoint
Then we add the address of the CESNET s3 endpoint for the Pilsen storage with the code designation CL2:
aws configure --profile pepa_servis set s3.endpoint_url https://s3.cl2.du.cesnet.cz
Above mentioned commands should insert proper configuration lines into your config, see below.
[user@distro ~]$ cat .aws/config [profile pepa_servis] output = text s3 = endpoint_url = https://s3.cl2.du.cesnet.cz [plugins] endpoint = awscli_plugin_endpoint
Now we can verify the connection, for example listing existing buckets.
$ aws s3 --profile pepa_servis ls 2019-09-11 17:06:53 test-win 2019-09-11 14:45:45 large-files 2019-09-11 14:48:21 small-files
To view full help (available commands) you can use help
$ aws s3 help
$ aws s3api help
- S3 object versioning instructions here.
- Sharing an S3 object using a (presigned) URL instructions zde.
S3cmd is a free command line tool and client for uploading, retrieving and managing data in s3 cloud storage. S3cmd is written in Python. It is an open source project available under the GNU Public License v2 (GPLv2) and is free for both commercial and private use.
s3cmd is available in the default rpm repositories for CentOS, RHEL and Ubuntu. Install it by simply running the following commands on your system.
$ sudo yum install s3cmd
$ sudo apt install s3cmd
To configure s3cmd, you need the Access Key and Secret Key that you received from data center admins. After obtaining the keys, insert them into the configuration file /home/user/.s3cfg.
[default] host_base = https://s3.cl2.du.cesnet.cz use_https = True access_key = xxxxxxxxxxxxxxxxxxxxxx secret_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx host_bucket = s3.cl2.du.cesnet.cz gpg_command = /usr/bin/gpg gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s gpg_passphrase = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
s3 commands support common bucket operation, such as creating, listing and deleting.
If you have a connection faster than 1-2Gbps and you want to optimize the transmission for maximum speed, use the s5cmd tool. The tool is available in the form of precompiled binaries for Linux and macOS. It is also available in the form of a code or docker image. The choice depends on your system and the intended use. You can find a complete overview at github project.
To .aws/credentials add the following lines.
[default] aws_access_key_id = xxxxxxxxxxxxxxxxxxxxxx aws_secret_access_key = xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx max_concurrent_requests = 2000 max_queue_size = 20000 multipart_threshold = 128MB multipart_chunksize = 32MB
Access Key a Secret Key you received while creating your S3 account.
To list all available buckets, use:
s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp myfile s3://
Easy file upload:
s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp myfile s3://bucket
To achieve higher speeds for larger data volumes of files, it is necessary to adjust the parameters, specifically to connect more CPU cores and workers, for example:
s5cmd --endpoint-url=https://s3.clX.du.cesnet.cz cp -c=8 -p=5000 /directory/big_file s3://bucket
Instructions to connect S3 storage via VEEAM software.
There are more options to choose from, here we offer a selection of several tested free clients for S3.
CloudBerry Explorer is an intuitive file explorer that helps you manage your S3 account as if it were another folder on your local drive . The program has a double-pane interface and acts as an FTP client, with each window dedicated to a single folder. These locations are not fixed and can be switched to suit your current task: a local computer and a remote S3 server, two local folders, or even two S3 accounts. Guide to CloudBerry
With S3 versioning, you can keep multiple versions of an object in one bucket. This will allow you to recover objects that have been accidentally deleted or overwritten. S3 object versioning instructions
Multipart recording allows you to record a single object as a set of related parts. After loading all parts of your object, Ceph presents the data as a single object. With this feature, you can create parallel recordings, pause and resume recording an object, and start recording before you know the total size of the object.
Individual objects can have a size from a minimum of 0B to typically 5GB, but this value may vary depending on the client used. The largest object that can be uploaded in one PUT operation is typically 5 GB. In general, for objects larger than 100 MB, you should consider using the multipart upload feature.