en:navody:object_storage:cesnet_s3:start

Connection and configuration of CESNET S3 service

CESNET S3 service utilizes the naming convention domain.cz/tenant:bucket. The tenant is the short VO's name and the domain is s3.clX.du.cesnet.cz. The name convention differs from the AWS S3 where is used formatting “bucket.domain.com”. If you will not explicitly mention the tenant it should be recognized automatically. The recognition is being performed based on the access key and secret key. So it should be sufficient to use the format as follows: s3.clX.du.cesnet.cz/bucket

In case your client considers the endpoint as native AWS you have to switch to S3 compatible endpoint. Most of the clients can automatically process both formattings. However, in some cases is necessary to specify the format explicitly.

The guides mention access_key, secret_key, and S3 endpoint URL. All information should be obtained during the process of S3 account creation. S3 endpoint URL is in the format “https://s3.clX.du.cesnet.cz” which is the name of the S3 data storage.
 cl1 - https://s3.cl1.du.cesnet.cz 
 cl2 - https://s3.cl2.du.cesnet.cz 
 cl3 - https://s3.cl3.du.cesnet.cz 
 cl4 - https://s3.cl4.du.cesnet.cz 
 cl5 - https://s3.cl5.du.cesnet.cz 

Personal S3 account

A personal S3 account is an elementary S3 service. It is suited for your personal working/research data which you don't need to share between users/groups. To obtain a personal S3 account just follow this guide.

Linux clients

AWS-CLI

AWS CLI - Amazon Web Services Command Line Interface - is standardized too; supporting S3 interface. Using this tool you can handle your data and set up your S3 data storage. You can used the command line control or you can incorporate AWS CLI into your automated scripts. Tutorial for AWS CLI


s3cmd

S3cmd is a free command line tool to upload and download your data. You can also control the setup of your S3 storage via this tool. S3cmd is written in python. It goes about open-source project available under GNU Public License v2 (GPLv2) for personal either or commercial usage. s3cmd guide.


Rclone - data synchronization

The tool Rclone is suitable for data synchronization and data migration between more endpoints (even between different data storage providers). Rclone preserves the time stamps and checks the checksums. It is written in Go language. Rclone is available for multiple platforms (GNU/Linux, Windows, macOS, BSD and Solaris). In the following guide, we will demonstrate the usage in Linux and Windows systems. Rclone guide.

s5cmd for very fast transfers

In case you have a connection between 1-2Gbps and you wish to optimize the transfer throughput you can use s5cmd tool. S5cmd is available in the form of precompiled binaries for Windows, Linux and macOS. It is also available in form of source code or docker images. The final solution always depends on the system where you wish to use s5cmd. A complete overview can be found at Github project. The guide for s5cmd can be found here.

VEEAM

VEEAM is the tool for backup, recovery, replication, etc.

The guide to connect the S3 storage via VEEAM

Windows clients

Below are listed several Windows clients with corresponding guides.

WinSCP

WinSCP WinSCP is the popular SFTP client and FTP client for Microsoft Windows! Transfer files between your local computer and remote servers using FTP, FTPS, SCP, SFTP, WebDAV or S3 file transfer protocols. The guide for WinSCP

CloudBerry Explorer for Amazon S3

CloudBerry Explorer is an intuitive file browser for your S3 storage. It has two windows so in one you can see the local disk and in the second you can see the remote S3 storage. Between these two windows, you can drag and drop your files. The guide for CloudBerry explorer.

S3 Browser

S3 Browser is a freeware tool for Windows to manage your S3 storage, upload and download data. The Guide for S3 Browser.

CyberDuck

CyberDuck is a multifunctional tool for various types of data storage (FTP, SFTP, WebDAV, OpenStack, OneDrive, Google Drive, Dropbox, etc.). Cyberduck provides only elementary functionalities, most of the advanced functions are paid. Guide for CyberDuck

Mountain Duck

Mountain Duck is an extension of Cyberduck to mount the drives into file browsers. So you can work with your remote data like with data on your local disk. Mountain Duck allows connecting the cloud data storage as the disk in Finder (macOS) or file browser (Windows).

Advanced S3 functionalities

Sharing S3 objects

S3 objects versioning

Under your S3 storage, you can keep more versions of your objects in one bucket. It allows you to recover the objects, which has been removed or overwritten. The guide for S3 object versioning

Sharing the objects within tenant

Frequently asked questions

Multipart upload

Problem description: I cannot upload file exceeding 5 GB. I got the error “Your proposed upload exceeds the maximum allowed object size”.

Problem solution: For your upload, you have to utilize so-called multipart upload. Multipart upload allows you to upload one object as a set of smaller files. Once the upload is done it is represented again as one object. Multipart upload can enhance the throughput of your upload. The single object should not exceed 5GB. In most of the clients, you can set up the threshold for multipart upload. Typically is 100MB.

Last modified:: 27.03.2024 19:42