How to Transfer Data from Yandex Object Storage to Amazon S3 Using rclone
RU / EN
Consultation

How to Transfer Data from Yandex Object Storage to Amazon S3 Using rclone

This article touches upon the process of transferring data to Amazon S3 Object Storage from Yandex Object Storage using the rclone utility. I will demonstrate how to install rclone, how to configure its parameters that will be used to access the repositories and I will show the commands that can be used to synchronize files and check their integrity.

How to Transfer Data from Yandex Object Storage to Amazon S3 Using rclone

By:

Igor Remsha

By:

Igor Remsha

How to Transfer Data from Yandex Object Storage to Amazon S3 Using rclone

Creation of API Keys and Receival of Storage Parameters

Before we start working with rclone, we need to create or receive access keys and acquire additional information (name, region, endpoint, etc.) about the storage.

Generating an API Key for Yandex Object Storage

If you do not have a Yandex Cloud API key with S3 management permission yet, you need to generate it right now. Select the needed cloud in the Yandex Cloud management console:

Choose the cloud

Select the sectionService Accounts and click on the Create service account:

Creating service account

Come up with a name and pick the storage.editor role:

Picking up the name and role

Then click on the Create new key, further selecting Create static access key:

Creating key

On the final page, don’t forget to save the Access key ID and the Secret access key.They will aid us later in configuring rclone.

New key

Generating an API Key for Amazon S3

If you do not have a Yandex Cloud API key with S3 management permission yet, you need to generate it right now. Select My Security Credentials in the AWS management console:

Setting the account

Then select Users and click on Add user:

Add user

Fill in the Username and choose Programmatic access:

Getting the access

Select Attach existing policies directly and select AmazonS3ReadOnlyAccess, which can be found by entering s3read in the search section:

Attaching existing policies

Click on Create user:

Creating user

On the final page, save the Access key ID and the Secret access key. They will be needed later on to configure rclone. You can save this data by clicking on the Download .csv:

Saving the access key and key ID

Amazon S3 Storage Location Region

Now we need to find out in which region our buckets are located. To do this, select S3 in Services:

Setting the region

This way we get a list of existing buckets with their regions listed as well.

To compare the region and the code that it has, we will use the AWS service endpoints. For example, the Europe (London) region has the eu-west-2 code, which we will be using further.

Installing rclone on a Local Machine

The installation will be performed on a Linux system.

Install from apt

sudo apt update
sudo apt install rclone

OR

Fetch and unpack

curl -O https://downloads.rclone.org/rclone-current-linux-amd64.zip
unzip rclone-current-linux-amd64.zip
cd rclone-*-linux-amd64

Copy binary file

sudo cp rclone /usr/bin/
sudo chown root:root /usr/bin/rclone
sudo chmod 755 /usr/bin/rclone

Install manpage

sudo mkdir -p /usr/local/share/man/man1
sudo cp rclone.1 /usr/local/share/man/man1/
sudo mandb

Run rclone config

sudo rclone config

Object Storage and S3 Accounts Configuration

There are several ways of working with rclone configurations

Setting the configuration file by yourself

mkdir -p ~/.config/rclone
sudo nano ~/.config/rclone/rclone.conf

 

[s3]
type = s3
env_auth = false
access_key_id = aws_access_key
secret_access_key = aws_secret_key
region = aws_region
location_constraint = aws_location_constraint
acl = public-read

Setting the configuration file via config command

sudo rclone config
No remotes found - make a new one
n) New remote
s) Set configuration password
q) Quit config
n/s/q> n
name> amazon-s3
Type of storage to configure.
Choose a number from below, or type in your own value
[snip]
XX / Amazon S3 Compliant Storage Providers including AWS, Ceph, Dreamhost, IBM COS, Minio, and Tencent COS
   \ "s3"
[snip]
Storage> s3
Choose your S3 provider.
Choose a number from below, or type in your own value
 1 / Amazon Web Services (AWS) S3
   \ "AWS"
 2 / Ceph Object Storage
   \ "Ceph"
 3 / Digital Ocean Spaces
   \ "DigitalOcean"
provider> 1
Get AWS credentials from runtime (environment variables or EC2/ECS meta data if no env vars). Only applies if access_key_id and secret_access_key is blank.
Choose a number from below, or type in your own value
 1 / Enter AWS credentials in the next step
   \ "false"
 2 / Get AWS credentials from the environment (env vars or IAM)
   \ "true"
env_auth> 1
AWS Access Key ID - leave blank for anonymous access or runtime credentials.
access_key_id> XXX
AWS Secret Access Key (password) - leave blank for anonymous access or runtime credentials.
secret_access_key> YYY

We will specify here the previously obtained access_key_id and secret_access_key.

Region to connect to.
...

Let's move on to the final result (a full list of all configurations and their data can be found at  https://rclone.org/s3/#amazon-s3)

Remote config
--------------------
[remote]
type = amazon-s3
provider = AWS
env_auth = false
access_key_id = XXX
secret_access_key = YYY
region = eu-west-2
endpoint = 
location_constraint = 
acl = public-read
--------------------
y) Yes this is OK
e) Edit this remote
d) Delete this remote
y/e/d>

Copying Objects from Object Storage to S3

Now we are ready to start transferring our files. Let's start by checking remote connections:

sudo rclone listremotes
Output
amazon-s3:
yandex-object-storage:

We can see the segments available to us:

sudo rclone lsd amzon-s3:
Output          
-1 2021-07-10 11:42:21        -1 some-bucket

The file structure inside the bucket can also be accessed:

sudo rclone tree amazon-s3:some-bucket
Output
/
├── README.txt
├── src
│   ├── media.pdf
│   └── textfile.txt
└── config
    ├── rust.ppt
    ├── Launcher.jar

2 directories, 5 files

When you will be ready to copy files from Yandex Object Storage to Amazon S3, run the following command:

sudo rclone sync yandex-object-storage:some-bucket amazon-s3:some-bucket

For greater reliability, use the check command to compare objects in both repositories:

sudo rclone check yandex-object-storage:some-bucket amazon-s3:some-bucket
Output
2021/07/11 14:51:36 NOTICE: S3 bucket some-bucket: 0 differences found
2021/07/11 14:51:36 NOTICE: S3 bucket some-bucket: 1 hashes could not be checked

This will compare the hash value for each object in both storages. However, some objects cannot be compared this way. In this case, you can re-run the command with the --size-only flag (the comparison will be made based on the file sizes) or the --download flag (which downloads each object from both repositories for local comparison) to check the integrity of the transfer.

Conclusion

In this guide, the process of transferring objects from Yandex Object Storage to Amazon S3 was analysed. We created API account data for both services, installed and configured the rclone utility on our local computer, and then copied all the objects from the Object Storage bucket to S3.

The rclone client can be used for many other storage management tasks, including uploading or downloading files, installing buckets in the local file system, and creating or deleting additional buckets. Check out man to learn more about the features that the tool provides.

Discuss article on social media

step 1: choose service

mobile development

web development

machine learning

design

audit

technical requirenment

consultation

animtaion