Migrating huge amount of data using s3 batch replication

0

I want to migrate 200 TB of data from account A s3 bucket to account B s3 bucket. Both the buckets are in the same region(Tokyo). I am planning to use s3 batch replication job for it. Four replication rules with different paths will be configured for the source account s3 bucket.

My questions are :-

  1. is it possible to use s3 batch replication for the migration?
  2. if i create 1 replication job for replicating the existing data, will all the replication configuration be used? if so how?
  3. how secure the replication of data will be?
  4. what will be the bandwidth/throughput of data transfer?
  5. how will the charges be generated?
Irfan
asked 10 months ago503 views
2 Answers
3
Accepted Answer
  1. Is it possible to use s3 batch replication for the migration?

    Yes, you can s3 batch replication for this much of data.

  2. If i create 1 replication job for replicating the existing data, will all the replication configuration be used? if so how?

    It's not clear, what do you mean by "all replication configuration"

  3. How secure the replication of data will be?

    Data in transit is encrypted, encryption at rest would fall into your responsibilities

  4. What will be the bandwidth/throughput of data transfer?

    Your network bandwidth won't come in to the picture, since it's All AWS. Most objects replicate within 15 minutes, but sometimes replication can take a couple hours or more. Refer this re:Post

  5. How will the charges be generated?

    Refer the S3 Replication Pricing

Additionally, I'd suggest you to go through this storage blog, which cover all the copy options and their pros/cons.

Batch Replication Document

profile pictureAWS
EXPERT
answered 10 months ago
    1. The replication rules are configured in the source bucket as below:

      1. path = xxxxx/xxxxxxx/Folder A
      2. path = xxxxx/xxxxxxx/Folder B
      3. path = xxxxx/xxxxx/Folder C
      4. path = xxxxx/xxxxx/Folder D

    if I create a batch replication job, will the existing data in all 4 folders be replicated in the destination bucket with same bucket key/path?

    1. The service quota for Replication transfer rate in the source account is 1 GBPS. Does it have any relation to transfer rate?
    1. Yes, key path would remain same as of source in the target bucket.
    2. Yes, it has relation with data transfer but to get that increase you would need to contact to AWS support and you may be asked, why is that required, what's the use case etc. Refer this RTC document

    Lastly just FYI, Replication rule basically creates the manifest, it's the replication job which copies the data. Since you are trying to do it for existing data, so first replication rule will ask you whether you want to do it for existing data or not, and as you'd choose yes, it'd create batch job to replicate the data for the prefix which you 'd have given in first replication rule. Catch is, after first rule, it won't ask whether you want to do it for existing data or not. So, I'd suggest you to create manifest for each location first and then create replication job using those manifest files.

    Hope it helps, if you have any additional questions, feel free to post here otherwise please click "Accept Answer" and upvote it.

1

If you have not seen this before, I encourage you to look at the best practices on Batch Replication : https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-batch-replication-batch.html To answer your questions :

1- Yes, it is the method proposed by AWS so replicate existing objects.

2- Can you rephrase the question and elaborate ?

3- All the data replication in transit is encrypted between AWS services. You need to ensure the encryption of files at rest by choosing the right encryption option.

4- As both source and destination will be on the same region, this replication will occur on AWS internal network. To observe the throughput , I'd propose de run a test as network is just one of the elements in the equation and you need to consider end to end transfer.

5- The cost elements will include : the number of GET at the source, the number of PUT at the destination(so you need to know the number of objects). As you are in the same region, there is no network charge for the transfer. for more details : https://aws.amazon.com/s3/pricing/

AWS
answered 10 months ago
    1. The replication rules are configured in the source bucket as below:

      1. path = xxxxx/xxxxxxx/Folder A
      2. path = xxxxx/xxxxxxx/Folder B
      3. path = xxxxx/xxxxx/Folder C
      4. path = xxxxx/xxxxx/Folder D

    if I create a batch replication job, will the existing data in all 4 folders be replicated in the destination bucket with same bucket key/path?

    1. The service quota for Replication transfer rate in the source account is 1 GBPS. Does it have any relation to transfer rate?
  • Are the current replication rules pointing to the same destination bucket that you will use for migration ? In this case, what is the use case to replicate 'again' objects that are replicated to the same bucket ? In general sense, batch replication replicates all existing objects.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions