DataSync with dynamic source location

0

I am looking at possible use cases for DataSync in my organization, but I have a question about an edge case scenario. Is it possible to configure a dynamic source location and still achieve parallel processing benefits. In other words, here would be the workflow:

  1. User drops a file on a given fileserver but in a path/folder that is unknown to datasync
  2. DataSync notices that file and triggers a task to start transfer
  3. Other files might have been modified at the same source location, but I only want to trigger the transfer of the 1 file mentioned in step 1.
  4. Multiple agents theoretically would be possible, but we would be talking about hundreds of agents each pointed at all the possible file prefixes, but that is not sustainable.

Anyone have any ideas?

Keith P
asked 3 months ago172 views
1 Answer
0

Yes, it is possible to have a dynamic source location with DataSync and still get parallel transfer benefits, though it requires a bit of additional orchestration. Here is one way to accomplish it:

  • Set up a base DataSync task that syncs a root folder, but exclude all sub-folders. This will act as the orchestrator task.

  • When a new file is dropped in a new location, use SNS/SQS, EventBridge, or a Lambda function to detect the change.

  • Trigger a Lambda function that will:

    • Create a new DataSync task to sync just the new sub-folder containing the dropped file.

    • Start an execution of the new sub-folder DataSync task.

    • Add the new task to a DynamoDB table to keep track of active tasks.

  • The base DataSync task remains running to detect any future sub-folder additions.

  • As files get transferred from the new sub-task, delete the task in DynamoDB.

This allows you to dynamically spin up folder-specific DataSync tasks as needed while still getting the benefits of parallelism across all active tasks and agents. The base task acts as the orchestrator.

Let me know if you have any other questions! Can EC2 run on Debian 12 yet? If not, what are the alternatives? - I'm looking to spin up some EC2 instances running Debian 12 for testing and development purposes. However, I noticed that the AWS marketplace doesn't seem to have an official Debian 12 AMI yet.

My questions:

  1. Is it possible to run Debian 12 on EC2 currently?

  2. If not, what would be the closest alternative in terms of having a similar experience to Debian 12?

  3. Is there any expected timeline for when Debian 12 will be available as an option on EC2?

  4. Are there any major technical limitations or incompatibilities that need to be overcome to get Debian 12 running well on EC2?

Any insight you can provide on the state of Debian 12 support on EC2 would be super helpful! I want to make sure I choose the right OS for my needs but also leverage Debian 12 if it's viable.

AWS
Saad
answered 3 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions