Questions tagged with AWS Data Pipeline

AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals.

Content language: English

Select tags to filter
Sort by most recent

Browse through the questions and answers listed below or filter and sort to narrow down your results.

86 results
I want to add Confluent Cloud Apache Kafka as a Data source in AWS ETL job to read data stream from Kafka topic. I created a cluster, topic, AWS SQS source connector and AWS S3 sink connector in Con...
1
answers
0
votes
802
views
asked a year ago
I am using DMS and Kinesis DataStream and Delivery stream to migrate existing and changes from MYSQL to S3 bucket. But I don't see the data is coming in S3. I specified the schema name and table name...
0
answers
0
votes
203
views
asked a year ago
Everyday a new emr cluster span up and terminated after completing the step job. Checking the cloudtrail, seems a Data Pipeline created it. I am not sure how to get more details like who created, what...
2
answers
1
votes
465
views
asked a year ago
I HAVE MULTIPLE CSVS ABOUT A SINGLE PATIENT AND I WOULD LIKE TO KNOW HOW DO I COMBINE ALL THE CSVS BECAUSE ALL THE COLUMNS INSIDE THE CSVS MAKE UP AN ALL THE INFORMATION FOR ONE PATIENT. THE CSV'S ARE...
1
answers
0
votes
712
views
asked 2 years ago
I am getting following error in my data pipeline `We currently do not have sufficient t1.micro capacity in the Availability Zone you requested (ap-southeast-2a). Our system will be working on provisi...
2
answers
0
votes
405
views
asked 2 years ago
Im trying to delete all the data pipeline that shows up in ``` aws datapipeline list-pipelines ``` in one go. How do I do that using aws-cli?
1
answers
0
votes
496
views
asked 2 years ago
I want to get the notification on Failed DataPipelines through event bridge only when they fail. I used the following event pattern but this didn't work. It can be done through the data pipeline SNS s...
2
answers
0
votes
575
views
asked 2 years ago
We have one API gateway that receive data 24x7 of gzipped meter data and the data come in concurrently(some times 5000 posts per second, sometimes not much), we are sure the compressed data won't excc...
1
answers
1
votes
545
views
asked 2 years ago
I'm trying to set up an AWS Data Pipeline so I can clone large huggingface repo's to S3. I'm encountering issue's when creating the permissions policy to use with a role for my data pipeline. [I'm a...
2
answers
0
votes
481
views
asked 2 years ago
Hello. I'm running SageMaker training jobs through a library called ZenML. The library is just there as an abstraction layer, so that when I return the artifacts gets automatically saved to S3. The li...
0
answers
0
votes
159
views
asked 2 years ago
Hi, I'm running a data pipeline from legacy DB(oracle) to Redshift using AWS Glue. I want to test the connection to the legacy DB before executing ETL without test query in working python script. as-...
1
answers
0
votes
394
views
asked 2 years ago
Hi Team, I have set up an **AWS DataPipeline** to run my EMR jobs on `On-Demand` instances. However, I now want to switch to using `Spot` Instances to reduce costs. I have configured the `spotBidPric...
2
answers
0
votes
460
views
asked 2 years ago