AWS Glue Connection with Customer managed Apache Kafka Data Source throws Status Code: 400; Error Code: InvalidInputException


I want to add Confluent Cloud Apache Kafka as a Data source in AWS ETL job to read data stream from Kafka topic.

I created a cluster, topic, AWS SQS source connector and AWS S3 sink connector in Confluent Cloud Kafka console. I was able to post message from AWS SQS to Confluent Kafka Topic and export data from Confluent Kafka to AWS S3 bucket. I am able to integrate Confluent Cloud Kafka with AWS SQS and S3.

Now I want to stream the data from Confluent Kafka Cluster to AWS Glue for ETL Transformation and save the output to Targeted s3 bucket.

I created a data connection in AWS Glue. I chose the connection type as "Kafka" and chose the "Customer managed Apache Kafka" option. Under the Kafka bootstrap URLs, I provided the bootstrao server that I saw under cluster settings in Confluent portal (host:9092). I unselected "Require SSL" and also set Authentication to None. I did not provide any values under Network Options. As I am creating this for testing purposes, I selected did not set the authentication/SSL.

I then proceeded to create a job from the data connection screen and selected the Kafka connection created and set the topic name. However when I run the job I see the below error -

jobname: KafkaJob and JobRunID:XXXX failed to execute with exception Unable to resolve any valid service connection (Service AWSGlueJobExecutor, Status Code 400, Error Code InvalidInputExcepion)

Am I missing any steps to setup the Kafka data connection?

1개 답변

That should mean there is something wrong with the "Network options" that prevents Glue from using it, not having a valid VPC/subnet/SG or not having enough IPs

profile pictureAWS
답변함 6달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인