Sophisticated Triggering of Glue Jobs

0

Is there some documentation about ways to trigger Glue jobs, that go beyond static schedules and simple conditions as explained in https://docs.aws.amazon.com/glue/latest/dg/about-triggers.html? I heard about the possibility to trigger Glue jobs from Lambda functions, but all I can find about that is not much more sophisticated than static schedules and simple conditions.

I have a pipeline of several Glue jobs that are normally run in a sequence once per week. The last glue job in this pipeline writes out a table that contains a flag, which is used to determine records, that need to be processed in a higher frequency. So I am looking for a mechanism, that processes this output table regularly and triggers that first glue job of the pipeline again more frequently in case a certain flag is set. I need to avoid, that the entire dataset is processed at this high frequency. How would this be done?

1 Answer
0

AWS Step Functions provides a great way of orchestrating multiple Glue jobs into a coherent workflow. Here is a workshop that shows an example on how to build a workflow using Step Functions. It provides a visual interface and can also be defined programmatically using the Amazon States Language.
You can store the state of the different Glue jobs (as flags) in DynamoDB so you can build a fully serverless data pipelines. (Glue, Step Functions and DynamoDB are all serverless). You can also consider event-driven orchestration of the different workflows using Amazon EventBridge.

AWS
answered 2 years ago
AWS
EXPERT
reviewed 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions