- Newest
- Most votes
- Most comments
Just to add on to Gonzolo's response, Glue itself doesn’t provide direct functionality for retrieving the timestamp of when a file was saved in a S3 bucket. To retrieve such information, you can use Boto client methods to achieve this. Specifically you can use the:
- head_object() method [1]
- list_objects_v2() method [2]
Please see the external resource below to see example code of how this can be achieved.
Alternatively, you can save your Athena query and execute the saved query through a Glue job. Please see example code of this in this AWS Blog post [3].
References:
It would be better if the predictions has a timestamp, instead of relying on the file modification date, which could be affected by other things.
Otherwise, I don't think there is a way in Glue/Spark but you could invoke Athena from Glue and read the results (it's a bit wasteful since the cluster would be waiting while Athena is running) and then ask Glue to read the Athena query results and continue from there.
Relevant content
- asked a year ago
- asked 2 years ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 2 months ago
- AWS OFFICIALUpdated 5 months ago