- Newest
- Most votes
- Most comments
The sink being JDBC (BTW for Redshift it would be better to use the Redshift connector) has no impact on what you are trying to do, you could show() the DF before writing and if the filename is there, it will be written like any other columns.
You could use the option attachFilename on the source, see: https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format.html#aws-glue-programming-etl-format-shared-reference or convert to DataFrame, call the function and then back to DynamicFrame (input_file_name() is not an "option", it's a SQL function)
Just a small tweak is needed to get the filename:
yfs_category_item_df = glueContext.create_dynamic_frame.from_options(
format_options={"attachFilename": "your_filename_column_name"},
connection_type="s3",
format="parquet",
connection_options={
"path": [
#"s3://" + args['s3_bucket'] + "/" + args['s3_key']
f"s3://abc/oms/YFS_CATEGORY_ITEM/"
],
"recurse": True,
},
transformation_ctx="yfs_category_item_df",
)
Change the value of attachFilename in format_options to have the column name you desire.
i tried this also, but did not work. I got blank value. I read somewhere that while using CRWALER/Glue Table only this works. Anyway, I found a workaround. IN the code you have mentioned above, I am using "args['s3_key']". This value is coming from Lambda for me. I am passing this value to a variable, and I am using that variable in my "Post Query" while doing a "write_dynamic_frame.from_jdbc_conf". Thanks for your inputs.
Relevant content
- Accepted Answerasked 8 months ago
- asked 7 months ago
- AWS OFFICIALUpdated 9 months ago
- AWS OFFICIALUpdated 10 months ago
- AWS OFFICIALUpdated 2 years ago
Thanks for your input. "attchFilename" cannot b added when we are using JDBC method to write to Redshift. it threw an error. You are right, input_file_name() is a SQL function only. The way I gave my statement was wrong. When I said "i tried using input_file_name() option ", I meant that this function was also tried and it did not work.