AWS Glue Error - File already Exists

Question

I am doing a AWS Glue job to read from Redshift (schema_1) and write it back to Redhshift (schema_2). This process is done using below:

```
Redshift_read = glueContext.create_dynamic_frame.from_options(
    connection_type="redshift",
    connection_options={
        "sampleQuery": sample_query,
        "redshiftTmpDir": tmp_dir,
        "useConnectionProperties": "true",
        "connectionName": "dodsprd_connection",
        "sse_kms_key" : "abc-fhrt-2345-8663",
    },
    transformation_ctx="Redshift_read",
)

Redshift_write = glueContext.write_dynamic_frame.from_jdbc_conf(
    frame=Redshift_read,
    catalog_connection="dodsprd_connection",
    connection_options={
        "database": "dodsprod",
        "dbtable": "dw_replatform_stage.rt_order_line_cancellations_1",
        "preactions": pre_query,
        "postactions": post_query,
    },
    redshift_tmp_dir=tmp_dir,
    transformation_ctx="Redshift_write",
)
```
The "sample_query" is a normal SQL query with some business logic. When i run this glue job, I am getting below error:

> An error occurred while calling o106.pyWriteDynamicFrame. File already exists:

When running the same SQL query manually in SQL workbench, I am getting a proper output.
Can anyone please help me on this.

Answer

Since you are using Redshift, I suspect the error comes from the temporary files used to do COPY/UNLOAD, check the stacktrace to confirm.    
A quick solution could be giving them different temporary subpaths.

AWS Glue Error - File already Exists

Contenuto pertinente