- Newest
- Most votes
- Most comments
Following is feedback from AWS support. I used this feedback and was able to successfully solve the issue:
I believe the issue here is not the job bookmarks but the query being passed to the PG instance. Using the from_catalog method to query the DB will not work here. When we use the from_catalog method to create the DynamicFrame, Glue will first consult the catalog for the column and then check with the PG instance. Glue will fold the column name to lowercase and pass it to the PG instance. PG, being case sensitive, will try to search for a column by name id, which it will not find and so the error is thrown.
Instead, we need to read directly from the PG instance using the from_options method. I request you to add the query, but escape the case-sensitive column from it to use the exact column name so that Glue will not fold it to lowercase and the PG engine will be able to find the required column in the table. Below is how you can do it.
Sample code snippet:
=====================
###Read from PG connection_pg_options = { "url": "jdbc:postgresql://<jdbc-host-name>:5432/db", "dbtable": "public.test", "user": "admin", "password": "pwd", "sampleQuery": "select "ID" from public.test", }
PGNode = glueContext.create_dynamic_frame.from_options(connection_type="postgresql", connection_options=connection_pg_options)
PGNode.show()
=====================
I have tested this at my end with a PG schema like below.
===================== create table with_case ("ID" int PRIMARY KEY, "firstName" varchar(10), "lastName" varchar(10));
Glue code snippet to read:
connection_pg_options = { "url": "jdbc:postgresql://.mydatabase.us-west-2.rds.amazonaws.com:5432/pilot", "dbtable": "public.with_case", "user": "postgres", "password": "mypassword", "sampleQuery": "select "ID" from public.with_case" }
PGNode = glueContext.create_dynamic_frame.from_options(connection_type="postgresql", connection_options=connection_pg_options)
PGNode.show()
Output:
{ "ID": 3 }
{ "ID": 2 }
{ "ID": 1 }
Hello,
Glue catalog only allows lowercase column/database names. This is the reason behind observing lower case column names although they are having camel case column names in source.
Even if you try to edit the column names manually in the glue table schema inside Glue catalog to Camelcase, when you save it you will observe it saving in lower case.
As postgresql is case sensitive, so you can bypass the reading by putting the source table name and column name in quotes. I have used below code to read the data from Postgresql table which has columns and table name in uppercase and it i working fine.
connection_psql_options = {
"url": "jdbc:postgresql://database-us-east-1.rds.amazonaws.com:5432/testdb",
"dbtable": "public.\"COMPANY1\"",
"user": "postgres",
"password": "Postgres123",
"sampleQuery" : "select \"ID\" from public.\"COMPANY1\"",
}
dyf_psql = glueContext.create_dynamic_frame.from_options(connection_type="postgresql",
connection_options=connection_psql_options)
dyf_psql.toDF().show(5)
Relevant content
- asked a year ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 2 years ago