Pass glue job parameters dynamically to a redshift table query

0

Problem Overview

It's not clear from the documentation how to dynamically pass parameters to query a Amazon Redshift database with the create_dynamic_frame.from_options method. When using the AWS SDK For pandas, we can use the `awswrangler.redshift.read_sql_query() and use the "params" method to pass these values. I am looking for something similar. Every time my glue job runs, I need to dynamically pass in a filter on a specific column on my redshift table. In the example below "dynamic_value" is the value I want to pass to my where statement to my redshift table:

Sample query to redshift table

def get_queried_results(dynamic_value):
   """gets queried results from an Amazon Redshift database"""
   dyf= glueContext.create_dynamic_frame.from_options(
   connection_type="redshift",
   connection_options={
   "sampleQuery": """select * from public.test
   where column1= dynamic_value""",
   "redshiftTmpDir": f"s3://{s3_bucket}/processed/temp/redshift/",
   "useConnectionProperties": "true",
   "dbtable": "public.test",
   "connectionName": "redshift1"})
   return dyf

dyf= get_queried_results(dynamic_value='2023-02')

Question I know the above example is no SQL injection safe but just using it as an example. Is there a parameter on this method that we should be passing our values to in pyspark so it can filter our table based on a value of a column?

Adriano
已提问 1 年前610 查看次数
1 回答
0

In this case, just put that inside the query, you can use the Python string interpolation to that easily:

dynamic_value = "value"
....
"sampleQuery": f"""select * from public.test  where column1= {dynamic_value}"""
profile pictureAWS
专家
已回答 1 年前
  • Hi Gonzalo,

    When attempting this answer using a common table expression in the "sampleQuery" for example

    with cte_exp as
    (
    select column
    from table1
    where
    column = {parameter}
    )
    select * from table2
    left join cte_exp
    on cte_exp.field1 = table2.field2
    where table2.column = parameter
    
    

    The first usage of the parameter is not being detected, while the second usage is. I have tried adding and removing the {} for both uses, but to no effect. It's not detecting the parameters correctly. Do you have any suggetions?

  • Notice this is Python string interpolation (notice the f at the beginning), it has to be translated before it's passed on (print it to verify it has the correct value)

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则