Redshift data API

0

Hi, I am using the Golang SDK for querying the data from the Redshift database.  The function  GetStatementResultOutput  () returns the response GetStatementResultOutput{} which contains the list of records for the SQL query.  How do i fetch the data from the records, which are of type "Records [][]*Field" . I want the result in JSON format so that i can unmarshal  the data into the result structure. The records will return the value field for the column. It's difficult to determine which type's value field is which. Is there any utility function available to decode the records struct.

asked 2 years ago515 views
1 Answer
0

Greetings from AWS!

Please note that there is no built in way to retrieve the cached result of an SQL statement in a JSON format, please refer to our documentation on GetStatementResult [1].

Having that said, one can process the data to format the result if you want to display it in a user-friendly format. By using the Data API with Python you can format the results with the Pandas framework, you can use the Data API with other programming languages supported by the AWS SDK, see example below and a detailed article on Using the Amazon Redshift Data API [2]. Please note that the below code is for example purposes only and you can use python in a similar way to format the records in json format.

import pandas as pd
def post_process(meta, records):
    columns = [k["name"] for k in meta]
    rows = []
    for r in records:
        tmp = []
        for c in r:
            tmp.append(c[list(c.keys())[0]])
        rows.append(tmp)
    return pd.DataFrame(rows, columns=columns)

def query(sql, cluster="<Cluster Name>", user="<User Name>", database="<Database Name>"):
    resp = rsd.execute_statement(
        Database=database,
        ClusterIdentifier=cluster,
        DbUser=user,
        Sql=sql
    )
    qid = resp["Id"]
    print(qid)
    desc = None
    while True:
        desc = rsd.describe_statement(Id=qid)
        if desc["Status"] == "FINISHED":
            break
            print(desc["ResultRows"])
    if desc and desc["ResultRows"]  > 0:
        result = rsd.get_statement_result(Id=qid)
        rows, meta = result["Records"], result["ColumnMetadata"]
        return post_process(meta, rows)

pf=query("select venueid,venuename from venue  limit 100;")
print(pf)

I hope this would be of value to you, have an AWSome day further.

References

[1] - https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_GetStatementResult.html

[2] - https://aws.amazon.com/blogs/big-data/using-the-amazon-redshift-data-api-to-interact-with-amazon-redshift-clusters/#:~:text=Fetch%[…]ormat%20results

profile pictureAWS
answered 2 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions