Redshift data API

0

Hi, I am using the Golang SDK for querying the data from the Redshift database.  The function  GetStatementResultOutput  () returns the response GetStatementResultOutput{} which contains the list of records for the SQL query.  How do i fetch the data from the records, which are of type "Records [][]*Field" . I want the result in JSON format so that i can unmarshal  the data into the result structure. The records will return the value field for the column. It's difficult to determine which type's value field is which. Is there any utility function available to decode the records struct.

已提问 1 年前316 查看次数
1 回答
0

Greetings from AWS!

Please note that there is no built in way to retrieve the cached result of an SQL statement in a JSON format, please refer to our documentation on GetStatementResult [1].

Having that said, one can process the data to format the result if you want to display it in a user-friendly format. By using the Data API with Python you can format the results with the Pandas framework, you can use the Data API with other programming languages supported by the AWS SDK, see example below and a detailed article on Using the Amazon Redshift Data API [2]. Please note that the below code is for example purposes only and you can use python in a similar way to format the records in json format.

import pandas as pd
def post_process(meta, records):
    columns = [k["name"] for k in meta]
    rows = []
    for r in records:
        tmp = []
        for c in r:
            tmp.append(c[list(c.keys())[0]])
        rows.append(tmp)
    return pd.DataFrame(rows, columns=columns)

def query(sql, cluster="<Cluster Name>", user="<User Name>", database="<Database Name>"):
    resp = rsd.execute_statement(
        Database=database,
        ClusterIdentifier=cluster,
        DbUser=user,
        Sql=sql
    )
    qid = resp["Id"]
    print(qid)
    desc = None
    while True:
        desc = rsd.describe_statement(Id=qid)
        if desc["Status"] == "FINISHED":
            break
            print(desc["ResultRows"])
    if desc and desc["ResultRows"]  > 0:
        result = rsd.get_statement_result(Id=qid)
        rows, meta = result["Records"], result["ColumnMetadata"]
        return post_process(meta, rows)

pf=query("select venueid,venuename from venue  limit 100;")
print(pf)

I hope this would be of value to you, have an AWSome day further.

References

[1] - https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_GetStatementResult.html

[2] - https://aws.amazon.com/blogs/big-data/using-the-amazon-redshift-data-api-to-interact-with-amazon-redshift-clusters/#:~:text=Fetch%[…]ormat%20results

profile pictureAWS
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则