Redshift data API

0

Hi, I am using the Golang SDK for querying the data from the Redshift database.  The function  GetStatementResultOutput  () returns the response GetStatementResultOutput{} which contains the list of records for the SQL query.  How do i fetch the data from the records, which are of type "Records [][]*Field" . I want the result in JSON format so that i can unmarshal  the data into the result structure. The records will return the value field for the column. It's difficult to determine which type's value field is which. Is there any utility function available to decode the records struct.

質問済み 2年前533ビュー
1回答
0

Greetings from AWS!

Please note that there is no built in way to retrieve the cached result of an SQL statement in a JSON format, please refer to our documentation on GetStatementResult [1].

Having that said, one can process the data to format the result if you want to display it in a user-friendly format. By using the Data API with Python you can format the results with the Pandas framework, you can use the Data API with other programming languages supported by the AWS SDK, see example below and a detailed article on Using the Amazon Redshift Data API [2]. Please note that the below code is for example purposes only and you can use python in a similar way to format the records in json format.

import pandas as pd
def post_process(meta, records):
    columns = [k["name"] for k in meta]
    rows = []
    for r in records:
        tmp = []
        for c in r:
            tmp.append(c[list(c.keys())[0]])
        rows.append(tmp)
    return pd.DataFrame(rows, columns=columns)

def query(sql, cluster="<Cluster Name>", user="<User Name>", database="<Database Name>"):
    resp = rsd.execute_statement(
        Database=database,
        ClusterIdentifier=cluster,
        DbUser=user,
        Sql=sql
    )
    qid = resp["Id"]
    print(qid)
    desc = None
    while True:
        desc = rsd.describe_statement(Id=qid)
        if desc["Status"] == "FINISHED":
            break
            print(desc["ResultRows"])
    if desc and desc["ResultRows"]  > 0:
        result = rsd.get_statement_result(Id=qid)
        rows, meta = result["Records"], result["ColumnMetadata"]
        return post_process(meta, rows)

pf=query("select venueid,venuename from venue  limit 100;")
print(pf)

I hope this would be of value to you, have an AWSome day further.

References

[1] - https://docs.aws.amazon.com/redshift-data/latest/APIReference/API_GetStatementResult.html

[2] - https://aws.amazon.com/blogs/big-data/using-the-amazon-redshift-data-api-to-interact-with-amazon-redshift-clusters/#:~:text=Fetch%[…]ormat%20results

profile pictureAWS
回答済み 2年前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ