Querying array of nested objects with nested array of objects in Redshift

0

Let's say I have the following JSON

{
  "id": 1,
  "sets": [
    {
      "values": [
        {
          "value": 1
        },
        {
          "value": 2
        }
      ]
    },
    {
      "values": [
        {
          "value": 5
        },
        {
          "value": 6
        }
      ]
    }
  ]
}

If the table name is X I expect the query

 SELECT x.id, v.value 
 FROM X as x,
   x.sets as sets,
   sets.values as v

to give me

id, value
1, 1
1, 2
2, 5
2, 6

and it does work if both sets and values has one object each. When there's more the query fails with column 'id' had 0 remaining values but expected 2. Seems to me I'm not iterating over "sets" properly?

So my question is: what's the proper way to query data structured like my example above in Redshift (using PartiQL)?

gefragt vor 2 Jahren253 Aufrufe
2 Antworten
0

Hi rakibansary,

Unless I'm wrong, I cannot see the name of the SUPER datatype column in your query. In the following example, my SUPER datatype column, where the JSON record is stored, is called json_text.

 SELECT x.json_text.id, v.value 
 FROM X as x,
   x.json_text.sets as sets,
   sets.values as v

Thanks,

AWS
EXPERTE
Ziad
beantwortet vor 2 Jahren
  • Some more context - I actually have data stored as Parquet and not JSON - I guess SUPER doesn't apply to parquet.

    x.json_text.sets fails for me with

    Hint: Split into more than one FROM clause elements or use a JOIN operation to extract ARRAY elements

    The problem here seems to be, json_text is an array.

0

Hi rakibansary,

The json_text column is just an example to show you how usually PartiQL is used on SUPER datatype columns.

SUPER datatype is used for semi-structured data such as JSON. The following link contains several examples on querying semi-structured data https://docs.aws.amazon.com/redshift/latest/dg/query-super.html#navigation.

If you are having a Parquet file with several columns and a json column, then you can use the SERIEALIZETOJSON option while copying the data to Redshift. You can find an example on the following link (the Copying data from columnar-format Parquet and ORC section) https://docs.aws.amazon.com/redshift/latest/dg/ingest-super.html#copy_json.

Thanks,

AWS
EXPERTE
Ziad
beantwortet vor 2 Jahren

Du bist nicht angemeldet. Anmelden um eine Antwort zu veröffentlichen.

Eine gute Antwort beantwortet die Frage klar, gibt konstruktives Feedback und fördert die berufliche Weiterentwicklung des Fragenstellers.

Richtlinien für die Beantwortung von Fragen