Redshift Spectrum: no way to glance at tables with nested data

0

Hi,
I'm enjoying working with Redshift Spectrum, but I find it difficult to quickly scan a table with nested fields. By that I mean something along the lines of "select * from table limit 100"

This is impossible if the table has nested fields, because Spectrum doesn't allow "*" queries on these.

"Query 1 ERROR: ERROR: Nested tables do not support '*' in the SELECT clause."

But there are few alternatives for me - say I have dozens of columns in my table, do I need to enumerate them while somehow deducing which ones are nested and thus to be avoided?

There is a related issue that I can't do "select nested_field from table" either, presumably because Spectrum can't serialize the field in any way.

These two issues require me to know the schema of all my tables before writing my queries, which is quite user unfriendly. It would be really nice if Spectrum supported displaying string representations of nested fields, if only for these simple selects, they are crucial for exploratory work. (I also use Presto, Hive, and Spark, neither of those has a problem with this.)

Thanks!

ondrejj
posta 5 anni fa2543 visualizzazioni
3 Risposte
0

Hi ondrejj,

If you're sure Presto readily allows you to query this external data in an exploratory way, then I respectfully suggest that you use Athena, that uses Presto as it's query engine, to do your exploratory SQL and Redshift Spectrum for actual analysis.

Assuming you are using the Glue data catalog for your external table catalog then Athena and Redshift can directly share the same table definitions. If you've already done the DDL in Redshift Spectrum then should be able to use it in Athena as is with little to no setup, nothing to provision or spin up, and the only cost would be the Athena scan costs.

I hope this helps,
-Kurt

klarson
con risposta 5 anni fa
0

We'd love to use Athena, but we're not using Glue and we're not currently planning on doing so.

We also like to query Redshift at the same time as S3 (which both Spectrum and Presto allow), so that's another deal breaker.

Last but not least, we don't want to use multiple tools (like Athena + Spectrum), rather one that supports all our workloads - currently Presto leads the way, but we like Spectrum's simplicity in terms of ops.

ondrejj
con risposta 5 anni fa
0

Thank you for for this feature suggestion. We have definitely heard requests for this feature and it will be considered for our roadmap.

We do not comment on the timing of new features until they are announced but new feature releases are noted in our regular maintenance announcements at the top of this forum and on our What's New page. https://aws.amazon.com/redshift/whats-new/

con risposta 5 anni fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande