Redshift Spectrum: no way to glance at tables with nested data

0

Hi,
I'm enjoying working with Redshift Spectrum, but I find it difficult to quickly scan a table with nested fields. By that I mean something along the lines of "select * from table limit 100"

This is impossible if the table has nested fields, because Spectrum doesn't allow "*" queries on these.

"Query 1 ERROR: ERROR: Nested tables do not support '*' in the SELECT clause."

But there are few alternatives for me - say I have dozens of columns in my table, do I need to enumerate them while somehow deducing which ones are nested and thus to be avoided?

There is a related issue that I can't do "select nested_field from table" either, presumably because Spectrum can't serialize the field in any way.

These two issues require me to know the schema of all my tables before writing my queries, which is quite user unfriendly. It would be really nice if Spectrum supported displaying string representations of nested fields, if only for these simple selects, they are crucial for exploratory work. (I also use Presto, Hive, and Spark, neither of those has a problem with this.)

Thanks!

ondrejj
preguntada hace 5 años2544 visualizaciones
3 Respuestas
0

Hi ondrejj,

If you're sure Presto readily allows you to query this external data in an exploratory way, then I respectfully suggest that you use Athena, that uses Presto as it's query engine, to do your exploratory SQL and Redshift Spectrum for actual analysis.

Assuming you are using the Glue data catalog for your external table catalog then Athena and Redshift can directly share the same table definitions. If you've already done the DDL in Redshift Spectrum then should be able to use it in Athena as is with little to no setup, nothing to provision or spin up, and the only cost would be the Athena scan costs.

I hope this helps,
-Kurt

klarson
respondido hace 5 años
0

We'd love to use Athena, but we're not using Glue and we're not currently planning on doing so.

We also like to query Redshift at the same time as S3 (which both Spectrum and Presto allow), so that's another deal breaker.

Last but not least, we don't want to use multiple tools (like Athena + Spectrum), rather one that supports all our workloads - currently Presto leads the way, but we like Spectrum's simplicity in terms of ops.

ondrejj
respondido hace 5 años
0

Thank you for for this feature suggestion. We have definitely heard requests for this feature and it will be considered for our roadmap.

We do not comment on the timing of new features until they are announced but new feature releases are noted in our regular maintenance announcements at the top of this forum and on our What's New page. https://aws.amazon.com/redshift/whats-new/

respondido hace 5 años

No has iniciado sesión. Iniciar sesión para publicar una respuesta.

Una buena respuesta responde claramente a la pregunta, proporciona comentarios constructivos y fomenta el crecimiento profesional en la persona que hace la pregunta.

Pautas para responder preguntas