Auto-detect schema for parquet data load

0

Hi, I'm trying to load a parquet in redshift, tried both locally or from S3. I've been trying to use the Load Data tool in Redshift query editor V2. I'm using the "Load new table" (Create table with detected schema) but it seems like Redshift is unable to detect the parquet schema, no column is automatically inferred.

Is there a way to create at table from a file (parquet or CSV), without having to manually specify the table schema manually ?

Thanks

RobinF
已提問 9 個月前檢視次數 325 次
1 個回答
1

Hi there,

Another option for inferring schemas of files that reside in S3 is to use an AWS Glue Crawler.

Once the S3-based files have been crawled, table entries will appear in the AWS Glue Data Catalog, which can be made visible in Redshift through creation of an EXTERNAL SCHEMA using the 'DATA CATALOG' keyword.

Once the external schema is created, you can begin querying the crawled tables inside Redshift. To create a physical copy of the external tables in Redshift, you can run a CTAS statement.

Any subsequent tables crawled will appear within Redshift for querying (as long as they are mapped to the same Glue Database).

I hope this helps!

profile pictureAWS
專家
已回答 9 個月前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南