Auto-detect schema for parquet data load

0

Hi, I'm trying to load a parquet in redshift, tried both locally or from S3. I've been trying to use the Load Data tool in Redshift query editor V2. I'm using the "Load new table" (Create table with detected schema) but it seems like Redshift is unable to detect the parquet schema, no column is automatically inferred.

Is there a way to create at table from a file (parquet or CSV), without having to manually specify the table schema manually ?

Thanks

RobinF
질문됨 9달 전325회 조회
1개 답변
1

Hi there,

Another option for inferring schemas of files that reside in S3 is to use an AWS Glue Crawler.

Once the S3-based files have been crawled, table entries will appear in the AWS Glue Data Catalog, which can be made visible in Redshift through creation of an EXTERNAL SCHEMA using the 'DATA CATALOG' keyword.

Once the external schema is created, you can begin querying the crawled tables inside Redshift. To create a physical copy of the external tables in Redshift, you can run a CTAS statement.

Any subsequent tables crawled will appear within Redshift for querying (as long as they are mapped to the same Glue Database).

I hope this helps!

profile pictureAWS
전문가
답변함 9달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인