How could we have Glue to get data from csv as String?

0

I have csv data uploaded to S3 bucket, letting Glue to set them up as tables for later use. I want all the columns loaded as string without designating each column name by name. How could we configure Glue to load all columns as string, not transforming into bigint or so?

질문됨 2년 전2011회 조회
1개 답변
0
수락된 답변

Hello,

In Glue we use crawlers to automatically detect the schema from file and create a table in Glue catalog. For CSV files, the crawler reads either the first 100 records or the first 1 MB of data, whatever comes first to detect the schema. [1]

Having said that with this approach it is not possible to load all csv columns as string in Glue catalog directly. You can consider two approach for your use case:

  1. Create a crawler and run on csv data. Once it create the table in Glue catalog with correct datatype , you can modify the table schema to string for all columns.

  2. Directly read the data from csv files using Glue ETL job and in applymapping change schema to string and write the table into catalog with enableUpdateCatalog option. [2]

--Reference:

[1] https://aws.amazon.com/premiumsupport/knowledge-center/glue-crawler-detect-schema/ [2] https://docs.aws.amazon.com/glue/latest/dg/update-from-job.html

AWS
답변함 2년 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠