Best practices for bulk data loading in AWS Redshift - Glue or Copy

0

What are the pros and cons when it comes to using AWS Glue over Redshift's internal functions (such as COPY and INSERT)? for bulk data loading (In terms of cost, time, and adaptability). It's really appreciated if you can provide some examples use cases.

1개 답변
0
수락된 답변

Hi, AWS Glue is an ETL service: T is the key letter. If you need to transform the source data before your load into RedShift, Glue will be highly useful.

For example, Glue provides lots of wired in simple and adanced transformations that you can integrate in your Glue-Based ETL pipeline: see https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-python-transforms.html

Also, you may want to measure the quality of your data, before loading it to ensure constant quality. Then AWS Glue Data Quality may be very helpful: see https://aws.amazon.com/blogs/big-data/getting-started-with-aws-glue-data-quality-from-the-aws-glue-data-catalog/

Hope it helps,

Didier

profile pictureAWS
전문가
답변함 10달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠