HOW DO I COMBINE MULTIPLE CSVS INTO ONE

0

I HAVE MULTIPLE CSVS ABOUT A SINGLE PATIENT AND I WOULD LIKE TO KNOW HOW DO I COMBINE ALL THE CSVS BECAUSE ALL THE COLUMNS INSIDE THE CSVS MAKE UP AN ALL THE INFORMATION FOR ONE PATIENT. THE CSV'S ARE STORED IN S3 BUCKET AND INDIFFERENT FOLDERS. i HAVE TRIED USING JOIN BUT BECAUSE WE HAVE MANY PATIENTS THE JOB IS TAKING FOREVER.TIA

CYN
質問済み 7ヶ月前405ビュー
1回答
3

Hello,

You can create an athena table for taking the input locations as all the s3 prefix. Something like this, refer create table in athena

CREATE EXTERNAL TABLE `test_table`(
...
)
ROW FORMAT ...
STORED AS INPUTFORMAT ...
OUTPUTFORMAT ...
LOCATION s3://bucketname/folder/

Once create the table, use CTAS to create another table to consolidate all the csv as single table output location like below, refer here for CTAS

CREATE TABLE ctas_csv_unpartitioned 
WITH (
     format = 'CSV', 
     external_location = 's3://xxxxxxxxxxxx/ctas_csv_unpartitioned/') 
AS SELECT key1, name1, comment1
FROM test_table;
AWS
サポートエンジニア
回答済み 7ヶ月前

ログインしていません。 ログイン 回答を投稿する。

優れた回答とは、質問に明確に答え、建設的なフィードバックを提供し、質問者の専門分野におけるスキルの向上を促すものです。

質問に答えるためのガイドライン

関連するコンテンツ