By using AWS re:Post, you agree to the AWS re:Post Terms of Use

Could not escape double quote from CSV file

0

I need to import a CSV file from s3 bucket into my Aurora database table. However some fields could contain double quote character which will break the process.

32533|Lease changed from "12 month" to "month 

is a sample row in the CSV file, '|' as the delimiter. I looked through the document and know that I should use ESCAPE attribute to escape the character. However, it does not work as expected.

SELECT aws_s3.table_import_from_s3 ( 'my_table', 
 ' ', 
 '(format csv, ESCAPE ''"'', NULL ''NULL'', DELIMITER ''|'' )' ,
 (SELECT aws_commons.create_s3_uri('my_bucket','my_file.csv','us-east-1')) ,                             
aws_commons.create_aws_credentials('xxxxx','xxxxx','')
);

It returns error message :

ERROR: unterminated CSV quoted field
CONTEXT:  COPY my_table, line 1: "32533|Lease changed from "12 month" to "month"
SQL statement "copy my_table  from '/rdsdbdata/extensions/aws_s3/amazon-s3-fifo-21982-20230
816T200710Z-0' with (format csv, ESCAPE '"', NULL 'NULL', DELIMITER '|' )"
SQL function "table_import_from_s3" statement 1
asked a year ago1.4K views
1 Answer
0
Accepted Answer

The issue is that by default the ESCAPE character needs to be before every occurrence of the character you want to escape. So, in your case, you would need to modify the CSV file itself to have a double quote before any internal double quotes: 32533|Lease changed from ""12 month"" to ""month"" Then your ESCAPE syntax should work: ESCAPE '"' However, an easier approach is to use the QUOTE parameter instead of ESCAPE. This allows you to specify a quoting character, and any instances of that character inside a field will be treated as part of the value instead of ending the field. For example: QUOTE '"' With this, you don't need to modify the CSV file at all. Any double quotes inside fields will be preserved as part of the value when imported. So, the full SQL would be:

SELECT aws_s3.table_import_from_s3 ( 'my_table', ' ', '(format csv, QUOTE ''"'', NULL ''NULL'', DELIMITER ''|'' )' , (SELECT aws_commons.create_s3_uri('my_bucket','my_file.csv','us-east-1')) ,
aws_commons.create_aws_credentials('xxxxx','xxxxx','') ); Let me know if that helps resolve the issue! The QUOTE option is usually the easiest way to handle quoting characters when importing CSV.

answered a year ago
  • Thank you for your comment. QUOTE ''"'' still do not work. However, I noticed that QUOTE E''\b'' will work. Do you know why is that?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions