Glue crawler如何正确处理CSV文件中含有逗号的字段内容

0

【以下的问题经过翻译处理】 我有一个CSV文件,其中包含一个用双引号括起来文本字段,文本中含有逗号。默认情况下,Glue crawler会按照逗号,将内容拆分成列。有没有办法让Glue crawler将双引号内部的文本作为一个字段进行处理?

以下是数据示例。第三个名为“description”的字段中包含逗号。

id,country,description
0,Italy,"Aromas include tropical fruit, broom, brimstone and dried herb. The palate isn't overly expressive, offering unripened apple, citrus and dried sage alongside brisk acidity."
1,Portugal,"This is ripe and fruity, a wine that is smooth while still structured. Firm tannins are filled out with juicy red berry fruits and freshened with acidity. It's  already drinkable, although it will certainly be better from 2016."
2,US,"Tart and snappy, the flavors of lime flesh and rind dominate. Some green pineapple pokes through, with crisp acidity underscoring the flavors. The wine was all stainless-steel fermented."
profile picture
专家
已提问 5 个月前35 查看次数
1 回答
0

【以下的回答经过翻译处理】 你应该需要自定义的爬虫crawler csv分类器来指定引用字符,可以在此处查看:https://docs.aws.amazon.com/glue/latest/dg/custom-classifier.html#custom-classifier-csv。

profile picture
专家
已回答 5 个月前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则