Is there a way to get raw files from GitHub using the Amazon AppFlow integration?

0

And if so, which object and subobject would that be? The file content of my CSV file does not show up when I use Repository (repos) as the object and ystoneman (my GitHub username) as the subjobject. Instead, the columns all contain only metadata.

The GitHub REST API itself seems to support this via the Repository Contents API. For example, I'm able to get the contents of an 18 MB file with the following cURL command:

curl \
  -H "Accept: application/vnd.github.raw+json" \
  -H "Authorization: Bearer TOKEN"\
  -H "X-GitHub-Api-Version: 2022-11-28" \
  https://api.github.com/repos/ystoneman/hotel-bookings/contents/hotel_bookings.csv

And here's an example of the output (data from Kaggle):

City Hotel,0,34,2017,August,35,31,2,5,2,0,0,BB,DEU,Online TA,TA/TO,0,0,0,D,D,0,No Deposit,9,NULL,0,Transient,157.71,0,4,Check-Out,2017-09-07
City Hotel,0,109,2017,August,35,31,2,5,2,0,0,BB,GBR,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,89,NULL,0,Transient,104.4,0,0,Check-Out,2017-09-07
City Hotel,0,205,2017,August,35,29,2,7,2,0,0,HB,DEU,Online TA,TA/TO,0,0,0,A,A,0,No Deposit,9,NULL,0,Transient,151.2,0,2,Check-Out,2017-09-07

Seems like using a source object of Repository, Branch, and Commit does not yield this data, even when I use an auth token with all read actions allowed on the repository, destination as S3, and I choose "Map all fields directly".

AWS
已提問 1 年前檢視次數 392 次
1 個回答
0
已接受的答案

Got the answer from the Amazon AppFlow service team. Currently only the Amazon S3 source on AppFlow supports unstructured data, so no, getting a CSV file's contents from GitHub via AppFlow would not work at this time, since GitHub's API does not perceive a CSV file within a repo as structured data but as a raw blob.

AWS
已回答 1 年前

您尚未登入。 登入 去張貼答案。

一個好的回答可以清楚地回答問題並提供建設性的意見回饋,同時有助於提問者的專業成長。

回答問題指南