Glue Crawlers and MongoDB Atlas

0

Is it possible to wildcard the include path for a MongoDB crawler. I've tried a number of different options similar to the options available for JDBC and other relational database connections, but they don't seem to work (example include path: "mydb/%"). I get the following error: "Crawler Error: Conversion = '/'

질문됨 2달 전123회 조회
1개 답변
1
수락된 답변

It is not possible to use a wildcard in the include path when crawling MongoDB data using AWS Glue crawlers. The include path for MongoDB must specify the exact database and collection names.

However, there are a few options to consider:

  • Create multiple crawlers - one for each database/collection combination. This allows crawling specific subsets of data.
  • Use an exclude pattern to skip certain collections or documents based on patterns. For example, to exclude any collection starting with "temp_", you can use the exclude pattern "temp_*".
  • If the databases and collections have a consistent naming pattern, you can programmatically generate the crawler configuration including all the combinations. Then update this configuration periodically as needed.
  • Another alternative is to copy the MongoDB data into Amazon S3 using tools like AWS Database Migration Service. You can then crawl the data in S3 using wildcards in the include path.
profile picture
전문가
답변함 2달 전
profile picture
전문가
검토됨 2달 전

로그인하지 않았습니다. 로그인해야 답변을 게시할 수 있습니다.

좋은 답변은 질문에 명확하게 답하고 건설적인 피드백을 제공하며 질문자의 전문적인 성장을 장려합니다.

질문 답변하기에 대한 가이드라인

관련 콘텐츠