Crawler creating separate tables for some sources and one table for some sources

0

I am crawling data from S3. The data are stored in CSV form. This is how the directory looks like: S3 Bucket

  • logs
    • north_america
      • year=2024/
    • europe
      • year=2024/
    • cog_processors
      • year=2024/
    • spot
      • year=2024/ Each directory (inside year=2024) contains csv files like this: north_america_2024-06-01.csv or cog_processor_2024-06-03.csv. I have set the source on Crawler like this: s3://bucket_name/logs/europe s3://ushr-glue-athena-logs/logs/north_america s3://ushr-glue-athena-logs/logs/cog_processors s3://ushr-glue-athena-logs/logs/spot When I run crawler, one table is created for europe and cog_processors, but for north_america and spot, it creates each table for each csv file: north_america_2024_06_07_csv, north_america_2024_06_06_csv, north_america_2024_06_05_csv, etc and same for spot. For europe and cog_processors, there are only two tables: europe and cog_processors and there are combined data of all csv files. However, for north_america and spot, tables are created separately and all of them are empty. I want to create one table for north_america and spot that has all combined data of all the csv files on that directory. How can I do that?
asked 4 months ago151 views
1 Answer
0

Is there a difference in the schema between cog_processor and the continent data? https://repost.aws/knowledge-center/glue-crawler-multiple-tables

The AWS Glue crawler creates multiple tables when your source data files don't use these same configurations:

Format (such as CSV, Parquet, or JSON)

Compression type (such as SNAPPY, gzip, or bzip2)

Schema

AWS
Raj_R
answered 4 months ago
  • No. They are all CSV files. Spot have different column names and number of columns, but north_america, europe and cog_processors have the same columns.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions