Trying to surface daily csvs in S3 in Redshift via AWS Glue Studio but databases aren't showing up


I am trying to use the AWS Glue Studio to build a simple ETL workflow. Basically, I have a bunch of csv files in different directories in S3. I want those csvs to be accessible via a database and have chosen Redshift for the job. The directories and will be updated every day with new csv files. The file structure is:

YYYY-MM-DD (e.g. 2023-03-07) |---- groupName1 |---- groupName1.csv |---- groupName2 |---- groupName2.csv ... |---- groupNameN |---- groupNameN.csv

We will be keeping historical data, so every day I will have a new date-based directory.

I've read that AWS Glue can automatically copy data on a schedule but I can't see my Redshift databases or tables (screenshot below). I'm using my AWS admin account and I do have AWSGlueConsoleFullAccess permission (screenshot below)

Enter image description here

Enter image description here

1 Answer

Those databases and tables are from the Glue Catalog, not Redshift.
The way it's intended to work is having a crawler map the Redshift tables to Catalog tables and they will be listed there for you to use.
Sorry for the inconvenience, the team is aware that this is something to improve.

profile pictureAWS
answered a year ago
  • So if I have hundreds of new .csv files every day in new directories in S3, what is a recommended approach to scalably load that data into Redshift tables? Also, what is the best way of creating those hundreds of Redshift tables to begin with?

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions