What is considered a glue data catalog object for pricing?

0

In Glue Catalog pricing, it says

With the AWS Glue Data Catalog, you can store up to a million objects for free. If you store more than a million objects, you will be charged $1.00 per 100,000 objects over a million, per month. An object in the AWS Glue Data Catalog is a table, table version, partition, partition indexes, or database.

I have created a Glue Table and run Lambda jobs to create partitions like the following method 4 in: https://medium.com/@bv_subhash/demystifying-the-ways-of-creating-partitions-in-glue-catalog-on-partitioned-s3-data-for-faster-e25671e65574

If my partition schema looks like:

year=yyyy/month=mm/day=dd/

And the files stored on S3 bucket look like:

year=2023/month=5/day=19/file1.parquet
year=2023/month=5/day=19/file2.parquet
…
year=2023/month=5/day=19/file288.parquet

Is an object in the glue catalog just the partition? Or all the files within the partition considered glue data catalog objects too? We were planning to emit ~300 files within a single bucket, so this question super important because if each file counts as an object, then its 300x the price.

I've asked sales representative, and they think a file might count, but told me I should talk to tech support and linked me to re:Post to confirm since they weren't sure.

posta un anno fa1655 visualizzazioni
1 Risposta
0
Risposta accettata

As the documentation says an object is: "table, table version, partition, partition indexes, or database"; never individual files since the files are not part of the catalog metadata.

profile pictureAWS
ESPERTO
con risposta un anno fa

Accesso non effettuato. Accedi per postare una risposta.

Una buona risposta soddisfa chiaramente la domanda, fornisce un feedback costruttivo e incoraggia la crescita professionale del richiedente.

Linee guida per rispondere alle domande