What is considered a glue data catalog object for pricing?

0

In Glue Catalog pricing, it says

With the AWS Glue Data Catalog, you can store up to a million objects for free. If you store more than a million objects, you will be charged $1.00 per 100,000 objects over a million, per month. An object in the AWS Glue Data Catalog is a table, table version, partition, partition indexes, or database.

I have created a Glue Table and run Lambda jobs to create partitions like the following method 4 in: https://medium.com/@bv_subhash/demystifying-the-ways-of-creating-partitions-in-glue-catalog-on-partitioned-s3-data-for-faster-e25671e65574

If my partition schema looks like:

year=yyyy/month=mm/day=dd/

And the files stored on S3 bucket look like:

year=2023/month=5/day=19/file1.parquet
year=2023/month=5/day=19/file2.parquet
…
year=2023/month=5/day=19/file288.parquet

Is an object in the glue catalog just the partition? Or all the files within the partition considered glue data catalog objects too? We were planning to emit ~300 files within a single bucket, so this question super important because if each file counts as an object, then its 300x the price.

I've asked sales representative, and they think a file might count, but told me I should talk to tech support and linked me to re:Post to confirm since they weren't sure.

已提问 1 年前1665 查看次数
1 回答
0
已接受的回答

As the documentation says an object is: "table, table version, partition, partition indexes, or database"; never individual files since the files are not part of the catalog metadata.

profile pictureAWS
专家
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则