Best way to setup bucket with access points?

1

Hello, As part of a SaaS solution, I'm currently setting up the structure for a S3 bucket which will contian multiple clients' data. The idea is to use one access point per client, in order to isolate the different client's data. To be clear, the data is not made accessible to the client (not directly at least). The bucket is only used to absorb data to be used for processing and analysis purposes.

This data is saved into different folders depending on the source type, so for example in a given access point one could have

/images/

/logs/

etc.

However, I'm unsure whether I should add extra partitioning to that, for a few reasons.

For example, one is file collision. Suppose access point A has a file /images/tree.png, and then access point B tries to add a file with the same path, how is the collision handled? That could be solved with a something like hash suffix, but I'd still like to know what would happen.

Then there is the question of scalability. This is not an issue per se, but I'm trying to think about what could happen in the future. It seems to me that having an extra partition on top of the access point would make it easier in the future, if there's any migration / refactoring that are needed.

My solution would be to add the organisation id as prefix. Each access point would only have access (through the policy) to files in a specific subdirectory, like /12345/* However, this means that the callers to the access point need to add that prefix too, which is adds an extra step for all inputs pushing data to the access point, instead of using access point like it were a bucket directly.

I'm not sure which way to go, if I'm complicating things or if there is a simpler solution, hence my question. Any advice would be greatly appreciated!

1 Answer
2

Really good question. A lot to unpack. For the file collision question, turning on object versioning is the right way to go for the entire bucket. You would then end up with a new /images/tree.png version on that object when that object is written a 2nd time. Additionally, you can use S3 Lifecycle Rules to control the lifecycle of the older versions.

I dont think you will need additional suffix on your access points. Each access point will already have your 12-digit AWS account ID in it, but its fine if you add that. The S3 access point policy is your vehicle for controlling which resources (object, prefix, tag) that you want to grant or deny access to. You can absolutely have separate access points that are pointing to the same resources/objects in the access policy to control what can be done to those resources behind the access point.

AWS
answered 3 years ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions