Lake Formation Data locations vs Data lake locations


I am trying to figure out the difference between Lake Formation components : Data locations and Data lake locations Data lake locations is in the Administration section of lake formation and is asking for the s3 path and the iam role. Data location is in the Permissions section and is asking for pretty much the same.

I was also not able to find the way to provision Data lake location through Cloudformation and CDK.

If anyone can provide a good definition for both or point to the documentation where that is explained when to use them, it would be awesome! Thank you

asked 10 months ago992 views
1 Answer
Accepted Answer

'Data lake locations' under 'Administration' is used for registering a data lake location[1][2]. A data lake here is an S3 location. While registering a location we associate an IAM role[3]. This means any access to tables pointing to this registered location (including all sub-directories/paths) will use the associated role to access the data. Users running a query need not have IAM S3 permissions to access the data. LF vends credentials (of registered role) to access the data.

Once it is registered and now suppose an user (other than a data lake admin) tries to create a table pointing to this s3 path (or a subpath) it fails with 'Insufficient permission on s3 path'. It requires the user to have LF DATA_LOCATION_ACCESS permission on the s3 path. This permission is granted by a data lake administrator. This comes under 'Permissions' > 'Data locations'. A principal with this LF permission will be able to create/alter a catalog resource (database/table/partition) that points to the registered location. Please note this permission (DATA_LOCATION_ACCESS) is not required in order to access the data in S3.

'AWS::LakeFormation::Resource'[4] is used for registering a data lake location in cloudformation. 'AWS::LakeFormation::PrincipalPermissions'[5] is used for granting LF 'DATA_LOCATION_ACCESS' permission to a principal on a registered location.

In short, 'Administration'>'Data lake locations' is for registering a data lake location whereas 'Permissions'>'Data locations' is for granting access to a principal to create/alter catalog resource pointing to this registered location.






answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions