By using AWS re:Post, you agree to the AWS re:Post Terms of Use

How do I share AWS Glue Data Catalog databases and tables cross-account using AWS Lake Formation?

8 minute read
0

I want to share AWS Glue Data Catalog databases and tables cross-account using AWS Lake Formation.

Resolution

With Lake Formation's cross-account feature, you can grant access to other AWS accounts to write and share data to or from the data lake. The resources can be shared either through tag-based access control or named resources. This article focuses on granting cross-account access to Data Catalog resources using the named resource method.

Be sure that the prerequisites are met

Keep in mind the following prerequisites before you share your Data Catalog resources with another account or access resources shared from another account:

Revoke Lake Formation permissions

Revoke all Lake Formation permissions from the IAMAllowedPrincipals group for the Data Catalog resource.

Prevent new tables from having Super permissions

For Data Catalog databases that contain tables that you might share, prevent new tables from having a default grant of Super to IAMAllowedPrincipals:

  1. Open the Lake Formation console.
  2. In the navigation pane, under Data Catalog, choose Databases.
  3. Select the database that you want to update.
  4. Choose Actions, and then choose Edit.
  5. Under Default permissions for newly created tables, clear Use only IAM access control for new tables in this database.
  6. Choose Save.

For more information, see Super.

Add permissions required for cross-account access

If the AWS Glue Data Catalog resource policy is already enabled in the account, then you can either remove the policy or add new permissions to the policy that are required for cross-account grants. The following is an example resource policy for providing cross-account AWS Glue access to account 5555666677778888 from account 1111222233334444.

For more information, see Granting cross-account access.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ram.amazonaws.com"
      },
      "Action": "glue:ShareResource",
      "Resource": [
        "arn:aws:glue:us-east-1:1111222233334444:table/*/*",
        "arn:aws:glue:us-east-1:1111222233334444:database/*",
        "arn:aws:glue:us-east-1:1111222233334444:catalog"
      ]
    },
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::5555666677778888:root"
      },
      "Action": "glue:*",
      "Resource": [
        "arn:aws:glue:us-east-1:1111222233334444:table/*/*",
        "arn:aws:glue:us-east-1:1111222233334444:database/*",
        "arn:aws:glue:us-east-1:1111222233334444:catalog"
      ]
    }
  ]
}

Enable sharing with organizations

If the Data Catalog resources are shared across organizations, then enable sharing with AWS Organizations using the AWS RAM console. The AWS Identity and Access Management (IAM) user or role enabling this option must have the ram:EnableSharingWithAwsOrganization IAM permission.

For more information, see Cross-account access prerequisites.

Grant the required IAM permissions

Source account: To use the named resources method to grant cross-account permissions, you must have the required IAM permissions for AWS Glue and AWS Resource Access Manager (AWS RAM). You can choose the AWS managed policy AWSLakeFormationCrossAccountManager that grants these permissions or create a new policy based on this policy.

Target account: Data lake administrators in target accounts must have the following additional policy. This policy allows the administrator to accept the AWS RAM resource share invitations and enable resource sharing with organizations:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ram:AcceptResourceShareInvitation",
        "ram:RejectResourceShareInvitation",
        "ec2:DescribeAvailabilityZones",
        "ram:EnableSharingWithAwsOrganization"
      ],
      "Resource": "*"
    }
  ]
}

Note: The IAM user or role receiving the resource share invitation in AWS RAM must have the required IAM Permissions for glue:PutResourcePolicy.

Share a database and its tables with the target account

To share a database and all the tables in the database with the target account that's not part of the organization, do the following:

Note: If you share all the tables in a database in the source account, then any new table created in the source account is automatically shared with the target account.

In the source account, do the following:

  1. Open the Lake Formation console, and sign in as a data lake administrator.
  2. In the navigation pane, choose Databases.
  3. Select the database that you want to share.
  4. Choose Actions, and then choose Grant.
  5. Select External account.
  6. For AWS account ID or AWS organization ID, enter the account ID of the target account.
  7. For Table, be sure that All tables is selected.
  8. For Table permissions and Grantable permissions, select the access permissions that you want to grant.
  9. Choose Grant.

In the target account, do the following:

  1. Open the AWS RAM console.
  2. In the navigation pane, under Shared with me, choose Resource shares.
  3. Review the list of resource shares to which you've been granted access.
  4. To accept the invitation for the resource shared from the source account, select the resource share ID, and choose Accept resource share.
  5. Open the Lake Formation console.
  6. In the navigation pane, choose Databases.
    You can view shared database in the listing. The Owner account ID for this database shows the account ID of the source account.
  7. Select the shared database, and then choose Actions.
  8. Choose Create resource link.
  9. In the Create resource link page, do the following:
    For Resource link name, enter the name of the resource link.
    For Shared database, be sure that the name of the shared database is selected.
    For Shared database's owner ID, enter the account ID of the source account.
  10. Choose Create.
    The resource link is created.

Resource links are Data Catalog objects that are links to metadata databases and tables, typically to shared databases and tables from other AWS accounts. They help to enable cross-account access to data in the data lake. After the resource link is created, you can query the tables in the shared database with the data lake administrator access.

To grant access to the IAM users/principals for the shared database, grant the required permissions for the resource link and the shared database. This allows the IAM users/principals to view the shared database and resource link in their Lake Formation console. The IAM users can also view the database and resource link in their Amazon Athena console or Amazon Redshift Spectrum.

To grant access to IAM users for the resource link, do the following:

  1. Open the Lake Formation console, and sign in as a data lake administrator.
  2. In the navigation pane, choose Databases.
  3. Select the resource link that you created.
  4. Choose Actions, and then choose Grant.
  5. Under Principals, select IAM users and roles.
  6. For IAM users and roles, select the IAM user or principal for which you need to grant access.
  7. Under Resource link permissions, select Describe.
  8. Choose Grant.

To grant access to the IAM users for the shared databases, do the following:

  1. Open the Lake Formation console, and sign in as a data lake administrator.
  2. In the navigation pane, choose Databases.
  3. Select the shared database.
  4. Choose Actions, and then choose Grant.
  5. Under Principals, select IAM users and roles.
  6. For IAM users and roles, select the IAM user or principal for which you need to grant access.
  7. Under Database permissions, select Describe.
    Note: This step provides the minimum permissions for the users to view the shared database.
  8. Choose Grant.

To grant access to all or specific tables in the database, select the All tables option:

  1. Select the resource link.
  2. Choose Actions, and then choose Grant.
  3. Select IAM users and roles.
  4. For IAM users and roles, select the user/principal for which you want to grant access.
  5. Under LF-Tags or catalog resources, do the following:
    To grant access to all tables in the database, for Tables - optional, select All tables.
    To grant access to only specific tables in the database, for Tables - optional, select the tables.
  6. For Table permissions and Grantable permissions, select Select and Describe.
  7. Choose Grant.

Note: You can grant only those permissions that you selected for Grantable permissions in the source account.

After granting the required permissions, you can successfully query the table in Athena from the target account.

Share only tables with the target account

To share individual tables with the target account, follow the instructions from the previous section with the following changes.

Source account:

To grant access to the target account from the Lake Formation console, select the individual tables instead of selecting the database.

Target account:

  • Accept the resource share in the AWS RAM console to access the shared table in the Lake Formation console.
  • Create a resource link for the shared table. After the resource link is created, you can query the shared table with the data lake administrator access.
  • To grant access to the IAM users/principals for the shared table, you must grant permissions for the resource link.

Review additional considerations

  • When you grant permissions on the table, you can restrict access only for the particular columns in the table. If you do so, the target account can view only those columns in the shared table.
  • Be sure that the IAM users/principals from the target account have access to the Amazon Simple Storage Service (Amazon S3) path in the source account.
  • If you revoke the permissions that were granted earlier from the source account, then the target account can't access the shared database/table. However, the resource link that you created in the target account isn't automatically deleted. You must manually delete the resource link.
  • When you delete a database/table, the resource shares in AWS RAM aren't deleted automatically. Therefore, you must manually revoke the cross-account permissions before deleting a shared database or table.

Related information

Viewing shared Data Catalog tables and databases

Creating resource links

Granting data location permissions (external account)

Granting and revoking permissions on Data Catalog resources

How AWS Lake Formation cross-account feature works

AWS OFFICIAL
AWS OFFICIALUpdated 3 years ago