How do I create Amazon Redshift Spectrum cross-account access to AWS Glue and Amazon S3?

5 minute read
0

I want to use Amazon Redshift Spectrum to access AWS Glue and Amazon Simple Storage Service (Amazon S3) in a different AWS account.

Resolution

Create an IAM role that trusts Amazon Redshift

To access cross-account AWS resources with Amazon Redshift Spectrum, you must create an AWS Identity and Access Management (IAM) role that trusts Amazon Redshift. Then, attach the role to the Amazon Redshift cluster.

Complete the following steps:

  1. Open the IAM console.
  2. In the navigation pane, choose Roles.
  3. Choose Create role.
  4. For Select type of trusted entity, choose AWS service.
  5. Choose Redshift.
  6. For Select your use case, choose Redshift - Customizable.
  7. Choose Next: Permissions, Next: Tags, and then Next: Review.
    Note: You don't need to add policies or tags.
  8. Enter a Role name, and then choose Create role. In the following examples, the role is redshift_role1.
  9. Associate redshift_role1 with the Amazon Redshift cluster. This association allows your cluster to assume the newly created role to access Amazon S3, Amazon Athena, and AWS Glue.

Create a new IAM role with access to AWS Glue and Amazon S3

In a different account, create a new IAM role with access to both AWS Glue and Amazon S3. Amazon Redshift assumes the role to access AWS Glue and S3 resources through the trust relationship.

Complete the following steps:

  1. Open the IAM console.

  2. Choose Policies, and then choose Create policy.

  3. Choose the JSON tab, and then enter an IAM policy similar to the following example:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "VisualEditor0",
          "Effect": "Allow",
          "Action": [
            "glue:BatchCreatePartition",
            "glue:UpdateDatabase",
            "glue:CreateTable",
            "glue:DeleteDatabase",
            "glue:GetTables",
            "glue:GetPartitions",
            "glue:BatchDeletePartition",
            "glue:UpdateTable",
            "glue:BatchGetPartition",
            "glue:DeleteTable",
            "glue:GetDatabases",
            "glue:GetTable",
            "glue:GetDatabase",
            "glue:GetPartition",
            "glue:CreateDatabase",
            "glue:BatchDeleteTable",
            "glue:CreatePartition",
            "glue:DeletePartition",
            "glue:UpdatePartition"
          ],
          "Resource": "*"
        },
        {
          "Sid": "VisualEditor1",
          "Effect": "Allow",
          "Action": [
            "s3:GetObject",
            "s3:ListBucketMultipartUploads",
            "s3:ListBucket",
            "s3:GetBucketLocation",
            "s3:ListMultipartUploadParts"
          ],
          "Resource": [
            "arn:aws:s3:::your_bucket",
            "arn:aws:s3:::your_bucket/*"
          ]
        },
        {
          "Sid": "VisualEditor2",
          "Effect": "Allow",
          "Action": [
            "kms:Decrypt"
          ],
          "Resource": [
            "<KMS_KEY_ARN>"
          ]
        }
      ]
    }

    Note: Replace your_bucket with the name of the S3 bucket that you want to access with Amazon Redshift Spectrum. Replace KMS_KEY_ARN with the ARN of the KMS key that encrypts your S3 bucket. If the files in your S3 bucket are encrypted, then grant the proper permissions to Amazon Redshift.

  4. Choose Review policy.

  5. Enter a Name for the policy, and then choose Create policy.

  6. In the navigation pane, choose Roles, and then choose Create role.

  7. For Select type of trusted entity, choose Another AWS account.

  8. For Account ID, enter the ID of the account that uses Amazon Redshift. For more information, see Using an alias for your AWS account ID.

  9. Choose Next: Permissions.

  10. In the list of policies, select the name of the policy that you created.

  11. Choose Next: Tags and then choose Next: Review.
    Note: You don't need to add tags.

  12. Enter a Role name, and then choose Create role. In the following examples, the role is glue_s3_role2.

  13. In the navigation pane, choose Roles.

  14. Select the name of the role that you created, and then choose the Trust Relationships tab.

  15. Choose Edit Trust Relationship.

  16. Delete the existing policy, and then replace it with a policy similar to the following example:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": "arn:aws:iam::redshift_account1:role/redshift_role1"
          },
          "Action": "sts:AssumeRole"
        }
      ]
    }

    Note: Replace redshift_account1 with the ID of the account that uses Amazon Redshift. Replace redshift_role1 with the name of the first role that you created.

  17. Choose Update Trust Policy.

Update permissions for the Amazon Redshift IAM role

Update the permissions for the first IAM role that you created. Complete the following steps:

  1. Open the IAM console.

  2. Choose Roles from the navigation pane, and then choose the name of the first role (redshift_role1) that you created earlier.

  3. Under Permissions, choose Add inline policy.

  4. Choose the JSON tab. Delete the existing policy, and then enter an IAM policy similar to the following example:

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "Stmt1487639602000",
          "Effect": "Allow",
          "Action": [
            "sts:AssumeRole"
          ],
          "Resource": "arn:aws:iam::glue_s3_account2:role/glue_s3_role2"
        }
      ]
    }

    Note: Replace glue_s3_account2 with the ID of the account that uses AWS Glue and Amazon S3. Replace glue_s3_role2 with the name of the second role that you created.

  5. Choose Review policy.

  6. Enter a name for the policy, and then choose Create policy.

  7. Connect to the Amazon Redshift cluster.

  8. Create an external schema that uses the roles that you created in both accounts. See the following example:

    create external schema spectrum_schema
    from data catalog
    database 'your_db'
    iam_role 'arn:aws:iam::redshift_account1:role/redshift_role1,arn:aws:iam::glue_s3_account2:role/glue_s3_role2'

    Note: Replace the following values:

    your_db: the name of your database in AWS Glue.
    redshift_account1: the ID of the Amazon Redshift account.
    redshift_role1: the name of the role that you created in the Amazon Redshift account.
    glue_s3_account2: the ID of the AWS Glue and Amazon S3 account.
    glue_s3_role2: the name of the role that you created in the AWS Glue and Amazon S3 account.

    In the Create External Schema DDL statement, list the IAM role ARNs in the following order: redshift_cluster_role, destination_account_role. Don't include spaces between the roles.

You can now use Amazon Redshift Spectrum from your Amazon Redshift cluster in redshift_account1 to query AWS Glue tables in glue_s3_account2. All resources must be in the same AWS Region. You don't need to recreate your external tables because Amazon Redshift Spectrum can access your existing AWS Glue tables.

The following query returns the number of rows in an AWS Glue table created in the external schema:

select count(*) from spectrum_schema.glue_table;

Note: Replace spectrum_schema and glue_table with the name of your schema and AWS Glue table.

Related information

Authorizing Amazon Redshift to access AWS services on your behalf

IAM policies for Amazon Redshift Spectrum

AWS OFFICIAL
AWS OFFICIALUpdated a month ago