Crawler cannot be started. Verify the permissions in the policies attached to the IAM role defined in the crawler.

1

Hi there, I am trying to run Glue Crawler against a JDBC data source, which is actually an RDS PostgreSQL instance in the same account as my Glue crawler.

My crawler won't start - it fails with the following error reported in CloudWatch:

Crawler cannot be started. Verify the permissions in the policies attached to the IAM role defined in the crawler.

I have given the crawler the IAM role of AWSGlueServiceRoleDefault, which has the following permission policies:

AmazonRDSFullAccess
AmazonS3FullAccess
AWSGlueServiceRole
AdministratorAccess
AWSGlueConsoleFullAccess

The role's trusted entities are as follows:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "glue.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

I fully expected the crawler to work just with AmazonS3FullAccess and AWSGlueServiceRole permissions - adding AdministratorAccess is obviously a desperate measure, but even that is not working, which makes me think the error being reported in CloudWatch by the crawler is not the actual problem.

I've set up the Data Connection already, and the RDS instance it points to is publicly accessible, and also known to be working via other access methods.

Has anyone seen this problem before, or know what else I might check?

已提问 1 年前508 查看次数
1 回答
3

The resolution for this problem was to configure VPC, Subnet and Security groups for the Connector, which I had not done.

I was mislead by two things:

  1. The 'Create connection' wizard which describes Network options as optional ("If your AWS Glue job needs to run on Amazon Elastic Compute Cloud (EC2) instances in a virtual private cloud (VPC) subnet").
  2. The error message above which suggests the problem is with IAM role permissions.

The page that unlocked things for me was here.

Testing an AWS Glue connection
https://docs.aws.amazon.com/glue/latest/dg/console-test-connections.html

To others who encounter this problem, I would suggest testing your AWS Glue connection and making sure that works before worrying about what AWS Crawler is telling you.

已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则