Crawler cannot be started. Verify the permissions in the policies attached to the IAM role defined in the crawler.

1

Hi there, I am trying to run Glue Crawler against a JDBC data source, which is actually an RDS PostgreSQL instance in the same account as my Glue crawler.

My crawler won't start - it fails with the following error reported in CloudWatch:

Crawler cannot be started. Verify the permissions in the policies attached to the IAM role defined in the crawler.

I have given the crawler the IAM role of AWSGlueServiceRoleDefault, which has the following permission policies:

AmazonRDSFullAccess
AmazonS3FullAccess
AWSGlueServiceRole
AdministratorAccess
AWSGlueConsoleFullAccess

The role's trusted entities are as follows:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "glue.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

I fully expected the crawler to work just with AmazonS3FullAccess and AWSGlueServiceRole permissions - adding AdministratorAccess is obviously a desperate measure, but even that is not working, which makes me think the error being reported in CloudWatch by the crawler is not the actual problem.

I've set up the Data Connection already, and the RDS instance it points to is publicly accessible, and also known to be working via other access methods.

Has anyone seen this problem before, or know what else I might check?

1 Answer
3

The resolution for this problem was to configure VPC, Subnet and Security groups for the Connector, which I had not done.

I was mislead by two things:

  1. The 'Create connection' wizard which describes Network options as optional ("If your AWS Glue job needs to run on Amazon Elastic Compute Cloud (EC2) instances in a virtual private cloud (VPC) subnet").
  2. The error message above which suggests the problem is with IAM role permissions.

The page that unlocked things for me was here.

Testing an AWS Glue connection
https://docs.aws.amazon.com/glue/latest/dg/console-test-connections.html

To others who encounter this problem, I would suggest testing your AWS Glue connection and making sure that works before worrying about what AWS Crawler is telling you.

answered a year ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions