Errors given when specifying permissions for a data pipeline

0

I'm trying to set up an AWS Data Pipeline so I can clone large huggingface repo's to S3.

I'm encountering issue's when creating the permissions policy to use with a role for my data pipeline. I'm attempting to follow the example policy but I get these errors.

Ln 16, Col 16Invalid ARN Account: The resource ARN account ID 111122223333 is not valid. Provide a 12-digit account ID. Learn more 
Ln 70, Col 16Invalid Action: The action elasticmapreduce:GetCluster does not exist. Learn more 
Ln 97, Col 16Invalid Action: The action s3:ListMultipartUploads does not exist. Learn more 
Ln 111, Col 16Invalid Action: The action s3:GetObjectMetadata does not exist. Learn more 
  • I don't think the error on line 16 is anything to worry about. I use the same ARN as line 15 (no error raised), and I've copied it directly from my account credentials, so I don't see why this would be an issue.

  • For line 70 I have just removed the line, I couldn't find a similar action from the Edit statement pane on the right hand side. I'm not currently using clusters no I can go without, however, I will be using them in the near future so it would be beneficial to have this action.

  • In line 97 I have replaced Action "s3:ListMultipartUploads" with "s3:ListMultipartUploadParts". Is this an equivalent?

  • For line 11, I replaced "s3:GetObjectMetadata" with "s3:GetObjectAttributes". Again, is this equivalent?

It looks to me that there has been some changes to the address of these actions but the example policy in documentation has not reflected this. Am I correct in my assumption or am not creating the policy correctly?

2 Answers
1
Accepted Answer

Short Description:

Setting up an AWS pipeline and using the following AWS documentation [1] “Example Permissions Policies for AWS Data Pipeline Roles” reference policy and getting errors for invalid actions.

Resolution:

  • There is an extraneous space on L16 in the referenced documentation, meaning the policy itself is not valid JSON. To resolve this error update the account ID in the resource ARN and make sure there are no extraneous spaces. Account IDs are 12-digit integers [1].

  • The API elasticmapreduce:GetCluster does not exist but there are other supported API’s such as DescribeCluster and ListClusters that can be used as per document [2] for reference.

  • The API s3:ListMultipartUploads is a valid API but cannot be specified in a policy but s3:ListMultipartUploadParts is the equivalent as per document [3] for reference.

  • The API s3:GetObjectMetadata does not exist but you may use s3:GetObjectAttributes as the API Retrieves all the metadata from an object without returning the object itself. This action is useful if you're interested only in an object's metadata [4].

Lastly, you are correct in your assumption and I have notified the internal team of the invalid API’s in the "AWS Data Pipeline" documentation.

If further assistance is required to troubleshoot a specific error received, may I recommend opening an Internal Ticket with AWS Support for further assistance.

References:

[1] https://docs.aws.amazon.com/IAM/latest/UserGuide/access-analyzer-reference-policy-checks.html

[2] https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonelasticmapreduce.html

[3] https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazons3.html

[4] https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectAttributes.html

AWS
answered 9 months ago
1
  • Line 16: The example has a space after the account number, remove the space.
  • Line 70: I would try DescribeCluster instead of GetCluster
  • Line 97: Your substitution looks reasonable
  • Line 111: Your substitution looks reasonable
profile pictureAWS
EXPERT
kentrad
answered 9 months ago

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.

Guidelines for Answering Questions