Skip to content

Did AWS break Bedrock Batch execution service role check?

0

I've had Bedrock Batch execution running fine with a role that has trust policy similar to what is specified here: https://docs.aws.amazon.com/bedrock/latest/userguide/batch-iam-sr.html

Only distinction is that with the region in aws:SourceArn there was *:

{
    "Version":"2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "bedrock.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "123456789012"
                },
                "ArnEquals": {
                    "aws:SourceArn": "arn:aws:bedrock:*:123456789012:model-invocation-job/*"
                }
            }
        }
    ]
}

Couple days ago this just stopped working and Batch Inference in Bedrock console shows job status as "Failed" with error: "Failed to start batch job due to invalid configuration: Bedrock cannot retrieve credential with the provided role. Ensure the role's trust policy grants access to Bedrock."

When I remove Condition (both) from the policy, the batch jobs starts just fine. If I leave either condition the execution fails with the given error.

This seems like quite significant security issue. I wouldn't really want to leave aws:SourceAccount unspecified as this opens up potential cross-account attack paths.

What's going on here?

UPDATE:

It really seems that AWS has messed up something Wed/Thu this week that broke my Bedrock batch execution. I didn't change any code or configuration, but still automatic processes suddenly started to fail with the above reason.

After fiddling with the set-up it seems to be connected to VPC configuration in Batch execution (https://docs.aws.amazon.com/bedrock/latest/userguide/batch-vpc.html):

  1. WORKS: Running batch without VPC configuration. Without VPC configuration all combinations for trust policy document seem to work fine (with one or multiple conditions or without any conditions).

  2. WORKS: Running batch with VPC configuration and trust policy that has no conditions.

  3. DOES NOT WORK: Running batch with VPC configuration when trust policy has any condition(s). And I mean any including the following:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Principal": {
				"Service": "bedrock.amazonaws.com"
			},
			"Action": "sts:AssumeRole",
			"Condition": {
				"StringLike": {
					"aws:SourceAccount": "*"
				}
			}
		}
	]
}

When that wildcard condition is removed, batch execution works fine.

So what to do? I feel like I don't want to pay for AWS support to just tell them they f'd something up.

2 Answers
2

To maintain security without triggering the "invalid configuration" error, make two specific changes:

  1. Specify the Region: Replace the * with the actual region where you are running the jobs (e.g., us-east-1).

  2. Use ArnLike for Wildcards: If you must use wildcards (for the region or the job ID), you should use the ArnLike condition operator instead of ArnEquals.

Updated Trust Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "bedrock.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "123456789012"
                },
                "ArnLike": {
                    "aws:SourceArn": "arn:aws:bedrock:us-east-1:123456789012:model-invocation-job/*"
                }
            }
        }
    ]
}

My understanding, why it stoped working?

AWS services occasionally update their "Confused Deputy" prevention logic. While ArnEquals with a wildcard might have been overlooked previously, standard IAM best practice dictates that ArnEquals is for exact matches and ArnLike is for pattern matching.

Security Note: Keeping aws:SourceAccount is correct and necessary. By switching to ArnLike and a specific region, you resolve the validation error while keeping the cross-account protection fully intact.

EXPERT
answered a month ago
  • This didn't help. I've tried quite many combinations and only working change to policy is to remove condition completely.

    I dug this bit deeper and noticed that this is related to VPC configuration. I'm running batch execution with VPC config (https://docs.aws.amazon.com/bedrock/latest/userguide/batch-vpc.html) and this seem to make a difference:

    1. Policy with the contents proposed in the documentation and running with VPC config results in status "Failed" with the error message in original post.
    2. Policy with the contents proposed in the documentation and running without the VPC config results in successful batch execution.
    3. Policy without any conditions and running with VPC config results in successful batch execution.

    And when I said that couple days ago this just stopped working means that I didn't do any code or configuration changes. Automated jobs just started to fail indicating something was changed by AWS (somewhere around Wed/Thu this week). Bit frustrating as I can't find any information about changes related to VPC, VPC endpoints or Bedrock execution!

    I'm also using VPC endpoints within the VPC (for bedrock-runtime and bedrock services), but don't know if that has anything to do with this (at least it shouldn't). Not sure if this is such a rare scenario that this has been on-going for multiple days without any fixes or health notifications.

1

This appears to be a condition key propagation issue specific to VPC endpoint routing.

What's happening

When Bedrock Batch runs with VPC configuration, the sts:AssumeRole call is routed through the VPC endpoint. In this path, the aws:SourceArn condition context key is either not populated or arrives with a different value — causing the condition check to fail. This explains why removing the condition entirely, or running without VPC, works fine.

Ref: IAM condition context keys reference

Workaround that maintains cross-account protection

Remove aws:SourceArn but keep aws:SourceAccount:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "bedrock.amazonaws.com"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringEquals": {
                    "aws:SourceAccount": "123456789012"
                }
            }
        }
    ]
}

This protects against the confused deputy cross-account attack vector. You lose job-level ARN scoping temporarily, but the account-level guard remains intact.

Recommended next step

Open an AWS Support case under "Account and billing" (free) and reference this thread. A working configuration breaking with no changes on your side, specifically tied to VPC batch config, is a regression worth reporting.

answered a month ago
EXPERT
reviewed a month ago
  • This workaround didn't work either. I tried multiple conditions for aws:SourceAccount and aws:SourceArn with StringEquals, StringLike, ArnEquals and ArnLike, but none of these work when VPC config is enabled. These included that exact version too.

You are not logged in. Log in to post an answer.

A good answer clearly answers the question and provides constructive feedback and encourages professional growth in the question asker.