AWS Step Functions Local does not handle a failure in a nested step function correctly. What is the best place to report a bug? Is this forum a good place for this - or is an issue tracker available somewhere?
The issue happens if we have nested step functions: e.g. an inner step function that is invoked by the outer one. If the inner step function fails (e.g. via a Fail
state), the outer step function incorrectly tries to parse the step function Output (which is null
due to failure) and itself fails with States.Runtime
error and argument "content" is null
cause).
To reproduce, use the latest amazon/aws-stepfunctions-local:1.10.1 Docker image. Launch the container with the following command (note the STEP_FUNCTIONS_ENDPOINT
pointing to itself to enable nested step function execution:
docker run -p 8083:8083 -e STEP_FUNCTIONS_ENDPOINT=http://localhost:8083 amazon/aws-stepfunctions-local
Then create a simple HelloWorld inner step function in the Step Functions Local container with a single Fail
state:
aws stepfunctions --endpoint-url http://localhost:8083 create-state-machine --definition "{\
\"StartAt\": \"HelloFail\",\
\"States\": {\
\"HelloFail\": {\
\"Type\": \"Fail\",\
\"Error\": \"TestFailure\",\
\"Cause\": \"Test cause\"\
}\
}}" --name "HelloWorld" --role-arn "arn:aws:iam::012345678901:role/DummyRole"
Add a simple outer step function that executes the HelloWorld one:
aws stepfunctions --endpoint-url http://localhost:8083 create-state-machine --definition "{\
\"StartAt\": \"InnerInvoke\",\
\"States\": {\
\"InnerInvoke\": {\
\"Type\": \"Task\",\
\"Resource\": \"arn:aws:states:::states:startExecution.sync:2\",\
\"Parameters\": {\
\"StateMachineArn\": \"arn:aws:states:us-east-1:123456789012:stateMachine:HelloWorld\"\
},\
\"End\": true\
}\
}}" --name "HelloWorldOuter" --role-arn "arn:aws:iam::012345678901:role/DummyRole"
Finally, start execution of the outer Step Function:
aws stepfunctions --endpoint-url http://localhost:8083 start-execution --state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:HelloWorldOuter
The execution fails with the argument "content" is null
error in the logs:
arn:aws:states:us-east-1:123456789012:execution:HelloWorldOuter:720b9e13-efdc-4718-9a70-be3eab15f416 : {"Type":"ExecutionFailed","PreviousEventId":2,"ExecutionFailedEventDetails":{"Error":"States.Runtime","Cause":"argument \"content\" is null"}}
Debugging the stepfunction-local Java application, I was able to narrow the error down to JSON parsing failing in DescribeExecutionJson
class:
resultNode.set("Output", this.mapper.readTree(executionResult.getOutput()));
The problem is that getOutput()
is null
for a FAILED
step function execution DescribeExecution
response. So this seems to be a bug - would like to get it to the Step Function Local developers to fix.
I think your problem might be related to https://repost.aws/questions/QU4oG5WM_HQaW4dNyonsVjZA#ANSUb1RDWhQXyjc41CRE3gQQ - try setting the STEP_FUNCTIONS_ENDPOINT to http://localhost:8083
I have this in my aws-stepfunctions-local-credentials.txt
but still getting the following error on my local docker machine: