As per this documentation, in the run
lifecycle step, I can have a long-running script and the component will remain in RUNNING
state as long as the script runs. If the script exits with an error code, the component will enter into ERRORED
state. I am trying to understand how to write the script in the recover
step to get the component back to RUNNING
. The documentation says that the recover script times out after 60s by default. Does the component go back into RUNNING
state if the recover script exits with a success code before it times out? But then how does Greengrass know what process to monitor for error/finish events in the future?
I have been experimenting with a dummy component to test out the behavior but I don't really understand the behavior completely. What I tried:
In the run
step, I have a long running python script python3 -u my_script.py
. It starts some server and listen on a host port. Then in the recover
step, I set it up so that if nothing is listening on the port, it just restarts the script. Then I kill the process in the run
step to trigger a recover.
What I noticed is that the component remains in ERRORED
state for 60 seconds and then goes into RUNNING
state. The timeout doesn't stop my python script though, it keeps running, and Greengrass somehow knows to monitor this new process created by the recover step (if I kill it, the recover gets triggered again and creates a new process).
What is surprising is even if I start the python script in the background using python3 -u my_script.py &
, the component will still remain in ERRORED
state for the full 60 seconds before going back into RUNNING
state even though the recover script would have finished much earlier. And Greengrass somehow still knows to monitor this new script started by recover
step for the component state. But then what if I start multiple processes in the background in the recover
step, what will it monitor?
Ah I see, this makes complete sense! So basically the run script is re-run after the recover script finishes executing.