Greengrass component dependencies on active network

0

I am finding regularly when starting with GG 2.10.3 that some components end up in a 'BROKEN' state - One in particular is 'SystemsManagerAgent'. This is a mobile device, and sometimes the network is not available immediately when the system boots - I suspect that it may be failing due to the network not being up at the time when it's initialised. Also, in other cases, there may be an install step which attempts an 'apt-get', or pip install which would also be attempting to resolve from network. Is there any way to have a dependency for a component on the network being available - or better way to deal with this ?

ManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. Applying config override from /etc/amazon/ssm/amazon-ssm-agent.json.. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. 2023-07-12 23:03:56 ERROR Registration failed due to error registering the instance with AWS SSM. CredentialsEndpointError: failed to load credentials. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. caused by: SerializationError: failed to decode error message. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. status code: 500, request id:. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. caused by: UnmarshalError: failed decoding error message. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. 00000000  46 61 69 6c 65 64 20 74  6f 20 67 65 74 20 63 6f  |Failed to get co|. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. 00000010  6e 6e 65 63 74 69 6f 6e                           |nnection|. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.076Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: stdout. caused by: invalid character 'F' looking for beginning of value. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Startup.Script, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.081Z [INFO] (Copier) aws.greengrass.SystemsManagerAgent: Startup script exited. {exitCode=1, serviceName=aws.greengrass.SystemsManagerAgent, currentState=STARTING}
2023-07-12T23:03:56.101Z [INFO] (pool-2-thread-33) aws.greengrass.SystemsManagerAgent: shell-runner-start. {scriptName=services.aws.greengrass.SystemsManagerAgent.lifecycle.Shutdown.Script, serviceName=aws.greengrass.
已提问 1 年前350 查看次数
1 回答
0

Hi,

Thanks for reaching out. Yes, lack of internet connectivity can result in components such as SystemsManagerAgent that depend on it becoming Broken due to repeatedly failure to start. As of today's date, we do not have a component that can detect internet availability and manage components respectively.

Given these issues occur frequently after startup, one possible workaround is to have an intermediate bash/terminal process that controls Greengrass startup. The process will use a time delay (after calculating the average time it takes for internet to be available and add a buffer) before it runs a system command to start the Greengrass process itself. This assumes that all components depend on the internet to operate.

If there are components on the core device that are critical and can operate without the use of the internet and so need these to be running as soon as possible, you can potentially build your own custom component using one of the supported SDKs [1]. The component will monitor network availability and once it identifies that internet is available, it can then call the RestartComponent greengrass-cli command to restart the component. You can read more about this at [2]. Please take note of the requirements towards using the cli command.

   

References

[1] Use the AWS IoT Device SDK to communicate with the Greengrass nucleus, other components, and AWS IoT Core - Supported SDKs for interprocess communication - https://docs.aws.amazon.com/greengrass/v2/developerguide/interprocess-communication.html#ipc-requirements

[2] Manage local deployments and components - https://docs.aws.amazon.com/greengrass/v2/developerguide/ipc-local-deployments-components.html

   

Honorable Mentions

PauseComponent, ResumeComponent - [] Interact with component lifecycle - https://docs.aws.amazon.com/greengrass/v2/developerguide/ipc-component-lifecycle.html

AWS
支持工程师
已回答 1 年前

您未登录。 登录 发布回答。

一个好的回答可以清楚地解答问题和提供建设性反馈,并能促进提问者的职业发展。

回答问题的准则