- Newest
- Most votes
- Most comments
Many of the resources you've mentioned are virtual and not built using a single "thing" which makes performing health checks difficult. I will attempt to answer the question though.
- If you can connect from an instance in that VPC to another instance in the VPC then the VPC is working. This is true for both private and public subnets. I would suggest setting up some sort of network probe (Lambda would be ideal for this) that emits a metric to your monitoring tool (CloudWatch Metrics would suit here).
- Similarly, if you can connect to an external (public) IP address from within your VPC or you can connect to an Elastic (or Public) IP connected to your VPC from an external source then the Internet Gateway is working.
- Same as (1) and (2) - except this time ensure that the network probe is on a private subnet and it is probing an external resource that must be reached via the NAT Gateway.
- Security groups aren't a physical firewall; they are (more or less) a property of the instance. You can check to see if the security group is operating correctly by probing ports that should be open (to confirm that they are open); and probing ports that should be closed (to confirm that they are closed). You might also use VPC Reachability Analyzer for this although it doesn't send probes - instead it uses automated reasoning to determine if your security groups are configured correctly.
With all of that said: In my opinion performing these health checks are not a good way to spend your time (or money). Instead, I would be building monitoring that is looking at the health and response times of the applications that are hosted in the VPC. This has the side effect of ensuring that all the other components are working correctly but also gives you greater visibility into what your end-users are experiencing. You might also consider CloudWatch Real-User Monitoring for this task.
Thank you so much Gurus, for your reply.
Please see that if any outage occur in any AWS Services in any Region we will not receive any email for that. Until Client or / End User complaint us that there app is not working (irrespective of what services they are using). We need to check the application first. Then open this link -->> https://status.aws.amazon.com/govcloud And need to check the service manually that it is working fine in the region from where client has logged a complaint.
So main focus was to configure the alert on below services.
- Virtual Private Cloud a. Public Subnet b. Private Subnet
- Internet Gateway
- NAT Gateway
- Security Groups
So that if these services got some outage we can get the Alert on Email.
Since you didn't mention what metric you would like to cover, I will just provide the documentation and maybe we can help you more once you provide additional info. Do you want only to check that it is available or anything else?
VPC - https://docs.aws.amazon.com/vpc/latest/userguide/vpc-cloudwatch.html
NAT GTW - https://docs.aws.amazon.com/vpc/latest/userguide/vpc-nat-gateway-cloudwatch.html
Security groups - https://aws.amazon.com/premiumsupport/knowledge-center/monitor-security-group-changes-ec2/
What is it you want to monitor with Internet GTW? You can use AWS Config , for example - https://docs.aws.amazon.com/config/latest/developerguide/internet-gateway-authorized-vpc-only.html
Not sure if helpful but this https://wellarchitectedlabs.com/reliability/300_labs/300_health_checks_and_dependencies/ LAB can provide additional insight to you.
Let us know what's your use case so we can a little dive deeper
Relevant content
- asked 2 years ago
- AWS OFFICIALUpdated 4 months ago
- AWS OFFICIALUpdated 6 months ago
- AWS OFFICIALUpdated a year ago
- AWS OFFICIALUpdated 2 years ago
- AWS OFFICIALUpdated 3 months ago
As per my answer: Some of those things you can't monitor directly. Doing end-to-end monitoring will ensure that your entire service is working as well as (by proxy) ensuring that the services you want to monitor are working correctly. You can't check a VPC or subnet to see if it is working - it isn't a valid question for that type of construct. Even on premises you can't check to see that a subnet is working - you have to check connectivity to something on that subnet. Same applies here - check to see that your application is working correctly and you've proven the other things.