Now that the right metric has been identified to monitor the dependency, it is time to create an alarm to monitor the metric and send notifications based on thresholds defined. CloudWatch Alarms can be used to automatically initiate actions on your behalf. An alarm watches a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time. The action is a notification sent to an Amazon SNS topic or an Auto Scaling policy.
An alarm needs to be created that checks the Lambda function invocation every minute to ensure that it has been invoked at least one time, and treats missing data as an indication that the function has not not been invoked, and as such the dependency that is being monitoring has failed. If the alarm is triggered, a notification should be sent to an SNS topic so that someone can be notified and respond, or an automatic remediation activity can be triggered as a result.
Go to the Amazon CloudWatch console at https://console.aws.amazon.com/cloudwatch, click on Alarms, and then Create alarm
Click on Select metric
In the search bar under All metrics, enter the name of the data read function -
WA-Lab-DataReadFunction and press enter
In the metric breakdown, select Lambda > By Resource and you will see a list of Lambda metrics available
Check the box for the metric Invocations and click Select metric
On the Specify metric and conditions page, make the following changes for the Metric:
Scroll down to the Conditions section and configure it as follows:
Click on the arrow next to Additional Configuration to expand that section and make the following configuration changes:
Click Next to go to the Configure actions page
Under Notification, make the following changes:
Click Next to go to the Add name and description page
Under Name and description, specify the following:
Click Next to go to the Preview and create page
Click Create alarm
Once the alarm has been created, you will be returned to the Alarms page on the CloudWatch console. In the search bar, enter the name of the alarm that was just created WA-Lab-Dependency-Alarm. The alarm will be listed with a state of Insufficient data. This is because CloudWatch is currently evaluating the underlying metric to determine the current state. In a few minutes, you will see the alarm transition from Insufficient data to OK. Click on the alarm name to go to the alarm details page. Review contents of the page to understand the configuration of the alarm such as metric used, threshold set, evaluation interval, etc. The red line on the graph indicates the threshold that has been set for the alarm. Based on the alarm conditions, the alarm will go into an In alarm state if the metric graph falls below this threshold.
Under the History section, you will be able to view all state changes with respect to the alarm. Once the alarm is in an OK state, you will see a State update event under History.
An alarm has now been created on a suitable metric to identify if the external service that the workload is dependent on is experiencing outage. Additionally, notifications have been configured to alert relevant stakeholders when the workload outcome is at risk due to failure/unavailability of the external service.