The error_handler program is an AWS Lambda function that processes AWS Batch job failures.
It logs the job failure to a CloudWatch log and publishes the job failure to an SNS Topic.
Top-level Generate repo: https://github.com/podaac/generate
The error_handler program includes the following AWS services:
- Lambda function to execute code deployed via zip file.
- Permissions that allow EventBridge to invoke the Lambda function.
- IAM role and policy for Lambda function execution.
- EventBridge rule to catch Batch job failures and target Lambda function.
- SNS Topic for Batch job failure with a topic policy and an email subscription.
- SNS Topic for Lambda function failure with a topic policy and an email subscription.
- CloudWatch metric alarm for Lambda function errors.
Deploys AWS infrastructure and stores state in an S3 backend using a DynamoDB table for locking.
To deploy:
- Edit
terraform.tfvars
for environment to deploy to. - Edit
terraform_conf/backed-{prefix}.conf
for environment deploy. - Initialize terraform:
terraform init -backend-config=terraform_conf/backend-{prefix}.conf
- Plan terraform modifications:
terraform plan -out=tfplan
- Apply terraform modifications:
terraform apply tfplan
{prefix}
is the account or environment name.