helm-janitor is an API interface to clean up releases in your k8s cluster.
It can also scans for helm3 releases
and runs a helm uninstall|delete
against releases that have a janitor-ttl
annotation which has expired on the release.
This first cut of code is initially intended to run in the Lendi k8s clusters to clean up helm releases in our development environment.
The goal of this project is very much to support the Lendi development workflow and has been built around the infrastructure here.
Our setup
- EKS (version >= 1.19)
- Helm releases stored as k8s secrets
- Bitbucket webhooks which fire when PRs are merged.
Support 2 modes of running (delete | scan)
./helm-janitor [command] [options]
[command]
delete <selector> # helm-janitor=true
scan <selector> # BRANCH=feat/test-something,REPO=cool-repo
[options]
--namespace <namespace>
--all-namespaces (default)
--include-namespace <expression match>
--exclude-namespace <expression match>
scan
will scan all namespaces if no specific namespace or expressions are
used.
When we run our k8s deployments via Helm, we also tag and label the helm releases (secrets) with our tooling using the repository and the branch that the release was deployed from.
Teams have a webhook configured on there repo which fires when a PR merged / branch is closed and Stack Janitor will clean up the left over running containers.
Like the kube-janitor project,
we wish to expire helm releases that exceed the ttl
value. During our
helm instal...
step, we tag the release secret afterwards with a
helm-janitor: true
label if we wish to clean up the release. We then read the
release values janitorAnnotations
config to check the janitor/ttl
or
janitor/expires
values and checks against the creationTime to see if we
should delete the release.
janitorAnnotations:
janitor/expires: "2021-07-03T07:06:45Z"
- Run this when teams want to manually clean up release via slack-ops
- Integrate with CI/CD systems via webhook call
- Custom cleanup schedule on certain environments
We can use an AWS lambda to periodically run the app (via serverless) in our AWS environment.
Need an IAM role that maps to a k8s RBAC cluster role which has enough permissions to clean up the helm release.
Use the ROLE_ARN
environment variable if the lambda needs to assume a role to
access the k8s cluster.
We use map the AWS IAM role to a cluster user which has the sufficient RBAC permission to remove a release.
Other k8s native option is to run this as a k8s cronjob that can remove the helm release.
Needs an RBAC cluster role binding which has the right amount of cluster permissions to remove a helm release.
We may reject PRs that break compatibility with our k8s setup.
As per AWS go1.x deprecation, migrated the lambda code to provided.al2023 ref
- Support SQL backends?
- Support configmap backed helm releases?