Skip to content

Backup & recovery

What to back up - Compliance reports and audit logs - Model checkpoints and run metadata - Config files and certificates (never commit private keys)

Strategy - Daily encrypted backups with retention policy - Test restore quarterly - Store CA root offline; rotate leaf certs regularly

Disaster recovery - Document RPO/RTO targets - Automate compose/k8s redeploy from backups

Copy‑paste examples

# Local backup (compose volumes)
tar czf backups/aegis-$(date +%F).tgz \
    deploy/prometheus/prometheus.yml \
    deploy/grafana/aegis_dashboard.json \
    deploy/grafana/credentials/admin_password.txt || true

# Kubernetes: export manifests and secrets (namespaced)
kubectl get all,cm,secret -n aegis -o yaml > backups/aegis-k8s-$(date +%F).yaml

# Restore (compose): redeploy
docker compose -f deploy/docker-compose.yml down -v
docker compose -f deploy/docker-compose.yml up -d

# Offsite copy (placeholder: replace with your S3 bucket)
aws s3 cp backups/aegis-$(date +%F).tgz s3://YOUR-BUCKET/aegis/

Tools - Install AWS CLI and jq if needed:

brew install awscli jq