DevOps

πŸ”„ CI/CD Pipelines & Automation

Continuous Integration (CI) and Continuous Delivery/Deployment (CD) are engineering practices that enable teams to deliver software faster, with greater confidence and repeatability. At Maxi's Computers, every production deployment is automated through pipelines.

CI/CD Principles

  • Build once, deploy everywhere β€” A single artifact (Docker image, binary) is promoted across environments. Never rebuild from source for staging or prod.
  • Fast feedback β€” The CI pipeline must complete in under 10 minutes. Developers should know if their change breaks the build immediately.
  • Everything as code β€” Pipeline definitions, environment configs, and deployment manifests live in version control.
  • Immutable deployments β€” Never patch running instances. Replace them with new ones built from the artifact.
  • Automated quality gates β€” Tests, security scans, and linting must all pass before promotion is allowed.

Pipeline Stages

Our standard pipeline for a containerized service consists of 5 stages:

yaml
# .github/workflows/pipeline.yml
name: Production Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

env:
  REGISTRY: 123456789.dkr.ecr.us-east-1.amazonaws.com
  IMAGE_NAME: mc-api

jobs:
  # ── Stage 1: Quality Gates ──────────────────────────────
  quality:
    name: Code Quality & Tests
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '20', cache: 'npm' }
      - run: npm ci
      - run: npm run lint
      - run: npm run typecheck
      - run: npm run test:unit -- --coverage
      - run: npm run test:integration

  # ── Stage 2: Security Scan ──────────────────────────────
  security:
    name: Security Analysis
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          severity: 'CRITICAL,HIGH'
          exit-code: '1'
      - uses: github/codeql-action/analyze@v3

  # ── Stage 3: Build & Push Image ────────────────────────
  build:
    name: Build & Push Container
    needs: [quality, security]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    outputs:
      image-tag: ${{ steps.meta.outputs.version }}
    steps:
      - uses: actions/checkout@v4
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DEPLOY_ROLE_ARN }}
          aws-region: us-east-1
      - name: Login to ECR
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v2
      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=sha,prefix=,format=short
            type=raw,value=latest
      - name: Build & push
        uses: docker/build-push-action@v5
        with:
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

  # ── Stage 4: Deploy to Staging ─────────────────────────
  deploy-staging:
    name: Deploy to Staging
    needs: build
    runs-on: ubuntu-latest
    environment: staging
    steps:
      - uses: actions/checkout@v4
      - name: Deploy to EKS staging
        run: |
          aws eks update-kubeconfig --name mc-staging-eks --region us-east-1
          helm upgrade --install mc-api ./charts/api \
            --namespace staging \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --values ./charts/api/values.staging.yaml \
            --atomic --timeout 5m

  # ── Stage 5: Deploy to Production ──────────────────────
  deploy-prod:
    name: Deploy to Production
    needs: deploy-staging
    runs-on: ubuntu-latest
    environment: production  # Requires manual approval gate
    steps:
      - uses: actions/checkout@v4
      - name: Deploy to EKS production
        run: |
          aws eks update-kubeconfig --name mc-prod-eks --region us-east-1
          helm upgrade --install mc-api ./charts/api \
            --namespace production \
            --set image.tag=${{ needs.build.outputs.image-tag }} \
            --values ./charts/api/values.production.yaml \
            --atomic --timeout 10m

Environment Promotion

Artifacts flow through environments using image tags. The same Docker image is deployed to each environment β€” only the configuration changes:

  • dev β€” Auto-deploys on every PR merge. Debug logging enabled. Mocked external services.
  • staging β€” Production-identical infrastructure at 20% scale. Integration tests run post-deploy.
  • production β€” Requires manual approval in GitHub. Canary or blue-green deployment.

Deployment Strategies

  • Rolling Update β€” Default Kubernetes strategy. Gradually replaces old pods with new ones. Zero downtime. Risk: mixed versions temporarily serving traffic.
  • Blue-Green β€” Two identical environments (blue=active, green=new). Traffic switches instantly via DNS or load balancer. Instant rollback capability.
  • Canary β€” Route a small percentage of traffic (1–10%) to the new version. Monitor metrics. Gradually increase if stable. Best for risk mitigation on critical paths.
  • Feature Flags β€” Deploy code without activating features. Enable for specific users or segments via LaunchDarkly or custom toggles.

GitOps with ArgoCD

yaml
# argocd-application.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: mc-api
  namespace: argocd
spec:
  project: production
  source:
    repoURL: https://github.com/maxiscomputers/k8s-manifests
    targetRevision: main
    path: apps/api/overlays/production
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true     # Remove resources deleted from Git
      selfHeal: true  # Revert manual changes in the cluster
    syncOptions:
      - CreateNamespace=true
      - PruneLast=true
    retry:
      limit: 5
      backoff:
        duration: 5s
        factor: 2
        maxDuration: 3m

Rollback Strategies

ℹ️

Rollback SLO At Maxi's Computers, we target a maximum rollback time of 5 minutes for any production incident caused by a bad deployment.

bash
# Kubernetes rolling rollback
kubectl rollout undo deployment/mc-api -n production

# Rollback to a specific revision
kubectl rollout history deployment/mc-api -n production
kubectl rollout undo deployment/mc-api --to-revision=3 -n production

# Helm rollback
helm history mc-api -n production
helm rollback mc-api 5 -n production --timeout 5m

# ArgoCD rollback via UI or CLI
argocd app rollback mc-api --revision 42
πŸ“± Install MC Wiki
Add to home screen for offline access.