Skip to main content

Command Palette

Search for a command to run...

Deploy Web App: Production Guide

Published
8 min read
T

Welcome to TopperBlog! 👋

I'm a tech content creator passionate about helping developers level up their careers and master cutting-edge technologies.

🎯 What I Write About: • AI/ML Engineering & LLMs • Web3 & Blockchain Development
• System Design & Architecture • Interview Preparation (FAANG) • Freelancing & Remote Work • Modern Tech Stacks (Next.js, React, Rust, TypeScript) • Performance Optimization & Best Practices

💼 Mission: Sharing practical, actionable insights that accelerate your tech career and maximize your earning potential.

📚 15+ In-Depth Guides covering everything from earning $10k/month as a freelancer to cracking FAANG interviews.

🌐 Let's connect and grow together in this amazing tech journey!

#TechBlogger #SoftwareEngineering #CareerGrowth #WebDevelopment #AIEngineering

Why Traditional Deployment Approaches Fail in 2025

The deployment landscape has fundamentally shifted. Modern web applications are no longer single codebases deployed to dedicated servers. They're distributed systems composed of containerized microservices, serverless functions, edge computing nodes, and managed cloud services—all requiring coordinated deployment across multiple availability zones and regions. Traditional approaches break down for specific, measurable reasons.

Manual deployment scripts that worked for single-server PHP applications cannot handle the orchestration complexity of deploying 50+ microservices with interdependencies. A typical e-commerce platform now includes separate services for authentication, product catalog, inventory management, payment processing, recommendation engines, and real-time analytics—each requiring specific deployment ordering, health checks, and rollback procedures.

Configuration management through environment-specific files creates security vulnerabilities and drift. Hardcoded database credentials in .env files get committed to repositories, while configuration differences between staging and production environments cause "works on my machine" failures that only surface under production load. Modern compliance frameworks like SOC 2 Type II explicitly require secrets management systems with audit trails and automatic rotation.

Sequential deployments cause extended downtime windows incompatible with global user bases. When your application serves users across 24 time zones, there's no "maintenance window" that doesn't impact significant traffic. A 15-minute deployment affecting 5% of your user base translates to 750,000 interrupted sessions for an application with 10 million daily active users.

Modern Production Deployment Architecture

A production-grade deployment system in 2025 requires several integrated components working in concert. The architecture centers on immutable infrastructure, declarative configuration, automated testing gates, and progressive delivery mechanisms.

Container Orchestration Foundation

Kubernetes has become the de facto standard for container orchestration, but successful production deployments require specific configuration patterns. Here's a production-ready deployment manifest that addresses real operational requirements:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
  labels:
    app: web-app
    version: v2.5.0
spec:
  replicas: 5
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 2
      maxUnavailable: 0
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
        version: v2.5.0
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9090"
    spec:
      serviceAccountName: web-app-sa
      securityContext:
        runAsNonRoot: true
        runAsUser: 1000
        fsGroup: 1000
      containers:
      - name: web-app
        image: registry.company.com/web-app:v2.5.0-sha256-abc123
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        - containerPort: 9090
          name: metrics
          protocol: TCP
        env:
        - name: NODE_ENV
          value: "production"
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: database-credentials
              key: connection-string
        - name: REDIS_URL
          valueFrom:
            secretKeyRef:
              name: redis-credentials
              key: connection-string
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health/live
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
          timeoutSeconds: 5
          failureThreshold: 3
        readinessProbe:
          httpGet:
            path: /health/ready
            port: 8080
          initialDelaySeconds: 10
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 2
        startupProbe:
          httpGet:
            path: /health/startup
            port: 8080
          initialDelaySeconds: 0
          periodSeconds: 5
          timeoutSeconds: 3
          failureThreshold: 30
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - web-app
              topologyKey: kubernetes.io/hostname

This configuration implements several critical production patterns. The maxUnavailable: 0 setting ensures zero-downtime deployments by maintaining full capacity throughout the rollout. Separate liveness, readiness, and startup probes prevent cascading failures—startup probes allow slow-initializing applications time to boot without triggering premature restarts, while readiness probes remove unhealthy pods from load balancer rotation before they receive traffic.

CI/CD Pipeline Implementation

Modern deployment pipelines must enforce quality gates while maintaining deployment velocity. Here's a production-grade GitHub Actions workflow implementing progressive deployment with automated rollback:

name: Production Deployment

on:
  push:
    branches:
      - main

env:
  REGISTRY: registry.company.com
  IMAGE_NAME: web-app
  KUBE_NAMESPACE: production

jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'

      - name: Install dependencies
        run: npm ci

      - name: Run unit tests
        run: npm run test:unit

      - name: Run integration tests
        run: npm run test:integration
        env:
          DATABASE_URL: ${{ secrets.TEST_DATABASE_URL }}

      - name: Build application
        run: npm run build

      - name: Run security scan
        uses: aquasecurity/trivy-action@master
        with:
          scan-type: 'fs'
          scan-ref: '.'
          severity: 'CRITICAL,HIGH'

      - name: Build container image
        run: |
          docker build \
            --build-arg NODE_ENV=production \
            --build-arg BUILD_DATE=$(date -u +'%Y-%m-%dT%H:%M:%SZ') \
            --build-arg VCS_REF=${{ github.sha }} \
            -t ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
            -t ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest \
            .

      - name: Scan container image
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          severity: 'CRITICAL,HIGH'
          exit-code: '1'

      - name: Push container image
        run: |
          echo ${{ secrets.REGISTRY_PASSWORD }} | docker login ${{ env.REGISTRY }} -u ${{ secrets.REGISTRY_USERNAME }} --password-stdin
          docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
          docker push ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest

  deploy-canary:
    needs: build-and-test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          method: kubeconfig
          kubeconfig: ${{ secrets.KUBE_CONFIG }}

      - name: Deploy canary
        run: |
          kubectl set image deployment/web-app-canary \
            web-app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
            -n ${{ env.KUBE_NAMESPACE }}
          kubectl rollout status deployment/web-app-canary -n ${{ env.KUBE_NAMESPACE }} --timeout=5m

      - name: Wait for canary analysis
        run: sleep 300

      - name: Check canary metrics
        id: canary-check
        run: |
          ERROR_RATE=$(curl -s "http://prometheus.monitoring.svc.cluster.local:9090/api/v1/query?query=rate(http_requests_total{job='web-app-canary',status=~'5..'}[5m])" | jq -r '.data.result[0].value[1]')
          if (( $(echo "$ERROR_RATE > 0.01" | bc -l) )); then
            echo "Canary error rate too high: $ERROR_RATE"
            exit 1
          fi

  deploy-production:
    needs: deploy-canary
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          method: kubeconfig
          kubeconfig: ${{ secrets.KUBE_CONFIG }}

      - name: Deploy to production
        run: |
          kubectl set image deployment/web-app \
            web-app=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }} \
            -n ${{ env.KUBE_NAMESPACE }}
          kubectl rollout status deployment/web-app -n ${{ env.KUBE_NAMESPACE }} --timeout=10m

      - name: Verify deployment
        run: |
          kubectl get deployment web-app -n ${{ env.KUBE_NAMESPACE }}
          kubectl get pods -l app=web-app -n ${{ env.KUBE_NAMESPACE }}

      - name: Run smoke tests
        run: |
          npm run test:smoke -- --endpoint=https://api.production.company.com

      - name: Rollback on failure
        if: failure()
        run: |
          kubectl rollout undo deployment/web-app -n ${{ env.KUBE_NAMESPACE }}
          kubectl rollout status deployment/web-app -n ${{ env.KUBE_NAMESPACE }} --timeout=5m

This pipeline implements several critical safety mechanisms. The canary deployment stage deploys to a small percentage of production traffic first, monitoring error rates before proceeding with full rollout. Automated security scanning with Trivy catches vulnerabilities in both application dependencies and container images before they reach production. The rollback mechanism triggers automatically if smoke tests fail, minimizing mean time to recovery.

Infrastructure as Code and Configuration Management

Modern deployments require declarative infrastructure management. Using Terraform for cloud resources and Kubernetes manifests for application configuration creates reproducible, version-controlled infrastructure:

// infrastructure/main.tf
terraform {
  required_version = ">= 1.6"

  backend "s3" {
    bucket         = "company-terraform-state"
    key            = "production/web-app/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-state-lock"
  }

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "~> 2.24"
    }
  }
}

resource "aws_eks_cluster" "production" {
  name     = "production-cluster"
  role_arn = aws_iam_role.cluster.arn
  version  = "1.28"

  vpc_config {
    subnet_ids              = aws_subnet.private[*].id
    endpoint_private_access = true
    endpoint_public_access  = true
    public_access_cidrs     = ["10.0.0.0/8"]
  }

  enabled_cluster_log_types = ["api", "audit", "authenticator", "controllerManager", "scheduler"]

  encryption_config {
    provider {
      key_arn = aws_kms_key.cluster.arn
    }
    resources = ["secrets"]
  }
}

resource "aws_eks_node_group" "production" {
  cluster_name    = aws_eks_cluster.production.name
  node_group_name = "production-nodes"
  node_role_arn   = aws_iam_role.node.arn
  subnet_ids      = aws_subnet.private[*].id

  scaling_config {
    desired_size = 5
    max_size     = 20
    min_size     = 3
  }

  instance_types = ["t3.xlarge"]

  update_config {
    max_unavailable_percentage = 25
  }

  labels = {
    Environment = "production"
    ManagedBy   = "terraform"
  }
}

Secrets Management and Security

Production deployments require robust secrets management. Never store credentials in code repositories or environment files. Modern solutions use external secrets operators that sync credentials from cloud provider secret managers:

apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: database-credentials
  namespace: production
spec:
  refreshInterval: 1h
  secretStoreRef:
    name: aws-secrets-manager
    kind: SecretStore
  target:
    name: database-credentials
    creationPolicy: Owner
  data:
  - secretKey: connection-string
    remoteRef:
      key: production/database/connection-string
  - secretKey: read-replica-url
    remoteRef:
      key: production/database/read-replica-url

This approach provides automatic secret rotation, audit logging, and eliminates secret sprawl across configuration files.

Observability and Monitoring

Production deployments are incomplete without comprehensive observability. Implement structured logging, distributed tracing, and metrics collection from day one:

// src/observability/logger.ts
import pino from 'pino';

export const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  formatters: {
    level: (label) => {
      return { level: label.toUpperCase() };
    },
  },
  timestamp: pino.stdTimeFunctions.isoTime,
  base: {
    service: 'web-app',
    environment: process.env.NODE_ENV,
    version: process.env.APP_VERSION,
  },
  redact: {
    paths: ['req.headers.authorization', 'req.headers.cookie', 'password', 'token'],
    remove: true,
  },
});

// src/observability/metrics.ts
import { Registry, Counter, Histogram } from 'prom-client';

export const register = new Registry();

export const httpRequestDuration = new Histogram({
  name: 'http_request_duration_seconds',
  help: 'Duration of HTTP requests in seconds',
  labelNames: ['method', 'route', 'status_code'],
  buckets: [0.001, 0.005, 0.015, 0.05, 0.1, 0.5, 1, 5],
  registers: [register],
});

export const httpRequestTotal = new Counter({
  name: 'http_requests_total',
  help: 'Total number of HTTP requests',
  labelNames: ['method', 'route', 'status_code'],
  registers: [register],
});

// src/middleware/observability.ts
import { Request, Response, NextFunction } from 'express';
import { logger } from '../observability/logger';
import { httpRequestDuration, httpRequestTotal } from '../observability/metrics';

export function observabilityMiddleware(req: Request, res: Response, next: NextFunction) {
  const start = Date.now();

  res.on('finish', () => {
    const duration = (Date.now() - start) / 1000;
    const route = req.route?.path || req.path;

    httpRequestDuration.observe(
      { method: req.method, route, status_code: res.statusCode },
      duration
    );

    httpRequestTotal.inc({
      method: req.method,
      route,
      status_code: res.statusCode,
    });

    logger.info({
      msg: 'HTTP request completed',
      method: req.method,
      path: req.path,
      status: res.statusCode,
      duration,
      userAgent: req.get('user-agent'),
      ip: req.ip,
    });
  });

  next();
}

Common Pitfalls and Failure Modes

Even well-designed deployment systems encounter specific failure modes that require proactive mitigation.

Resource exhaustion during deployment: Rolling updates temporarily increase resource consumption as new pods start before old pods terminate. Set maxSurge appropriately and ensure cluster capacity can handle 120-150% of normal pod count during deployments. Monitor node CPU and memory during rollouts.

Database migration failures: Schema migrations that run during deployment can fail halfway through, leaving the database in an inconsistent state. Always use transactional migrations with automatic rollback, and test migrations against production-sized datasets in staging. Consider backward-compatible migrations that deploy in two phases: first deploy code that works with both old and new schemas, then migrate data, then remove old schema support.

Configuration drift between environments: Staging environments that don't match production topology cause deployment failures that only surface in production. Use identical Kubernetes manifests across environments, varying only through Kustomize overlays or Helm values. Regularly sync staging data from production snapshots.

Insufficient health check coverage: Applications that pass liveness probes but fail to handle traffic correctly cause silent failures. Implement comprehensive readiness checks that verify database connectivity, cache availability, and external API reachability. Include startup probes for applications with long initialization times.

Cascading failures during rollout: New application versions that introduce performance regressions can overwhelm downstream services during deployment. Implement circuit breakers, rate limiting, and bulkheads. Use progressive delivery with automated rollback based on error rate thresholds.

Secret rotation during deployment: Rotating secrets while deployments are in progress can cause authentication failures. Implement secret versioning that allows both old and new secrets to work during rotation windows. Use grace periods of at least 2x your