Content Role: pillar

API Gateway Authentication: Zero Trust Security

JWT validation and rate limiting at the edge with Kong and Tyk

The traditional perimeter-based security model is dead. When your microservices architecture spans multiple clouds, edge locations, and third-party integrations, the assumption that anything inside your network is trustworthy becomes a critical vulnerability. API gateway authentication has evolved from a simple credential check to a comprehensive zero trust enforcement point that validates every request, regardless of origin.

The problem isn't just about blocking unauthorized access anymore. Modern API gateways must handle distributed identity verification, enforce fine-grained authorization policies, prevent abuse through intelligent rate limiting, and do all of this with sub-10ms latency overhead. Traditional approaches that rely on session cookies, network-level trust, or centralized authentication servers create bottlenecks and single points of failure that don't scale with cloud-native architectures.

Why Traditional API Authentication Fails at Scale

Legacy authentication patterns break down in distributed systems for several concrete reasons. Session-based authentication requires sticky sessions or distributed session stores, adding latency and complexity. Network-level security assumes internal traffic is safe, which ignores lateral movement attacks and compromised services. Centralized authentication servers become bottlenecks when handling thousands of requests per second across geographic regions.

The shift to microservices exacerbates these issues. When a single user request triggers 15-20 internal service calls, each requiring authentication, the overhead compounds. Services need to verify identity independently without creating a cascade of authentication requests back to a central authority. This is where zero trust principles and edge authentication become essential.

Zero Trust Architecture for API Gateways

Zero trust security operates on a simple principle: never trust, always verify. Every request must prove its identity and authorization, regardless of where it originates. For API gateways, this means implementing authentication and authorization at the edge before traffic reaches your internal services.

The architecture consists of several layers. The API gateway sits at the perimeter, validating JWT tokens, enforcing rate limits, and checking authorization policies. Behind it, services can optionally perform additional validation but primarily trust the gateway's verification. Identity providers (like Auth0, Keycloak, or custom OAuth2 servers) issue tokens but don't participate in every request validation.

This approach decentralizes authentication while maintaining security. The gateway caches public keys for JWT verification, validates token signatures cryptographically, and enforces policies without round-trips to external services. Services receive pre-validated requests with identity context, reducing latency and complexity.

Implementing JWT Validation at the Edge

JWT tokens provide a self-contained authentication mechanism perfect for distributed systems. The token carries identity claims, expiration times, and signatures that gateways can verify independently. Here's a production-ready implementation using Kong Gateway with TypeScript plugins:

// kong-jwt-validator.ts
import { createRemoteJWKSet, jwtVerify } from 'jose';
import type { JWTPayload } from 'jose';

interface ValidationConfig {
  jwksUri: string;
  issuer: string;
  audience: string;
  clockTolerance: number;
}

class JWTValidator {
  private jwks: ReturnType<typeof createRemoteJWKSet>;
  private config: ValidationConfig;

  constructor(config: ValidationConfig) {
    this.config = config;
    this.jwks = createRemoteJWKSet(new URL(config.jwksUri), {
      cacheMaxAge: 3600000, // 1 hour cache
      cooldownDuration: 30000, // 30 second cooldown on refresh
    });
  }

  async validate(token: string): Promise<JWTPayload> {
    try {
      const { payload } = await jwtVerify(token, this.jwks, {
        issuer: this.config.issuer,
        audience: this.config.audience,
        clockTolerance: this.config.clockTolerance,
      });

      // Additional custom validations
      if (!payload.sub) {
        throw new Error('Missing subject claim');
      }

      if (payload.scope && typeof payload.scope === 'string') {
        const scopes = payload.scope.split(' ');
        if (scopes.length === 0) {
          throw new Error('No scopes present');
        }
      }

      return payload;
    } catch (error) {
      throw new Error(`JWT validation failed: ${error.message}`);
    }
  }

  extractScopes(payload: JWTPayload): string[] {
    if (typeof payload.scope === 'string') {
      return payload.scope.split(' ');
    }
    if (Array.isArray(payload.scopes)) {
      return payload.scopes;
    }
    return [];
  }
}

export default JWTValidator;

This implementation handles several critical aspects. The JWKS (JSON Web Key Set) is cached to avoid fetching public keys on every request. Clock tolerance accounts for time drift between services. The validator extracts scopes for fine-grained authorization decisions downstream.

Rate Limiting with Token Bucket Algorithm

Rate limiting prevents abuse and ensures fair resource allocation. The token bucket algorithm provides smooth rate limiting with burst capacity, ideal for API gateways. Here's an implementation using Tyk Gateway's custom middleware:

// tyk-rate-limiter.ts
interface RateLimitConfig {
  capacity: number;
  refillRate: number; // tokens per second
  keyPrefix: string;
}

interface TokenBucket {
  tokens: number;
  lastRefill: number;
}

class DistributedRateLimiter {
  private redis: RedisClient;
  private config: RateLimitConfig;

  constructor(redis: RedisClient, config: RateLimitConfig) {
    this.redis = redis;
    this.config = config;
  }

  async checkLimit(identifier: string): Promise<boolean> {
    const key = `${this.config.keyPrefix}:${identifier}`;
    const now = Date.now();

    // Lua script for atomic token bucket operations
    const script = `
      local key = KEYS[1]
      local capacity = tonumber(ARGV[1])
      local refill_rate = tonumber(ARGV[2])
      local now = tonumber(ARGV[3])
      local requested = tonumber(ARGV[4])

      local bucket = redis.call('HMGET', key, 'tokens', 'last_refill')
      local tokens = tonumber(bucket[1]) or capacity
      local last_refill = tonumber(bucket[2]) or now

      -- Calculate tokens to add based on time elapsed
      local elapsed = (now - last_refill) / 1000
      local tokens_to_add = elapsed * refill_rate
      tokens = math.min(capacity, tokens + tokens_to_add)

      if tokens >= requested then
        tokens = tokens - requested
        redis.call('HMSET', key, 'tokens', tokens, 'last_refill', now)
        redis.call('EXPIRE', key, 3600)
        return 1
      else
        return 0
      end
    `;

    const result = await this.redis.eval(
      script,
      1,
      key,
      this.config.capacity,
      this.config.refillRate,
      now,
      1 // requesting 1 token
    );

    return result === 1;
  }

  async getRemainingTokens(identifier: string): Promise<number> {
    const key = `${this.config.keyPrefix}:${identifier}`;
    const bucket = await this.redis.hmget(key, 'tokens', 'last_refill');

    if (!bucket[0]) return this.config.capacity;

    const tokens = parseFloat(bucket[0]);
    const lastRefill = parseFloat(bucket[1]);
    const now = Date.now();
    const elapsed = (now - lastRefill) / 1000;
    const tokensToAdd = elapsed * this.config.refillRate;

    return Math.min(this.config.capacity, tokens + tokensToAdd);
  }
}

export default DistributedRateLimiter;

The Lua script ensures atomic operations in Redis, preventing race conditions in distributed environments. The token bucket refills continuously, allowing burst traffic while maintaining average rate limits. This approach scales horizontally since Redis handles the shared state.

Combining Authentication and Rate Limiting

The real power comes from combining these mechanisms. Rate limits should vary based on authentication context—authenticated users get higher limits than anonymous requests, premium tiers get more capacity than free tiers:

// gateway-middleware.ts
import JWTValidator from './kong-jwt-validator';
import DistributedRateLimiter from './tyk-rate-limiter';

interface RequestContext {
  userId?: string;
  tier: 'anonymous' | 'free' | 'premium' | 'enterprise';
  scopes: string[];
}

class GatewayMiddleware {
  private jwtValidator: JWTValidator;
  private rateLimiters: Map<string, DistributedRateLimiter>;

  constructor(
    jwtValidator: JWTValidator,
    rateLimiters: Map<string, DistributedRateLimiter>
  ) {
    this.jwtValidator = jwtValidator;
    this.rateLimiters = rateLimiters;
  }

  async processRequest(
    authHeader: string | undefined,
    clientIp: string
  ): Promise<RequestContext> {
    let context: RequestContext = {
      tier: 'anonymous',
      scopes: [],
    };

    // Authenticate if token present
    if (authHeader?.startsWith('Bearer ')) {
      const token = authHeader.substring(7);
      try {
        const payload = await this.jwtValidator.validate(token);
        context.userId = payload.sub;
        context.tier = (payload.tier as any) || 'free';
        context.scopes = this.jwtValidator.extractScopes(payload);
      } catch (error) {
        throw new Error('Invalid authentication token');
      }
    }

    // Apply rate limiting based on context
    const rateLimiter = this.rateLimiters.get(context.tier);
    if (!rateLimiter) {
      throw new Error('Rate limiter not configured');
    }

    const identifier = context.userId || clientIp;
    const allowed = await rateLimiter.checkLimit(identifier);

    if (!allowed) {
      const remaining = await rateLimiter.getRemainingTokens(identifier);
      throw new Error(
        `Rate limit exceeded. Tokens remaining: ${remaining.toFixed(2)}`
      );
    }

    return context;
  }
}

export default GatewayMiddleware;

This middleware authenticates requests, extracts user context, and applies appropriate rate limits in a single pass. The gateway enriches requests with validated identity information before forwarding to backend services.

Common Pitfalls in API Gateway Authentication

Token validation without caching: Fetching JWKS on every request adds 50-100ms latency. Always cache public keys with appropriate TTLs and implement background refresh.

Ignoring token expiration edge cases: Clock skew between services causes valid tokens to be rejected. Implement clock tolerance (typically 30-60 seconds) in JWT validation.

Rate limiting by IP only: NAT and proxies cause multiple users to share IPs. Use authenticated user IDs when available, falling back to IP for anonymous requests.

Synchronous external calls in the hot path: Every external dependency in request validation adds latency and failure points. Validate tokens cryptographically without calling identity providers.

Insufficient monitoring: Authentication failures, rate limit hits, and token validation errors provide security insights. Implement comprehensive metrics and alerting.

Missing token revocation strategy: JWTs can't be invalidated before expiration. Implement short token lifetimes (5-15 minutes) with refresh tokens, or maintain a revocation list for critical scenarios.

Best Practices Checklist

Implement JWT validation at the edge with cached JWKS and cryptographic verification
Use token bucket rate limiting with Redis for distributed state management
Apply tiered rate limits based on authentication context and user roles
Set appropriate token expiration (5-15 minutes for access tokens, longer for refresh tokens)
Monitor authentication metrics including validation failures, rate limit hits, and latency
Implement circuit breakers for external dependencies like JWKS endpoints
Use mutual TLS for service-to-service communication behind the gateway
Rotate signing keys regularly and support multiple active keys during rotation
Log security events to SIEM systems for threat detection and compliance
Test rate limiting under load to ensure Redis performance scales with traffic

Frequently Asked Questions

How do I handle token refresh without disrupting user experience?

Implement a refresh token flow where clients request new access tokens before expiration. The gateway validates short-lived access tokens (5-15 minutes) while clients use longer-lived refresh tokens (days to weeks) to obtain new access tokens. This balances security with usability—compromised access tokens expire quickly while users don't need to re-authenticate constantly.

Should I validate JWTs in both the gateway and backend services?

The gateway should perform primary validation, but critical services should verify token signatures independently. This defense-in-depth approach protects against gateway compromise or misconfiguration. Backend services can skip expensive operations like JWKS fetching since the gateway already validated the token, but should verify signatures using cached keys.

How do I implement rate limiting for GraphQL APIs where a single request can be expensive?

Use query complexity analysis to assign cost values to different operations. Instead of rate limiting by request count, limit by accumulated query cost. Calculate complexity based on field depth, list sizes, and resolver costs. This prevents abuse through expensive queries while allowing legitimate simple queries.

What's the best way to handle rate limiting in multi-region deployments?

Use regional Redis clusters with eventual consistency for rate limit state. Accept slight over-limit scenarios (105% of limit) in exchange for low latency. For strict limits, use a global Redis cluster with regional read replicas, accepting higher latency. Most applications benefit from regional enforcement with monitoring to detect abuse patterns.

How do I migrate from session-based authentication to JWT without breaking existing clients?

Run both systems in parallel during migration. The gateway checks for JWT tokens first, falling back to session validation. Issue JWTs alongside sessions for authenticated users. Gradually migrate clients to JWT-based authentication. Monitor usage metrics to determine when session support can be deprecated. This phased approach minimizes disruption.

Should I encrypt JWT payloads or are signatures sufficient?

Signatures prevent tampering but don't hide payload contents. If tokens contain sensitive data, use JWE (JSON Web Encryption) instead of JWS (JSON Web Signature). However, best practice is to keep tokens minimal—include only user ID, expiration, and scopes. Store sensitive data server-side and reference it by user ID. This keeps tokens small and avoids encryption overhead.

How do I handle authentication for WebSocket connections through the gateway?

Authenticate during the initial WebSocket handshake using JWT tokens in the upgrade request. The gateway validates the token and establishes the connection. For long-lived connections, implement periodic token refresh where clients send new tokens over the WebSocket channel. The gateway validates these refresh messages and updates connection context without dropping the connection.

API gateway authentication with zero trust principles transforms security from a perimeter defense to a comprehensive verification system. By implementing JWT validation and intelligent rate limiting at the edge, you create a scalable, secure foundation for modern microservices architectures. The key is treating every request as untrusted, validating cryptographically, and enforcing policies before traffic reaches your services.

API Gateway Authentication: Zero Trust Security

API Gateway Authentication: Zero Trust Security

JWT validation and rate limiting at the edge with Kong and Tyk

Why Traditional API Authentication Fails at Scale

Zero Trust Architecture for API Gateways

Implementing JWT Validation at the Edge

Rate Limiting with Token Bucket Algorithm

Combining Authentication and Rate Limiting

Common Pitfalls in API Gateway Authentication

Best Practices Checklist

Frequently Asked Questions

Comments

More from this blog

Embedding-First Architecture for Real-World LLM Apps

AI/ML Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Command Palette

API Gateway Authentication: Zero Trust Security

JWT validation and rate limiting at the edge with Kong and Tyk

Why Traditional API Authentication Fails at Scale

Zero Trust Architecture for API Gateways

Implementing JWT Validation at the Edge

Rate Limiting with Token Bucket Algorithm

Combining Authentication and Rate Limiting

Common Pitfalls in API Gateway Authentication

Best Practices Checklist

Frequently Asked Questions

Comments

More from this blog