Content Role: pillar

API Gateway Authentication: Zero Trust Security

JWT validation and rate limiting at the edge with Kong and Tyk

The traditional perimeter-based security model is dead. When your microservices architecture spans multiple clouds, edge locations, and third-party integrations, the concept of a "trusted internal network" becomes meaningless. Every API request—regardless of origin—must be authenticated, authorized, and validated before reaching your services.

API gateway authentication has evolved from a simple credential check to a sophisticated zero trust enforcement point. Modern gateways must validate JWTs in microseconds, enforce granular rate limits per client, and make authorization decisions without introducing latency that degrades user experience. The challenge isn't just implementing authentication—it's doing so at scale while maintaining sub-10ms p99 latency.

Why Traditional Authentication Patterns Fail at the Edge

Legacy authentication architectures typically rely on session-based authentication with sticky sessions or centralized authentication services that become bottlenecks. These patterns break down in distributed systems for several reasons:

Session affinity creates operational complexity. When authentication state lives in memory on specific gateway instances, you need complex load balancer configurations and lose the ability to scale horizontally without disruption.

Synchronous authentication calls add latency. Making a network call to an authentication service for every request adds 20-50ms minimum, often more under load. This compounds across microservices chains.

Insufficient context at the gateway. Traditional gateways only check if a token is valid, not whether the specific client should access the specific resource at this moment given current rate limits, quotas, and dynamic policies.

The zero trust model requires that authentication and authorization happen at the edge, with full context, before requests enter your infrastructure. This means your gateway must be stateful enough to enforce policies but stateless enough to scale horizontally.

Modern API Gateway Authentication Architecture

A production-grade API gateway authentication system requires three core components working in concert:

JWT validation with public key caching ensures tokens are cryptographically verified without calling external services. The gateway maintains a hot cache of public keys from your identity provider, refreshing them based on the kid (key ID) header and cache-control directives.

Distributed rate limiting tracks request counts across gateway instances using a shared state store like Redis or a gossip protocol. This prevents clients from bypassing limits by distributing requests across multiple gateway instances.

Policy-based authorization evaluates fine-grained access control rules at the gateway using attributes from the JWT claims, request metadata, and real-time context like current rate limit consumption.

Here's how these components integrate in a Kong-based implementation:

// kong-jwt-validator.ts
import { createRemoteJWKSet, jwtVerify } from 'jose';
import { createClient } from 'redis';

interface JWTValidationConfig {
  jwksUri: string;
  audience: string;
  issuer: string;
  redisUrl: string;
}

class EdgeAuthenticator {
  private jwks: ReturnType<typeof createRemoteJWKSet>;
  private redis: ReturnType<typeof createClient>;
  private rateLimitWindow = 60; // seconds

  constructor(private config: JWTValidationConfig) {
    this.jwks = createRemoteJWKSet(new URL(config.jwksUri), {
      cacheMaxAge: 3600000, // 1 hour
      cooldownDuration: 30000, // 30 seconds
    });

    this.redis = createClient({ url: config.redisUrl });
    this.redis.connect();
  }

  async validateRequest(token: string, clientId: string, endpoint: string): Promise<{
    valid: boolean;
    claims?: any;
    rateLimitRemaining?: number;
    error?: string;
  }> {
    try {
      // JWT validation with automatic key rotation
      const { payload } = await jwtVerify(token, this.jwks, {
        audience: this.config.audience,
        issuer: this.config.issuer,
        maxTokenAge: '1h',
      });

      // Extract rate limit tier from JWT claims
      const rateLimitTier = payload.tier as string || 'default';
      const maxRequests = this.getRateLimitForTier(rateLimitTier);

      // Distributed rate limiting check
      const rateLimitKey = `ratelimit:${clientId}:${endpoint}:${Math.floor(Date.now() / 1000 / this.rateLimitWindow)}`;
      const currentCount = await this.redis.incr(rateLimitKey);

      if (currentCount === 1) {
        await this.redis.expire(rateLimitKey, this.rateLimitWindow);
      }

      if (currentCount > maxRequests) {
        return {
          valid: false,
          error: 'rate_limit_exceeded',
          rateLimitRemaining: 0,
        };
      }

      return {
        valid: true,
        claims: payload,
        rateLimitRemaining: maxRequests - currentCount,
      };
    } catch (error) {
      return {
        valid: false,
        error: error instanceof Error ? error.message : 'validation_failed',
      };
    }
  }

  private getRateLimitForTier(tier: string): number {
    const limits: Record<string, number> = {
      'free': 100,
      'pro': 1000,
      'enterprise': 10000,
    };
    return limits[tier] || limits['free'];
  }
}

For Tyk, the implementation leverages its built-in middleware chain:

// tyk-custom-auth-plugin.ts
interface TykRequest {
  Headers: Record<string, string>;
  Body: string;
  URL: string;
}

interface TykResponse {
  Body: string;
  Headers: Record<string, string>;
  Code: number;
}

interface TykSessionState {
  rate: number;
  per: number;
  quota_max: number;
  quota_remaining: number;
  metadata: Record<string, any>;
}

async function validateAndEnrich(request: TykRequest, session: TykSessionState): Promise<TykResponse | null> {
  const authHeader = request.Headers['Authorization'];

  if (!authHeader?.startsWith('Bearer ')) {
    return {
      Body: JSON.stringify({ error: 'missing_token' }),
      Headers: { 'Content-Type': 'application/json' },
      Code: 401,
    };
  }

  const token = authHeader.substring(7);

  try {
    // Validate JWT and extract claims
    const claims = await validateJWT(token);

    // Enrich session with JWT claims for downstream services
    session.metadata = {
      user_id: claims.sub,
      tenant_id: claims.tenant_id,
      scopes: claims.scope?.split(' ') || [],
      tier: claims.tier,
    };

    // Dynamic rate limiting based on JWT claims
    const tierLimits = {
      'free': { rate: 10, per: 60 },
      'pro': { rate: 100, per: 60 },
      'enterprise': { rate: 1000, per: 60 },
    };

    const limits = tierLimits[claims.tier as keyof typeof tierLimits] || tierLimits.free;
    session.rate = limits.rate;
    session.per = limits.per;

    // Allow request to proceed
    return null;
  } catch (error) {
    return {
      Body: JSON.stringify({ error: 'invalid_token' }),
      Headers: { 'Content-Type': 'application/json' },
      Code: 401,
    };
  }
}

Implementing Zero Trust Policies at the Gateway

Zero trust requires continuous verification, not just authentication. Your gateway must evaluate multiple signals for every request:

Token freshness and rotation. Set maximum token ages (typically 1 hour for access tokens) and require refresh token rotation. Reject tokens that are valid but stale.

Scope-based authorization. JWT scopes should map to specific API operations. A token with read:users scope should not be able to call POST /users.

Contextual access control. Consider request metadata like IP geolocation, device fingerprint, and time of day. A request from an unusual location should trigger step-up authentication.

// zero-trust-policy-engine.ts
interface PolicyContext {
  claims: any;
  request: {
    method: string;
    path: string;
    ip: string;
    userAgent: string;
  };
  metadata: {
    geoCountry?: string;
    deviceFingerprint?: string;
    riskScore?: number;
  };
}

class ZeroTrustPolicyEngine {
  evaluateAccess(context: PolicyContext): {
    allowed: boolean;
    reason?: string;
    requiresStepUp?: boolean;
  } {
    // Scope validation
    const requiredScope = this.getScopeForEndpoint(context.request.method, context.request.path);
    const userScopes = context.claims.scope?.split(' ') || [];

    if (!userScopes.includes(requiredScope)) {
      return { allowed: false, reason: 'insufficient_scope' };
    }

    // Geo-fencing for sensitive operations
    if (this.isSensitiveOperation(context.request.path)) {
      const allowedCountries = context.claims.allowed_countries || [];
      if (!allowedCountries.includes(context.metadata.geoCountry)) {
        return { allowed: false, reason: 'geo_restricted' };
      }
    }

    // Risk-based step-up authentication
    if (context.metadata.riskScore && context.metadata.riskScore > 0.7) {
      return { 
        allowed: false, 
        requiresStepUp: true,
        reason: 'high_risk_detected' 
      };
    }

    return { allowed: true };
  }

  private getScopeForEndpoint(method: string, path: string): string {
    // Map HTTP method + path to required OAuth scope
    const scopeMap: Record<string, string> = {
      'GET:/api/users': 'read:users',
      'POST:/api/users': 'write:users',
      'DELETE:/api/users/*': 'admin:users',
    };

    const key = `${method}:${path}`;
    return scopeMap[key] || 'api:access';
  }

  private isSensitiveOperation(path: string): boolean {
    return path.includes('/admin') || path.includes('/payment');
  }
}

Common Pitfalls in Gateway Authentication

Caching JWT validation results without considering token revocation. Even with short-lived tokens, you need a mechanism to invalidate compromised tokens immediately. Implement a Redis-based revocation list that your gateway checks before accepting cached validation results.

Ignoring clock skew in JWT validation. Distributed systems have clock drift. Always configure a clock skew tolerance (typically 60 seconds) in your JWT validation library to prevent false rejections.

Rate limiting by IP address alone. NAT and proxy servers mean multiple users share IPs. Rate limit by authenticated client ID or API key, not source IP.

Logging sensitive data in authentication failures. Never log full JWTs or API keys. Log only token prefixes (first 8 characters) and error types for debugging.

Not monitoring authentication latency separately. Authentication adds latency to every request. Track p50, p95, and p99 latency specifically for authentication operations to identify performance degradation before it impacts users.

Best Practices Checklist

[ ] JWT validation happens at the gateway, not in backend services
[ ] Public keys are cached with automatic rotation based on JWKS endpoint
[ ] Rate limiting uses distributed state (Redis) or gossip protocol
[ ] Token maximum age is enforced (1 hour for access tokens)
[ ] Scope-based authorization is implemented for all endpoints
[ ] Authentication metrics are tracked separately (latency, failure rate)
[ ] Token revocation list is checked for high-value operations
[ ] Clock skew tolerance is configured (60 seconds minimum)
[ ] Failed authentication attempts are logged with sanitized data
[ ] Step-up authentication is triggered for high-risk requests

Frequently Asked Questions

How do I handle JWT key rotation without downtime?

Use the kid (key ID) header in JWTs to identify which public key to use for validation. Your gateway should cache multiple public keys simultaneously and automatically fetch new keys when it encounters an unknown kid. The jose library handles this automatically with createRemoteJWKSet.

Should I validate JWTs at the gateway and in backend services?

Yes, implement defense in depth. The gateway performs primary validation and rate limiting, but backend services should verify the JWT signature and claims they care about. This protects against gateway bypass attacks and misconfigurations.

How do I implement rate limiting across multiple gateway instances?

Use Redis with atomic increment operations or a gossip protocol like memberlist. Redis provides strong consistency but requires network calls. Gossip protocols offer eventual consistency with lower latency but may allow brief rate limit violations during synchronization.

What's the right JWT expiration time for API access tokens?

For machine-to-machine APIs, 1 hour is standard. For user-facing APIs, 15-30 minutes with refresh token rotation provides better security. Never use expiration times longer than 24 hours for access tokens.

How do I handle authentication for WebSocket connections?

Authenticate during the WebSocket handshake using a JWT in the Sec-WebSocket-Protocol header or as a query parameter. Once authenticated, maintain the connection without re-authenticating each message, but implement heartbeat checks to detect stale connections.

Should I use API keys or JWTs for service-to-service authentication?

JWTs are preferable because they're self-contained and include expiration. API keys require database lookups and don't expire automatically. If you must use API keys, implement automatic rotation and store them in a secrets manager like HashiCorp Vault.

How do I test gateway authentication policies in CI/CD?

Use contract testing with tools like Pact or write integration tests that spin up your gateway in a container. Test both positive cases (valid tokens) and negative cases (expired tokens, insufficient scopes, rate limit violations). Mock your JWKS endpoint to control key rotation scenarios.

Modern API gateway authentication is the foundation of zero trust architecture. By validating JWTs at the edge, implementing distributed rate limiting, and enforcing contextual policies, you create a security boundary that scales with your infrastructure while maintaining the performance users expect.

API Gateway Authentication: Zero Trust Security

API Gateway Authentication: Zero Trust Security

JWT validation and rate limiting at the edge with Kong and Tyk

Why Traditional Authentication Patterns Fail at the Edge

Modern API Gateway Authentication Architecture

Implementing Zero Trust Policies at the Gateway

Common Pitfalls in Gateway Authentication

Best Practices Checklist

Frequently Asked Questions

Comments

More from this blog

Embedding-First Architecture for Real-World LLM Apps

AI/ML Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Command Palette

API Gateway Authentication: Zero Trust Security

JWT validation and rate limiting at the edge with Kong and Tyk

Why Traditional Authentication Patterns Fail at the Edge

Modern API Gateway Authentication Architecture

Implementing Zero Trust Policies at the Gateway

Common Pitfalls in Gateway Authentication

Best Practices Checklist

Frequently Asked Questions

Comments

More from this blog