Why Traditional Real-Time Approaches Fall Short

WebSockets became the default choice for real-time communication over the past decade, but this pattern emerged before HTTP/2 multiplexing, modern edge networks, and serverless architectures reshaped how we build distributed systems. WebSockets require maintaining persistent TCP connections with custom protocols, which conflicts with the stateless, horizontally scalable architectures that dominate cloud-native infrastructure in 2025.

Polling mechanisms—whether short or long—create predictable but wasteful traffic patterns. A dashboard polling every 5 seconds generates 17,280 requests daily per client, even when no data changes. This approach fails under modern cost models where egress bandwidth, function invocations, and database queries directly impact operational expenses. More critically, polling introduces artificial latency bounds that make truly responsive interfaces impossible.

The shift toward edge computing and globally distributed architectures exposes another weakness: WebSocket connections don't traverse CDN layers effectively. Most CDN providers in 2025 support HTTP streaming natively but require special configuration or dedicated infrastructure for WebSocket proxying. This architectural friction increases deployment complexity and reduces the effectiveness of edge caching strategies.

Server-sent events architecture solves these problems by using standard HTTP connections with a specific content type (text/event-stream) that keeps the connection open for server-initiated messages. This approach works with existing HTTP infrastructure, supports automatic reconnection with last-event-ID tracking, and integrates seamlessly with modern observability tools that understand HTTP semantics.

Modern Server-Sent Events Implementation

A production-grade SSE implementation requires careful attention to connection lifecycle management, error handling, and scalability patterns. Here's a realistic TypeScript implementation using Node.js with proper connection tracking and graceful shutdown:

import { FastifyInstance, FastifyRequest, FastifyReply } from 'fastify';
import { EventEmitter } from 'events';

interface SSEClient {
  id: string;
  reply: FastifyReply;
  lastEventId: string;
  channels: Set<string>;
  heartbeatInterval: NodeJS.Timeout;
}

class SSEConnectionManager {
  private clients: Map<string, SSEClient> = new Map();
  private eventBus: EventEmitter = new EventEmitter();
  private readonly HEARTBEAT_INTERVAL = 30000;
  private readonly MAX_CLIENTS_PER_INSTANCE = 10000;

  async registerClient(
    request: FastifyRequest,
    reply: FastifyReply,
    channels: string[]
  ): Promise<void> {
    if (this.clients.size >= this.MAX_CLIENTS_PER_INSTANCE) {
      reply.code(503).send({ error: 'Server at capacity' });
      return;
    }

    const clientId = this.generateClientId();
    const lastEventId = request.headers['last-event-id'] as string || '0';

    reply.raw.writeHead(200, {
      'Content-Type': 'text/event-stream',
      'Cache-Control': 'no-cache, no-transform',
      'Connection': 'keep-alive',
      'X-Accel-Buffering': 'no', // Disable nginx buffering
    });

    const client: SSEClient = {
      id: clientId,
      reply,
      lastEventId,
      channels: new Set(channels),
      heartbeatInterval: setInterval(() => {
        this.sendHeartbeat(clientId);
      }, this.HEARTBEAT_INTERVAL),
    };

    this.clients.set(clientId, client);

    // Send missed events if reconnecting
    if (lastEventId !== '0') {
      await this.replayMissedEvents(client, lastEventId);
    }

    // Setup cleanup on connection close
    request.raw.on('close', () => {
      this.removeClient(clientId);
    });

    // Send initial connection confirmation
    this.sendEvent(clientId, 'connected', { clientId, timestamp: Date.now() });
  }

  private sendEvent(clientId: string, eventType: string, data: any, eventId?: string): boolean {
    const client = this.clients.get(clientId);
    if (!client) return false;

    try {
      const id = eventId || Date.now().toString();
      client.reply.raw.write(`id: ${id}\n`);
      client.reply.raw.write(`event: ${eventType}\n`);
      client.reply.raw.write(`data: ${JSON.stringify(data)}\n\n`);
      client.lastEventId = id;
      return true;
    } catch (error) {
      this.removeClient(clientId);
      return false;
    }
  }

  private sendHeartbeat(clientId: string): void {
    const client = this.clients.get(clientId);
    if (!client) return;

    try {
      client.reply.raw.write(': heartbeat\n\n');
    } catch (error) {
      this.removeClient(clientId);
    }
  }

  broadcast(channel: string, eventType: string, data: any, eventId?: string): void {
    for (const [clientId, client] of this.clients.entries()) {
      if (client.channels.has(channel)) {
        this.sendEvent(clientId, eventType, data, eventId);
      }
    }
  }

  private async replayMissedEvents(client: SSEClient, lastEventId: string): Promise<void> {
    // Implementation depends on your event store
    // This example assumes Redis Streams or similar
    const missedEvents = await this.fetchEventsSince(lastEventId, Array.from(client.channels));

    for (const event of missedEvents) {
      this.sendEvent(client.id, event.type, event.data, event.id);
    }
  }

  private async fetchEventsSince(eventId: string, channels: string[]): Promise<any[]> {
    // Integrate with your event persistence layer
    // Redis Streams, Kafka, or custom event store
    return [];
  }

  private removeClient(clientId: string): void {
    const client = this.clients.get(clientId);
    if (client) {
      clearInterval(client.heartbeatInterval);
      this.clients.delete(clientId);
    }
  }

  private generateClientId(): string {
    return `${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;
  }

  getConnectionCount(): number {
    return this.clients.size;
  }

  async shutdown(): Promise<void> {
    for (const [clientId, client] of this.clients.entries()) {
      this.sendEvent(clientId, 'shutdown', { message: 'Server shutting down' });
      clearInterval(client.heartbeatInterval);
      client.reply.raw.end();
    }
    this.clients.clear();
  }
}

// Fastify route registration
export function registerSSERoutes(fastify: FastifyInstance, manager: SSEConnectionManager) {
  fastify.get('/events/stream', async (request, reply) => {
    const channels = (request.query as any).channels?.split(',') || ['default'];
    await manager.registerClient(request, reply, channels);
  });

  fastify.post('/events/publish', async (request, reply) => {
    const { channel, eventType, data } = request.body as any;
    const eventId = Date.now().toString();

    manager.broadcast(channel, eventType, data, eventId);

    // Persist event for replay capability
    await persistEvent(channel, eventType, data, eventId);

    reply.send({ success: true, eventId });
  });
}

async function persistEvent(channel: string, eventType: string, data: any, eventId: string): Promise<void> {
  // Implement event persistence for reconnection replay
  // Use Redis Streams, Kafka, or database with TTL
}

The client-side implementation requires proper reconnection logic and event handling:

class SSEClient {
  private eventSource: EventSource | null = null;
  private reconnectAttempts = 0;
  private readonly MAX_RECONNECT_ATTEMPTS = 10;
  private readonly BASE_RECONNECT_DELAY = 1000;
  private handlers: Map<string, Set<(data: any) => void>> = new Map();

  constructor(
    private url: string,
    private channels: string[]
  ) {}

  connect(): void {
    const channelParam = this.channels.join(',');
    const urlWithChannels = `${this.url}?channels=${channelParam}`;

    this.eventSource = new EventSource(urlWithChannels);

    this.eventSource.onopen = () => {
      console.log('SSE connection established');
      this.reconnectAttempts = 0;
    };

    this.eventSource.onerror = (error) => {
      console.error('SSE connection error:', error);
      this.eventSource?.close();
      this.handleReconnect();
    };

    this.eventSource.addEventListener('connected', (event) => {
      const data = JSON.parse(event.data);
      console.log('Connected with client ID:', data.clientId);
    });

    // Register custom event handlers
    for (const [eventType, callbacks] of this.handlers.entries()) {
      this.eventSource.addEventListener(eventType, (event: MessageEvent) => {
        const data = JSON.parse(event.data);
        callbacks.forEach(callback => callback(data));
      });
    }
  }

  on(eventType: string, callback: (data: any) => void): void {
    if (!this.handlers.has(eventType)) {
      this.handlers.set(eventType, new Set());
    }
    this.handlers.get(eventType)!.add(callback);

    // If already connected, register the listener
    if (this.eventSource) {
      this.eventSource.addEventListener(eventType, (event: MessageEvent) => {
        const data = JSON.parse(event.data);
        callback(data);
      });
    }
  }

  private handleReconnect(): void {
    if (this.reconnectAttempts >= this.MAX_RECONNECT_ATTEMPTS) {
      console.error('Max reconnection attempts reached');
      return;
    }

    const delay = Math.min(
      this.BASE_RECONNECT_DELAY * Math.pow(2, this.reconnectAttempts),
      30000
    );

    this.reconnectAttempts++;
    console.log(`Reconnecting in ${delay}ms (attempt ${this.reconnectAttempts})`);

    setTimeout(() => {
      this.connect();
    }, delay);
  }

  disconnect(): void {
    this.eventSource?.close();
    this.eventSource = null;
  }
}

// Usage example
const client = new SSEClient('/events/stream', ['notifications', 'updates']);

client.on('notification', (data) => {
  console.log('Received notification:', data);
  updateUI(data);
});

client.on('update', (data) => {
  console.log('Received update:', data);
  refreshData(data);
});

client.connect();

Scaling Server-Sent Events Architecture

Horizontal scaling requires addressing connection distribution and event broadcasting across multiple server instances. The naive approach of using a shared message broker works but introduces latency and complexity. Modern implementations leverage Redis Pub/Sub or NATS for inter-instance communication:

import Redis from 'ioredis';

class DistributedSSEManager extends SSEConnectionManager {
  private redis: Redis;
  private subscriber: Redis;
  private instanceId: string;

  constructor() {
    super();
    this.redis = new Redis(process.env.REDIS_URL);
    this.subscriber = new Redis(process.env.REDIS_URL);
    this.instanceId = `instance-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`;

    this.setupSubscriptions();
  }

  private setupSubscriptions(): void {
    this.subscriber.psubscribe('sse:channel:*', (err, count) => {
      if (err) {
        console.error('Failed to subscribe:', err);
        return;
      }
      console.log(`Subscribed to ${count} channels`);
    });

    this.subscriber.on('pmessage', (pattern, channel, message) => {
      const channelName = channel.replace('sse:channel:', '');
      const event = JSON.parse(message);

      // Only broadcast to local clients
      super.broadcast(channelName, event.type, event.data, event.id);
    });
  }

  broadcast(channel: string, eventType: string, data: any, eventId?: string): void {
    const event = {
      type: eventType,
      data,
      id: eventId || Date.now().toString(),
      instanceId: this.instanceId,
    };

    // Publish to Redis for other instances
    this.redis.publish(
      `sse:channel:${channel}`,
      JSON.stringify(event)
    );

    // Also broadcast to local clients immediately
    super.broadcast(channel, eventType, data, eventId);
  }
}

For truly massive scale (millions of concurrent connections), consider using specialized infrastructure like AWS EventBridge, Google Cloud Pub/Sub, or dedicated SSE services. These managed solutions handle connection state, message routing, and geographic distribution without requiring custom scaling logic.

Common Pitfalls and Edge Cases

Proxy and Load Balancer Buffering: Many reverse proxies buffer responses by default, which breaks SSE streaming. Nginx requires X-Accel-Buffering: no, Apache needs SetEnv proxy-nokeepalive 1, and CloudFlare requires Enterprise plan for true streaming support. Always test your full infrastructure stack with actual SSE traffic before production deployment.

Connection Limits: Browsers limit concurrent connections per domain (typically 6-8). This affects applications with multiple SSE streams. Use HTTP/2 multiplexing or consolidate streams into fewer connections with channel-based filtering server-side.

Mobile Network Behavior: Mobile carriers and WiFi networks often terminate idle connections aggressively. Implement heartbeat messages every 15-30 seconds to keep connections alive. The heartbeat should be a comment line (: heartbeat\n\n) to avoid triggering client-side event handlers.

Event Replay Complexity: Implementing reliable event replay requires persistent event storage with TTL policies. Redis Streams provides a good balance of performance and durability, but requires careful memory management. Set appropriate MAXLEN limits and implement event compaction for high-volume streams.

Memory Leaks in Long-Running Connections: Each SSE connection holds references to response objects and buffers. Implement connection limits per instance, monitor memory usage, and implement graceful connection rotation for clients that stay connected for extended periods (days or weeks).

CORS and Authentication: SSE connections don't support custom headers in the initial request (browser limitation). Pass authentication tokens via query parameters or cookies, and implement proper CORS headers. Remember that credentials are sent with every reconnection attempt.

Best Practices for Production Deployments

Implement Connection Limits: Set per-instance connection limits based on available memory and CPU. A typical 2GB instance can handle 5,000-10,000 concurrent SSE connections depending on message frequency and payload size.

Use Event IDs Consistently: Every event should have a unique, monotonically increasing ID. This enables reliable reconnection and event replay. Use timestamp-based IDs with sequence numbers for distributed systems.

Monitor Connection Health: Track connection duration, reconnection rates, message delivery latency, and failed delivery attempts. High reconnection rates indicate network issues or server instability.

Implement Graceful Shutdown: Send a shutdown event before terminating connections during deployments. This allows clients to reconnect immediately to healthy instances rather than waiting for timeout.

Compress Event Payloads: Use gzip compression for the event stream when payload sizes exceed 1KB. Modern browsers and HTTP clients support transparent decompression.

Design for Idempotency: Clients may receive duplicate events during reconnection windows. Include event IDs in your application logic to deduplicate messages client-side.

Set Appropriate Timeouts: Configure server-side connection timeouts (typically 5-10 minutes) to clean up abandoned connections. Implement client-side reconnection logic that respects exponential backoff.

Frequently Asked Questions

What is the difference between Server-Sent Events and WebSockets in 2025?

Server-sent events provide unidirectional server-to-client communication over standard HTTP, while WebSockets offer bidirectional communication over a custom protocol. SSE works seamlessly with HTTP/2, CDNs, and existing infrastructure. Choose SSE when you only need server-to-client push; use WebSockets when you need true bidirectional real-time communication like collaborative editing or gaming.

How does Server-Sent Events architecture handle connection failures?

SSE includes built-in automatic reconnection with exponential backoff. The browser's EventSource API automatically reconnects and sends the last-event-id header, allowing servers to replay missed events. This makes SSE more resilient than custom WebSocket implementations that require manual reconnection logic.

What is the best way to scale Server-Sent Events to millions of users?

Use a distributed architecture with Redis Pub/Sub or NATS for inter-instance messaging, implement connection limits per instance, and leverage HTTP/2 multiplexing. For massive scale, consider managed services like AWS EventBridge or dedicated SSE infrastructure. Implement geographic distribution with edge servers to reduce latency.

When should you avoid using Server-Sent Events?

Avoid SSE when you need client-to-server real-time communication (use WebSockets), when you need binary data streaming (use WebSockets or gRPC), or when you must support Internet Explorer (use polling fallback). SSE also isn't ideal for extremely high-frequency updates (>100 messages/second per client) where UDP-based protocols perform better.

How do you implement authentication for Server-Sent Events connections?

Use HTTP-only cookies for session-based authentication or pass JWT tokens as query parameters. Validate authentication before establishing the SSE connection and implement token refresh logic for long-lived connections. Consider using short-lived connection tokens separate from your main authentication tokens.

What are the bandwidth implications of Server-Sent Events compared to polling?

SSE dramatically reduces bandwidth compared to polling by eliminating request overhead. A polling client making requests every 5 seconds generates ~17KB of HTTP headers daily per client. SSE maintains one connection with minimal heartbeat overhead (~2KB daily). For 100,000 clients, this saves approximately 1.5TB of bandwidth monthly.

How do you handle Server-Sent Events in serverless environments?

Traditional serverless functions (AWS Lambda, Cloud Functions) don't support long-lived connections required for SSE. Use container-based serverless (Cloud Run, Fargate) or dedicated SSE services. Alternatively, implement a hybrid architecture where serverless functions publish events to a managed message broker consumed by long-running SSE

Server-Sent Events: Push Architecture

Why Traditional Real-Time Approaches Fall Short

Modern Server-Sent Events Implementation

Scaling Server-Sent Events Architecture

Common Pitfalls and Edge Cases

Best Practices for Production Deployments

Frequently Asked Questions

Comments

More from this blog

Embedding-First Architecture for Real-World LLM Apps

AI/ML Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Containers/K8s Modern Patterns

Command Palette

Why Traditional Real-Time Approaches Fall Short

Modern Server-Sent Events Implementation

Scaling Server-Sent Events Architecture

Common Pitfalls and Edge Cases

Best Practices for Production Deployments

Frequently Asked Questions

Comments

More from this blog