API Composition: Backend for Micro-Frontends
Welcome to TopperBlog! 👋
I'm a tech content creator passionate about helping developers level up their careers and master cutting-edge technologies.
🎯 What I Write About:
• AI/ML Engineering & LLMs
• Web3 & Blockchain Development
• System Design & Architecture
• Interview Preparation (FAANG)
• Freelancing & Remote Work
• Modern Tech Stacks (Next.js, React, Rust, TypeScript)
• Performance Optimization & Best Practices
💼 Mission: Sharing practical, actionable insights that accelerate your tech career and maximize your earning potential.
📚 15+ In-Depth Guides covering everything from earning $10k/month as a freelancer to cracking FAANG interviews.
🌐 Let's connect and grow together in this amazing tech journey!
#TechBlogger #SoftwareEngineering #CareerGrowth #WebDevelopment #AIEngineering
API Composition Pattern: Backend for Micro-Frontends
Modern micro-frontend architectures promise autonomous teams, independent deployments, and technology flexibility. Yet most implementations fail at the data layer. When each micro-frontend independently calls multiple backend services, you face cascading latency, inconsistent error handling, authentication token sprawl, and frontend code bloated with orchestration logic. The API composition pattern micro-frontends need isn't just about aggregating data—it's about maintaining performance boundaries while preserving team autonomy in distributed systems where a single page view might require data from eight different services owned by five different teams.
The consequences of getting this wrong are measurable and expensive. A major e-commerce platform discovered their product detail page made 23 separate API calls from the browser, resulting in a 4.2-second time-to-interactive on 4G networks and a 12% conversion rate drop. Their micro-frontends were architecturally independent but operationally coupled through shared data dependencies, creating a distributed monolith worse than the original system they decomposed.
Why Traditional API Patterns Fail for Micro-Frontends
The conventional approach of letting each micro-frontend directly consume backend microservices creates fundamental problems that worsen as systems scale. Direct service-to-frontend communication means every frontend team must understand authentication protocols, rate limiting, retry logic, circuit breaking, and data transformation for every service they consume. This knowledge distribution violates the bounded context principle that makes microservices valuable.
Network waterfalls become unavoidable when frontends orchestrate calls sequentially. If your checkout micro-frontend needs user data, cart contents, inventory status, and shipping options, making four sequential round trips from the browser adds 400-800ms of latency before rendering even begins. Parallel requests help but introduce race conditions and complex state management.
The authentication problem compounds quickly. Each backend service requires valid tokens, but managing token refresh, scope validation, and security context propagation from browser JavaScript exposes attack surfaces and creates inconsistent security implementations across teams. When a security vulnerability emerges, you're patching authentication logic in twelve different frontend repositories.
Traditional API gateways solve some problems but create others. A centralized gateway becomes a deployment bottleneck, forcing coordination between teams that micro-frontends were supposed to decouple. Configuration sprawl makes the gateway itself a distributed monolith. Generic gateways lack the domain context needed for intelligent composition—they route and transform but don't understand that a product page needs inventory checked only for in-stock items.
The API Composition Pattern for Micro-Frontends
The API composition pattern introduces a dedicated backend layer—often called Backend for Frontend (BFF)—that sits between micro-frontends and backend microservices. Each micro-frontend gets a corresponding composition service that aggregates, transforms, and optimizes data specifically for that frontend's needs. This isn't a shared API gateway; it's a purpose-built composition layer owned by the same team that owns the frontend.
The architectural principle is clear: move orchestration complexity from the browser to the backend where you control latency, security, and failure handling. The composition layer makes parallel service calls, handles partial failures gracefully, implements caching strategies, and returns exactly the data shape the frontend needs—no over-fetching, no under-fetching.
Here's a production-grade implementation for a product detail micro-frontend's composition service:
// product-composition-service/src/handlers/productDetail.ts
import { FastifyRequest, FastifyReply } from 'fastify';
import { ServiceOrchestrator } from '../orchestration/ServiceOrchestrator';
import { CircuitBreaker } from '../resilience/CircuitBreaker';
import { CacheManager } from '../cache/CacheManager';
interface ProductDetailRequest {
productId: string;
userId?: string;
includeRecommendations?: boolean;
}
interface ComposedProductDetail {
product: ProductData;
inventory: InventoryStatus;
pricing: PriceData;
reviews: ReviewSummary;
recommendations?: Product[];
}
export class ProductDetailHandler {
constructor(
private orchestrator: ServiceOrchestrator,
private cache: CacheManager,
private circuitBreaker: CircuitBreaker
) {}
async handle(
request: FastifyRequest<{ Params: { productId: string }; Querystring: ProductDetailRequest }>,
reply: FastifyReply
): Promise<void> {
const { productId } = request.params;
const { userId, includeRecommendations = false } = request.query;
// Check cache first - product data changes infrequently
const cacheKey = `product:${productId}:${userId || 'anon'}`;
const cached = await this.cache.get<ComposedProductDetail>(cacheKey);
if (cached && this.isCacheValid(cached)) {
reply.header('X-Cache', 'HIT');
return reply.send(cached);
}
try {
// Parallel composition with circuit breaker protection
const [product, inventory, pricing, reviews, recommendations] = await Promise.allSettled([
this.circuitBreaker.execute('product-service', () =>
this.orchestrator.getProduct(productId)
),
this.circuitBreaker.execute('inventory-service', () =>
this.orchestrator.getInventory(productId)
),
this.circuitBreaker.execute('pricing-service', () =>
this.orchestrator.getPricing(productId, userId)
),
this.circuitBreaker.execute('review-service', () =>
this.orchestrator.getReviewSummary(productId)
),
includeRecommendations
? this.orchestrator.getRecommendations(productId, userId)
: Promise.resolve(null)
]);
// Handle partial failures gracefully
const composed = this.composeResponse({
product: this.extractValue(product, 'product'),
inventory: this.extractValue(inventory, 'inventory', { available: false, quantity: 0 }),
pricing: this.extractValue(pricing, 'pricing'),
reviews: this.extractValue(reviews, 'reviews', { rating: 0, count: 0 }),
recommendations: this.extractValue(recommendations, 'recommendations', [])
});
// Cache with appropriate TTL based on data volatility
await this.cache.set(cacheKey, composed, {
ttl: 300, // 5 minutes for product data
tags: [`product:${productId}`]
});
reply.header('X-Cache', 'MISS');
reply.send(composed);
} catch (error) {
request.log.error({ error, productId }, 'Product composition failed');
// Return degraded response if possible
const fallback = await this.getFallbackResponse(productId);
if (fallback) {
reply.code(200).send(fallback);
} else {
reply.code(503).send({ error: 'Service temporarily unavailable' });
}
}
}
private extractValue<T>(
result: PromiseSettledResult<T>,
serviceName: string,
fallback?: T
): T | undefined {
if (result.status === 'fulfilled') {
return result.value;
}
this.orchestrator.recordFailure(serviceName, result.reason);
return fallback;
}
private composeResponse(data: Partial<ComposedProductDetail>): ComposedProductDetail {
// Business logic for composition - e.g., hide pricing if inventory unavailable
if (!data.inventory?.available) {
data.pricing = undefined;
}
return {
product: data.product!,
inventory: data.inventory!,
pricing: data.pricing,
reviews: data.reviews!,
recommendations: data.recommendations
} as ComposedProductDetail;
}
private isCacheValid(cached: ComposedProductDetail): boolean {
// Custom validation logic - e.g., check if pricing is stale
return true; // Simplified for example
}
private async getFallbackResponse(productId: string): Promise<ComposedProductDetail | null> {
// Attempt to serve stale cache or minimal product info
return this.cache.getStale<ComposedProductDetail>(`product:${productId}`);
}
}
The service orchestrator handles the actual service communication with proper timeout and retry logic:
// product-composition-service/src/orchestration/ServiceOrchestrator.ts
import { HttpClient } from '../http/HttpClient';
import { TracingContext } from '../observability/TracingContext';
export class ServiceOrchestrator {
constructor(
private httpClient: HttpClient,
private tracer: TracingContext
) {}
async getProduct(productId: string): Promise<ProductData> {
const span = this.tracer.startSpan('get-product');
try {
const response = await this.httpClient.get<ProductData>(
`${process.env.PRODUCT_SERVICE_URL}/products/${productId}`,
{
timeout: 2000,
retry: { attempts: 2, backoff: 'exponential' },
headers: this.getServiceHeaders()
}
);
span.setTag('product.id', productId);
span.setTag('cache.hit', response.headers['x-cache'] === 'HIT');
return response.data;
} catch (error) {
span.setTag('error', true);
span.log({ event: 'error', message: error.message });
throw error;
} finally {
span.finish();
}
}
async getInventory(productId: string): Promise<InventoryStatus> {
// Similar implementation with service-specific timeout and retry config
return this.httpClient.get<InventoryStatus>(
`${process.env.INVENTORY_SERVICE_URL}/inventory/${productId}`,
{ timeout: 1500, retry: { attempts: 3 } }
).then(r => r.data);
}
async getPricing(productId: string, userId?: string): Promise<PriceData> {
const url = `${process.env.PRICING_SERVICE_URL}/pricing/${productId}`;
const params = userId ? { userId } : {};
return this.httpClient.get<PriceData>(url, {
params,
timeout: 1000,
retry: { attempts: 2 }
}).then(r => r.data);
}
recordFailure(serviceName: string, error: Error): void {
// Emit metrics for monitoring
this.tracer.recordMetric(`service.failure.${serviceName}`, 1);
}
private getServiceHeaders(): Record<string, string> {
return {
'X-Request-ID': this.tracer.getRequestId(),
'X-Correlation-ID': this.tracer.getCorrelationId(),
'Authorization': `Bearer ${this.getServiceToken()}`
};
}
private getServiceToken(): string {
// Service-to-service authentication token
return process.env.SERVICE_TOKEN!;
}
}
Implementing GraphQL Federation as an Alternative
For organizations already using GraphQL, federation provides a declarative approach to API composition. Each backend service exposes a GraphQL subgraph, and Apollo Federation or similar tools compose them into a unified graph. The composition layer becomes the federated gateway.
// product-composition-gateway/src/gateway.ts
import { ApolloGateway, IntrospectAndCompose } from '@apollo/gateway';
import { ApolloServer } from '@apollo/server';
import { startStandaloneServer } from '@apollo/server/standalone';
const gateway = new ApolloGateway({
supergraphSdl: new IntrospectAndCompose({
subgraphs: [
{ name: 'products', url: process.env.PRODUCT_SUBGRAPH_URL },
{ name: 'inventory', url: process.env.INVENTORY_SUBGRAPH_URL },
{ name: 'pricing', url: process.env.PRICING_SUBGRAPH_URL },
{ name: 'reviews', url: process.env.REVIEW_SUBGRAPH_URL }
],
pollIntervalInMs: 30000 // Refresh schema every 30 seconds
}),
buildService({ url }) {
return new RemoteGraphQLDataSource({
url,
willSendRequest({ request, context }) {
// Propagate authentication and tracing context
request.http.headers.set('authorization', context.token);
request.http.headers.set('x-request-id', context.requestId);
}
});
}
});
const server = new ApolloServer({
gateway,
plugins: [
// Custom plugin for caching and monitoring
{
async requestDidStart() {
return {
async willSendResponse({ response, contextValue }) {
// Add cache headers based on query complexity
if (contextValue.cacheable) {
response.http.headers.set('Cache-Control', 'public, max-age=300');
}
}
};
}
}
]
});
const { url } = await startStandaloneServer(server, {
context: async ({ req }) => ({
token: req.headers.authorization,
requestId: req.headers['x-request-id'] || generateRequestId()
}),
listen: { port: 4000 }
});
GraphQL federation works well when your backend services already expose GraphQL APIs and you need flexible client-driven queries. The trade-off is increased complexity in schema management and potential over-fetching if clients request unnecessary fields.
Common Pitfalls and Failure Modes
Shared composition services become bottlenecks. When multiple micro-frontends share a single composition service, you recreate the monolith problem. Each frontend should own its composition layer, even if that means some code duplication. The operational independence is worth it.
Insufficient timeout and circuit breaker configuration. Default HTTP client timeouts (often 30-60 seconds) are too long for user-facing requests. Set aggressive timeouts (1-3 seconds) and implement circuit breakers that fail fast when downstream services degrade. A slow service shouldn't cascade into a slow frontend.
Caching without invalidation strategy. Aggressive caching improves performance but stale data damages user experience. Implement cache tagging and event-driven invalidation. When a product price changes, invalidate all cached responses containing that product.
Ignoring partial failure scenarios. Your composition service will face partial failures—some backend services respond while others timeout. Design responses that gracefully degrade. A product page can render without recommendations but not without basic product data.
Authentication token leakage. Never pass user authentication tokens directly to backend services from the composition layer. Use service-to-service authentication and maintain a separate authorization context. The composition service authenticates the user, then makes authorized requests on their behalf using service credentials.
Inadequate observability. Distributed tracing is non-negotiable. Every request through the composition layer should generate trace spans for each backend call, allowing you to identify which service causes latency or failures. Without this, debugging production issues becomes guesswork.
Over-composition creating tight coupling. Don't let composition services become orchestration engines that implement complex business logic. They should aggregate and transform data, not make business decisions. Business logic belongs in backend services.
Best Practices for Production Deployments
Deploy composition services with the frontend. Use the same deployment pipeline and cadence. This maintains the autonomy that micro-frontends promise while keeping the composition layer aligned with frontend needs.
Implement request coalescing. When multiple users request the same data simultaneously, coalesce those requests into a single backend call. This prevents thundering herd problems during traffic spikes.
Use semantic versioning for composition APIs. Even though the composition service is owned by the frontend team, treat its API as a contract. Version it properly to support gradual rollouts and rollbacks.
Monitor composition latency separately. Track P50, P95, and P99 latency for composition requests distinct from backend service latency. This reveals whether your composition logic introduces overhead.
Implement request prioritization. Not all data is equally important. Fetch critical data first, then make secondary requests for enhancements like recommendations. Use HTTP/2 or HTTP/3 to multiplex requests efficiently.
Design for regional deployment. Deploy composition services in the same regions as your frontends to minimize latency. Use service mesh or API gateway routing to direct requests to regional backend services.
Create composition service templates. While each micro-frontend needs its own composition service, standardize the infrastructure, observability, and resilience patterns. Provide templates or libraries that teams can customize.
Test failure scenarios explicitly. Write integration tests that simulate backend service failures, timeouts, and partial responses. Verify that your composition service handles these gracefully and returns appropriate fallback data.
Frequently Asked Questions
What is the API composition pattern for micro-frontends?
The API composition pattern introduces a dedicated backend layer that aggregates data from multiple microservices and returns composed responses optimized for specific micro-frontend needs. This moves orchestration complexity from the browser to the backend, improving performance, security, and maintainability.
How does the Backend for Frontend pattern differ from an API gateway in 2025?
A BFF is purpose-built for a specific frontend and owned by the same team, while an API gateway is a shared infrastructure component. BFFs contain domain-specific composition logic and evolve with frontend requirements. Modern API gateways handle cross-cutting concerns like rate limiting and authentication but shouldn't implement business logic.
What is the best way to handle authentication in API composition services?
Implement two-layer authentication: the composition service validates user tokens and establishes user context, then uses service-to-service credentials (mutual TLS or service tokens) to call backend services. Never forward user tokens directly to backend services. This separation improves security and simplifies token management.
When should you avoid using GraphQL federation for micro-frontends?
Avoid GraphQL federation when backend services don't naturally expose GraphQL APIs, when you need fine-grained caching control, or when query complexity makes performance unpredictable. REST-based composition services offer more explicit control over data fetching and caching strategies.
How do you scale API composition services under high traffic?
Scale horizontally with stateless composition service instances behind a load balancer. Implement aggressive caching with Redis or similar, use request coalescing to prevent duplicate backend calls, and deploy regionally to reduce latency. Monitor backend service capacity separately—composition services can scale independently but are limited by downstream service capacity.
What metrics should you track for composition service health?
Track composition request latency (P50, P95, P99), backend service call success rates, cache hit ratios, circuit breaker state changes, and partial failure rates. Set up alerts for latency degradation and increased error rates. Distributed tracing provides request-level visibility into which backend services cause problems.
How do you prevent composition services from becoming a distributed monolith?
Maintain strict ownership boundaries—each micro-frontend team owns their composition service. Avoid sharing composition logic between teams. Accept some code duplication as the cost of operational independence. Use shared libraries for infrastructure concerns (HTTP clients, circuit breakers) but not business logic.
Conclusion
The API composition pattern solves the fundamental data aggregation challenge in micro-frontend architectures by moving orchestration complexity from the browser to a dedicated backend layer. This approach delivers measurable improvements in performance, security, and maintainability while preserving the team autonomy that makes micro-frontends valuable.
Success requires treating composition services as first-class components owned by frontend teams, implementing robust resilience patterns, and designing for partial failures from the start. The architectural investment pays dividends through reduced frontend complexity, improved user experience, and operational independence between teams.
Start by identifying your most complex micro-frontend—typically a dashboard or detail page that aggregates data from multiple services. Implement a composition service for that frontend first, measure the performance improvement, and use that success to justify broader adoption. Focus on observability from day one; you can't optimize what you can't