Automation · APIs|12 min read|

Enterprise API Integrations: Design, Security, and Resilience

Integrations between systems are the connective tissue of a modern enterprise. They're also the most frequent source of hard-to-diagnose incidents: a third party that changes its API without notice, an integration that silently fails for hours, or a webhook that processes the same event twice with financial consequences. This guide documents the integration patterns that make the difference between systems that fall together and systems that fail gracefully.

API Gateway: the single entry point

An API Gateway centralizes cross-cutting concerns for all APIs: authentication, rate limiting, logging, request transformation, and routing. It eliminates the need to implement these functions in every individual microservice.

  • Kong (open-source): the most flexible self-hosted option. Plugins for JWT auth, OAuth2, rate limiting, CORS, and request transformation. Deployable on Kubernetes with Kong's ingress controller.
  • AWS API Gateway: native integration with Lambda, Cognito, and IAM. Ideal when the backend is primarily serverless or heavily AWS-dependent.
  • Traefik: if already using Traefik as Kubernetes ingress controller, its middlewares cover most cases: basic/JWT auth, rate limiting, circuit breaking.

OAuth2 and JWT: correct authentication for enterprise APIs

The most common API authentication mistake: using static API keys without expiration or rotation. For enterprise APIs integrating internal systems and third parties, OAuth2 with short-lived JWT tokens is the correct standard.

typescript
// JWT validation with enterprise claim verification
import jwt from 'jsonwebtoken';
import jwksClient from 'jwks-rsa';

const client = jwksClient({
  jwksUri: 'https://auth.company.com/.well-known/jwks.json',
  cache: true,
  cacheMaxAge: 86400000, // 24h cache for public keys
});

async function validateToken(token: string): Promise<JwtPayload> {
  const decoded = jwt.decode(token, { complete: true });
  if (!decoded?.header?.kid) throw new Error('Invalid token: missing kid');

  const key = await client.getSigningKey(decoded.header.kid);
  return jwt.verify(token, key.getPublicKey(), {
    algorithms: ['RS256'],
    audience: 'api.company.com',
    issuer: 'https://auth.company.com',
  }) as JwtPayload;
}

Circuit Breaker: isolating third-party failures

When an external service (payment provider, geolocation API, ERP) starts failing or responding slowly, without a circuit breaker that behavior propagates to your system: your API waits for the third party's response while accumulating open connections until the thread pool is exhausted and your service also goes down.

typescript
import CircuitBreaker from 'opossum';

const breaker = new CircuitBreaker(callPaymentProvider, {
  timeout: 3000,
  errorThresholdPercentage: 50,
  resetTimeout: 30000,
  volumeThreshold: 5,
});

breaker.fallback(() => ({
  status: 'pending',
  message: 'Payment provider unavailable, retrying automatically'
}));

breaker.on('open', () =>
  logger.warn('Circuit breaker open: payment provider degraded')
);
The circuit breaker fallback should never silently simulate success. If payment failed because the circuit is open, the user needs to know. The correct fallback communicates degradation and lets the user or system retry consciously.

Retry with Exponential Backoff and Jitter

typescript
async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  maxAttempts = 4,
  baseDelay = 1000
): Promise<T> {
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (error) {
      if (attempt === maxAttempts) throw error;
      // Don't retry 4xx errors (client errors, not transient)
      if (error instanceof ApiError && error.status >= 400 && error.status < 500) throw error;
      const delay = Math.min(
        baseDelay * Math.pow(2, attempt - 1) + Math.random() * 1000,
        30000
      );
      await new Promise(resolve => setTimeout(resolve, delay));
    }
  }
  throw new Error('Unreachable');
}

Reliable webhooks: idempotency and signature verification

typescript
app.post('/webhooks/payment', async (req, res) => {
  // 1. Verify HMAC signature from provider
  const sig = req.headers['x-webhook-signature'] as string;
  const expected = crypto
    .createHmac('sha256', process.env.WEBHOOK_SECRET!)
    .update(JSON.stringify(req.body)).digest('hex');
  if (!crypto.timingSafeEqual(Buffer.from(sig), Buffer.from(expected)))
    return res.status(401).json({ error: 'Invalid signature' });

  // 2. Idempotency: skip if already processed
  const existing = await db.processedEvents.findUnique({
    where: { event_id: req.body.event_id }
  });
  if (existing) return res.status(200).json({ status: 'already_processed' });

  await processPaymentEvent(req.body);
  await db.processedEvents.create({ data: { event_id: req.body.event_id, processed_at: new Date() } });
  res.status(200).json({ status: 'ok' });
});

Frequently Asked Questions

REST, GraphQL, or gRPC for internal enterprise APIs?
REST for public APIs and third-party integrations — it's the standard everyone understands. gRPC for internal microservice communication in production — more efficient serialization (Protocol Buffers), native streaming support, and strict typing with code generation. GraphQL when clients need flexibility over which fields to fetch — avoid for simple internal APIs where complexity isn't justified.
How do I manage API versions without breaking existing integrations?
The most pragmatic approach: URL versioning (/api/v1/, /api/v2/). Previous versions stay active during an announced deprecation period (minimum 6 months for enterprise APIs). Breaking changes always go in a new version. Optional additional fields and new endpoints are not breaking changes and can go in the existing version.
How do I detect when a third-party integration is silently failing?
The correct combination: alerts on error rate and latency for each integration in Prometheus/Datadog, dead letter queues for messages that couldn't be processed, and alerts on webhook received volume (if volume drops 80% below average, something is wrong on the provider side). Detecting silent failures requires active monitoring, not just reacting to explicit errors.
How do I document internal APIs in a way that doesn't go stale?
OpenAPI (Swagger) generated automatically from code, not written manually. Frameworks like NestJS, FastAPI, and Spring Boot can generate the OpenAPI spec from code annotations. This guarantees documentation always reflects the current implementation. Tools like Stoplight or Redocly can publish documentation automatically from the CI pipeline.
What is an API contract test and when is it necessary?
A contract test verifies that an API (yours or a third party's) fulfills the contract your integrations expect. Pact is the most widely used tool. They're especially valuable when multiple teams consume the same internal API: the contract test detects if a provider change would break consumers, before deploying to production.

Does your company have fragile integrations causing frequent incidents? We can redesign integrations with the correct resilience patterns.

Talk to our team

Related articles

IQS

Engineering Team — IQS

Software, cloud, and DevOps engineers with enterprise project experience.