mcpgw exposes two endpoints for orchestrators. They serve distinct purposes — do not point both your liveness and readiness probes at the same one.

Endpoint	Probe type	Returns 503 when…
`/healthz`	Liveness	Never (a hung process answering 200 is the only failure mode)
`/readyz`	Readiness	License is invalid or expired beyond grace

`/healthz`

GET /healthz

Response	Meaning
`200 OK` body `ok`	Process is responsive
no response	Process is dead, hung, or the network is broken — your liveness probe should restart

Use as your liveness probe. Kubernetes example:

livenessProbe:
  httpGet:
    path: /healthz
    port: 7332
  initialDelaySeconds: 5
  periodSeconds: 10
  failureThreshold: 3

/healthz deliberately does not check the license, the upstreams, or the audit file writability. Its only job is to prove the process is responsive. Coupling liveness to license state turns a slow license-renewal incident into a thundering-herd restart that makes everything worse.

`/readyz`

GET /readyz

Response	Meaning
`200 OK` body `ready`	License is valid (within `exp + grace_days`)
`503 Service Unavailable` body `license_expired`	License is expired beyond the grace window
`503 Service Unavailable` body `license_invalid`	Signature or claim validation failed

Use as your readiness probe. When /readyz returns 503, the orchestrator should remove the pod from the LB rotation but not restart it.

readinessProbe:
  httpGet:
    path: /readyz
    port: 7332
  initialDelaySeconds: 2
  periodSeconds: 30
  failureThreshold: 2

/readyz does not check upstream health. An upstream MCP server being down does not make the gateway “not ready” — the gateway can still receive requests and return useful errors (502 upstream_unreachable). Coupling readiness to upstream health collapses both layers’ availability into one number.

Why split these?

The Kubernetes-style liveness/readiness split exists for a reason: they correspond to different remediations.

Liveness fail = restart me. The process is broken and only a restart will fix it.
Readiness fail = remove me from rotation. The process is fine but currently can’t serve traffic correctly.

License expiry is a readiness failure: you do not want the orchestrator to restart-loop a binary while you are renewing the JWT. Hung process is a liveness failure: restart immediately.

If your orchestrator only supports one probe type, use /readyz. The downside is slightly slower recovery from process hangs (one fewer fast-fail signal); the upside is correct behavior on license expiry.

Fronting `/readyz` with a load balancer

LB health checks are typically configured as readiness probes. Point them at /readyz. If you need the LB to detect process death faster, use a TCP-only check on :7332 in addition.

/readyz is excluded from rate-limit identity and policy evaluation. It does not appear in the audit log or in OTLP spans. This is intentional — health-check traffic should not pollute observability data.

CLI reference: server — exit codes and signals
License JWT reference — what makes a license invalid

Health endpoints

/healthz

/readyz

Why split these?

Fronting /readyz with a load balancer

Related

`/healthz`

`/readyz`

Fronting `/readyz` with a load balancer