How OAuth works in mcpgw
mcpgw is an OAuth 2.1 Resource Server. It does not issue tokens. It verifies tokens issued by your Authorization Server (Auth0, Okta, Keycloak, etc.) and gates access to the upstream MCP servers.
The triangle
MCP Client
|
| 1. Fetch token (client credentials, PKCE, etc.)
v
Authorization Server ---------> JWKS endpoint
(Auth0 / Okta / etc.) (public key material)
| ^
| 2. JWT access token | 3. Verify signature
v |
MCP Client ----[Bearer token]----> mcpgw ------> Upstream MCP Server
The MCP client gets a token from the AS, then presents it to mcpgw on every request. mcpgw never sees client secrets or issues tokens itself.
Token validation pipeline
Each inbound request at POST /mcp follows this sequence:
-
Header extraction. mcpgw reads the
Authorizationheader and strips theBearerprefix. -
Token shape check (
looksLikeJWT). The raw credential is tested: three dot-separated base64url segments, total length ≥ 100 characters, no characters outside[A-Za-z0-9._-]. Tokens that pass → OAuth verifier. Tokens that fail → API key store (if configured). If neither path has a matching verifier/store, the request is rejected with401 invalid. -
JWKS fetch (cached). The verifier fetches the AS’s JWKS from
auth.oauth.jwks_url. The keyset is cached in memory forjwks_cache_ttl(default 5m) and refreshed in the background by a goroutine bound to the process’s run context. -
Signature verification.
jwt.Parsefromgithub.com/lestrrat-go/jwx/v2selects the JWK bykidheader and verifies the signature. An unknownkidfails withinvalid_token; this is the expected failure mode when JWKS rotation has published a new key before the cache has refreshed. -
Standard claims checked:
issmust matchauth.oauth.issuer,audmust containauth.oauth.audience,expmust be in the future withinleeway(default 30s). -
Scope enforcement. Each scope listed in
auth.oauth.required_scopesmust appear in the token’sscopeclaim (space-separated string, RFC 9068) orscpclaim (string or string array, used by Azure AD and Okta). Missing scope →401witherror="insufficient_scope". -
Identity extraction. On success,
client_idis read from theclient_idclaim (RFC 9068 §2.2). If absent,azp(authorized party) is tried as a fallback — Keycloak and some other AS emit this instead. The resolved value and scopes are stored for audit and telemetry.
The error sentinel dispatching is:
errors.Is sentinel | Audit auth_result | WWW-Authenticate error |
|---|---|---|
ErrExpired | expired | invalid_token |
ErrAudience | bad_audience | invalid_token |
ErrIssuer | bad_issuer | invalid_token |
ErrInvalidToken | invalid | invalid_token |
ErrInsufficient | insufficient_scope | insufficient_scope |
ErrNoToken | missing | (no error param) |
Discovery (RFC 9728)
When auth.oauth.enabled is true, mcpgw registers a GET handler at
/.well-known/oauth-protected-resource (overridable via metadata_path). The
document it serves:
{
"resource": "https://gw.acme.com/mcp",
"authorization_servers": ["https://idp.acme.com/"],
"bearer_methods_supported": ["header"],
"scopes_supported": ["mcp:read"]
}
scopes_supported is omitted when required_scopes is empty.
resource is derived from auth.oauth.public_url when set; otherwise from
listen + /mcp. The document is served without authentication and cached by
clients for up to 1 hour (Cache-Control: public, max-age=3600).
When a request fails authentication, mcpgw’s WWW-Authenticate response header
points at this document:
WWW-Authenticate: Bearer realm="mcpgw",
resource_metadata="https://gw.acme.com/.well-known/oauth-protected-resource",
error="invalid_token"
A fresh client that has never seen this gateway can read resource_metadata,
fetch the document, find the AS URL under authorization_servers, and
start the authorization flow — without any out-of-band configuration.
The resource_metadata parameter in the challenge is the mechanism described
in Anthropic, “Building Agents That Reach Production Systems with MCP”
as “Standardized OAuth with CIMD.” It reduces surprise re-auth prompts on
first-time flows.
Why mcpgw is RS-only
mcpgw intentionally does not act as an Authorization Server. The reasoning:
- Operators already have an AS. Every company running Auth0, Okta, Entra ID, or Keycloak already has token issuance, refresh-token storage, client credential management, and MFA policies. Duplicating that inside mcpgw would require operators to manage two credential stores.
- Issuing tokens means storing secrets. Client secrets, private keys for signing — these require secure storage, rotation tooling, and audit trails that belong in dedicated identity infrastructure, not in a transparent proxy.
- RS-only keeps mcpgw deployable anywhere. Adding OAuth to an existing mcpgw deployment that fronts any MCP server requires only config changes. No changes to the upstream, no new infrastructure — just point mcpgw at your existing AS.
Coexistence with API keys
Both mechanisms can be active simultaneously. The dispatch key is token shape, not configuration order:
- Three dot-separated base64url segments, ≥ 100 characters total → OAuth verifier
- Everything else → API key store
This is useful for migrations: existing API-key clients keep working while new OAuth clients are onboarded. There is no config knob for precedence; shape is deterministic.
One edge case: an opaque API key that happens to look like a compact JWS (three
dot-separated base64url segments, ≥ 100 chars) will be routed to the OAuth
verifier and rejected. In practice, mcpgw-generated keys (mcpg_live_...) do
not match this shape.
Hot reload
The OAuth verifier is held in an atomic.Pointer[oauth.Verifier]. On SIGHUP:
- mcpgw re-reads and validates
mcpgw.yaml. - If
auth.oauth.enabledis true, a newVerifieris constructed — which primes the JWKS cache with one synchronous fetch. - If the prime succeeds,
SwapOAuthVerifierstores the new pointer atomically. - If the prime fails (JWKS endpoint unreachable), the reload is aborted with
slog.Error("oauth reload failed")and the existing verifier stays in place.
In-flight requests capture the verifier pointer at request entry and complete with the old verifier. New requests after the swap use the new one. There is no window where a request sees a half-loaded verifier.
The JWKS cache refresh goroutine is bound to the verifier’s context (the
process’s run context on startup, or the reload invocation’s context on
reload). When a verifier is replaced, its context is not cancelled — the
goroutine stops naturally when the context that was passed to NewVerifier
is done. On normal operation that is when the process shuts down.
public_url and metadata_path configure HTTP mux routes registered at
startup. Mux routes cannot be changed at runtime; changes to these two fields
require a full restart.
Failure modes operators should know
JWKS endpoint unreachable at startup
oauth.NewVerifier performs one synchronous JWKS fetch (“prime”) before
returning. If this fails, main logs slog.Error("oauth setup") and exits
with code 78. This is intentional fail-fast behaviour: a gateway that cannot
verify tokens must not serve traffic. The discovery document cannot be served
either, since the mux was never registered.
JWKS rotation
When the AS rotates keys, it publishes a new JWK with a new kid and (ideally)
keeps the old key live for a grace period. mcpgw’s cache refreshes at
jwks_cache_ttl intervals (default 5m). Tokens signed with the new key before
the next cache refresh fail with invalid_token. The window is bounded by
jwks_cache_ttl — operators with strict rotation requirements can set this to
1m or shorter, at the cost of more traffic to the JWKS endpoint.
Clock skew
leeway (default 30s) expands the exp and nbf acceptance windows in both
directions. A 30s leeway means a token 30 seconds past exp is still accepted.
Larger values improve tolerance for clock drift between AS and gateway; smaller
values tighten expiry guarantees. 30s is a safe default for deployments where
the AS and mcpgw share NTP synchronisation.
0.0.0.0 listen + no public_url
When listen is 0.0.0.0:PORT or :PORT and public_url is not set,
publicResourceURL constructs the metadata resource field as
http://0.0.0.0:PORT/mcp. mcpgw logs slog.Warn("oauth metadata advertises listen address")
at startup. The gateway continues running, but MCP clients fetching the discovery
document will see an unresolvable address. Production deployments must set
auth.oauth.public_url.
Reload with unreachable JWKS
If a SIGHUP is received while the JWKS endpoint is down, the reload is
skipped and the existing verifier stays in place. The operator sees
slog.Error("oauth reload failed") in stderr/journald. This is the conservative
choice: an unreachable IdP during reload should not take the gateway down.
What this is not
- Not an Authorization Server. mcpgw never sees client secrets or issues tokens.
- Not a token-introspection endpoint. RFC 7662 introspection is an AS-side feature; mcpgw validates signatures locally using the cached JWKS, not via a network call per request.
- Not DPoP or mTLS-bound token support. Token binding (RFC 9449, RFC 8705) is not implemented in Phase 1.
- Not OpenID Connect aware. mcpgw treats tokens as OAuth access tokens per
RFC 9068. ID tokens (
id_token) are not parsed or forwarded. Thesubclaim is extracted forIdentity.Subjectbut not surfaced in audit or spans in Phase 1 (planned for Phase 2 CIMD work).
Related reading
- RFC 9728 — OAuth 2.0 Protected Resource Metadata
- RFC 6750 — The OAuth 2.0 Authorization Framework: Bearer Token Usage
- RFC 8725 — JSON Web Token Best Current Practices
- RFC 9068 — JSON Web Token Profile for OAuth 2.0 Access Tokens
- Anthropic, “Building Agents That Reach Production Systems with MCP”
- How-to: Enable OAuth 2.1 authentication
- Configuration reference
- Architecture and request lifecycle