Security15 min readMay 2, 2026

NHI with Keycloak and AgentGuard: Complete Guide

Learn how to implement AI agent governance with Keycloak and AgentGuard. Covers lifecycle management, permission ceilings, and offboarding in under 60 seconds.

RM

Rahul Mashere

KeycloakPro Team

The Non-Human Identity (NHI) Problem

Every organization now runs AI agents: Claude Code, GitHub Copilot, Cursor, and dozens more. These agents run on employee laptops and cloud servers with no central inventory, hold live API keys, and are never offboarded when employees leave.

Your security team faces a critical question: "Who are all the AI agents in our company, what can they access, and who owns them?"

According to GCP's 2024 Security Intelligence Report, 47% of credential leaks come from unmanaged agent API keys. Most organizations have zero visibility into agent inventory, let alone governance.

Why Does Non-Human Identity Matter for Your Security Team?

Four concrete risks your security team faces:

  1. Shadow AI sprawl — Developers install agents without IT approval. Security has no inventory. Agents can exfiltrate source code or access production credentials.

  2. Credential rot — Long-lived API keys (Anthropic, OpenAI, GitHub tokens) keep working indefinitely. When employees leave, keys continue accessing company systems for months.

  3. Blast radius exposure — An agent with access to prod-secrets in git can reach your entire infrastructure. No organization maps this today.

  4. Compliance gaps — Auditors ask: "Who owns all AI systems? How do you control them?" Organizations have no documented answer.

What Is Non-Human Identity and How Does It Work?

Non-Human Identity is a managed identity in your identity provider (Keycloak, Okta, Azure AD) representing an AI agent—like a service account for agents.

Unlike unmanaged API keys:

  • NHI has a named owner (the employee running the agent)
  • Identity is rotatable and revocable without code changes
  • Permissions are bounded by the owner's role
  • Lifecycle is traceable (discovered, identified, classified, governed, offboarded)
  • Policies enforce what tools agents can use

Managing an agent via NHI means: When engineer Alice leaves your company, her agent's managed identity is disabled. All API calls fail within seconds. No long-lived secrets to manually rotate.

Why Use Keycloak as the Identity Provider for AI Agents?

Keycloak is an open-source identity and access management server. For NHI, it provides:

  1. Service account provisioning — Automatically create Keycloak accounts when agents are discovered (AgentGuard provisioning)
  2. Credential management — Issue, rotate, and revoke JWT tokens
  3. Role-based access control — Assign roles to agents
  4. Multi-tenancy — Separate agent identities per organization via AgentGuard tenant scoping
  5. Audit trail — Log every identity change and credential rotation
  6. OIDC/OAuth 2.0 — Standard authentication protocols for agent authentication

How Does AgentGuard Enforce Custom Policy Governance?

Policies answer: "Is this agent allowed to do this action?"

You can't use blanket roles like agent:full-access. You need granular policies:

  • Claude Code can read code repos, but not write to main branch
  • GitHub Copilot can call OpenAI, but not Anthropic
  • Agents can only access resources their owner can access

AgentGuard with Open Policy Agent (OPA) and Rego provides the industry-standard enforcement layer for this. See NIST SP 800-207 (Zero Trust Architecture) for the underlying zero-trust principles.

Permission Ceiling: Agent ≤ Owner

The core concept: an agent's permissions never exceed its owner's permissions.

Example Rego rule:

deny[msg] {
  input.action == "access_resource"
  agent_perms := data.agents[input.agent_id].permissions
  owner_perms := data.owners[input.owner_id].permissions
  
  not resource_in_set(input.resource, owner_perms)
  
  msg := sprintf(
    "Agent cannot access %s: owner lacks permission",
    [input.resource]
  )
}

Real scenario: Alice is a junior developer with permissions to:

  • Read code repos
  • Deploy to staging only
  • NOT deploy to production
  • NOT access customer databases

Alice's Claude Code agent inherits exactly these permissions. If Alice gets promoted (gains prod deploy access), her agent's permissions auto-cascade within 60 seconds.

If Alice is fired, all her agents immediately lose all permissions (within 1 second).

What Are the Six Stages of an Agent's Lifecycle?

Every agent moves through a defined state machine:

  • DISCOVERED — Agent detected on endpoint, owner unknown
  • IDENTIFIED — Owner assigned (e.g., alice@company.com)
  • CLASSIFIED — Risk score assigned, Keycloak service account created
  • GOVERNED — Agent actively managed, policies enforced
  • ORPHANED — Owner departed or agent missing 30+ days
  • DECOMMISSIONED — Terminal state, all credentials revoked

What Happens When an Employee Leaves: Real-World Offboarding Scenario

When Alice leaves the company on Friday:

  1. HR marks Alice as "terminated" → SCIM event fires
  2. Permission Sync receives event → Finds 3 agents owned by Alice
  3. Keycloak disables all 3 service accounts
  4. Alice's agents try to authenticate → Get 403 Forbidden
  5. All API calls fail immediately
  6. Security team receives alert with offboarding options
  7. Team confirms offboarding → Status moves to DECOMMISSIONED

Total time: Less than 5 minutes from departure to complete credential revocation

Contrast: Manual process would take 1-2 weeks to find all keys, notify teams, and rotate credentials

How to Implement

Phase 1: Agent Discovery

Deploy endpoint agent to laptops. Agent scans every 30 seconds for:

  • Running processes (Claude Code, Copilot, Cursor)
  • Config files
  • Credential patterns

Phase 2: Agent Identification

Security team assigns owners:

curl -X POST https://api.company.com/v1/agents/agent-id/owner \
  -H "Authorization: Bearer $ADMIN_JWT" \
  -d '{"owner_id": "alice@company.com"}'

Phase 3: Classification & Keycloak Provisioning

Automatic classifier evaluates risk based on:

  • Agent type
  • Credentials held
  • Config file locations

Keycloak service account is created (idempotent):

curl -X POST https://keycloak.company.com/admin/realms/agentguard/clients \
  -H "Authorization: Bearer $KEYCLOAK_TOKEN" \
  -d '{
    "clientId": "sa-claude-code-macpro117",
    "serviceAccountsEnabled": true,
    "attributes": {
      "nhi_id": "nhi-abc123",
      "owner_id": "alice@company.com"
    }
  }'

Phase 4: Policy Definition

Define policies in OPA Rego:

package agentguard.authorization

deny[msg] {
  input.action == "call_mcp_tool"
  input.tool_name in ["bash_shell", "system_execute"]
  
  agent_classification := data.agents[input.agent_id].classification
  agent_classification != "ADMIN"
  
  msg := sprintf("Tool %s not allowed for %s", [input.tool_name, input.agent_id])
}

Phase 5: Monitoring

Deploy API Gateway with policy evaluation:

  • Every request validated against policies
  • JWT tokens refreshed hourly
  • All decisions logged
  • Orphan agents auto-decommissioned after SLA (24/72/7 hours)

What Are the Top 8 NHI and AgentGuard Best Practices?

  1. Policy as Code — Store Rego policies in git, reviewed like application code
  2. Least Privilege — Start restrictive by default, explicitly allow actions
  3. Short-Lived CredentialsKeycloak JWTs expire in 1 hour, agent refreshes automatically
  4. Immutable Audit Trail — Log every policy decision with full context
  5. SLA on Orphans — Auto-decommission after configured period (24/72/7 hours)
  6. Blast Radius Scoring — Periodically assess agent risk and access chains
  7. Credential Rotation — Agents holding API keys should rotate monthly via HashiCorp Vault
  8. Policy Testing — Test OPA policies offline before deployment to production

How Does AgentGuard Meet Compliance Requirements?

  • SOC 2 Type II — Full audit trail per agent, all changes logged and exportable
  • ISO 27001 — Permission ceiling enforced, offboarding within 60s, access control verified
  • HIPAA — Access logs, role-based control, compliance-ready deployment
  • PCI DSS — Unique agent IDs, auto-lockout on violations, immutable audit trail
  • NIST SP 800-207 (Zero Trust) — Zero trust evaluation, continuous authentication and authorization, least privilege enforcement

Frequently Asked Questions About NHI and AgentGuard

What if an agent's token expires mid-request?

The agent receives 401 Unauthorized and refreshes its token automatically. If the owner was just offboarded, refresh fails and the agent stops operating. This is intentional.

Can we support both unmanaged and managed keys?

Yes, in Phase 1. Flag unmanaged keys as high-risk in the dashboard. Gradually migrate to Keycloak-issued JWTs.

How does multi-tenancy work?

Each customer gets a separate Keycloak realm. Agent identities are scoped per realm. An agent in customer A cannot impersonate an agent in customer B.

What if Keycloak is down?

Deploy Keycloak as a cluster behind a load balancer. Agents cache tokens locally (1-hour TTL). Brief outages don't block agents.

Can we audit every policy decision?

Yes. Every OPA evaluation is logged with timestamp, agent ID, owner ID, action, and decision reason.

Summary

Non-Human Identity governance combines Keycloak, AgentGuard, and automated lifecycle management to deliver:

  • Visibility — Complete agent inventory and discovery
  • Control — Enforce MCP governance and what each agent can access
  • Compliance — Full audit trail for regulators (SOC 2, ISO 27001, HIPAA)
  • Speed — Offboarding in under 60 seconds with automated credential revocation
  • Scalability — Automatic permission cascading and multi-tenancy

AgentGuard implementation roadmap:

  • Phase 1 — Agent discovery and NHI registry
  • Phase 2 — MCP governance and policy enforcement
  • Phase 3 — Advanced features (cloud detection, anomaly detection, blast radius)
  • Phase 4 — Enterprise scale (HA, eBPF enforcement, public API)

Start with a pilot on 5-10 machines, measure offboarding time, expand organization-wide.

Keycloak + AgentGuard + OPA = Modern AI Agent Security

Need Help With Keycloak?

Our team specializes in production-grade Keycloak deployments. Get a free 30-minute strategy consultation.

Book a Free Strategy Call