Cogent Raises $42M Series A - Read more here

Product

Agentic AI

Customers

Resources

Company

Product

Agentic AI

Customers

Resources

Company

Product

Agentic AI

Customers

Resources

Company

Product

Agentic AI

Customers

Resources

Company

Product

Agentic AI

Customers

Resources

Company

Jan 23, 2026

What We Learned Building Safe Agents for Enterprise Security

Real lessons from running AI agents in production enterprise environments

Geng Sng, CTO

The Problem: Agents in High-Stakes Environments

An agent analyzes 11,470,000 vulnerabilities across your infrastructure and decides which to prioritize. It queries production databases, generates remediation plans, and creates tickets for your engineering teams.

This is high-stakes automation. One mistake like querying the wrong tenant's data, leaking credentials in a ticket, or hallucinating a critical severity will have real consequences.

Over the past year at Cogent, we've learned that building safe agents isn't about better prompts. It's about infrastructure: isolation boundaries, credential management, and failure containment.

This post shares what we learned—the patterns that work, the ones that don't, and what we're still figuring out.

What We Actually Built

Before diving into lessons, here's what we're running in production:

The Agent Environment

Our agents run in an isolated execution environment (E2B sandboxes) with explicit boundaries:

  • Network isolation: Default-deny with allow-list for approved domains

  • Credential injection: AWS STS tokens via environment variables, never in prompts

  • Context scoping: Each agent gets only the data it needs for its task

  • Multi-tenant isolation: Storage-level separation (dedicated PostgreSQL, Redis per tenant)

  • Failure containment: Sandbox crashes don't affect the parent session

What This Enables

  • Code execution: Agents can run Python notebooks with direct Athena/S3 access

  • Tool calling: 15+ MCP servers (KB, CVE data, Splunk, Slack, AWS CLI)

  • Cross-tenant queries: Internal users can query any customer's data while maintaining audit trails

  • Safe exploration: Agents can experiment without risking production systems

Lesson 1: Network Isolation Is Non-Negotiable

The Failure Mode

Early on, we tried prompt-based safety: "Only query approved endpoints. Don't send data to external services."

This failed spectacularly. Agents would:

  • Attempt to call unapproved logging services

  • Try to fetch data from arbitrary URLs in tool responses

  • Follow redirects to domains outside our control

What We Built Instead

Default-deny network policy with explicit allow-list:

# Simplified example
SANDBOX_ALLOWED_DOMAINS = [
    "cogent.security",
    "*.cogent.security",
    "*.other.allowed.domains"
]

SANDBOX_NETWORK_CONFIG = {
    "deny_out": [ALL_TRAFFIC],  # Block everything by default
    "allow_out": SANDBOX_ALLOWED_DOMAINS,
}
# Simplified example
SANDBOX_ALLOWED_DOMAINS = [
    "cogent.security",
    "*.cogent.security",
    "*.other.allowed.domains"
]

SANDBOX_NETWORK_CONFIG = {
    "deny_out": [ALL_TRAFFIC],  # Block everything by default
    "allow_out": SANDBOX_ALLOWED_DOMAINS,
}
# Simplified example
SANDBOX_ALLOWED_DOMAINS = [
    "cogent.security",
    "*.cogent.security",
    "*.other.allowed.domains"
]

SANDBOX_NETWORK_CONFIG = {
    "deny_out": [ALL_TRAFFIC],  # Block everything by default
    "allow_out": SANDBOX_ALLOWED_DOMAINS,
}
# Simplified example
SANDBOX_ALLOWED_DOMAINS = [
    "cogent.security",
    "*.cogent.security",
    "*.other.allowed.domains"
]

SANDBOX_NETWORK_CONFIG = {
    "deny_out": [ALL_TRAFFIC],  # Block everything by default
    "allow_out": SANDBOX_ALLOWED_DOMAINS,
}

Why This Works

  • Can't be bypassed: No prompt injection can override network policy

  • Explicit boundaries: Every domain must be justified and approved

  • Audit trail: Network attempts to blocked domains are logged

  • Easy to reason about: List of 10 domains vs. "don't do bad things"

The Tradeoff

Adding domains is friction. Every new integration requires:

  1. Security review of the domain

  2. Update to allow-list

  3. Deployment of updated sandbox config

This is intentional. Friction forces us to think critically about each external dependency.

Lesson 2: Never Put Credentials in Prompts

The Failure Mode

One aspect we were careful to avoid was putting AWS credentials in the system prompt:

# DON'T DO THIS
system_prompt = f"""
You have access to AWS with these credentials:
AWS_ACCESS_KEY_ID: {aws_key}
AWS_SECRET_ACCESS_KEY: {aws_secret}
"""
# DON'T DO THIS
system_prompt = f"""
You have access to AWS with these credentials:
AWS_ACCESS_KEY_ID: {aws_key}
AWS_SECRET_ACCESS_KEY: {aws_secret}
"""
# DON'T DO THIS
system_prompt = f"""
You have access to AWS with these credentials:
AWS_ACCESS_KEY_ID: {aws_key}
AWS_SECRET_ACCESS_KEY: {aws_secret}
"""
# DON'T DO THIS
system_prompt = f"""
You have access to AWS with these credentials:
AWS_ACCESS_KEY_ID: {aws_key}
AWS_SECRET_ACCESS_KEY: {aws_secret}
"""

Problems:

  • Credentials will appear in LLM logs and tracing

  • Agents sometimes echoed credentials in responses

  • No automatic rotation — credentials lived in prompts indefinitely

What We Built Instead

Inject credentials via environment variables:

# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,
    user_id: str,
    ...
) -> Dict[str, str]:
    """
    Inject credentials securely via environment variables.
    NEVER pass credentials in prompts or tool arguments.
    """
    # Fetch short-lived AWS STS tokens (N seconds TTL)
    credentials = await get_aws_sandbox_credentials(
        tenant_identifier=tenant_id,
        user_uuid=user_id,
    )

    env_vars = {
        "AWS_ACCESS_KEY_ID": credentials.aws_access_key,
        "AWS_SECRET_ACCESS_KEY": credentials.aws_secret_key,
        "AWS_SESSION_TOKEN": credentials.aws_session_token,  # Short-lived
        "AWS_REGION": credentials.aws_region,
        # Tenant-specific config
        "ATHENA_DATABASE": tenant_config.athena_database,
        "ATHENA_RESULT_BUCKET": tenant_config.athena_result_bucket,
    }
    return env_vars
# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,
    user_id: str,
    ...
) -> Dict[str, str]:
    """
    Inject credentials securely via environment variables.
    NEVER pass credentials in prompts or tool arguments.
    """
    # Fetch short-lived AWS STS tokens (N seconds TTL)
    credentials = await get_aws_sandbox_credentials(
        tenant_identifier=tenant_id,
        user_uuid=user_id,
    )

    env_vars = {
        "AWS_ACCESS_KEY_ID": credentials.aws_access_key,
        "AWS_SECRET_ACCESS_KEY": credentials.aws_secret_key,
        "AWS_SESSION_TOKEN": credentials.aws_session_token,  # Short-lived
        "AWS_REGION": credentials.aws_region,
        # Tenant-specific config
        "ATHENA_DATABASE": tenant_config.athena_database,
        "ATHENA_RESULT_BUCKET": tenant_config.athena_result_bucket,
    }
    return env_vars
# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,
    user_id: str,
    ...
) -> Dict[str, str]:
    """
    Inject credentials securely via environment variables.
    NEVER pass credentials in prompts or tool arguments.
    """
    # Fetch short-lived AWS STS tokens (N seconds TTL)
    credentials = await get_aws_sandbox_credentials(
        tenant_identifier=tenant_id,
        user_uuid=user_id,
    )

    env_vars = {
        "AWS_ACCESS_KEY_ID": credentials.aws_access_key,
        "AWS_SECRET_ACCESS_KEY": credentials.aws_secret_key,
        "AWS_SESSION_TOKEN": credentials.aws_session_token,  # Short-lived
        "AWS_REGION": credentials.aws_region,
        # Tenant-specific config
        "ATHENA_DATABASE": tenant_config.athena_database,
        "ATHENA_RESULT_BUCKET": tenant_config.athena_result_bucket,
    }
    return env_vars
# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,
    user_id: str,
    ...
) -> Dict[str, str]:
    """
    Inject credentials securely via environment variables.
    NEVER pass credentials in prompts or tool arguments.
    """
    # Fetch short-lived AWS STS tokens (N seconds TTL)
    credentials = await get_aws_sandbox_credentials(
        tenant_identifier=tenant_id,
        user_uuid=user_id,
    )

    env_vars = {
        "AWS_ACCESS_KEY_ID": credentials.aws_access_key,
        "AWS_SECRET_ACCESS_KEY": credentials.aws_secret_key,
        "AWS_SESSION_TOKEN": credentials.aws_session_token,  # Short-lived
        "AWS_REGION": credentials.aws_region,
        # Tenant-specific config
        "ATHENA_DATABASE": tenant_config.athena_database,
        "ATHENA_RESULT_BUCKET": tenant_config.athena_result_bucket,
    }
    return env_vars

Key properties:

  • Never logged: Credentials don't appear in LLM traces

  • Short-lived: STS tokens expire in 1 hour, refreshed at 30 minutes

  • Tenant-scoped: Each tenant gets their own role-assumed credentials

  • Automatic rotation: Token refresh happens without agent involvement

Why This Works

Credentials are infrastructure, not context. They belong in the execution environment, not the prompt.

Lesson 3: Context Delegation Over Full Access

The Failure Mode

Another potential issue was giving agents access to all customer data:

# Early approach: Load everything
context = {
    "assets": load_all_assets(),  # 50,000+ assets
    "vulnerabilities": load_all_vulns(),  # 100,000+ vulns
    "tenants": load_all_tenants(),  # All customers
}
# Early approach: Load everything
context = {
    "assets": load_all_assets(),  # 50,000+ assets
    "vulnerabilities": load_all_vulns(),  # 100,000+ vulns
    "tenants": load_all_tenants(),  # All customers
}
# Early approach: Load everything
context = {
    "assets": load_all_assets(),  # 50,000+ assets
    "vulnerabilities": load_all_vulns(),  # 100,000+ vulns
    "tenants": load_all_tenants(),  # All customers
}
# Early approach: Load everything
context = {
    "assets": load_all_assets(),  # 50,000+ assets
    "vulnerabilities": load_all_vulns(),  # 100,000+ vulns
    "tenants": load_all_tenants(),  # All customers
}

Problems:

  • Attention dilution: Agents get distracted by irrelevant data

  • Context window exhaustion: Can't fit all data in prompts

  • Accidental oversharing: Agents mix data from multiple tenants

  • Blast radius: One compromised tool sees everything

What We Built Instead

Scoped context views with explicit boundaries:

# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,  # Explicit tenant scope
    allowed_athena_tables: Optional[List[str]] = None,  # Explicit table access
    ...
) -> Dict[str, str]:
    env_vars = {
        "TENANT_ID": tenant_id,  # Single tenant only
        "ALLOWED_ATHENA_TABLES": (
            ",".join(allowed_athena_tables) if allowed_athena_tables else ""
        ),
        "USER_ID": user_id,
        "SESSION_ID": session_id,
    }
    return env_vars
# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,  # Explicit tenant scope
    allowed_athena_tables: Optional[List[str]] = None,  # Explicit table access
    ...
) -> Dict[str, str]:
    env_vars = {
        "TENANT_ID": tenant_id,  # Single tenant only
        "ALLOWED_ATHENA_TABLES": (
            ",".join(allowed_athena_tables) if allowed_athena_tables else ""
        ),
        "USER_ID": user_id,
        "SESSION_ID": session_id,
    }
    return env_vars
# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,  # Explicit tenant scope
    allowed_athena_tables: Optional[List[str]] = None,  # Explicit table access
    ...
) -> Dict[str, str]:
    env_vars = {
        "TENANT_ID": tenant_id,  # Single tenant only
        "ALLOWED_ATHENA_TABLES": (
            ",".join(allowed_athena_tables) if allowed_athena_tables else ""
        ),
        "USER_ID": user_id,
        "SESSION_ID": session_id,
    }
    return env_vars
# Simplified example
async def prepare_sandbox_env_vars(
    tenant_id: str,  # Explicit tenant scope
    allowed_athena_tables: Optional[List[str]] = None,  # Explicit table access
    ...
) -> Dict[str, str]:
    env_vars = {
        "TENANT_ID": tenant_id,  # Single tenant only
        "ALLOWED_ATHENA_TABLES": (
            ",".join(allowed_athena_tables) if allowed_athena_tables else ""
        ),
        "USER_ID": user_id,
        "SESSION_ID": session_id,
    }
    return env_vars

In practice:

  • Agents only see data for one tenant at a time

  • Athena queries are scoped to allowed tables (e.g., tenant_acme.assets)

  • Environment variables enforce boundaries at runtime

The Pattern





Why This Works

  • Reduced context: Agents focus on relevant data only

  • Prevents cross-tenant leakage: Can't accidentally query wrong tenant

  • Audit trail: Every query includes tenant_id for compliance

  • Failure containment: Compromise affects one tenant, not all

The Tradeoff

Cross-tenant analysis requires special handling. For example:

  • "Compare vulnerability trends across all customers" requires aggregated views

  • We use pre-computed metrics in ClickHouse (historical data warehouse)

  • Agents query aggregates, not raw tenant data

Lesson 4: Failure Isolation Saves You

The Failure Mode

Early sandboxes shared state across tool calls:

# DON'T DO THIS
sandbox = create_sandbox()  # Created once per session
for tool_call in agent_calls:
    result = sandbox.execute(tool_call)  # Reuses sandbox
# DON'T DO THIS
sandbox = create_sandbox()  # Created once per session
for tool_call in agent_calls:
    result = sandbox.execute(tool_call)  # Reuses sandbox
# DON'T DO THIS
sandbox = create_sandbox()  # Created once per session
for tool_call in agent_calls:
    result = sandbox.execute(tool_call)  # Reuses sandbox
# DON'T DO THIS
sandbox = create_sandbox()  # Created once per session
for tool_call in agent_calls:
    result = sandbox.execute(tool_call)  # Reuses sandbox

Problems:

  • One bad tool call poisoned the sandbox for all subsequent calls

  • Memory leaks accumulated across calls

  • Credential expiry killed the entire session

What We Built Instead

Isolated execution per tool invocation with failure containment:

# Simplified example
try:
    async for event in agent.process():
        yield event  # Stream results to frontend
except asyncio.CancelledError:
    # Sandbox crash doesn't kill parent session
    logger.error("Sandbox execution cancelled")
    await chat_context.set_error("Chat cancelled by user")
    await chat_context.mark_completed()
    # Session persists, user can continue
except Exception as e:
    # Tool failure is contained
    logger.error(f"Tool execution failed: {e}")
    # Return error to agent, let it recover
# Simplified example
try:
    async for event in agent.process():
        yield event  # Stream results to frontend
except asyncio.CancelledError:
    # Sandbox crash doesn't kill parent session
    logger.error("Sandbox execution cancelled")
    await chat_context.set_error("Chat cancelled by user")
    await chat_context.mark_completed()
    # Session persists, user can continue
except Exception as e:
    # Tool failure is contained
    logger.error(f"Tool execution failed: {e}")
    # Return error to agent, let it recover
# Simplified example
try:
    async for event in agent.process():
        yield event  # Stream results to frontend
except asyncio.CancelledError:
    # Sandbox crash doesn't kill parent session
    logger.error("Sandbox execution cancelled")
    await chat_context.set_error("Chat cancelled by user")
    await chat_context.mark_completed()
    # Session persists, user can continue
except Exception as e:
    # Tool failure is contained
    logger.error(f"Tool execution failed: {e}")
    # Return error to agent, let it recover
# Simplified example
try:
    async for event in agent.process():
        yield event  # Stream results to frontend
except asyncio.CancelledError:
    # Sandbox crash doesn't kill parent session
    logger.error("Sandbox execution cancelled")
    await chat_context.set_error("Chat cancelled by user")
    await chat_context.mark_completed()
    # Session persists, user can continue
except Exception as e:
    # Tool failure is contained
    logger.error(f"Tool execution failed: {e}")
    # Return error to agent, let it recover

Key patterns:

  • Shielded persistence: asyncio.shield() prevents cancellation from interrupting writes

  • Timeout enforcement: 5-minute sandbox timeout, 30-second tool timeout

  • Graceful degradation: One tool failure doesn't break the conversation

  • Execution history preserved: All tool calls logged even if they fail

Why This Works

Production environments are hostile:

  • Network can fail mid-request

  • Credentials can expire

  • Tools can return malformed data

  • Users can cancel requests

Isolation means failures are local, not catastrophic.

Lesson 5: Multi-Tenancy Is Storage-Level Isolation

The Failure Mode

Logical multi-tenancy (one database, tenant_id column):

-- DON'T DO THIS
SELECT * FROM vulnerabilities WHERE tenant_id = 'acme'

-- DON'T DO THIS
SELECT * FROM vulnerabilities WHERE tenant_id = 'acme'

-- DON'T DO THIS
SELECT * FROM vulnerabilities WHERE tenant_id = 'acme'

-- DON'T DO THIS
SELECT * FROM vulnerabilities WHERE tenant_id = 'acme'

Problems:

  • One mistake = data breach: Forgot WHERE clause? You just leaked all tenants' data.

  • Performance interference: Heavy query from one tenant slows all tenants

  • Compliance complexity: Hard to prove data isolation to auditors

What We Built Instead

Physical multi-tenancy: Each tenant gets their own infrastructure

Resource

Isolation Level

PostgreSQL

Dedicated Aurora cluster per tenant

Redis

Dedicated ElastiCache cluster per tenant

S3 buckets

Tenant-prefixed buckets (tenant-acme-artifacts/)

IAM roles

Tenant-specific role assumption

How agents access tenant data:

# Configuration is fetched per tenant at runtime
tenant_config = await get_tenant_configuration(tenant_identifier="acme")

# All database queries go to tenant-specific endpoint
db_connection = await get_tenant_database(tenant_id="acme")
# Points to: acme-production.cluster-xyz.us-east-1.rds.amazonaws.com

# All S3 operations use tenant-specific credentials
s3_client = await get_tenant_s3_client(tenant_id="acme")
# Has access to: s3://tenant-acme-artifacts/ only
# Configuration is fetched per tenant at runtime
tenant_config = await get_tenant_configuration(tenant_identifier="acme")

# All database queries go to tenant-specific endpoint
db_connection = await get_tenant_database(tenant_id="acme")
# Points to: acme-production.cluster-xyz.us-east-1.rds.amazonaws.com

# All S3 operations use tenant-specific credentials
s3_client = await get_tenant_s3_client(tenant_id="acme")
# Has access to: s3://tenant-acme-artifacts/ only
# Configuration is fetched per tenant at runtime
tenant_config = await get_tenant_configuration(tenant_identifier="acme")

# All database queries go to tenant-specific endpoint
db_connection = await get_tenant_database(tenant_id="acme")
# Points to: acme-production.cluster-xyz.us-east-1.rds.amazonaws.com

# All S3 operations use tenant-specific credentials
s3_client = await get_tenant_s3_client(tenant_id="acme")
# Has access to: s3://tenant-acme-artifacts/ only
# Configuration is fetched per tenant at runtime
tenant_config = await get_tenant_configuration(tenant_identifier="acme")

# All database queries go to tenant-specific endpoint
db_connection = await get_tenant_database(tenant_id="acme")
# Points to: acme-production.cluster-xyz.us-east-1.rds.amazonaws.com

# All S3 operations use tenant-specific credentials
s3_client = await get_tenant_s3_client(tenant_id="acme")
# Has access to: s3://tenant-acme-artifacts/ only

Why This Works

  • Impossible to leak cross-tenant data: Can't query another tenant's database

  • Performance isolation: One tenant's heavy query doesn't affect others

  • Compliance proof: Auditors can verify physical isolation

  • Blast radius containment: Incident affects one tenant, not all

The Tradeoff

Operational complexity increases:

  • Deploying schema changes requires per-tenant migrations

  • Monitoring requires per-tenant dashboards

  • Costs are higher (can't share infrastructure)

But for enterprise security data, this is the only acceptable architecture.

Lesson 6: Observability Must Be Tenant-Aware

The Failure Mode

We were cautious not to have LLM tracing send all data to shared observability services:

# DON'T DO THIS
langsmith_client.log(
    prompt=system_prompt,  # Contains customer data!
    response=agent_response,  # Contains query results!
)
# DON'T DO THIS
langsmith_client.log(
    prompt=system_prompt,  # Contains customer data!
    response=agent_response,  # Contains query results!
)
# DON'T DO THIS
langsmith_client.log(
    prompt=system_prompt,  # Contains customer data!
    response=agent_response,  # Contains query results!
)
# DON'T DO THIS
langsmith_client.log(
    prompt=system_prompt,  # Contains customer data!
    response=agent_response,  # Contains query results!
)

Problems:

  • Customer data leaked to third-party services (Braintrust, LangSmith)

  • No tenant isolation in traces

  • Hard to debug tenant-specific issues

What We Built Instead

Tenant-scoped tracing with PII filtering:

# Simplified example
async def log_llm_call(
    tenant_id: str,
    prompt: str,
    response: str,
    metadata: Dict,
):
    # Apply PII redaction before logging
    safe_prompt = redact_credentials(prompt)
    safe_response = redact_credentials(response)

    # Tag with tenant for filtering
    trace = {
        "tenant_id": tenant_id,
        "prompt": safe_prompt,
        "response": safe_response,
        "metadata": metadata,
    }

    # Send to tenant-scoped storage (if enabled)
    if tenant_config.enable_external_tracing:
        await external_tracer.log(trace)

    # Always send to internal audit log
    await internal_audit_log.record(trace)
# Simplified example
async def log_llm_call(
    tenant_id: str,
    prompt: str,
    response: str,
    metadata: Dict,
):
    # Apply PII redaction before logging
    safe_prompt = redact_credentials(prompt)
    safe_response = redact_credentials(response)

    # Tag with tenant for filtering
    trace = {
        "tenant_id": tenant_id,
        "prompt": safe_prompt,
        "response": safe_response,
        "metadata": metadata,
    }

    # Send to tenant-scoped storage (if enabled)
    if tenant_config.enable_external_tracing:
        await external_tracer.log(trace)

    # Always send to internal audit log
    await internal_audit_log.record(trace)
# Simplified example
async def log_llm_call(
    tenant_id: str,
    prompt: str,
    response: str,
    metadata: Dict,
):
    # Apply PII redaction before logging
    safe_prompt = redact_credentials(prompt)
    safe_response = redact_credentials(response)

    # Tag with tenant for filtering
    trace = {
        "tenant_id": tenant_id,
        "prompt": safe_prompt,
        "response": safe_response,
        "metadata": metadata,
    }

    # Send to tenant-scoped storage (if enabled)
    if tenant_config.enable_external_tracing:
        await external_tracer.log(trace)

    # Always send to internal audit log
    await internal_audit_log.record(trace)
# Simplified example
async def log_llm_call(
    tenant_id: str,
    prompt: str,
    response: str,
    metadata: Dict,
):
    # Apply PII redaction before logging
    safe_prompt = redact_credentials(prompt)
    safe_response = redact_credentials(response)

    # Tag with tenant for filtering
    trace = {
        "tenant_id": tenant_id,
        "prompt": safe_prompt,
        "response": safe_response,
        "metadata": metadata,
    }

    # Send to tenant-scoped storage (if enabled)
    if tenant_config.enable_external_tracing:
        await external_tracer.log(trace)

    # Always send to internal audit log
    await internal_audit_log.record(trace)

Key patterns:

  • ✅ Credential redaction before any external logging

  • ✅ Tenant-scoped trace storage

  • ✅ Opt-in for external services (LangSmith, Braintrust)

  • ✅ Internal audit log always captures full context

Why This Works

You can't debug what you can't observe. But observation can't compromise security.

The balance: Internal audit logs (full fidelity) + External traces (redacted) + Tenant control (opt-in/out).

What We're Still Figuring Out

1. Approval UX for High-Risk Actions

The problem: Some agent actions require human approval:

  • Changing production firewall rules

  • Deleting customer data

  • Modifying IAM policies

Current approach: Agents pause and ask for approval via Slack.

Open questions:

  • How do we avoid approval fatigue?

  • Should approvals be sync (blocking) or async?

  • Who should approve?

2. Context Window Management

The problem: Agents work on long-running tasks (analyzing 10,000 vulns).

Current approach:

  • Compaction: Summarization at context window limits

  • Chunking large datasets

  • Delegation to sub-agents with scoped context

Open questions:

  • When should we summarize vs. paginate?

  • How do we preserve important details in summaries?

  • What's the right granularity for sub-agent delegation?

3. Tool Access Control

The problem: Different agents need different tool access:

  • Read-only analyst agent (knowledge base queries only)

  • Action agent (can create tickets, run remediations)

  • Admin agent (full system access)

Current approach: Agents declare tools in config, auth middleware validates.

Future work:

  • Policy-as-code for tool access decisions

  • Explicit allow/deny rules per tool

  • Audit trail for tool access attempts

  • This is where we're heading: A PDP/PEP gateway pattern where agents have a code of conduct that is enforceable at runtime.

4. Testing Agent Safety

The problem: How do you test that isolation actually works?

Current approach:

  • Adversarial test cases (try to leak credentials, query wrong tenant)

  • Boundary tests (try to access blocked domains)

  • Regression tests (verify redaction on known PII patterns)

Open questions:

  • How do we continuously test isolation boundaries?

  • Can we use LLMs to generate adversarial inputs?

  • What's the right balance between security and agent capabilities?

Key Takeaways

After a year of running agents in production, here's what we know for certain:

1. Infrastructure > Prompts

Safety must be enforced outside the agent:

  • Network boundaries (allow-lists)

  • Credential injection (env vars, not prompts)

  • Multi-tenancy (storage-level isolation)

Prompts can't provide hard guarantees. Infrastructure can.

2. Failure Isolation Is Critical

Production is hostile. Things will fail:

  • Tool timeouts

  • Network errors

  • Credential expiry

  • Malformed data

Isolation means failures are local, not catastrophic.

3. Observability ≠ Surveillance

You need to observe agents to debug them. But observation can't compromise security:

  • Redact credentials before logging

  • Tenant-scoped traces

  • Opt-in for external services

4. Multi-Tenancy Is Non-Negotiable

For enterprise security data, logical isolation (tenant_id column) isn't enough. You need physical isolation:

  • Dedicated databases per tenant

  • Tenant-specific credentials

  • Storage-level separation

One SQL mistake can't leak all customers' data.

5. Context Scoping Reduces Risk

Agents with access to everything are dangerous. Give them:

  • Only the tenant they need

  • Only the tables they need

  • Only the time window they need

Least-context access reduces blast radius.

What's Next

We're building toward more explicit tool access control:

The vision: Every tool call evaluated against policy (not just authentication). Policy-as-code determines:

  • Is this tool allowed for this agent?

  • Does this query match the tenant scope?

  • Should this action require approval?

This is hard. Policy can become unmanageable ("policy sprawl"). But the alternative—vibes-based access control—doesn't scale for enterprise security.

We'll share more as we build it.

Conclusion

Building safe agents for high-stakes environments isn't about prompt engineering. It's about infrastructure:

  • Network isolation (deny by default)

  • Credential management (env vars, not prompts)

  • Failure containment (sandbox crashes don't propagate)

  • Multi-tenancy (storage-level isolation)

  • Context scoping (least-context access)

These patterns work. They're running in production today at Cogent, handling millions of vulnerabilities across dozens of customers.

If you're building agents for enterprise environments, start here.

Join Us

If you're excited about building secure autonomous systems that can operate in mission-critical environments with provable, controlled autonomy—we're hiring.

We're looking for engineers who:

  • Care about both performance and correctness

  • Embrace incremental improvement over perfect designs

  • Want to work at the intersection of cybersecurity and agentic systems

Check out our careers page or reach out directly.