Analyzes codebases with tree-sitter and generates AGENTS.md files for AI coding agents.
MCP server that analyzes codebases with tree-sitter and generates AGENTS.md files.
Compatible with any MCP-capable client: Claude Code, Gemini CLI, Cursor, Windsurf, and others.
How it works: The server does all the heavy lifting locally — AST parsing, incremental change detection, environment variable scanning, entry point detection. It writes a compact structured payload to disk and returns step-by-step instructions to your AI client. The client reads the payload and writes AGENTS.md. No large data travels over the MCP wire.
Python · C# · TypeScript · JavaScript · Go
See INSTALLATION.md for the full guide including prerequisites and troubleshooting.
Requirements: Python 3.11+, uv, Git, and any MCP-compatible client.
claude mcp add agents-md uvx agents-md-generatorOr add it manually to ~/.claude.json (Linux/macOS) or %USERPROFILE%\.claude.json (Windows):
{
"mcpServers": {
"agents-md": {
"command": "uvx",
"args": ["agents-md-generator"]
}
}
}Add it to ~/.gemini/settings.json:
{
"mcpServers": {
"agents-md": {
"command": "uvx",
"args": ["agents-md-generator"]
}
}
}The server uses stdio transport. Add this entry to your client's MCP config under mcpServers:
"agents-md": {
"command": "uvx",
"args": ["agents-md-generator"]
}Restart your client — uvx downloads the package automatically on first run.
Once registered, ask your AI client:
"Generate the AGENTS.md for this project"
The client will call generate_agents_md automatically.
| Parameter | Type | Default | Description |
|---|---|---|---|
project_path | string | "." | Path to the project root |
force_full_scan | boolean | false | Ignore cache and rescan everything from scratch |
Note on
force_full_scan: Use this only when explicitly requested. When asking Claude to improve or update an existingAGENTS.md, leave it asfalse— the incremental scan already provides all the data needed.
The generated AGENTS.md follows the agents.md open standard. It is written as a README for AI agents, not as documentation for humans. Sections include:
.env.examplepackage.json, Makefile, etc.Sections with no detected data are omitted entirely.
AGENTS.md content anywayAGENTS.mdFor large codebases the analysis payload can be too big to return inline over the MCP wire. The server handles this transparently through a second tool: get_payload_chunk.
Flow:
generate_agents_md runs the full analysis, writes the payload to disk, and returns a small response with total_chunks and instructionsget_payload_chunk(project_path, chunk_index=0), then increments chunk_index until the response contains has_more: falsedata fields in order and parses the result as JSONThis flow is pure MCP — no filesystem access required from the client side. Any MCP-compatible client can follow it.
All runtime artifacts are stored outside your project, in the user cache directory:
~/.cache/agents-md-generator/<project-hash>/cache.json ← incremental scan cache
The <project-hash> is a SHA-256 of the project's absolute path — unique per project. Nothing is written to your repository.
Note: The server also writes a temporary
payload.jsonto this directory during analysis, but it is managed entirely by theget_payload_chunktool and deleted automatically after the last chunk is read. You never need to access it directly.
Create .agents-config.json at your project root to customize behavior. This file is optional — all fields have defaults.
{
"impact_threshold": "medium",
"exclude": [
"**/node_modules/**",
"**/bin/**",
"**/obj/**",
"**/.git/**",
"**/dist/**",
"**/build/**",
"**/__pycache__/**",
"**/*.min.js",
"**/*.min.css",
"**/*.bundle.js",
"**/vendor/**",
"**/packages/**",
"**/.venv/**",
"**/venv/**",
"**/bower_components/**",
"**/app/lib/**",
"**/wwwroot/lib/**",
"**/wwwroot/libs/**",
"**/static/vendor/**",
"**/public/vendor/**",
"**/assets/vendor/**",
"**/site-packages/**"
],
"include": [],
"languages": "auto",
"agents_md_path": "./AGENTS.md",
"max_file_size_bytes": 1048576,
"dir_aggregation_threshold": 8
}| Key | Default | Description |
|---|---|---|
impact_threshold | "medium" | Minimum change impact to include in incremental payload (see Impact Threshold) |
exclude | (see above) | Glob patterns to exclude from analysis |
include | [] | If non-empty, only analyze files matching these patterns |
languages | "auto" | "auto" detects all supported languages, or pass a list like ["typescript", "python"] |
agents_md_path | "./AGENTS.md" | Output path for the generated file |
max_file_size_bytes | 1048576 | Files larger than this are skipped (default: 1 MB) |
dir_aggregation_threshold | 8 | Directories with this many or more files of the same language are collapsed into a single directory summary instead of per-file entries. Reduces payload size significantly on large codebases. Set to a high number to disable. |
You can commit .agents-config.json to share exclusion rules and thresholds with your team.
The impact_threshold controls which symbol changes are included in incremental scan payloads. Changes below the threshold are silently ignored — AGENTS.md is not regenerated for them.
| Change type | Symbol kind | Extra condition | Impact |
|---|---|---|---|
| any | any | Has HTTP decorator (@HttpGet, @app.route, @Get, …) | high |
added or removed | class, interface, struct | — | high |
removed | method | public | high |
modified | any | public | medium |
added | function or method | public | medium |
| any | any | none of the above | low |
Choosing a threshold:
"high" — Only regenerate AGENTS.md for breaking or structural changes. Best for large, stable codebases where minor additions are frequent."medium" (default) — Regenerate when the public API surface grows or changes. Suitable for most projects."low" — Regenerate on any public symbol change. Best for early-stage projects where the architecture is still evolving.The server scans all source files for environment variable references using language-specific patterns:
| Language | Pattern detected |
|---|---|
| JavaScript / TypeScript | process.env.VAR_NAME |
| Python | os.environ['VAR'], os.getenv('VAR') |
| Go | os.Getenv("VAR") |
| Ruby | ENV['VAR'] |
| Rust | env!("VAR"), var("VAR") |
It also parses .env.example, .env.template, and .env.sample files at the project root.
Files named index, main, app, server, program, bootstrap, or startup (with any supported extension) are detected as entry points and annotated with their inferred role (e.g., "HTTP server bootstrap", "Electron main process").
Tree-sitter parses each source file and extracts public symbols — classes, functions, methods, interfaces — filtering out private/protected members and underscore-prefixed symbols. For classes and structs, constructors (when they have parameters) and public properties are also included, revealing dependency injection patterns and data shapes. Interface methods are always included as they define the public contract. These are used to detect naming conventions, DI patterns, and export contracts across layers.
AGENTS.md format based on the open agents.md standard.