Back to Directory/Developer Tools

io.github.j7an/nexus-mcp

Invoke CLI agents (Gemini, Codex, Claude, OpenCode) as MCP tools with parallel execution

Developer ToolsPythonv0.9.2

Nexus MCP

PyPI Python 3.13+ License: MIT Ruff type-checked: mypy pre-commit MCP

A MCP server that enables AI models to invoke AI CLI agents (Gemini CLI, Codex, Claude Code, OpenCode) as tools. Provides parallel execution, automatic retries with exponential backoff, JSON-first response parsing, and structured output through five MCP tools.

Use Cases

Nexus MCP is useful whenever a task benefits from querying multiple AI agents in parallel rather than sequentially:

  • Research & summarization — fan out a topic to multiple agents, then synthesize their responses into a single summary with diverse perspectives
  • Code review — send different files or review angles (security, correctness, style) to separate agents simultaneously
  • Multi-model comparison — prompt the same question to different models and compare outputs side-by-side for quality or consistency
  • Bulk content generation — generate multiple test cases, translations, or documentation pages concurrently instead of one at a time
  • Second-opinion workflows — get independent answers from separate agents before making a decision, reducing single-model bias

Features

  • Parallel executionbatch_prompt fans out tasks with asyncio.gather and a configurable semaphore (default concurrency: 3)
  • Automatic retries — exponential backoff with full jitter for transient errors (HTTP 429/503)
  • Output handling — JSON-first parsing, brace-depth fallback for noisy stdout, temp-file spillover for outputs exceeding 50 KB
  • Execution modesdefault (safe, no auto-approve), yolo (full auto-approve)
  • CLI detection — auto-detects binary path, version, and JSON output capability at startup
  • Session preferences — set defaults for execution mode, model, max retries, output limit, and timeout once per session; subsequent calls inherit them without repeating parameters
  • Tool timeouts — configurable safety timeout (default 15 min) cancels long-running tool calls to prevent the server from blocking indefinitely
  • Client-visible logging — runner events (retries, output truncation, error recovery) are sent to MCP clients via protocol notifications, not just server stderr
  • Elicitation — interactive parameter resolution via MCP elicitation; disambiguates missing CLI, offers model selection, confirms YOLO mode, and prompts for elaboration on vague prompts. Auto-detects client support and skips gracefully when unavailable. Suppression flags prevent repeat prompts within a session
  • Extensible — implement build_command + parse_output, register in RunnerFactory
AgentStatus
Gemini CLISupported
CodexSupported
Claude CodeSupported
OpenCodeSupported

Installation

Run with uvx (recommended)

bash
uvx nexus-mcp

uvx installs the package in an ephemeral virtual environment and runs it — no cloning required.

To check the installed version:

bash
uvx nexus-mcp --version

To update to the latest version:

bash
uvx --reinstall nexus-mcp

MCP Client Configuration

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):

json
{
  "mcpServers": {
    "nexus-mcp": {
      "command": "uvx",
      "args": ["nexus-mcp"],
      "env": {
        "NEXUS_GEMINI_MODEL": "gemini-3-flash-preview",
        "NEXUS_GEMINI_MODELS": "gemini-3.1-pro-preview,gemini-3-flash-preview,gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite",
        "NEXUS_CODEX_MODEL": "gpt-5.2",
        "NEXUS_CODEX_MODELS": "gpt-5.4,gpt-5.4-mini,gpt-5.3-codex,gpt-5.2-codex,gpt-5.2,gpt-5.1-codex-max,gpt-5.1-codex-mini",
        "NEXUS_OPENCODE_MODEL": "ollama-cloud/kimi-k2.5",
        "NEXUS_OPENCODE_MODELS": "ollama-cloud/glm-5,ollama-cloud/kimi-k2.5,ollama-cloud/qwen3-coder-next,ollama-cloud/minimax-m2.5,ollama/gemini-3-flash-preview"
      }
    }
  }
}

Cursor (.cursor/mcp.json in your project or ~/.cursor/mcp.json globally):

json
{
  "mcpServers": {
    "nexus-mcp": {
      "command": "uvx",
      "args": ["nexus-mcp"],
      "env": {
        "NEXUS_GEMINI_MODEL": "gemini-3-flash-preview",
        "NEXUS_GEMINI_MODELS": "gemini-3.1-pro-preview,gemini-3-flash-preview,gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite",
        "NEXUS_CODEX_MODEL": "gpt-5.2",
        "NEXUS_CODEX_MODELS": "gpt-5.4,gpt-5.4-mini,gpt-5.3-codex,gpt-5.2-codex,gpt-5.2,gpt-5.1-codex-max,gpt-5.1-codex-mini",
        "NEXUS_OPENCODE_MODEL": "ollama-cloud/kimi-k2.5",
        "NEXUS_OPENCODE_MODELS": "ollama-cloud/glm-5,ollama-cloud/kimi-k2.5,ollama-cloud/qwen3-coder-next,ollama-cloud/minimax-m2.5,ollama/gemini-3-flash-preview"
      }
    }
  }
}

Claude Code (CLI):

bash
claude mcp add nexus-mcp \
  -e NEXUS_GEMINI_MODEL=gemini-3-flash-preview \
  -e NEXUS_GEMINI_MODELS=gemini-3.1-pro-preview,gemini-3-flash-preview,gemini-2.5-pro,gemini-2.5-flash,gemini-2.5-flash-lite \
  -e NEXUS_CODEX_MODEL=gpt-5.2 \
  -e NEXUS_CODEX_MODELS=gpt-5.4,gpt-5.4-mini,gpt-5.3-codex,gpt-5.2-codex,gpt-5.2,gpt-5.1-codex-max,gpt-5.1-codex-mini \
  -e NEXUS_OPENCODE_MODEL=ollama-cloud/kimi-k2.5 \
  -e NEXUS_OPENCODE_MODELS=ollama-cloud/glm-5,ollama-cloud/kimi-k2.5,ollama-cloud/qwen3-coder-next,ollama-cloud/minimax-m2.5,ollama/gemini-3-flash-preview \
  -- uvx nexus-mcp

Generic stdio config (any MCP-compatible client):

json
{
  "command": "uvx",
  "args": ["nexus-mcp"],
  "transport": "stdio",
  "env": {
    "NEXUS_GEMINI_MODEL": "gemini-3-flash-preview",
    "NEXUS_CODEX_MODEL": "gpt-5.2",
    "NEXUS_OPENCODE_MODEL": "ollama-cloud/kimi-k2.5"
  }
}

All env keys are optional — see Configuration for the full list.

Setup for Development

Prerequisites:

  • Python 3.13+ (download)
  • uv dependency manager (install guide)
    bash
    curl -LsSf https://astral.sh/uv/install.sh | sh

Optional (for integration tests):

  • Gemini CLI v0.6.0+ — npm install -g @google/gemini-cli
  • Codex — check with codex --version
  • Claude Code — check with claude --version
  • OpenCode — check with opencode --version

Note: Integration tests are optional. Unit tests run without CLI dependencies via subprocess mocking.

bash
# 1. Clone the repository
git clone <repository-url>
cd nexus-mcp

# 2. Install dependencies
uv sync

# 3. Install pre-commit hooks (runs linting/formatting on commit)
uv run pre-commit install

# 4. Verify installation
uv run pytest                    # Run tests
uv run mypy src/nexus_mcp        # Type checking
uv run ruff check .              # Linting

# 5. Run the MCP server
uv run python -m nexus_mcp

Usage

Once nexus-mcp is configured in your MCP client, your AI assistant automatically sees its tools. The reliable trigger is explicitly asking for output from an external AI agent (e.g. Gemini, Codex, Claude Code, OpenCode). Generic "do this in parallel" prompts may be handled by the host AI's own capabilities instead. The cli parameter is optional — if omitted and the client supports MCP elicitation, the server will ask which runner to use. The server provides runner metadata (names, models, availability, execution modes) in its connection instructions — no discovery call needed. The cli parameter includes a JSON schema enum listing valid runner names.

Fan out a research question (batch_prompt)

You say to your AI assistant:

"Get perspectives from Gemini, Codex, and OpenCode on transformer architectures — summary, limitations, and applications."

Your AI assistant calls batch_prompt with the discovered runners:

json
{
  "tasks": [
    { "cli": "gemini", "prompt": "Summarize the key findings of the Attention Is All You Need paper", "label": "gemini-summary" },
    { "cli": "codex", "prompt": "What are the main limitations of transformer architectures?", "label": "codex-limitations" },
    { "cli": "opencode", "prompt": "List 3 real-world applications of transformers beyond NLP", "label": "opencode-applications" }
  ]
}

Code review from multiple angles (batch_prompt)

You say to your AI assistant:

"Have Gemini, Codex, and OpenCode each review this diff in parallel — I want three independent perspectives."

Your AI assistant calls batch_prompt:

json
{
  "tasks": [
    { "cli": "gemini", "prompt": "Review this diff for security vulnerabilities and logic errors:\n\n<paste diff>", "label": "gemini-review" },
    { "cli": "codex", "prompt": "Review this diff for correctness and edge cases:\n\n<paste diff>", "label": "codex-review" },
    { "cli": "opencode", "prompt": "Review this diff for style and maintainability:\n\n<paste diff>", "label": "opencode-review" }
  ]
}

Single-agent prompt (prompt)

You say to your AI assistant:

"Ask Gemini Flash to explain the difference between TCP and UDP in simple terms."

Your AI assistant calls prompt:

json
{
  "cli": "gemini",
  "prompt": "Explain the difference between TCP and UDP in simple terms",
  "model": "gemini-3-flash-preview"
}

Or target Codex:

json
{
  "cli": "codex",
  "prompt": "Explain the difference between TCP and UDP in simple terms",
  "model": "gpt-5.2"
}

Or OpenCode:

json
{
  "cli": "opencode",
  "prompt": "Explain the difference between TCP and UDP in simple terms",
  "model": "ollama-cloud/kimi-k2.5"
}

Letting the server pick the runner (elicitation)

You say to your AI assistant:

"Explain the CAP theorem using one of the available agents."

Your AI assistant calls prompt without specifying cli:

json
{
  "prompt": "Explain the CAP theorem in simple terms"
}

If the MCP client supports elicitation, the server asks which runner to use. If elicitation is unavailable, the server returns an error asking for cli to be specified. To skip elicitation for a specific call, pass "elicit": false.

Session preferences (set_preferences)

You say to your AI assistant:

"For the rest of this session, use YOLO mode with Gemini Flash — I don't want to repeat those settings on every call."

Your AI assistant calls set_preferences once:

json
{
  "execution_mode": "yolo",
  "model": "gemini-3-flash-preview",
  "max_retries": 5
}

Response:

Preferences set: {"execution_mode": "yolo", "model": "gemini-3-flash-preview", "max_retries": 5, "output_limit": null, "timeout": null, "retry_base_delay": null, "retry_max_delay": null, "elicit": null, "confirm_yolo": null, "confirm_vague_prompt": null, "confirm_high_retries": null, "confirm_large_batch": null}

Subsequent prompt and batch_prompt calls omit those fields — they inherit from the session:

json
{
  "cli": "gemini",
  "prompt": "Summarize the latest developments in Rust's async ecosystem"
}

The fallback chain is: explicit parameter → session preference → per-runner env → global env → hardcoded default. To override for one call, pass the parameter directly — it takes precedence without changing the session. To clear a single preference, use set_preferences with the corresponding clear_* flag (e.g. clear_model: true). Other preferences are preserved.

MCP Tools

All prompt tools run as background tasks — they return a task ID immediately so the client can poll for results, preventing MCP timeouts for long operations (e.g. YOLO mode: 2–5 minutes).

ToolTask?Description
batch_promptYesFan out prompts to multiple runners in parallel; returns MultiPromptResponse
promptYesSingle-runner convenience wrapper; routes to batch_prompt
set_preferencesNoSet or selectively clear session defaults for execution mode, model, retries, timeouts, elicitation, and trigger suppression
get_preferencesNoRetrieve current session preferences
clear_preferencesNoReset all session preferences

batch_prompt

ParameterRequiredDefaultDescription
tasksYesList of task objects (see below)
max_concurrencyNo3Max parallel agent invocations
elicitNosession pref or trueEnable/disable interactive elicitation for this call

Task object fields:

FieldRequiredDefaultDescription
cliNoRunner name (e.g. "gemini"); if omitted and elicitation is enabled, the server asks which runner to use
promptYesPrompt text
labelNoautoDisplay label for results (auto-assigned from runner name if omitted)
contextNo{}Optional context metadata dict
execution_modeNosession pref or "default""default" or "yolo"
modelNosession pref or CLI defaultModel name override
max_retriesNosession pref or env defaultMax retry attempts for transient errors
output_limitNosession pref or env defaultMax output bytes before temp-file spillover
timeoutNosession pref or env defaultSubprocess timeout in seconds
retry_base_delayNosession pref or env defaultBase delay seconds for exponential backoff
retry_max_delayNosession pref or env defaultMax delay cap for backoff in seconds

Note: elicit is a batch-level parameter, not a per-task field. When enabled, the server runs a single upfront elicitation pass across all tasks (e.g., "3 of 5 tasks use YOLO mode — confirm?") rather than prompting per-task.

prompt

ParameterRequiredDefaultDescription
cliNoRunner name; if omitted and elicitation is enabled, the server asks which runner to use
promptYesPrompt text
contextNo{}Optional context metadata dict
execution_modeNosession pref or "default""default" or "yolo"
modelNosession pref or CLI defaultModel name override; if omitted and the runner has multiple models, elicitation may offer a choice
max_retriesNosession pref or env defaultMax retry attempts for transient errors
output_limitNosession pref or env defaultMax output bytes before temp-file spillover
timeoutNosession pref or env defaultSubprocess timeout in seconds
retry_base_delayNosession pref or env defaultBase delay seconds for exponential backoff
retry_max_delayNosession pref or env defaultMax delay cap for backoff in seconds
elicitNosession pref or trueEnable/disable interactive elicitation for this call

set_preferences

ParameterRequiredDefaultDescription
execution_modeNo"default" or "yolo"
modelNoModel name (e.g. "gemini-3-flash-preview")
max_retriesNoMax total attempts including the first (≥1; 1 means run once, no retries)
output_limitNoMax output bytes before temp-file spillover (≥1)
timeoutNoSubprocess timeout in seconds (≥1)
retry_base_delayNoBase delay seconds for exponential backoff (≥0)
retry_max_delayNoMax delay cap for backoff in seconds (≥0)
clear_execution_modeNofalseClear execution mode (takes precedence if execution_mode is also provided)
clear_modelNofalseClear model (takes precedence if model is also provided)
clear_max_retriesNofalseClear max retries (takes precedence if max_retries is also provided)
clear_output_limitNofalseClear output limit (takes precedence if output_limit is also provided)
clear_timeoutNofalseClear timeout (takes precedence if timeout is also provided)
clear_retry_base_delayNofalseClear retry base delay
clear_retry_max_delayNofalseClear retry max delay
elicitNotrueEnable/disable elicitation for the session. When true (default), the server may ask the client for missing parameters or confirmations
confirm_yoloNotrueWhether to prompt before YOLO mode. Set to false to skip the confirmation. Auto-set to false after first acceptance
confirm_vague_promptNotrueWhether to prompt when prompts are very short. Set to false to skip. Not auto-suppressed
confirm_high_retriesNotrueWhether to prompt when max_retries > 5. Set to false to skip. Auto-set to false after first acceptance
confirm_large_batchNotrueWhether to prompt when batch has > 5 tasks. Set to false to skip. Auto-set to false after first acceptance
clear_elicitNofalseReset elicit to default (true)
clear_confirm_yoloNofalseReset YOLO suppression (re-enables confirmation prompt)
clear_confirm_vague_promptNofalseReset vague prompt suppression
clear_confirm_high_retriesNofalseReset high retry suppression
clear_confirm_large_batchNofalseReset large batch suppression

get_preferences

No parameters. Returns a dict with all preference fields (null when unset):

KeyTypeDescription
execution_modestring | null"default" or "yolo"
modelstring | nullModel name
max_retriesint | nullMax total attempts
output_limitint | nullMax output bytes
timeoutint | nullSubprocess timeout seconds
retry_base_delayfloat | nullBackoff base delay
retry_max_delayfloat | nullBackoff max delay
elicitbool | nullElicitation enabled
confirm_yolobool | nullYOLO confirmation enabled
confirm_vague_promptbool | nullVague prompt check enabled
confirm_high_retriesbool | nullHigh retry warning enabled
confirm_large_batchbool | nullLarge batch confirmation enabled

clear_preferences

No parameters. Resets all session preferences to null, including elicitation suppression flags (re-enables all confirmation prompts).

Managing Session Preferences

OperationToolNotes
Set one or more fieldsset_preferencesPass only the fields you want to change
Read current valuesget_preferencesReturns all preference fields with null for unset
Clear all fieldsclear_preferencesReverts to per-call defaults
Clear one preferenceset_preferences with the corresponding clear_*: true flagOther preferences are preserved
Suppress an elicitation promptset_preferences with confirm_*: falsePersists for the session; YOLO/batch/retry auto-suppress after first acceptance
Re-enable a suppressed promptset_preferences with clear_confirm_*: trueResets to default (prompts again)
Disable all elicitationset_preferences with elicit: falseSkips all interactive prompts for the session

Configuration

Global Environment Variables

VariableDefaultDescription
NEXUS_OUTPUT_LIMIT_BYTES50000Max output size in bytes before temp-file spillover
NEXUS_TIMEOUT_SECONDS600Subprocess timeout in seconds (10 minutes)
NEXUS_TOOL_TIMEOUT_SECONDS900Tool-level timeout in seconds (15 minutes); set to 0 to disable
NEXUS_RETRY_MAX_ATTEMPTS3Max attempts including the first (set to 1 to disable retries)
NEXUS_RETRY_BASE_DELAY2.0Base seconds for exponential backoff
NEXUS_RETRY_MAX_DELAY60.0Maximum seconds to wait between retries
NEXUS_CLI_DETECTION_TIMEOUT30Timeout in seconds for CLI binary version detection at startup
NEXUS_EXECUTION_MODEdefaultGlobal execution mode (default or yolo)

Per-Runner Environment Variables

Pattern: NEXUS_{AGENT}_{KEY} (agent name uppercased). Per-runner values override global values.

Valid {AGENT} values: CLAUDE, CODEX, GEMINI, OPENCODE

Variable patternExampleDescription
NEXUS_{AGENT}_MODELNEXUS_GEMINI_MODEL=gemini-3-flash-previewDefault model for this runner
NEXUS_{AGENT}_MODELSNEXUS_GEMINI_MODELS=gemini-3-flash-preview,gemini-2.5-proComma-separated model list (surfaced in server instructions)
NEXUS_{AGENT}_TIMEOUTNEXUS_GEMINI_TIMEOUT=900Subprocess timeout override
NEXUS_{AGENT}_OUTPUT_LIMITNEXUS_CODEX_OUTPUT_LIMIT=100000Output limit override
NEXUS_{AGENT}_MAX_RETRIESNEXUS_CLAUDE_MAX_RETRIES=5Max retry attempts override
NEXUS_{AGENT}_RETRY_BASE_DELAYNEXUS_GEMINI_RETRY_BASE_DELAY=1.0Backoff base delay override
NEXUS_{AGENT}_RETRY_MAX_DELAYNEXUS_GEMINI_RETRY_MAX_DELAY=30.0Backoff max delay override
NEXUS_{AGENT}_EXECUTION_MODENEXUS_GEMINI_EXECUTION_MODE=yoloExecution mode override

Invalid per-runner values are silently ignored (the global or hardcoded default is used instead).

Development

Testing

This project follows Test-Driven Development (TDD) with strict Red→Green→Refactor cycles.

bash
# Run all tests
uv run pytest

# Run with coverage report
uv run pytest --cov=nexus_mcp --cov-report=term-missing

# Run specific test types
uv run pytest -m integration           # Integration tests (requires CLIs)
uv run pytest -m "not integration"     # Unit tests only
uv run pytest -m "not slow"            # Skip slow tests

# Run specific test file
uv run pytest tests/unit/runners/test_gemini.py

Test markers:

  • @pytest.mark.integration — requires real CLI installations
  • @pytest.mark.slow — tests taking >1 second

Code Quality

All quality checks run automatically via pre-commit hooks. Run manually:

bash
# Lint and format
uv run ruff check .              # Check for issues
uv run ruff check --fix .        # Auto-fix issues
uv run ruff format .             # Format code

# Type checking (strict mode)
uv run mypy src/nexus_mcp

# Run all pre-commit hooks manually
uv run pre-commit run --all-files

Adding Dependencies

bash
uv add <package>              # Production dependency
uv add --dev <package>        # Development dependency
uv sync                       # Sync environment after changes

Tool Configuration

  • Ruff: line length 100, 17 rule sets (E/F/I/W + UP/FA/B/C4/SIM/RET/ICN/TID/TC/ISC/PTH/TD/NPY) — pyproject.toml → [tool.ruff]
  • Mypy: strict mode, all type annotations required — pyproject.toml → [tool.mypy]
  • Pytest: asyncio_mode = "auto", no @pytest.mark.asyncio needed — pyproject.toml → [tool.pytest.ini_options]
  • Pre-commit: ruff-check, ruff-format, mypy, trailing-whitespace, end-of-file-fixer — .pre-commit-config.yaml

Python 3.13+ Syntax

  • type keyword for type aliases: type AgentName = str
  • Union syntax: str | None (not Optional[str])
  • match statements for complex conditionals
  • NO from __future__ import annotations

Project Structure

text
nexus-mcp/
├── src/nexus_mcp/
│   ├── __main__.py         # Entry point
│   ├── server.py           # FastMCP server + tools
│   ├── types.py            # Pydantic models
│   ├── exceptions.py       # Exception hierarchy
│   ├── config.py           # Environment variable config
│   ├── elicitation.py      # ElicitationGuard — interactive parameter resolution
│   ├── process.py          # Subprocess wrapper
│   ├── parser.py           # JSON→text fallback parsing
│   ├── cli_detector.py     # CLI binary detection + version checks
│   └── runners/
│       ├── base.py         # Protocol + ABC
│       ├── factory.py      # RunnerFactory
│       ├── claude.py       # ClaudeRunner
│       ├── codex.py        # CodexRunner
│       ├── gemini.py       # GeminiRunner
│       └── opencode.py     # OpenCodeRunner
├── tests/
│   ├── unit/               # Fast, mocked tests
│   ├── e2e/                # End-to-end MCP protocol tests
│   ├── integration/        # Real CLI tests
│   └── fixtures.py         # Shared test utilities
├── .github/
│   └── workflows/          # CI, security, dependabot
├── pyproject.toml          # Dependencies + tool config
└── .pre-commit-config.yaml # Git hooks configuration

License

MIT

Learn More