io.github.n24q02m/wet-mcp
MCP server for web search, content extraction, academic research, and library docs.
★ 2MITdevtools
Install
Config snippet generator goes here (5 client tabs)
README
# WET - Web Extended Toolkit MCP Server
mcp-name: io.github.n24q02m/wet-mcp
**Open-source MCP Server for web search, content extraction, library docs & multimodal analysis.**
<!-- Badge Row 1: Status -->
[](https://github.com/n24q02m/wet-mcp/actions/workflows/ci.yml)
[](https://codecov.io/gh/n24q02m/wet-mcp)
[](https://pypi.org/project/wet-mcp/)
[](https://hub.docker.com/r/n24q02m/wet-mcp)
[](LICENSE)
<!-- Badge Row 2: Tech -->
[](#)
[](#)
[](#)
[](https://github.com/python-semantic-release/python-semantic-release)
[](https://developer.mend.io/)
<a href="https://glama.ai/mcp/servers/n24q02m/wet-mcp">
<img width="380" height="200" src="https://glama.ai/mcp/servers/n24q02m/wet-mcp/badge" alt="WET MCP server" />
</a>
## Features
- **Web Search** -- Embedded SearXNG metasearch (Google, Bing, DuckDuckGo, Brave) with filters, semantic reranking, query expansion, and snippet enrichment
- **Academic Research** -- Search Google Scholar, Semantic Scholar, arXiv, PubMed, CrossRef, BASE
- **Library Docs** -- Auto-discover and index documentation with FTS5 hybrid search, HyDE-enhanced retrieval, and version-specific docs
- **Content Extract** -- Clean content extraction (Markdown/Text), structured data extraction (LLM + JSON Schema), batch processing (up to 50 URLs), deep crawling, site mapping
- **Local File Conversion** -- Convert PDF, DOCX, XLSX, CSV, HTML, EPUB, PPTX to Markdown
- **Media** -- List, download, and analyze images, videos, audio files
- **Anti-bot** -- Stealth mode bypasses Cloudflare, Medium, LinkedIn, Twitter
- **Zero Config** -- Built-in local Qwen3 embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere)
- **Sync** -- Cross-machine sync of indexed docs via rclone (Google Drive, S3, Dropbox)
## Quick Start
### Claude Code Plugin (Recommended)
Via marketplace (includes skills: /fact-check, /compare):
```bash
/plugin marketplace add n24q02m/claude-plugins
/plugin install wet-mcp@claude-plugins
```
Or install this plugin only:
```bash
/plugin marketplace add n24q02m/wet-mcp
/plugin install wet-mcp
```
Configure env vars in `~/.claude/settings.local.json` or shell profile. See [Environment Variables](#environment-variables).
### MCP Server
> **Python 3.13 required** -- Python 3.14+ is **not** supported due to SearXNG incompatibility. You **must** specify `--python 3.13` when using `uvx`.
**On first run**, the server automatically installs SearXNG, Playwright chromium, and starts the embedded search engine.
#### Option 1: uvx
```jsonc
{
"mcpServers": {
"wet": {
"command": "uvx",
"args": ["--python", "3.13", "wet-mcp@latest"]
}
}
}
```
<details>
<summary>Other MCP clients (Cursor, Codex, Gemini CLI)</summary>
```jsonc
// Cursor (~/.cursor/mcp.json), Windsurf, Cline, Amp, OpenCode
{
"mcpServers": {
"wet": {
"command": "uvx",
"args": ["--python", "3.13", "wet-mcp@latest"]
}
}
}
```
```toml
# Codex (~/.codex/config.toml)
[mcp_servers.wet]
command = "uvx"
args = ["--python", "3.13", "wet-mcp@latest"]
```
</details>
#### Option 2: Docker
```jsonc
{
"mcpServers": {
"wet": {
"command": "docker",
"args": [
"run", "-i", "--rm",
"--name", "mcp-wet",
"-v", "wet-data:/data",
"-e", "API_KEYS",
"-e", "GITHUB_TOKEN",
"-e", "SYNC_ENABLED",
"n24q02m/wet-mcp:latest"
]
}
}
}
```
Configure env vars in `~/.claude/settings.local.json` or your shell profile. See [Environment Variables](#environment-variables) below.
### Pre-install (optional)
Use the `setup` MCP tool to warmup models and install dependencies:
```
# Via MCP tool call (recommended):
setup(action="warmup")
# With cloud embedding configured, warmup validates API keys
# and skips local model download if cloud models are available.
```
The warmup action pre-downloads SearXNG, Playwright, and embedding/reranker models (~1.1GB total) so the first real connection does not timeout.
### Sync setup
Sync is fully automatic. Just set `SYNC_ENABLED=true` and the server handles everything:
1. **First sync**: rclone is