Universal data connector for CSV, Postgres, and REST APIs via DuckDB
npm mcp-data-pipeline-connector package
One MCP server for all your data sources — with cross-source SQL joins and no external query service. DuckDB runs embedded in-process, so you can join a CSV file against a Postgres table against a REST API response in a single query, entirely on your machine. Agents work with your data without needing source-specific knowledge or multiple MCP server configs.
Tool reference | Configuration | Contributing | Troubleshooting
The common alternative is running one MCP server per data source — a postgres MCP server, a CSV MCP server, a REST MCP server. Each works fine in isolation, but they can't talk to each other.
| mcp-data-pipeline-connector | Separate per-source servers | |
|---|---|---|
| Cross-source joins | Native SQL via embedded DuckDB | Not possible — agent must fetch and join manually |
| Config complexity | One server entry in your MCP config | One entry per source type |
| Query engine | DuckDB in-process — no install, no service | Depends on each source's query capabilities |
| Schema unification | Normalizes all types to string/integer/number/datetime/boolean/json/unknown | Each source uses its own type system |
| Data residency | All queries run locally | Depends on each connector's implementation |
If you're asking questions that span multiple data sources — "join my sales CSV with the users table" — this is the right tool. If you only ever query one source type, a dedicated single-source server is simpler.
mcp-data-pipeline-connector connects to data sources you configure and executes queries against them on behalf of your agent. Ensure agents only have the database permissions they need. Connection strings are never logged or transmitted; keep them out of version-controlled config files. Use environment variables for credentials.
Add the following config to your MCP client:
{
"mcpServers": {
"data-connector": {
"command": "npx",
"args": ["-y", "mcp-data-pipeline-connector@latest"]
}
}
}Define your data sources in ~/.mcp/data-sources.yaml:
sources:
- name: sales
type: csv
path: ~/data/sales-2025.csv
- name: users
type: postgres
connection_string: "${POSTGRES_URL}"
tables: [users, subscriptions]Store connection strings in environment variables, not directly in the YAML file.
Amp · Claude Code · Cline · Cursor · VS Code · Windsurf · Zed
Place a CSV file at ~/data/sample.csv, add it as a source in your config, then enter:
What columns are in the sample table? Show me the first 5 rows.
Your client should return the schema and a preview of the data.
connect_sourcelist_sourceslist_tablesget_schemaquerytransformcheck_health--config / --sources-configPath to the YAML file defining data sources.
Type: string
Default: ~/.mcp/data-sources.yaml
--rest-cache-ttlTime-to-live in seconds for cached REST API responses. Set to 0 to disable caching.
Type: number
Default: 300
--max-rowsMaximum number of rows returned by a single query call. Prevents accidental large result sets.
Type: number
Default: 1000
--read-onlyReject any SQL statements that are not SELECT queries. Enforces read-only access across all sources.
Type: boolean
Default: true
Pass flags via the args property in your JSON config:
{
"mcpServers": {
"data-connector": {
"command": "npx",
"args": ["-y", "mcp-data-pipeline-connector@latest", "--max-rows=5000", "--rest-cache-ttl=60"]
}
}
}Before publishing a new version, verify the server with MCP Inspector to confirm all tools are exposed correctly and the protocol handshake succeeds.
Interactive UI (opens browser):
npm run build && npm run inspectCLI mode (scripted / CI-friendly):
# List all tools
npx @modelcontextprotocol/inspector --cli node dist/index.js --method tools/list
# List resources and prompts
npx @modelcontextprotocol/inspector --cli node dist/index.js --method resources/list
npx @modelcontextprotocol/inspector --cli node dist/index.js --method prompts/list
# Call a tool (example — replace with a relevant read-only tool for this plugin)
npx @modelcontextprotocol/inspector --cli node dist/index.js \
--method tools/call --tool-name list_sources
# Call a tool with arguments
npx @modelcontextprotocol/inspector --cli node dist/index.js \
--method tools/call --tool-name list_sources --tool-arg key=valueRun before publishing to catch regressions in tool registration and runtime startup.
Each connector lives in src/connectors/ and must implement the DataConnector interface. Add fixture data files under tests/fixtures/ for integration tests. Never log connection strings or credentials — sanitize before any output or error message.
npm install && npm testmcp-data-pipeline-connector is listed on MCP Registry and MCP Market.
check_health to retest after startup.source='_all'.truncated: true: Increase --max-rows or add a LIMIT clause to your SQL.