Back to Directory/Developer Tools

CLIO Hdf5

HDF5 FastMCP - Scientific Data Access for AI Agents | CLIO Kit MCP Server

Developer ToolsPythonv2.0.1

CLIO Kit

License: BSD-3-Clause PyPI version Python FastMCP CI Coverage

MCP Servers Ruff Type Checked Package Manager Security Audit

CLIO Kit - Part of the IoWarp platform's tooling layer for AI agents. A comprehensive collection of tools, skills, plugins, and extensions. Currently featuring 15+ Model Context Protocol (MCP) servers for scientific computing, with plans to expand to additional agent capabilities. Enables AI agents to interact with HPC resources, scientific data formats, and research datasets.

Website | IOWarp

Chat with us on Zulip or join us

Developed by <img src="https://grc.iit.edu/img/logo.png" alt="GRC Logo" width="18" height="18"> Gnosis Research Center


❌ Without CLIO Kit

Working with scientific data and HPC resources requires manual scripting and tool-specific knowledge:

  • ❌ Write custom scripts for every HDF5/Parquet file exploration
  • ❌ Manually craft Slurm job submission scripts
  • ❌ Switch between multiple tools for data analysis
  • ❌ No AI assistance for scientific workflows
  • ❌ Repetitive coding for common research tasks

✅ With CLIO Kit

AI agents handle scientific computing tasks through natural language:

  • "Analyze the temperature dataset in this HDF5 file" - HDF5 MCP does it
  • "Submit this simulation to Slurm with 32 cores" - Slurm MCP handles it
  • "Find papers on neural networks from ArXiv" - ArXiv MCP searches
  • "Plot the results from this CSV file" - Plot MCP visualizes
  • "Optimize memory usage for this pandas DataFrame" - Pandas MCP optimizes
  • "Find all documents where pressure exceeds 200 kPa" - Agentic Search retrieves

One unified interface. 16 MCP servers. Hybrid search engine. 150+ specialized tools. Built for research.

CLIO Kit is part of the IoWarp platform's comprehensive tooling ecosystem for AI agents. It brings AI assistance to your scientific computing workflow—whether you're analyzing terabytes of HDF5 data, managing Slurm jobs across clusters, or exploring research papers. Built by researchers, for researchers, at Illinois Institute of Technology with NSF support.

Part of IoWarp Platform: CLIO Kit is the tooling layer of the IoWarp platform, providing skills, plugins, and extensions for AI agents working in scientific computing environments.

One simple command. Production-ready, fully typed, MIT licensed, and beta-tested in real HPC environments.

🚀 Quick Installation

One Command for Any Server

bash
# List all 16 available MCP servers
uvx clio-kit mcp-servers

# Run any server instantly
uvx clio-kit mcp-server hdf5
uvx clio-kit mcp-server pandas
uvx clio-kit mcp-server slurm

# Agentic search — hybrid retrieval for scientific corpora
uvx clio-kit search serve               # Start search API server
uvx clio-kit search query --namespace local_fs --q "pressure > 200 kPa"

# AI prompts also available
uvx clio-kit prompts                    # List all prompts
uvx clio-kit prompt code-coverage-prompt # Use a prompt
<details> <summary><b>Install in Cursor</b></summary>

Add to your Cursor ~/.cursor/mcp.json:

json
{
  "mcpServers": {
    "hdf5-mcp": {
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "hdf5"]
    },
    "pandas-mcp": {
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "pandas"]
    },
    "slurm-mcp": {
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "slurm"]
    }
  }
}

See Cursor MCP docs for more info.

</details> <details> <summary><b>Install in Claude Code</b></summary>
bash
# Add HDF5 MCP
claude mcp add hdf5-mcp -- uvx clio-kit mcp-server hdf5

# Add Pandas MCP
claude mcp add pandas-mcp -- uvx clio-kit mcp-server pandas

# Add Slurm MCP
claude mcp add slurm-mcp -- uvx clio-kit mcp-server slurm

See Claude Code MCP docs for more info.

</details> <details> <summary><b>Install in VS Code</b></summary>

Add to your VS Code MCP config:

json
"mcp": {
  "servers": {
    "hdf5-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "hdf5"]
    },
    "pandas-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "pandas"]
    }
  }
}

See VS Code MCP docs for more info.

</details> <details> <summary><b>Install in Claude Desktop</b></summary>

Edit claude_desktop_config.json:

json
{
  "mcpServers": {
    "hdf5-mcp": {
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "hdf5"]
    },
    "arxiv-mcp": {
      "command": "uvx",
      "args": ["clio-kit", "mcp-server", "arxiv"]
    }
  }
}

See Claude Desktop MCP docs for more info.

</details>

Available Packages

<div align="center">
📦 Package📌 Ver🔧 System📋 DescriptionInstall Command
adios2.0.1Data I/ORead data using ADIOS2 engineuvx clio-kit mcp-server adios
arxiv2.0.1ResearchFetch research papers from ArXivuvx clio-kit mcp-server arxiv
chronolog2.0.1LoggingLog and retrieve data from ChronoLoguvx clio-kit mcp-server chronolog
compression2.0.1UtilitiesFile compression with gzipuvx clio-kit mcp-server compression
darshan2.0.1PerformanceI/O performance trace analysisuvx clio-kit mcp-server darshan
hdf52.0.1Data I/OHPC-optimized scientific data with 27 tools, AI insights, caching, streaminguvx clio-kit mcp-server hdf5
jarvis2.0.1WorkflowData pipeline lifecycle managementuvx clio-kit mcp-server jarvis
lmod2.0.1EnvironmentEnvironment module managementuvx clio-kit mcp-server lmod
ndp2.0.1Data ProtocolSearch and discover datasets across CKAN instancesuvx clio-kit mcp-server ndp
node-hardware2.0.1SystemSystem hardware informationuvx clio-kit mcp-server node-hardware
pandas2.0.1Data AnalysisCSV data loading and filteringuvx clio-kit mcp-server pandas
parallel-sort2.0.1ComputingLarge file sortinguvx clio-kit mcp-server parallel-sort
paraview2.0.1VisualizationScientific 3D visualization and analysisuvx clio-kit mcp-server paraview
parquet2.0.1Data I/ORead Parquet file columnsuvx clio-kit mcp-server parquet
plot2.0.1VisualizationGenerate plots from CSV datauvx clio-kit mcp-server plot
slurm2.0.1HPCJob submission and managementuvx clio-kit mcp-server slurm
</div>

Agentic Search

Hybrid retrieval engine for scientific corpora — combines lexical (BM25), vector, graph, and scientific search (numeric range, unit matching, formula targeting) over namespaced document collections. DuckDB storage, FastAPI, async job queue, OpenTelemetry tracing, Prometheus metrics.

bash
# Start the search API server
uvx clio-kit search serve

# Index documents from a namespace
uvx clio-kit search index --namespace local_fs

# Query with scientific operators
uvx clio-kit search query --namespace local_fs --q "pressure between 190 and 360 kPa"

# List indexed documents
uvx clio-kit search list --namespace local_fs

API endpoints: /query, /jobs/index, /documents, /health, /metricsfull docs


📖 Usage Examples

HDF5: Scientific Data Analysis

"What datasets are in climate_simulation.h5? Show me the temperature field structure and read the first 100 timesteps."

Tools used: open_file, analyze_dataset_structure, read_partial_dataset, list_attributes

Slurm: HPC Job Management

"Submit simulation.py to Slurm with 32 cores, 64GB memory, 24-hour runtime. Monitor progress and retrieve output when complete."

Tools used: submit_slurm_job, check_job_status, get_job_output

ArXiv: Research Discovery

"Find the latest papers on diffusion models from ArXiv, get details on the top 3, and export citations to BibTeX."

Tools used: search_arxiv, get_paper_details, export_to_bibtex, download_paper_pdf

Pandas: Data Processing

"Load sales_data.csv, clean missing values, compute statistics by region, and save as Parquet with compression."

Tools used: load_data, handle_missing_data, groupby_operations, save_data

Plot: Data Visualization

"Create a line plot showing temperature trends over time from weather.csv with proper axis labels."

Tools used: line_plot, data_info

Agentic Search: Scientific Retrieval

"Find all chunks mentioning pressure above 200 kPa in the local_fs namespace."

CLI: uvx clio-kit search query --namespace local_fs --q "pressure > 200 kPa"


🚨 Troubleshooting

<details> <summary><b>Server Not Found Error</b></summary>

If uvx clio-kit mcp-server <server-name> fails:

bash
# Verify server name is correct
uvx clio-kit mcp-servers

# Common names: hdf5, pandas, slurm, arxiv (not hdf5-mcp, pandas-mcp)
</details> <details> <summary><b>Import Errors or Missing Dependencies</b></summary>

For development or local testing:

bash
cd clio-kit-mcp-servers/hdf5
uv sync --all-extras --dev
uv run hdf5-mcp
</details> <details> <summary><b>uvx Command Not Found</b></summary>

Install uv package manager:

bash
# Linux/macOS
curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

# Or via pip
pip install uv
</details>

Team

Sponsored By

<img src="https://www.nsf.gov/themes/custom/nsf_theme/components/molecules/logo/logo-desktop.png" alt="NSF Logo" width="24" height="24"> NSF (National Science Foundation) - Supporting scientific computing research and AI integration initiatives

we welcome more sponsorships. please contact the Principal Investigator

Ways to Contribute

  • Submit Issues: Report bugs or request features via GitHub Issues
  • Develop New MCPs: Add servers for your research tools (CONTRIBUTING.md)
  • Improve Documentation: Help make guides clearer
  • Share Use Cases: Tell us how you're using CLIO Kit in your research

Full Guide: CONTRIBUTING.md

Community & Support


Learn More