ai.smithery/hithereiamaliff-mcp-datagovmy

This MCP server provides seamless access to Malaysia's government open data, including datasets, w…

2MITai-ml

Install

Config snippet generator goes here (5 client tabs)

README

# Malaysia Open Data MCP

**MCP Endpoint:** `https://mcp.techmavie.digital/datagovmy/mcp`

**Analytics Dashboard:** [`https://mcp.techmavie.digital/datagovmy/analytics/dashboard`](https://mcp.techmavie.digital/datagovmy/analytics/dashboard)

MCP (Model Context Protocol) server for Malaysia's Open Data APIs, providing easy access to government datasets and collections.

Do note that this is **NOT** an official MCP server by the Government of Malaysia or anyone from Malaysia's Open Data/Jabatan Digital Negara/Ministry of Digital team.

## Features

- **Enhanced Unified Search** with flexible tokenization and synonym expansion
  - Intelligent query handling with term normalization
  - Support for plurals and common prefixes (e.g., "e" in "epayment")
  - Smart prioritization for different data types
- **Parquet File Support** using pure JavaScript
  - Parse Parquet files directly in the browser or Node.js
  - Support for BROTLI compression
  - Intelligent date field handling for empty date objects
  - Increased row limits (up to 500 rows) for comprehensive data retrieval
  - Fallback to metadata estimation when parsing fails
  - Automatic dashboard URL mapping for visualization
- **Live Data Access Architecture**
  - Real-time index fetching from GitHub (data-gov-my/datagovmy-meta)
  - In-memory caching with configurable TTL
  - Dynamic API calls for detailed metadata
- **Multi-Provider Geocoding**
  - Support for Google Maps, GrabMaps, and Nominatim (OpenStreetMap)
  - Intelligent service selection based on location and available API keys
  - GrabMaps optimization for locations in Malaysia
  - Automatic fallback between providers
- **Comprehensive Data Sources**
  - Malaysia's Data Catalogue with rich metadata
  - Interactive Dashboards for data visualization
  - Department of Statistics Malaysia (DOSM) data
  - Weather forecast and warnings
  - Public transport and GTFS data
- **Multi-Provider Malaysian Geocoding**
  - Optimized for Malaysian addresses and locations
  - Three-tier geocoding system: GrabMaps, Google Maps, and Nominatim
  - Prioritizes local knowledge with GrabMaps for better Malaysian coverage
  - Automatic fallback to Nominatim when no API keys are provided

## Architecture

This MCP server fetches dataset and dashboard metadata live from the [data-gov-my/datagovmy-meta](https://github.com/data-gov-my/datagovmy-meta) GitHub repository:

- **Live GitHub Indexes** — Fetches all dataset and dashboard metadata via the GitHub Trees API and raw content URLs
- **Cache Pre-Warming** — Indexes are fetched immediately on server startup, so the first user request is fast
- **In-Memory Caching** — Indexes are cached in memory with a configurable TTL (default: 1 hour)
- **Background Refresh** — When cache expires, stale data is served instantly while a background refresh fetches updated indexes. Users never experience fetch delays after the initial startup.
- **Dynamic Detail Fetching** — Individual dataset/dashboard details are fetched on-demand from GitHub raw content

This approach provides several benefits:
- Always up-to-date with the latest datasets and dashboards
- No static data that goes stale
- Zero-latency responses (pre-warmed cache + background refresh)
- Consistent data access patterns

## Documentation

- **[TOOLS.md](./TOOLS.md)** - Detailed information about available tools and best practices
- **[PROMPT.md](./PROMPT.md)** - AI integration guidelines and usage patterns

## AI Integration

When integrating this MCP server with AI models:

1. **Use the unified search tool first** - Always start with `search_all` for any data queries
2. **Follow the correct URL patterns** - Use `https://data.gov.my/...` and `https://open.dosm.gov.my/...`
3. **Leverage Parquet file tools** - Use `parse_parquet_file` to access data directly or `get_parquet_info` for metadata
4. **Live indexes** - Dataset and dashboard indexes are fetched live from GitHub and cached in memory
5. **Consider dashboard visualization** - For complex data, use the dashboard links provided by `find_dashboard_for_parquet`
6. **Leverage the multi-provider Malaysian geocoding** - For Malaysian location queries, the system automatically selects the best provider (GrabMaps, Google Maps, or Nominatim) with fallback to Nominatim when no API keys are configured

Refer to [PROMPT.md](./PROMPT.md) for comprehensive AI integration guidelines.

## Installation

```bash
npm install
```

## Quick Start (Hosted Server)

The easiest way to use this MCP server is via the hosted endpoint. **No installation required!**

**Server URL:**
```
https://mcp.techmavie.digital/datagovmy/mcp
```

#### Using Your Own API Keys

You can provide your own API keys via URL query parameters:

```
https://mcp.techmavie.digital/datagovmy/mcp?googleMapsApiKey=YOUR_KEY
```

Or via headers:
- `X-Google-Maps-Api-Key: YOUR_KEY`
- `X-GrabMaps-Api-Key: YOUR_KEY`
- `X-AWS-Access-Key-Id: YOUR_KEY`
- `X-AWS-Secret-Access-Key: YOUR_KEY`
- `X-AWS-Region: ap-southeast-5`

**Supporte