ai.smithery/ScrapeGraphAI-scrapegraph-mcp

Enable language models to perform advanced AI-powered web scraping with enterprise-grade reliabili…

59MITsearch

Install

Config snippet generator goes here (5 client tabs)

README

# ScrapeGraph MCP Server

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python 3.13+](https://img.shields.io/badge/python-3.13+-blue.svg)](https://www.python.org/downloads/)
[![smithery badge](https://smithery.ai/badge/@ScrapeGraphAI/scrapegraph-mcp)](https://smithery.ai/server/@ScrapeGraphAI/scrapegraph-mcp)


A production-ready [Model Context Protocol](https://modelcontextprotocol.io/introduction) (MCP) server that provides seamless integration with the [ScrapeGraph AI](https://scrapegraphai.com) API. This server enables language models to leverage advanced AI-powered web scraping capabilities with enterprise-grade reliability.

[![API Banner](https://raw.githubusercontent.com/ScrapeGraphAI/Scrapegraph-ai/main/docs/assets/api_banner.png)](https://scrapegraphai.com/?utm_source=github&utm_medium=readme&utm_campaign=api_banner&utm_content=api_banner_image)
## Table of Contents

- [Key Features](#key-features)
- [Quick Start](#quick-start)
- [Available Tools](#available-tools)
- [Setup Instructions](#setup-instructions)
- [Remote Server Usage](#remote-server-usage)
- [Local Usage](#local-usage)
- [Google ADK Integration](#google-adk-integration)
- [Example Use Cases](#example-use-cases)
- [Error Handling](#error-handling)
- [Common Issues](#common-issues)
- [Development](#development)
- [Contributing](#contributing)
- [Documentation](#documentation)
- [Technology Stack](#technology-stack)
- [License](#license)

## Key Features

- **8 Powerful Tools**: From simple markdown conversion to complex multi-page crawling and agentic workflows
- **AI-Powered Extraction**: Intelligently extract structured data using natural language prompts
- **Multi-Page Crawling**: SmartCrawler supports asynchronous crawling with configurable depth and page limits
- **Infinite Scroll Support**: Handle dynamic content loading with configurable scroll counts
- **JavaScript Rendering**: Full support for JavaScript-heavy websites
- **Flexible Output Formats**: Get results as markdown, structured JSON, or custom schemas
- **Easy Integration**: Works seamlessly with Claude Desktop, Cursor, and any MCP-compatible client
- **Enterprise-Ready**: Robust error handling, timeout management, and production-tested reliability
- **Simple Deployment**: One-command installation via Smithery or manual setup
- **Comprehensive Documentation**: Detailed developer docs in `.agent/` folder

## Quick Start

### 1. Get Your API Key

Sign up and get your API key from the [ScrapeGraph Dashboard](https://dashboard.scrapegraphai.com)

### 2. Install with Smithery (Recommended)

```bash
npx -y @smithery/cli install @ScrapeGraphAI/scrapegraph-mcp --client claude
```

### 3. Start Using

Ask Claude or Cursor:
- "Convert https://scrapegraphai.com to markdown"
- "Extract all product prices from this e-commerce page"
- "Research the latest AI developments and summarize findings"

That's it! The server is now available to your AI assistant.

## Available Tools

The server provides **8 enterprise-ready tools** for AI-powered web scraping:

### Core Scraping Tools

#### 1. `markdownify`
Transform any webpage into clean, structured markdown format.

```python
markdownify(website_url: str)
```
- **Credits**: 2 per request
- **Use case**: Quick webpage content extraction in markdown

#### 2. `smartscraper`
Leverage AI to extract structured data from any webpage with support for infinite scrolling.

```python
smartscraper(
    user_prompt: str,
    website_url: str,
    number_of_scrolls: int = None,
    markdown_only: bool = None
)
```
- **Credits**: 10+ (base) + variable based on scrolling
- **Use case**: AI-powered data extraction with custom prompts

#### 3. `searchscraper`
Execute AI-powered web searches with structured, actionable results.

```python
searchscraper(
    user_prompt: str,
    num_results: int = None,
    number_of_scrolls: int = None,
    time_range: str = None  # Filter by: past_hour, past_24_hours, past_week, past_month, past_year
)
```
- **Credits**: Variable (3-20 websites × 10 credits)
- **Use case**: Multi-source research and data aggregation
- **Time filtering**: Use `time_range` to filter results by recency (e.g., `"past_week"` for recent results)

### Advanced Scraping Tools

#### 4. `scrape`
Basic scraping endpoint to fetch page content with optional heavy JavaScript rendering.

```python
scrape(website_url: str, render_heavy_js: bool = None)
```
- **Use case**: Simple page content fetching with JS rendering support

#### 5. `sitemap`
Extract sitemap URLs and structure for any website.

```python
sitemap(website_url: str)
```
- **Use case**: Website structure analysis and URL discovery

### Multi-Page Crawling

#### 6. `smartcrawler_initiate`
Initiate intelligent multi-page web crawling (asynchronous operation).

```python
smartcrawler_initiate(
    url: str,
    prompt: str = None,
    extraction_mode: str = "ai",
    depth: int = None,
    max_pages: int = None,
    same_domain_