Code Pathfinder

Code intelligence MCP server: call graphs, type inference, and symbol search for Python/Go.

113Apache-2.0search

Install

Config snippet generator goes here (5 client tabs)

README

<div align="center">
  <img src="./assets/banner.png" alt="Code Pathfinder - Open-source SAST with cross-file dataflow analysis" width="100%">
</div>

<div align="center">

<h3>Open-source SAST engine that traces vulnerabilities across files and functions</h3>

[Website](https://codepathfinder.dev/) · [Docs](https://codepathfinder.dev/docs/quickstart) · [Rule Registry](https://codepathfinder.dev/registry) · [MCP Server](https://codepathfinder.dev/mcp) · [Blog](https://codepathfinder.dev/blog)

[![Build](https://github.com/shivasurya/code-pathfinder/actions/workflows/build.yml/badge.svg)](https://github.com/shivasurya/code-pathfinder/actions/workflows/build.yml)
[![GitHub Release](https://img.shields.io/github/v/release/shivasurya/code-pathfinder?label=release)](https://github.com/shivasurya/code-pathfinder/releases)
[![Apache-2.0 License](https://img.shields.io/badge/license-Apache--2.0-blue)](https://github.com/shivasurya/code-pathfinder/blob/main/LICENSE)
[![GitHub Stars](https://img.shields.io/github/stars/shivasurya/code-pathfinder?style=flat)](https://github.com/shivasurya/code-pathfinder/stargazers)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/shivasurya/code-pathfinder)

</div>

---

## Quick Start

**Install:**

```bash
brew install shivasurya/tap/pathfinder
```

**Scan a Python project** (rules download automatically):

```bash
pathfinder scan --ruleset python/all --project .
```

**Scan Dockerfiles:**

```bash
pathfinder scan --ruleset docker/all --project .
```

No config files, no API keys, no cloud accounts. Results in your terminal in seconds.

---

<!-- TODO: Add demo video/GIF here -->

## What is Code Pathfinder?

Code Pathfinder is an open-source static analysis engine that builds a graph of your codebase and traces how data flows through it. It parses source code into Abstract Syntax Trees, constructs call graphs across files, and runs taint analysis to find source-to-sink vulnerabilities that span multiple files and function boundaries.

**v2.0** introduces **cross-file dataflow analysis**: trace user input from an HTTP handler in one file through helper functions and into a SQL query in another file. This is the kind of analysis that pattern-matching tools miss entirely.

### Cross-File Taint Analysis

Most open-source SAST tools operate on single files. Code Pathfinder v2.0 tracks tainted data across file boundaries:

```
app.py:5    user_input = request.get("query")     ← Source: user-controlled input
  ↓ calls
db.py:12    cursor.execute(query)                  ← Sink: SQL execution
```

The engine builds a Variable Dependency Graph (VDG) per function, then connects them through inter-procedural taint transfer summaries. When `user_input` flows into a function parameter in another file, the taint propagates through the call graph to the sink.

### How It Works

```
Source Code → Tree-sitter AST → Call Graph → Variable Dependency Graph → Taint Analysis → Findings
                                     ↓
                              Inter-procedural
                              Taint Summaries
                              (cross-file flows)
```

1. **Parse**: Tree-sitter builds ASTs for Python, Dockerfiles, and Docker Compose files
2. **Index**: Extract functions, call sites, parameters, and assignments into a queryable call graph
3. **Analyze**: Build VDGs per function, resolve inter-procedural flows, run taint analysis
4. **Detect**: Python-based security rules query the graph to find source-to-sink paths
5. **Report**: Output findings as text, JSON, SARIF (GitHub Code Scanning), or CSV

## 190 Security Rules, Ready to Use

Rules download from CDN automatically. No need to clone the repo or manage rule files.

| Language | Bundles | Rules | Coverage |
|----------|---------|-------|----------|
| **[Python](https://codepathfinder.dev/registry/python)** | django, flask, aws_lambda, cryptography, jwt, lang, deserialization, pyramid | 158 | SQL injection, RCE, SSRF, path traversal, XSS, deserialization, crypto misuse, JWT vulnerabilities |
| **[Docker](https://codepathfinder.dev/registry/docker)** | security, best-practice, performance | 37 | Root user, exposed secrets, image pinning, multi-stage builds, layer optimization |
| **[Docker Compose](https://codepathfinder.dev/registry/docker-compose)** | security, networking | 10 | Privileged mode, socket exposure, capability escalation, network isolation |

```bash
# Scan with a specific bundle
pathfinder scan --ruleset python/django --project .

# Scan with multiple bundles
pathfinder scan --ruleset python/flask --ruleset python/jwt --project .

# Scan a single rule
pathfinder scan --ruleset python/PYTHON-DJANGO-SEC-001 --project .

# Scan all rules for a language
pathfinder scan --ruleset python/all --project .
```

Browse all rules with examples and test cases at the [Rule Registry](https://codepathfinder.dev/registry).

## MCP Server for AI Coding Assistants

Code Pathfinder runs as an [MCP server](https://codep