Skip to content

Quick Start Guide

Before we begin, make sure you have:

Terminal window
git clone https://github.com/mpmeetpatel/sniffhunt-scraper.git
cd sniffhunt-scraper
Terminal window
bun install

This installs all dependencies for the entire workspace including all apps.

Terminal window
cp .env.example .env

Edit the .env file and add your Gemini API key:

.env
# Required
GOOGLE_GEMINI_KEY=your_actual_api_key_here
# Optional (You can provide multiple keys here to avoid rate limiting & load balancing)
GOOGLE_GEMINI_KEY1=your_alternative_key_1
GOOGLE_GEMINI_KEY2=your_alternative_key_2
GOOGLE_GEMINI_KEY3=your_alternative_key_3
# Optional (defaults shown)
PORT=8080
MAX_RETRY_COUNT=2
RETRY_DELAY=1000
PAGE_TIMEOUT=10000
CORS_ORIGIN=*

Choose your preferred way to use SniffHunt:

Perfect for interactive use and web application integration.

Terminal window
bun run dev:server

The server will start on http://localhost:8080

Terminal window
# In a new terminal
bun run dev:web

Open http://localhost:6001 in your browser for the beautiful web interface.

Terminal window
# Test the API
curl -X POST http://localhost:8080/scrape-sync \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "mode": "normal"}'

Integrate SniffHunt directly with Claude Desktop, Cursor, or other MCP-compatible AI tools.

Terminal window
bun run setup:mcp

This builds the MCP server and makes it globally available.

Add this to your MCP client configuration (e.g., Cursor, Windsurf, VSCode, Claude Desktop):

mcp_config.json
{
"mcpServers": {
"sniffhunt-scraper": {
"command": "npx",
"args": ["-y", "sniffhunt-scraper-mcp-server"],
"env": {
"GOOGLE_GEMINI_KEY": "your-api-key-here"
}
}
}
}

Restart your AI client and try asking:

Scrape https://anu-vue.netlify.app/guide/components/alert.html & grab the ‘Outlined Alert Code snippets’

The AI will automatically use SniffHunt to extract the content!

Perfect for automation, scripting, and one-off extractions.

Terminal window
# Scrape any website
bun run cli:scraper https://anu-vue.netlify.app/guide/components/alert.html
# Output saved as:
scraped.raw.md or scraped.md # (based on mode and query automatically generated name)
scraped.html
Terminal window
# Use normal mode for static sites
bun run cli:scraper https://anu-vue.netlify.app/guide/components/alert.html --mode normal
# Use beast mode for complex sites
bun run cli:scraper https://anu-vue.netlify.app/guide/components/alert.html --query "Grab the Outlined Alert Code snippets" --mode beast
# Add semantic query for focused extraction & Custom output filename
bun run cli:scraper https://anu-vue.netlify.app/guide/components/alert.html --query "Grab the Outlined Alert Code snippets" --output my-content
Terminal window
# Check if API server is running
curl http://localhost:8080/health
# Should return:
{
"status": "healthy",
"service": "SniffHunt Scraper API",
"version": "1.0.0"
}
Terminal window
curl -X POST http://localhost:8080/scrape-sync \
-H "Content-Type: application/json" \
-d '{
"url": "https://anu-vue.netlify.app/guide/components/alert.html",
"mode": "normal",
"query": "Grab the Outlined Alert Code snippets"
}'
Terminal window
bun run cli:scraper https://anu-vue.netlify.app/guide/components/alert.html --query "Grab the Outlined Alert Code snippets"
  1. Open http://localhost:6001
  2. Enter URL: https://anu-vue.netlify.app/guide/components/alert.html
  3. Select mode: “Normal”
  4. Add query: “Grab the Outlined Alert Code snippets”
  5. Click “Extract Content”

Ready to extract some content? Choose your preferred integration method above and start scraping! 🚀