The webglean command-line tool wraps the full API — scrape, crawl, extract, map, search, monitor, and batch scrape — for use directly from your terminal or shell scripts. It's built on top of the Node.js SDK.
Install
npm install -g webglean-cli
This installs a webglean binary. Package page: webglean-cli on npm.
Authenticate
webglean login <api-key> # saved to ~/.webglean/config.json
webglean whoami # show which key/base URL is active and where it came from
webglean logout # remove the saved key
The API key is resolved in this order: --api-key <key> flag → WEBGLEAN_API_KEY env var → webglean login. --base-url / WEBGLEAN_BASE_URL follow the same order for overriding the API host — useful for pointing the CLI at a self-hosted or staging instance.
Commands
# Scrape a single URL — prints Markdown to stdout by default
webglean scrape https://example.com
webglean scrape https://example.com --format html -o page.html
webglean scrape https://example.com --no-only-main-content # keep nav/ads/footers
# Crawl a site (waits for completion by default; --no-wait returns just the job id)
webglean crawl https://example.com --max-depth 2 --max-pages 20
webglean crawl https://example.com --output ./pages # write each page as markdown
webglean crawl-status <id>
# Extract structured fields via Claude
webglean extract https://example.com --prompt "get the product price and title"
webglean extract https://example.com --schema-file ./schema.json
# Discover URLs on a site without scraping content
webglean map https://example.com --max-urls 200
# Search the web, get back scraped Markdown results
webglean search "best espresso machines 2026" --num-results 5
# Monitor a URL for changes
webglean monitor create https://example.com --interval daily
webglean monitor list
webglean monitor get <id>
webglean monitor delete <id>
# Scrape many URLs in one call (waits by default)
webglean batch scrape https://a.com https://b.com https://c.com
webglean batch scrape https://a.com https://b.com --output ./results
webglean batch status <id>
Every command accepts a global --json flag to print raw JSON instead of formatted output, e.g. webglean scrape https://example.com --json.
Errors
Any non-2xx API response prints Error: <message> (HTTP <status>) to stderr and exits with code 1 — safe to check in scripts with $?.
Full reference
| Flag | Applies to | Description |
|---|---|---|
-f, --format <format> | scrape, batch scrape | markdown (default), html, text, or json |
--no-only-main-content | scrape, batch scrape | Include nav, ads, footers, and sidebars |
-o, --output <path> | scrape, crawl, batch scrape | Write to a file (scrape) or directory (crawl, batch scrape) instead of stdout |
--no-wait | crawl, batch scrape | Return the job id immediately instead of polling until it finishes |
--poll-interval <ms> / --timeout <ms> | crawl, batch scrape | Tune the polling loop used when waiting |
--schema-file <path> / --prompt <text> | extract | At least one is required |
WebGlean