The webglean command-line tool wraps the full API — scrape, crawl, extract, map, search, monitor, and batch scrape — for use directly from your terminal or shell scripts. It's built on top of the Node.js SDK.

Install

npm install -g webglean-cli

This installs a webglean binary. Package page: webglean-cli on npm.

Authenticate

webglean login <api-key>     # saved to ~/.webglean/config.json
webglean whoami               # show which key/base URL is active and where it came from
webglean logout                # remove the saved key

The API key is resolved in this order: --api-key <key> flag → WEBGLEAN_API_KEY env var → webglean login. --base-url / WEBGLEAN_BASE_URL follow the same order for overriding the API host — useful for pointing the CLI at a self-hosted or staging instance.

Commands

# Scrape a single URL — prints Markdown to stdout by default
webglean scrape https://example.com
webglean scrape https://example.com --format html -o page.html
webglean scrape https://example.com --no-only-main-content   # keep nav/ads/footers

# Crawl a site (waits for completion by default; --no-wait returns just the job id)
webglean crawl https://example.com --max-depth 2 --max-pages 20
webglean crawl https://example.com --output ./pages          # write each page as markdown
webglean crawl-status <id>

# Extract structured fields via Claude
webglean extract https://example.com --prompt "get the product price and title"
webglean extract https://example.com --schema-file ./schema.json

# Discover URLs on a site without scraping content
webglean map https://example.com --max-urls 200

# Search the web, get back scraped Markdown results
webglean search "best espresso machines 2026" --num-results 5

# Monitor a URL for changes
webglean monitor create https://example.com --interval daily
webglean monitor list
webglean monitor get <id>
webglean monitor delete <id>

# Scrape many URLs in one call (waits by default)
webglean batch scrape https://a.com https://b.com https://c.com
webglean batch scrape https://a.com https://b.com --output ./results
webglean batch status <id>

Every command accepts a global --json flag to print raw JSON instead of formatted output, e.g. webglean scrape https://example.com --json.

Errors

Any non-2xx API response prints Error: <message> (HTTP <status>) to stderr and exits with code 1 — safe to check in scripts with $?.

Full reference

FlagApplies toDescription
-f, --format <format>scrape, batch scrapemarkdown (default), html, text, or json
--no-only-main-contentscrape, batch scrapeInclude nav, ads, footers, and sidebars
-o, --output <path>scrape, crawl, batch scrapeWrite to a file (scrape) or directory (crawl, batch scrape) instead of stdout
--no-waitcrawl, batch scrapeReturn the job id immediately instead of polling until it finishes
--poll-interval <ms> / --timeout <ms>crawl, batch scrapeTune the polling loop used when waiting
--schema-file <path> / --prompt <text>extractAt least one is required