POST /v1/map

Returns every URL discoverable on a site. Tries sitemap.xml first for speed, falls back to BFS link extraction via Playwright if no sitemap is found.

Cost: 1 credit per job

Request

curl -X POST https://www.webglean.com/v1/map \
  -H "Authorization: Bearer wg_your_key" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "maxUrls": 100,
    "search": "/blog/"
  }'

Body parameters

ParameterTypeDefaultDescription
urlstringrequiredThe root domain or URL to map
maxUrlsnumber100Maximum number of URLs to return (up to 5000)
searchstringoptionalFilter — only return URLs containing this string

Response

{
  "success": true,
  "links": [
    "https://example.com/",
    "https://example.com/about",
    "https://example.com/blog/post-1",
    "https://example.com/blog/post-2"
  ],
  "total": 4
}

Use cases

  • Discover all blog posts before crawling them
  • Build a site index for search or RAG
  • Find all product pages on an e-commerce site
  • Audit a site's URL structure

Notes

  • Sitemap fast path: if sitemap.xml or sitemap_index.xml exists, results return in under 1 second
  • BFS fallback: if no sitemap, Playwright visits up to 5 pages and extracts links — covers dynamic sites
  • The search filter is applied after URL discovery, not during — it does not affect credit cost

Errors

CodeReason
401Invalid API key
402Insufficient credits
429Rate limit exceeded
400Missing or invalid url, or the target domain doesn't exist
504Map timed out — try again
502The target site refused the connection, blocked automated access, or had an SSL error
500Map failed for another reason