{k}

Crawling of Entire Websites

Adrian Krebs,Co-Founder & CEO of Kadoa
20 February 2025
Back to changelog

Today we're launching Kadoa Crawling, a powerful addition to our platform that converts entire websites into a clean and LLM-ready format. Now, you can easily connect the web to your agent or LLM.

While our workflows with automated navigation work for most use cases, some applications require processing entire websites:

  • Content repositories
  • Product catalogs for market intelligence
  • Document archives
  • Documentation sites or knowledge bases

Our simple crawling API allows you to crawl all accessible subpages of a website and convert them into markdown.

curl -X POST "https://api.kadoa.com/v4/crawl" \
  -H "x-api-key: YOUR_API_KEY" \
  -d '{
    "url": "https://docs.example.com",
    "maxDepth": 10
  }'

Kadoa handles the entire pipeline:

  • Smart site navigation and discovery
  • JavaScript rendering
  • Content extraction
  • Markdown conversion
  • Metadata enrichment
  • Rate limiting and retry logic

The result is clean and structured data ready for your LLM applications. See it in action below:

Get started

Crawling is available via API today as part of our platform and we will soon seamlessly integrate it into our standard workflow setup process.

Visit our documentation to learn more or contact us to discuss your use case.

Feedback

Where are you struggling the most when using unstructured data? How might Kadoa help you? Send us your thoughts, ideas, concerns via the feedback form.