Octivas Docs

Python

Install and use the official Octivas Python SDK for web extraction, crawling, and search.

The official Python SDK provides a typed interface for the Octivas API. The current release on PyPI is 0.1.3 (pip install octivas).

Installation

pip install octivas

Requires Python 3.9 or newer.

Quick Start

from octivas import Octivas

client = Octivas(api_key="your_api_key")

Use the client as a context manager so the HTTP connection is closed cleanly:

from octivas import Octivas

with Octivas(api_key="your_api_key") as client:
    result = client.scrape("https://example.com")
    print(result.markdown)

Scrape a Page

Use scrape to fetch a single URL. Optional formats selects which representations to return (for example markdown, html, summary). See the scrape endpoint for all options.

result = client.scrape(
    "https://example.com",
    formats=["markdown", "html"],
)

print(result.markdown)
if result.metadata:
    print(result.metadata.title)

Crawl a Website

Crawling runs as a job on the server: crawl submits work and returns a job id immediately. For a blocking flow, use crawl_and_wait, which polls until the job finishes (or times out).

Parameters use url (start URL) and limit (maximum pages), not start_url / max_pages.

status = client.crawl_and_wait(
    "https://docs.example.com",
    limit=50,
)

if status.status == "completed" and status.results:
    # Stored crawl payload: pages_crawled, pages, credits_used, etc.
    payload = status.results
    pages = payload.get("pages", [])
    print(f"Crawled {payload.get('pages_crawled', len(pages))} pages")
    for page in pages:
        meta = page.get("metadata") or {}
        print(f"  {page.get('url')}: {meta.get('title')}")
elif status.status == "failed" and status.error:
    print(status.error.message)

To submit without blocking, call crawl, then poll with get_job(job_id, include_results=True) or use wait_for_crawl(job_id).

Search the Web

results = client.search(
    "python web scraping best practices",
    limit=10,
)

for item in results.results:
    print(item.title or item.url)
    print(item.url)
    if item.markdown:
        print(item.markdown[:200])
    print("---")

Optional parameters include tbs (time range), location, country, and only_main_content. See the search endpoint.

Map URLs on a Site

Discover links without scraping full page content:

mapped = client.map("https://example.com", limit=100)
for link in mapped.links:
    print(link.url, link.title)

Async usage

For asyncio code, use AsyncOctivas with the same methods (await client.scrape(...), await client.search(...), etc.):

from octivas import AsyncOctivas

async def main():
    async with AsyncOctivas(api_key="your_api_key") as client:
        result = await client.scrape("https://example.com", formats=["markdown"])
        print(result.markdown)

Complete Example

from octivas import Octivas

with Octivas(api_key="your_api_key") as client:
    page = client.scrape("https://example.com", formats=["markdown"])
    print(page.markdown)

    crawl = client.crawl_and_wait("https://example.com", limit=10)
    if crawl.status == "completed" and crawl.results:
        for p in crawl.results.get("pages", []):
            print(p.get("url"))

    found = client.search("python tutorials", limit=5)
    for r in found.results:
        print(r.title or r.url)

Error Handling

from octivas import Octivas
from octivas.exceptions import AuthenticationError, RateLimitError

with Octivas(api_key="your_api_key") as client:
    try:
        result = client.scrape("https://example.com")
    except AuthenticationError:
        print("Invalid API key")
    except RateLimitError:
        print("Rate limit exceeded, please wait")
    except Exception as e:
        print(f"An error occurred: {e}")

Other typed exceptions include BadRequestError, NotFoundError, ForbiddenError, and ServerError.

On this page