Python
Install and use the official Octivas Python SDK for web extraction, crawling, and search.
The official Python SDK provides a typed interface for the Octivas API. The current release on PyPI is 0.1.3 (pip install octivas).
Installation
pip install octivasRequires Python 3.9 or newer.
Quick Start
from octivas import Octivas
client = Octivas(api_key="your_api_key")Use the client as a context manager so the HTTP connection is closed cleanly:
from octivas import Octivas
with Octivas(api_key="your_api_key") as client:
result = client.scrape("https://example.com")
print(result.markdown)Scrape a Page
Use scrape to fetch a single URL. Optional formats selects which representations to return (for example markdown, html, summary). See the scrape endpoint for all options.
result = client.scrape(
"https://example.com",
formats=["markdown", "html"],
)
print(result.markdown)
if result.metadata:
print(result.metadata.title)Crawl a Website
Crawling runs as a job on the server: crawl submits work and returns a job id immediately. For a blocking flow, use crawl_and_wait, which polls until the job finishes (or times out).
Parameters use url (start URL) and limit (maximum pages), not start_url / max_pages.
status = client.crawl_and_wait(
"https://docs.example.com",
limit=50,
)
if status.status == "completed" and status.results:
# Stored crawl payload: pages_crawled, pages, credits_used, etc.
payload = status.results
pages = payload.get("pages", [])
print(f"Crawled {payload.get('pages_crawled', len(pages))} pages")
for page in pages:
meta = page.get("metadata") or {}
print(f" {page.get('url')}: {meta.get('title')}")
elif status.status == "failed" and status.error:
print(status.error.message)To submit without blocking, call crawl, then poll with get_job(job_id, include_results=True) or use wait_for_crawl(job_id).
Search the Web
results = client.search(
"python web scraping best practices",
limit=10,
)
for item in results.results:
print(item.title or item.url)
print(item.url)
if item.markdown:
print(item.markdown[:200])
print("---")Optional parameters include tbs (time range), location, country, and only_main_content. See the search endpoint.
Map URLs on a Site
Discover links without scraping full page content:
mapped = client.map("https://example.com", limit=100)
for link in mapped.links:
print(link.url, link.title)Async usage
For asyncio code, use AsyncOctivas with the same methods (await client.scrape(...), await client.search(...), etc.):
from octivas import AsyncOctivas
async def main():
async with AsyncOctivas(api_key="your_api_key") as client:
result = await client.scrape("https://example.com", formats=["markdown"])
print(result.markdown)Complete Example
from octivas import Octivas
with Octivas(api_key="your_api_key") as client:
page = client.scrape("https://example.com", formats=["markdown"])
print(page.markdown)
crawl = client.crawl_and_wait("https://example.com", limit=10)
if crawl.status == "completed" and crawl.results:
for p in crawl.results.get("pages", []):
print(p.get("url"))
found = client.search("python tutorials", limit=5)
for r in found.results:
print(r.title or r.url)Error Handling
from octivas import Octivas
from octivas.exceptions import AuthenticationError, RateLimitError
with Octivas(api_key="your_api_key") as client:
try:
result = client.scrape("https://example.com")
except AuthenticationError:
print("Invalid API key")
except RateLimitError:
print("Rate limit exceeded, please wait")
except Exception as e:
print(f"An error occurred: {e}")Other typed exceptions include BadRequestError, NotFoundError, ForbiddenError, and ServerError.