hCaptcha solving for web scraping.

When a scrape target throws up an hCaptcha gate, you read its data-sitekey and page URL, POST them to NoneCap, get back a real P1_ token, and submit that token as the form’s h-captcha-response. Then you keep scraping. You run no headless browser and no image-recognition model on your side.

The pipeline, end to end

An hCaptcha gate is a small, fixed handshake. Your scraper detects it, hands the two identifying values to NoneCap, and submits the token NoneCap returns. The token does the unlocking; the rest of your scrape is unchanged.

hCaptcha solving pipeline for a scraper
Step	What you do	NoneCap field
1 · Detect	See the `h-captcha` div / `hcaptcha.com/1/api.js` in the HTML	(none)
2 · Read	Pull `data-sitekey` and the page URL	`sitekey`, `url`
3 · Solve	POST the job, block with `?wait` or poll	`POST /v1/solves`
4 · Token	Read the `P1_` string off the solved object	`token`
5 · Inject	Send it as the form’s `h-captcha-response`	(none)
6 · Continue	Scrape the now-unlocked response	(none)

1. Detect the gate and read the sitekey

hCaptcha hydrates a placeholder div client-side. In the HTML you already fetched, look for an h-captcha class or a reference to hcaptcha.com/1/api.js, then pull data-sitekey off the widget. That sitekey plus the page url are the only two values NoneCap needs.

What the gate looks like

# the gate is just a div hCaptcha hydrates client-side:
<div class="h-captcha" data-sitekey="f5ab1c2d-7e8f-4a9b-b1c2-d3e4f5a6b7c8"></div>

# the same sitekey is also passed to the hcaptcha.com/1/api.js script tag,
# or to grecaptcha-compat / hcaptcha.render() in the page's JS bundle.

Avoid hard-coding the sitekey: sites rotate them, and enterprise deployments bind the challenge to the page, so always read it from the live response.

2. POST the solve and read the token

Create a solve with type: "hcaptcha", the sitekey, and the url. Pass ?wait=N (up to 90s) to block until the solve reaches a terminal state and return the token inline, so a scraper worker is a single request instead of a submit-then-poll loop:

Solved object

{
  "id": "solve_01HQF7K3JKWZX",
  "object": "solve",
  "type": "hcaptcha",
  "status": "solved",
  "sitekey": "f5ab1c2d-7e8f-4a9b-b1c2-d3e4f5a6b7c8",
  "url": "https://target.example/listings",
  "token": "P1_eyJ0eXAi...UV8w",
  "error": null,
  "credits_charged": 1
}

The token lives in the token field whenever status is solved. For long-running crawls you can drop ?wait and pass a webhook_url, or poll GET /v1/solves/{id} instead. See the API reference for every field and language.

3. Inject the token and continue

The returned P1_ string is what the page’s own JavaScript would have produced after a human solved the challenge. Submit it as the h-captcha-response form field on the request that was gated, then carry on scraping the unlocked response.

Here is the whole loop in one Python file using requests:

scrape_gated.py

import os
import requests

NONECAP_KEY = os.environ["NONECAP_KEY"]
session = requests.Session()


def solve_hcaptcha(sitekey: str, url: str) -> str:
    """Mint a real hCaptcha token via NoneCap and return the P1_ string."""
    r = requests.post(
        "https://api.nonecap.com/v1/solves",
        headers={"Authorization": f"Bearer {NONECAP_KEY}"},
        params={"wait": 90},                 # block up to 90s for a terminal state
        json={"type": "hcaptcha", "sitekey": sitekey, "url": url},
        timeout=120,
    )
    r.raise_for_status()
    solve = r.json()
    if solve["status"] != "solved":
        raise RuntimeError(f"solve {solve['status']}: {solve.get('error')}")
    return solve["token"]                    # a real P1_… hCaptcha token


def scrape_gated_page(url: str):
    # 1. fetch the page and detect the hCaptcha gate
    page = session.get(url, timeout=30)
    if "h-captcha" not in page.text and "hcaptcha.com" not in page.text:
        return page                          # no gate, keep scraping as normal

    # 2. read data-sitekey off the widget div in the HTML
    import re
    sitekey = re.search(r'data-sitekey="([0-9a-f-]+)"', page.text).group(1)

    # 3. POST the sitekey + page URL, 4. get the token back
    token = solve_hcaptcha(sitekey, url)

    # 5. inject the token as the form's h-captcha-response and submit
    resp = session.post(
        url,
        data={"h-captcha-response": token},
        timeout=30,
    )
    resp.raise_for_status()

    # 6. continue the scrape against the now-unlocked response
    return resp

The same shape works with httpx, aiohttp, or any other HTTP client. Only the request calls change. The hCaptcha handshake itself is identical regardless of your stack.

Cost on a scrape job

Billing is per challenge round: 1 credit = 1 hCaptcha round (hCaptcha decides how many rounds a request needs: often one, sometimes two or three), priced at $0.40-$0.50 per 1,000 credits depending on the pack, with no subscription and credits that never expire. That maps cleanly onto scraping, where volume is bursty and a slow week should cost nothing.

Failed, cancelled, and expired solves are auto-refunded, so a retry on a flaky target never double-bills. You only pay for solves that actually return a token.
5 concurrent solves per key by default (up to 50 on paid packs), so a parallel crawler can keep many gated requests in flight at once.
New accounts get 100 free credits, which is enough to wire the pipeline into your scraper and confirm tokens are accepted before you top up.

Enterprise hCaptcha (`rqdata`)

Some targets run enterprise hCaptcha, which binds each challenge to a fresh rqdata blob captured from the page. When you hit one, scrape the rqdata value out of the request that sets up the widget and pass it with type: "hcaptcha_enterprise" alongside the sitekey and url. NoneCap returns tokens that enterprise sitekeys accept, where pure image-recognition solvers tend to fail.

Scope: hCaptcha gates only

This pipeline unlocks hCaptcha gates only: regular, invisible, and enterprise rqdata. If your target sits behind reCAPTCHA v2/v3 or Cloudflare Turnstile, NoneCap does not solve those today (they are on the roadmap), so it cannot get you past a non-hCaptcha gate. Confirm the gate is hCaptcha before wiring this in. The h-captcha div and the hcaptcha.com script are the tell.

If your scrape needs a full browser context (sites that only render listings after JS execution, or that watch for a real DOM), pair this with hCaptcha solving in Playwright, where you inject the same token into the live page instead of a bare HTTP POST.

Last updated June 2026.

Frequently asked

Where do I get the sitekey for a scrape target?

It is in the page HTML you already fetched. hCaptcha renders a div with class h-captcha and a data-sitekey attribute; read that attribute (or the key passed to hcaptcha.render() in the page JS). Pass that exact value as sitekey along with the page url to POST /v1/solves.

Do I need a real browser to use the token?

No. NoneCap solves the challenge on its own side and your scraper just needs to submit the returned P1_ string as the h-captcha-response form field on the request that was gated. A plain requests/httpx client works. If the target requires a full browser anyway, see hCaptcha solving with Playwright.

What happens to my credits if a solve fails on a flaky target?

Failed, cancelled, and expired solves are auto-refunded, so a retry never double-bills. A successful solve is billed credits_charged = the number of challenge rounds it took (at least 1), and only when it returns a token. That keeps retry loops on bursty scrape jobs cheap.

Can NoneCap get me past a reCAPTCHA or Cloudflare Turnstile gate on the same site?

No. NoneCap solves hCaptcha only: regular, invisible, and enterprise (rqdata). If a target is behind reCAPTCHA or Turnstile, those gates are out of scope today (they are on the roadmap), so this pipeline only unlocks the hCaptcha-gated requests.