hCaptcha solving for web scraping.
When a scrape target throws up an hCaptcha gate, you read its
data-sitekey and page URL, POST them to NoneCap, get back a real
P1_ token, and submit that token as the form’s
h-captcha-response. Then you keep scraping. You run no headless
browser and no image-recognition model on your side.
The pipeline, end to end
An hCaptcha gate is a small, fixed handshake. Your scraper detects it, hands the two identifying values to NoneCap, and submits the token NoneCap returns. The token does the unlocking; the rest of your scrape is unchanged.
| Step | What you do | NoneCap field |
|---|---|---|
| 1 · Detect | See the h-captcha div / hcaptcha.com/1/api.js in the HTML | (none) |
| 2 · Read | Pull data-sitekey and the page URL | sitekey, url |
| 3 · Solve | POST the job, block with ?wait or poll | POST /v1/solves |
| 4 · Token | Read the P1_ string off the solved object | token |
| 5 · Inject | Send it as the form’s h-captcha-response | (none) |
| 6 · Continue | Scrape the now-unlocked response | (none) |
1. Detect the gate and read the sitekey
hCaptcha hydrates a placeholder div client-side. In the HTML you
already fetched, look for an h-captcha class or a reference to
hcaptcha.com/1/api.js, then pull data-sitekey off the
widget. That sitekey plus the page url are the only two values
NoneCap needs.
# the gate is just a div hCaptcha hydrates client-side:
<div class="h-captcha" data-sitekey="f5ab1c2d-7e8f-4a9b-b1c2-d3e4f5a6b7c8"></div>
# the same sitekey is also passed to the hcaptcha.com/1/api.js script tag,
# or to grecaptcha-compat / hcaptcha.render() in the page's JS bundle. Avoid hard-coding the sitekey: sites rotate them, and enterprise deployments bind the challenge to the page, so always read it from the live response.
2. POST the solve and read the token
Create a solve with type: "hcaptcha", the sitekey, and
the url. Pass ?wait=N (up to 90s) to block until the
solve reaches a terminal state and return the token inline, so a scraper worker
is a single request instead of a submit-then-poll loop:
{
"id": "solve_01HQF7K3JKWZX",
"object": "solve",
"type": "hcaptcha",
"status": "solved",
"sitekey": "f5ab1c2d-7e8f-4a9b-b1c2-d3e4f5a6b7c8",
"url": "https://target.example/listings",
"token": "P1_eyJ0eXAi...UV8w",
"error": null,
"credits_charged": 1
}
The token lives in the token field whenever
status is solved. For long-running crawls you can drop
?wait and pass a webhook_url, or poll
GET /v1/solves/{id} instead. See the
API reference for every field and language.
3. Inject the token and continue
The returned P1_ string is what the page’s own JavaScript would
have produced after a human solved the challenge. Submit it as the
h-captcha-response form field on the request that was gated, then
carry on scraping the unlocked response.
Here is the whole loop in one Python file using requests:
import os
import requests
NONECAP_KEY = os.environ["NONECAP_KEY"]
session = requests.Session()
def solve_hcaptcha(sitekey: str, url: str) -> str:
"""Mint a real hCaptcha token via NoneCap and return the P1_ string."""
r = requests.post(
"https://api.nonecap.com/v1/solves",
headers={"Authorization": f"Bearer {NONECAP_KEY}"},
params={"wait": 90}, # block up to 90s for a terminal state
json={"type": "hcaptcha", "sitekey": sitekey, "url": url},
timeout=120,
)
r.raise_for_status()
solve = r.json()
if solve["status"] != "solved":
raise RuntimeError(f"solve {solve['status']}: {solve.get('error')}")
return solve["token"] # a real P1_… hCaptcha token
def scrape_gated_page(url: str):
# 1. fetch the page and detect the hCaptcha gate
page = session.get(url, timeout=30)
if "h-captcha" not in page.text and "hcaptcha.com" not in page.text:
return page # no gate, keep scraping as normal
# 2. read data-sitekey off the widget div in the HTML
import re
sitekey = re.search(r'data-sitekey="([0-9a-f-]+)"', page.text).group(1)
# 3. POST the sitekey + page URL, 4. get the token back
token = solve_hcaptcha(sitekey, url)
# 5. inject the token as the form's h-captcha-response and submit
resp = session.post(
url,
data={"h-captcha-response": token},
timeout=30,
)
resp.raise_for_status()
# 6. continue the scrape against the now-unlocked response
return resp
The same shape works with httpx, aiohttp, or any other
HTTP client. Only the request calls change. The hCaptcha handshake itself is
identical regardless of your stack.
Cost on a scrape job
Billing is per challenge round: 1 credit = 1 hCaptcha round (hCaptcha decides how many rounds a request needs: often one, sometimes two or three), priced at $0.40-$0.50 per 1,000 credits depending on the pack, with no subscription and credits that never expire. That maps cleanly onto scraping, where volume is bursty and a slow week should cost nothing.
- Failed, cancelled, and expired solves are auto-refunded, so a retry on a flaky target never double-bills. You only pay for solves that actually return a token.
- 5 concurrent solves per key by default (up to 50 on paid packs), so a parallel crawler can keep many gated requests in flight at once.
- New accounts get 100 free credits, which is enough to wire the pipeline into your scraper and confirm tokens are accepted before you top up.
Enterprise hCaptcha (rqdata)
Some targets run enterprise hCaptcha, which binds each challenge to a fresh
rqdata blob captured from the page. When you hit one, scrape the
rqdata value out of the request that sets up the widget and pass it
with type: "hcaptcha_enterprise" alongside the sitekey
and url. NoneCap returns tokens that enterprise sitekeys accept, where
pure image-recognition solvers tend to fail.
Scope: hCaptcha gates only
This pipeline unlocks hCaptcha gates only: regular, invisible, and enterprise
rqdata. If your target sits behind reCAPTCHA v2/v3 or Cloudflare Turnstile, NoneCap does not solve those today (they are on the roadmap), so it cannot get you past a non-hCaptcha gate. Confirm the gate is hCaptcha before wiring this in. Theh-captchadiv and thehcaptcha.comscript are the tell.
If your scrape needs a full browser context (sites that only render listings after JS execution, or that watch for a real DOM), pair this with hCaptcha solving in Playwright, where you inject the same token into the live page instead of a bare HTTP POST.
Last updated June 2026.
Frequently asked
Where do I get the sitekey for a scrape target?
h-captcha and a data-sitekey attribute; read that attribute (or the key passed to hcaptcha.render() in the page JS). Pass that exact value as sitekey along with the page url to POST /v1/solves.Do I need a real browser to use the token?
P1_ string as the h-captcha-response form field on the request that was gated. A plain requests/httpx client works. If the target requires a full browser anyway, see hCaptcha solving with Playwright.What happens to my credits if a solve fails on a flaky target?
credits_charged = the number of challenge rounds it took (at least 1), and only when it returns a token. That keeps retry loops on bursty scrape jobs cheap.Can NoneCap get me past a reCAPTCHA or Cloudflare Turnstile gate on the same site?
rqdata). If a target is behind reCAPTCHA or Turnstile, those gates are out of scope today (they are on the roadmap), so this pipeline only unlocks the hCaptcha-gated requests.