For Quant Researchers

Polymarket data for quantitative research

Resolved Markets is built for the kind of research that breaks when a vendor deletes data after 31 days. We keep every orderbook snapshot ever captured in ClickHouse, stamp each one with monotonic sequence numbers and millisecond-precision event and capture timestamps, and pair every crypto snapshot with the Binance spot price at capture time. The fields you need for microstructure work are first-class.

Last updated: 2026-05-05

Full archiveHistorical retention
~20 HzCrypto capture rate
ClickHouseBackend
Per-token sequence numbersGap detection

Get an API key See pricing

What makes the dataset useful for research

Full archive, no retention cliff. Multi-month strategy validation, longitudinal studies, and event studies on dated events (elections, FOMC, sports playoffs, hurricanes) all need data older than 31 days. Resolved Markets keeps it.
Microstructure-grade fields. Each snapshot has full bids and asks arrays as Array(Tuple(price, size)), plus best bid/ask, mid, spread, top-5 cumulative depth, sequence number, event timestamp, capture timestamp, and the paired crypto spot price with its staleness in milliseconds.
Gap detection. The monotonic sequence_number on every snapshot lets you detect dropped events from the upstream Polymarket WebSocket — important when you are reconstructing a true book and want to know where you can trust the depth.
Cross-category coverage. Crypto, sports, economics, weather, and Hyperliquid perpetual futures through one key. Useful for cross-asset and cross-event studies that competitor APIs can't answer.
ClickHouse backend. Aggregations across billions of rows return in seconds. Enterprise customers get direct query access; everyone else gets the same shape via REST + the rm-api download CLI which writes parquet locally.

Snapshot schema (what every record contains)

{
  "condition_id":     "0x...",            // Polymarket market id
  "token_id":         "1234...",          // UP or DOWN token
  "side":             "UP",               // UP | DOWN
  "event_timestamp":  "2026-05-04T14:23:00.123Z",   // Polymarket emitted
  "capture_timestamp":"2026-05-04T14:23:00.131Z",   // we processed
  "sequence_number":  82731,
  "best_bid":         0.6231,
  "best_ask":         0.6244,
  "mid":              0.62375,
  "spread":           0.0013,
  "bids":             [[0.6231, 412.5], [0.6225, 800.0], ...],
  "asks":             [[0.6244, 350.0], [0.6250, 612.0], ...],
  "depth5_bid":       4123.7,             // cum size, top 5 levels
  "depth5_ask":       3998.4,
  "spot_crypto_usd":  62418.50,           // Binance ref at capture
  "spot_crypto_age_ms": 84
}

Patterns for common quant tasks

Backfill a strategy in vectorbt or Backtrader

from rm_api import Client
import pandas as pd

c = Client(api_key=os.environ["RM_API_KEY"])
snaps = c.snapshots(condition_id, frm="2026-01-01", to="2026-04-01", limit=500)
df = pd.DataFrame(snaps).set_index("capture_timestamp")
# Now feed df["mid"] into vectorbt / Backtrader as a price series.

Bulk parquet export via CLI

rm-api download \
  --category crypto --subcategory BTC \
  --from 2026-01-01 --to 2026-05-01 \
  --format parquet --out ./btc_snapshots/

Detect gaps before computing realized spread

SELECT condition_id, token_id,
       max(sequence_number) - min(sequence_number) AS span,
       count() AS rows,
       (max(sequence_number) - min(sequence_number) + 1) - count() AS missing
FROM polymarket.snapshots_hf
WHERE capture_timestamp >= now() - INTERVAL 1 DAY
GROUP BY condition_id, token_id
HAVING missing > 0
ORDER BY missing DESC;

Academic and research access

University researchers with a .edu email can request extended Enterprise access in exchange for a citation in published work. Email info@elcara.xyz with a short description of the project. The dataset has already been used for studies on prediction-market liquidity, microstructure of binary outcomes, and cross-asset event studies.

Frequently asked questions

Why is Resolved Markets useful for quantitative research?

Three reasons. (1) Full historical archive — no 31-day retention limit, so multi-month and multi-quarter studies are possible. (2) Microstructure-grade fields — every snapshot carries best bid/ask, full depth, mid, spread, sequence numbers, eventTimestamp, captureTimestamp, and a paired crypto spot price for cross-asset joins. (3) ClickHouse backend — analytical queries over billions of rows are seconds, not minutes.

How do I pull historical data efficiently?

Use /v1/markets/:id/snapshots with limit=500 and ISO from/to bounds, or paginate by sequence number. For bulk pulls across many markets, the CLI rm-api download writes parquet locally. Enterprise customers get direct ClickHouse query access.

Is there gap detection in the data stream?

Yes — every snapshot has a monotonic sequence number per token. Gaps in the sequence indicate dropped events from the upstream Polymarket WebSocket and are flagged in our /api/debug endpoint and the rm-api gaps command.

Can I cross-reference Polymarket prices with Binance spot?

Each snapshot includes the spot crypto price (from Binance) at capture time, with a price-staleness field measuring milliseconds since the last spot update. This lets you compute implied vs realized spreads or basis without joining external feeds.

Do you have an academic / research access program?

Yes — university researchers with .edu addresses can request extended Enterprise access in exchange for a citation in published work. Contact info@elcara.xyz with a brief project description.