I spent the last two months running a Polymarket order-book collector. The collector runs on a small VM, subscribes to the public WebSocket feed, and writes one Parquet file per UTC hour. By 2026-04-15 the archive had grown to 1,262 hourly files, 30,287,264,368 events, 623.8 GB on disk, covering 52 calendar days and 385,198 distinct market ids. The first version of the paper is up on arXiv, the replication package is on GitHub and Zenodo (DOI 10.5281/zenodo.19811426), and the manuscript is under review at the Journal of Financial Markets. This post walks through what’s in it. Why bother Prediction markets aggregate dispersed beliefs into a single price that, in equilibrium, behaves like a probability. The empirical literature has historically focused on price-level questions: forecast accuracy, calibration against realised outcomes, the longshot bias, and the extent to which informed and uninformed traders coexist on the same venue. Yet microstructure is what determines the trading…
No comments yet. Log in to reply on the Fediverse. Comments will appear here.