The high cost of split R-hat

0 ▲

Statistical Modeling, Causal Inference, and Social Science

1 hour ago · Tech · hide · 0 comments

This post is by Bob. I’ve been thinking a lot lately about R-hat given that I’m using it for online converging monitoring in our new Walnuts implementation. In that setting, where I use Welford accumulators to update R-hat estimates every iteration, I can’t use split R-hat without way too much buffering. So I’ve been thinking about the effect of splitting, too, and whether we need it. I asked Andrew and he said Kenny Shirley once produced an example where split R-hat diagnosed non-convergence that regular R-hat didn’t, but that example is lost to time and we’ve never seen this kind of behavior with NUTS as far as I know (please give us an example in the comments or via email to Andrew if you have). Relating R-hat and ESS My intuition was that we could set a low enough R-hat threshold that it would ensure a high enough effective sample size (ESS) when we crossed it. The relation’s a little tighter than I thought, with Rhat^2 ≈ 1 + M / ESS, where M is the number of chains and ESS is the…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.