Stein’s method, learning and inference -or- how to really monitor convergence and thin chains

0 ▲

Statistical Modeling, Causal Inference, and Social Science

1 hour ago · Tech · 0 comments

This post is from Bob. I’ve been thinking a lot about scores (gradients of the log density function) and how they can be used for convergence monitoring. We know that the expected value of the score is zero. Stein generalized this with Stein operators. In the monomial case, the Stein operators give you functions in increasing degrees, all of which have zero expectation in the posterior. Here theta is the variable being sampled and S is the score function, so that S(theta) is the gradient of the target log density evaluated at theta. Order 0: S(theta) Order 1: 1 + theta .* S(theta) Order 2: 2 * theta + theta^2 .* S(theta) This leads to a natural test for convergence of first, second, and third moments. Just compute Monte Carlo estimates of these quantities and see if they’re zero. We’d want to standardize for standard deviation to make the result scale-free like R-hat. To develop some intuitions, in a standard normal distribution p(theta) = normal(theta | 0, I), we have S(theta) =…

No comments yet. Log in to reply on the Fediverse. Comments will appear here.