The posterior density for the parameters is proportional to the product of the prior density and the likelihood function. Given the factorizations of the likelihood function in (6) and the prior density in (16), the posterior density can also be factored as the posterior for b and cr multiplied by the posterior for в and G. We analyze these two posteriors separately and then explain how we combine the posterior moments for b and A to obtain posterior moments for the expected excess return.

D.l. Regression Parameters

The joint prior density p(b,a) is equal to the product p(b\cr)p(cr), where the normal prior density for b given a in (9) can be written as

where b = {X’X)~lX’r and Та2 = (r — Xb)'(r — Xb). We compute moments of this joint posterior using the Metropolis-Hastings (MH) algorithm, a Markov chain Monte Carlo procedure introduced by Metropolis et al. (1953) and generalized by Hastings (1970). (For an introduction to the MH algorithm, see Chib and Greenberg (1995) or Gilks, Richardson, and Spiegelhalter (1996).)

Briefly, a sequence of draws of b and a is constructed by making ’’candidate” draws from a “proposal” density and then accepting a new candidate or retaining the previous value based on a rule that assures the resulting sequence for (6, cr) forms a Markov chain whose invariant distribution is the “target” posterior density of interest. The posterior moments of the parameters are computed as the sample moments of a large number of draws. We use a “block-at-a-time” version of the MH algorithm, where b is drawn directly from the conditional density p(b\a, r, F^), but cr is drawn from a proposal density given by the conditional posterior density for a that arises when a and b are made independent in the normal-inverted-gamma prior.13 The target density for a is the conditional density p(c\b, r, F(T)), which is proportional to the right-hand side of (23) (since b is then viewed as a constant and the marginal density of 6, by definition, does not involve cr).

We simulate a MH chain of 50,500 draws, discard the first 500 draws, and estimate the posterior moments of b and a over the remaining 50,000 draws. The number of draws is chosen such that, across repeated independent runs of the MH algorithm, differences in the computed first and second moments of b are small enough for us to report at least two decimal places in our results.
From (23) we see that the conditional posterior for b given a can be wrritten as