I’d like to do this very quickly based on snippets from wikipedia, rather than thinking much about it.
Snippets from wikipedia
Here is the first snippet…
To sample from the joint posterior of \((\mu, \Sigma)\), one simply draws samples from \(\Sigma \mid y \sim W^{-1}(\Psi_n, \nu_n)\), then draw \(\mu \mid \Sigma, y \sim N_p(\mu_n, \Sigma / \lambda_n)\), where
Ok, so this assumes we have used a prior distribution:
\[\Sigma \sim W^{-1}(\Psi, \nu)\]
\[\mu \mid \Sigma \sim N(\mu_0, \Sigma/\lambda)\]
I don’t want to think too hard about prior distributions, even though that’s dangerous. It isn’t the point of this exercise. I want \(\mu_0\), \(\lambda\), \(\nu\) and \(\Psi\) to be small. Let’s just say zero for now, even if that’s a terrible idea. This way I’ve reduced the problem to drawing samples from \(\Sigma \mid y \sim W^{-1}(S, n)\), then \(\mu \mid \Sigma, y \sim N_p(\bar{y}, \Sigma / n)\).
If \(X\sim W(\Sigma, \nu)\), then \(A = X^{-1}\) has an inverse-Wishart distribution \(A \sim W^{-1}(\Sigma^{-1}, \nu).\)
Ok, so this is mixing up notation now. Our target in our original notation is \(\Sigma \mid y \sim W^{-1}(S, n)\). This snippet says we can achieve that by simulating \(X\) from a Wishart distribution \(X\sim W(S^{-1}, n)\), and then taking \(\Sigma = X^{-1}\).
Example in R
# simulate some 3-dimensional normal data, n = 20n <-20Y <- mvtnorm::rmvnorm(n = n,mean =c(180, 190, 200),sigma =matrix(70, 3, 3) +diag(30, 3))# calculate Y_bar and SY_bar <-colMeans(Y)S <- (t(Y) - Y_bar) %*%t(t(Y) - Y_bar)# simulate from inverse-Wishart distributionX <-rWishart(1, df = n, Sigma =solve(S))Sigma <-solve(X[,,1])Sigma