Multivariate normal density using Woodbury matrix inversion lemma

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty margin-bottom:0;







up vote
6
down vote

favorite












I'm trying to speed up by multivariate normal density function by taking advantage of the fact that the covariance matrix is of the form A + U C U'. The inverse of such a matrix can be calculated using the Woodbury matrix lemma that allows us to take inverses of either diagonals or much smaller matrices.



The arguments to my function are dynamically sized Eigen::Matrix objects. The matrices A and C are diagonal, and C is much smaller than A.



typedef Eigen::Matrix< double, Eigen::Dynamic, 1 > Vec;
typedef Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic > Mat;
const double inv_sqrt_2pi(0.3989422804014327);
const double log_two_pi (1.83787706640935);

double evalMultivNormWBDA(const Vec &x, const Vec &meanVec, const Vec &A, const Mat &U, const Mat &C, bool log)

Mat Ainv = A.asDiagonal().inverse();
Mat Cinv = C.inverse();
Mat I = Cinv + U.transpose()*Ainv*U;
Mat SigInv = Ainv - Ainv * U * I.ldlt().solve(U.transpose() * Ainv);
Eigen::LLT<Mat> lltSigInv(SigInv);
Mat L = lltSigInv.matrixL(); // LL' = Sig^-1
double quadform = (L * (x-meanVec)).squaredNorm();
unsigned int dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition (assumes symmetric and positive definite)
double halfld (0.0);
// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
halfld += std::log(L(i,i));


return -.5*log_two_pi * dim + halfld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) * L.determinant();
return normConst * std::exp(-.5* quadform);




For 9-dimensional vectors it's actually slower than the function I have that doesn't take advantage of the lemma. How can I make this faster?



Here is the generic multivariate normal density function along with some rough timing code. Even for vectors of dimension 100 it's slower.



double evalMultivNorm(const Vec &x, const Vec &meanVec, const Mat &covMat, bool log)

// from Eigen: Remember that Cholesky decompositions are not rank-revealing.
/// This LLT decomposition is only stable on positive definite matrices,
// use LDLT instead for the semidefinite case. Also, do not use a Cholesky
// decomposition to determine whether a system of equations has a solution.
Eigen::LLT<Mat> lltM(covMat);
double quadform = (lltM.matrixL().solve(x-meanVec)).squaredNorm();
size_t dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition too
double ld (0.0);
Mat L = lltM.matrixL(); // the lower diagonal L such that M = LL^T

// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
ld += std::log(L(i,i));

ld *= 2; // covMat = LL^T

return -.5*log_two_pi * dim - .5*ld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) / lltM.matrixL().determinant();
return normConst * std::exp(-.5* quadform);




typedef std::chrono::high_resolution_clock Clock;


int main(int argc, char **argv)

auto begin = Clock::now();


unsigned int size(100);
unsigned int numiters(1e5);
Vec x = Vec::Random(size);
Vec mean = Vec::Zero(size);
Mat U = Mat::Random(size,1);
Mat C = Mat::Identity(1,1);
Mat Amat = Mat::Identity(size,size) * .01;
Vec Avec = Vec::Constant(size,.01);
Mat cov = U * C * U.transpose() + Amat;
for(size_t i = 0; i < numiters; ++i)
evalMultivNorm(x, mean, cov, true); // 134 milliseconds with ten, 1588 with 50, 7718 100
//evalMultivNormWBDA(x, mean, Avec, U, C, true); // 368, 2687 with 50, 15583 with 100







auto end = Clock::now();
std::cout << "Delta t2-t1: "
<< std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count()
<< " milliseconds" << std::endl;
return 0;







share|improve this question





















  • Can you post the original function as a reference?
    – Frank
    Feb 26 at 23:57










  • @Frank you mean the one that doesn’t make use of the lemma that I’m kind of comparing it to? Sure
    – Taylor
    Feb 27 at 0:43










  • Can you add some example inputs to your post as well?
    – Yuushi
    Feb 27 at 3:07










  • Is it also slower if you don't evaluate intermediate results? I.e. don't store things in Mat variables, except for L. You can try using auto for intermediate results instead, that will store everything as the template expression types that Eigen uses internally. Be wary about the pitfalls associated with that though (eigen.tuxfamily.org/dox/TopicPitfalls.html). As a test you could try putting the entire calculation for L on a single line and see if that gives a speedup.
    – Darhuuk
    Feb 27 at 9:24







  • 1




    Use 'using' rather than typedefs.
    – JNS
    Feb 27 at 13:10
















up vote
6
down vote

favorite












I'm trying to speed up by multivariate normal density function by taking advantage of the fact that the covariance matrix is of the form A + U C U'. The inverse of such a matrix can be calculated using the Woodbury matrix lemma that allows us to take inverses of either diagonals or much smaller matrices.



The arguments to my function are dynamically sized Eigen::Matrix objects. The matrices A and C are diagonal, and C is much smaller than A.



typedef Eigen::Matrix< double, Eigen::Dynamic, 1 > Vec;
typedef Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic > Mat;
const double inv_sqrt_2pi(0.3989422804014327);
const double log_two_pi (1.83787706640935);

double evalMultivNormWBDA(const Vec &x, const Vec &meanVec, const Vec &A, const Mat &U, const Mat &C, bool log)

Mat Ainv = A.asDiagonal().inverse();
Mat Cinv = C.inverse();
Mat I = Cinv + U.transpose()*Ainv*U;
Mat SigInv = Ainv - Ainv * U * I.ldlt().solve(U.transpose() * Ainv);
Eigen::LLT<Mat> lltSigInv(SigInv);
Mat L = lltSigInv.matrixL(); // LL' = Sig^-1
double quadform = (L * (x-meanVec)).squaredNorm();
unsigned int dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition (assumes symmetric and positive definite)
double halfld (0.0);
// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
halfld += std::log(L(i,i));


return -.5*log_two_pi * dim + halfld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) * L.determinant();
return normConst * std::exp(-.5* quadform);




For 9-dimensional vectors it's actually slower than the function I have that doesn't take advantage of the lemma. How can I make this faster?



Here is the generic multivariate normal density function along with some rough timing code. Even for vectors of dimension 100 it's slower.



double evalMultivNorm(const Vec &x, const Vec &meanVec, const Mat &covMat, bool log)

// from Eigen: Remember that Cholesky decompositions are not rank-revealing.
/// This LLT decomposition is only stable on positive definite matrices,
// use LDLT instead for the semidefinite case. Also, do not use a Cholesky
// decomposition to determine whether a system of equations has a solution.
Eigen::LLT<Mat> lltM(covMat);
double quadform = (lltM.matrixL().solve(x-meanVec)).squaredNorm();
size_t dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition too
double ld (0.0);
Mat L = lltM.matrixL(); // the lower diagonal L such that M = LL^T

// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
ld += std::log(L(i,i));

ld *= 2; // covMat = LL^T

return -.5*log_two_pi * dim - .5*ld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) / lltM.matrixL().determinant();
return normConst * std::exp(-.5* quadform);




typedef std::chrono::high_resolution_clock Clock;


int main(int argc, char **argv)

auto begin = Clock::now();


unsigned int size(100);
unsigned int numiters(1e5);
Vec x = Vec::Random(size);
Vec mean = Vec::Zero(size);
Mat U = Mat::Random(size,1);
Mat C = Mat::Identity(1,1);
Mat Amat = Mat::Identity(size,size) * .01;
Vec Avec = Vec::Constant(size,.01);
Mat cov = U * C * U.transpose() + Amat;
for(size_t i = 0; i < numiters; ++i)
evalMultivNorm(x, mean, cov, true); // 134 milliseconds with ten, 1588 with 50, 7718 100
//evalMultivNormWBDA(x, mean, Avec, U, C, true); // 368, 2687 with 50, 15583 with 100







auto end = Clock::now();
std::cout << "Delta t2-t1: "
<< std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count()
<< " milliseconds" << std::endl;
return 0;







share|improve this question





















  • Can you post the original function as a reference?
    – Frank
    Feb 26 at 23:57










  • @Frank you mean the one that doesn’t make use of the lemma that I’m kind of comparing it to? Sure
    – Taylor
    Feb 27 at 0:43










  • Can you add some example inputs to your post as well?
    – Yuushi
    Feb 27 at 3:07










  • Is it also slower if you don't evaluate intermediate results? I.e. don't store things in Mat variables, except for L. You can try using auto for intermediate results instead, that will store everything as the template expression types that Eigen uses internally. Be wary about the pitfalls associated with that though (eigen.tuxfamily.org/dox/TopicPitfalls.html). As a test you could try putting the entire calculation for L on a single line and see if that gives a speedup.
    – Darhuuk
    Feb 27 at 9:24







  • 1




    Use 'using' rather than typedefs.
    – JNS
    Feb 27 at 13:10












up vote
6
down vote

favorite









up vote
6
down vote

favorite











I'm trying to speed up by multivariate normal density function by taking advantage of the fact that the covariance matrix is of the form A + U C U'. The inverse of such a matrix can be calculated using the Woodbury matrix lemma that allows us to take inverses of either diagonals or much smaller matrices.



The arguments to my function are dynamically sized Eigen::Matrix objects. The matrices A and C are diagonal, and C is much smaller than A.



typedef Eigen::Matrix< double, Eigen::Dynamic, 1 > Vec;
typedef Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic > Mat;
const double inv_sqrt_2pi(0.3989422804014327);
const double log_two_pi (1.83787706640935);

double evalMultivNormWBDA(const Vec &x, const Vec &meanVec, const Vec &A, const Mat &U, const Mat &C, bool log)

Mat Ainv = A.asDiagonal().inverse();
Mat Cinv = C.inverse();
Mat I = Cinv + U.transpose()*Ainv*U;
Mat SigInv = Ainv - Ainv * U * I.ldlt().solve(U.transpose() * Ainv);
Eigen::LLT<Mat> lltSigInv(SigInv);
Mat L = lltSigInv.matrixL(); // LL' = Sig^-1
double quadform = (L * (x-meanVec)).squaredNorm();
unsigned int dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition (assumes symmetric and positive definite)
double halfld (0.0);
// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
halfld += std::log(L(i,i));


return -.5*log_two_pi * dim + halfld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) * L.determinant();
return normConst * std::exp(-.5* quadform);




For 9-dimensional vectors it's actually slower than the function I have that doesn't take advantage of the lemma. How can I make this faster?



Here is the generic multivariate normal density function along with some rough timing code. Even for vectors of dimension 100 it's slower.



double evalMultivNorm(const Vec &x, const Vec &meanVec, const Mat &covMat, bool log)

// from Eigen: Remember that Cholesky decompositions are not rank-revealing.
/// This LLT decomposition is only stable on positive definite matrices,
// use LDLT instead for the semidefinite case. Also, do not use a Cholesky
// decomposition to determine whether a system of equations has a solution.
Eigen::LLT<Mat> lltM(covMat);
double quadform = (lltM.matrixL().solve(x-meanVec)).squaredNorm();
size_t dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition too
double ld (0.0);
Mat L = lltM.matrixL(); // the lower diagonal L such that M = LL^T

// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
ld += std::log(L(i,i));

ld *= 2; // covMat = LL^T

return -.5*log_two_pi * dim - .5*ld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) / lltM.matrixL().determinant();
return normConst * std::exp(-.5* quadform);




typedef std::chrono::high_resolution_clock Clock;


int main(int argc, char **argv)

auto begin = Clock::now();


unsigned int size(100);
unsigned int numiters(1e5);
Vec x = Vec::Random(size);
Vec mean = Vec::Zero(size);
Mat U = Mat::Random(size,1);
Mat C = Mat::Identity(1,1);
Mat Amat = Mat::Identity(size,size) * .01;
Vec Avec = Vec::Constant(size,.01);
Mat cov = U * C * U.transpose() + Amat;
for(size_t i = 0; i < numiters; ++i)
evalMultivNorm(x, mean, cov, true); // 134 milliseconds with ten, 1588 with 50, 7718 100
//evalMultivNormWBDA(x, mean, Avec, U, C, true); // 368, 2687 with 50, 15583 with 100







auto end = Clock::now();
std::cout << "Delta t2-t1: "
<< std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count()
<< " milliseconds" << std::endl;
return 0;







share|improve this question













I'm trying to speed up by multivariate normal density function by taking advantage of the fact that the covariance matrix is of the form A + U C U'. The inverse of such a matrix can be calculated using the Woodbury matrix lemma that allows us to take inverses of either diagonals or much smaller matrices.



The arguments to my function are dynamically sized Eigen::Matrix objects. The matrices A and C are diagonal, and C is much smaller than A.



typedef Eigen::Matrix< double, Eigen::Dynamic, 1 > Vec;
typedef Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic > Mat;
const double inv_sqrt_2pi(0.3989422804014327);
const double log_two_pi (1.83787706640935);

double evalMultivNormWBDA(const Vec &x, const Vec &meanVec, const Vec &A, const Mat &U, const Mat &C, bool log)

Mat Ainv = A.asDiagonal().inverse();
Mat Cinv = C.inverse();
Mat I = Cinv + U.transpose()*Ainv*U;
Mat SigInv = Ainv - Ainv * U * I.ldlt().solve(U.transpose() * Ainv);
Eigen::LLT<Mat> lltSigInv(SigInv);
Mat L = lltSigInv.matrixL(); // LL' = Sig^-1
double quadform = (L * (x-meanVec)).squaredNorm();
unsigned int dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition (assumes symmetric and positive definite)
double halfld (0.0);
// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
halfld += std::log(L(i,i));


return -.5*log_two_pi * dim + halfld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) * L.determinant();
return normConst * std::exp(-.5* quadform);




For 9-dimensional vectors it's actually slower than the function I have that doesn't take advantage of the lemma. How can I make this faster?



Here is the generic multivariate normal density function along with some rough timing code. Even for vectors of dimension 100 it's slower.



double evalMultivNorm(const Vec &x, const Vec &meanVec, const Mat &covMat, bool log)

// from Eigen: Remember that Cholesky decompositions are not rank-revealing.
/// This LLT decomposition is only stable on positive definite matrices,
// use LDLT instead for the semidefinite case. Also, do not use a Cholesky
// decomposition to determine whether a system of equations has a solution.
Eigen::LLT<Mat> lltM(covMat);
double quadform = (lltM.matrixL().solve(x-meanVec)).squaredNorm();
size_t dim = x.rows();
if (log)

// calculate log-determinant using cholesky decomposition too
double ld (0.0);
Mat L = lltM.matrixL(); // the lower diagonal L such that M = LL^T

// add up log of diagnols of Cholesky L
for(size_t i = 0; i < dim; ++i)
ld += std::log(L(i,i));

ld *= 2; // covMat = LL^T

return -.5*log_two_pi * dim - .5*ld - .5*quadform;


else // not the log density
double normConst = std::pow(inv_sqrt_2pi, dim) / lltM.matrixL().determinant();
return normConst * std::exp(-.5* quadform);




typedef std::chrono::high_resolution_clock Clock;


int main(int argc, char **argv)

auto begin = Clock::now();


unsigned int size(100);
unsigned int numiters(1e5);
Vec x = Vec::Random(size);
Vec mean = Vec::Zero(size);
Mat U = Mat::Random(size,1);
Mat C = Mat::Identity(1,1);
Mat Amat = Mat::Identity(size,size) * .01;
Vec Avec = Vec::Constant(size,.01);
Mat cov = U * C * U.transpose() + Amat;
for(size_t i = 0; i < numiters; ++i)
evalMultivNorm(x, mean, cov, true); // 134 milliseconds with ten, 1588 with 50, 7718 100
//evalMultivNormWBDA(x, mean, Avec, U, C, true); // 368, 2687 with 50, 15583 with 100







auto end = Clock::now();
std::cout << "Delta t2-t1: "
<< std::chrono::duration_cast<std::chrono::milliseconds>(end - begin).count()
<< " milliseconds" << std::endl;
return 0;









share|improve this question












share|improve this question




share|improve this question








edited Apr 25 at 2:34









Jamal♦

30.1k11114225




30.1k11114225









asked Feb 26 at 22:06









Taylor

1312




1312











  • Can you post the original function as a reference?
    – Frank
    Feb 26 at 23:57










  • @Frank you mean the one that doesn’t make use of the lemma that I’m kind of comparing it to? Sure
    – Taylor
    Feb 27 at 0:43










  • Can you add some example inputs to your post as well?
    – Yuushi
    Feb 27 at 3:07










  • Is it also slower if you don't evaluate intermediate results? I.e. don't store things in Mat variables, except for L. You can try using auto for intermediate results instead, that will store everything as the template expression types that Eigen uses internally. Be wary about the pitfalls associated with that though (eigen.tuxfamily.org/dox/TopicPitfalls.html). As a test you could try putting the entire calculation for L on a single line and see if that gives a speedup.
    – Darhuuk
    Feb 27 at 9:24







  • 1




    Use 'using' rather than typedefs.
    – JNS
    Feb 27 at 13:10
















  • Can you post the original function as a reference?
    – Frank
    Feb 26 at 23:57










  • @Frank you mean the one that doesn’t make use of the lemma that I’m kind of comparing it to? Sure
    – Taylor
    Feb 27 at 0:43










  • Can you add some example inputs to your post as well?
    – Yuushi
    Feb 27 at 3:07










  • Is it also slower if you don't evaluate intermediate results? I.e. don't store things in Mat variables, except for L. You can try using auto for intermediate results instead, that will store everything as the template expression types that Eigen uses internally. Be wary about the pitfalls associated with that though (eigen.tuxfamily.org/dox/TopicPitfalls.html). As a test you could try putting the entire calculation for L on a single line and see if that gives a speedup.
    – Darhuuk
    Feb 27 at 9:24







  • 1




    Use 'using' rather than typedefs.
    – JNS
    Feb 27 at 13:10















Can you post the original function as a reference?
– Frank
Feb 26 at 23:57




Can you post the original function as a reference?
– Frank
Feb 26 at 23:57












@Frank you mean the one that doesn’t make use of the lemma that I’m kind of comparing it to? Sure
– Taylor
Feb 27 at 0:43




@Frank you mean the one that doesn’t make use of the lemma that I’m kind of comparing it to? Sure
– Taylor
Feb 27 at 0:43












Can you add some example inputs to your post as well?
– Yuushi
Feb 27 at 3:07




Can you add some example inputs to your post as well?
– Yuushi
Feb 27 at 3:07












Is it also slower if you don't evaluate intermediate results? I.e. don't store things in Mat variables, except for L. You can try using auto for intermediate results instead, that will store everything as the template expression types that Eigen uses internally. Be wary about the pitfalls associated with that though (eigen.tuxfamily.org/dox/TopicPitfalls.html). As a test you could try putting the entire calculation for L on a single line and see if that gives a speedup.
– Darhuuk
Feb 27 at 9:24





Is it also slower if you don't evaluate intermediate results? I.e. don't store things in Mat variables, except for L. You can try using auto for intermediate results instead, that will store everything as the template expression types that Eigen uses internally. Be wary about the pitfalls associated with that though (eigen.tuxfamily.org/dox/TopicPitfalls.html). As a test you could try putting the entire calculation for L on a single line and see if that gives a speedup.
– Darhuuk
Feb 27 at 9:24





1




1




Use 'using' rather than typedefs.
– JNS
Feb 27 at 13:10




Use 'using' rather than typedefs.
– JNS
Feb 27 at 13:10










1 Answer
1






active

oldest

votes

















up vote
1
down vote













I hesitate to write this as it's not about the code obove but about a different algorithm. However I think, for small enough m (eg a quarter of n -- here A is nxn and B is mxm), that this is faster.



The below is in terms of upper cholesky factors, ie an upper triangular matrix U so that



U'*U = P


Of course this is just the transpose of the lower factor.



We want to find a cholesky factor of the inverse of



Cov = A + M*B*M', 
where A and B are diagonal and M is nxm


We can write



Cov = (A + M*B*M') = A + Sum k in [0,m-1] 
where the v[i] are the columns of M


To find U define U(i) to the the cholesky factor of the inverse of



A + Sum k in [0,i] 


then a little algebra shows



U(i+1)'*U(i+1) = inv( A + Sum v[i]*b[i]*v[i]')
= inv( A + Sum k in [0,i] + v[i+1]*v[i+1]')
= inv( inv( U(i)'*U(i)) + v[i+1]*v[i+1]')
= U(i)'*inv( I + w*w'/beta)*U(i)
where w = U(i)*v[i] and beta = 1/b[i]


So if we can find Z so that



Z'*Z = inv( I + w*w'/beta)


we will have



U(i+1) = Z*U(i)


It may seem that no progress has been made -- we have to do m factorisation and multiply up the results -- but in fact Z can be computed simply, and so can the products.



We note, by Woodbury, that



inv( I + w*w'/beta) = I - w*w'/(beta + w'*w)


and that, somewhat surprisingly, Z is given by



Z[k,k] = sqrt( N(k+1)/N(k))
Z[k,i] = - w[k]*w[i]/(sqrt( N(k)*N(k+1))) i>k


where



N(i) = beta + Sum w[k]*w[k]
Note N(i) = N(i+1) + w[i]*w[i]
and N(0) = beta+w'*w


For example, for the diagonal:



(Z'*Z)[i,i] = Sum Z[k,i]*Z[k,i]
= Z[i,i]*Z[i,i] + Sum k<i
= N(i+1)/N(i) + w[i]*w[i]*Sum w[k]*w[k]/(N(k+1)*N(k))
= N(i+1)/N(i) + w[i]*w[i]*Sum (N(k)-N(k+1))/(N(k+1)*N(k))
= N(i+1)/N(i) + w[i]*w[i]*Sum k<i
= N(i+1)/N(i) + w[i]*w[i]*(1/N(i) - 1/N(0))
= (N(i+1) + w[i]*w[i])/N(i) - w[i]*w[i]/N(0)
= 1 - w[i]*w[i]/(1+w'*w)


and the off diagonal case is similar



To compute Z*U we have:



Let y[k] = w[k]/(sqrt( N(k)*N(k+1)))

(Z*U)[i,j] = Sum Z[i,k]*U[k,j]
= Sum Z[i,k]*U[k,j]
= Z[i,i]*U[i,j] + Sum Z[i,k]*U[k,j]
= Z[i,i]*U[i,j] + Sum - w[i]*w[k]/(sqrt( N(i)*N(k+1)))*U[k,j]
= Z[i,i]*U[i,j] - w[i]*Sum w[k]*U[k,j]/(sqrt( N(i)*N(i+1)))
= Z[i,i]*U[i,j] - y[i]*Sum w[k]*U[k,j]

Let R[i,j] = Sum w[k]*U[k,j] then
R[j,j] = 0
R[i,j] = w[i+1]*U[i+1,j] + R[i+1,j]
and
(Z*U)[i,j] = Z[i,i]*U[i,j] - y[i]*R[i,j]


Thus incorporating another vector into U is of order n*n



Note that the initial step is a little different in that we want to find U so that



U'*U = inv( A + x*b*x' }


with A diagonal. Similar reasoning to the above leads to



Let M(i) = beta + Sum k>=i 
Note M(i) = M(i+1) + a[i]*x[i]*x[i]
and M(0) = beta+x'*a*x
where a[i] = 1/A[i]
U[i,i] = sqrt( a[i]*M[i+1]/M[i])
U[i,j] = sqrt( a[i]/(M[i+1]*M[i])) * x[i]* a[j]*x[j]


An implementation, that I believe works, is below. The matrices are in column order. The ws parameter is workspace, 4*C doubles (though only C doubles are used in the second routine).
Note that v is assumed to have dimension C, while really U is CxC, but
could be stored in a RxC array. If you are using square matrices set R to be C.
The function mat_ut_vec( R, C, U, v, w);
computes U*v (for upper triangular U) in w.



void mat_utspec_cholesky_update( Int R, Int C, double b, const double* x, double* U, double* ws)

double* y = ws;
double* d = y+C;
double* S = d+C;
double* v = S+C;
Int i, j;

mat_ut_vec( R, C, U, x, v);

double N;
double pN = 1.0/b;
for( i=C-1; i>=0; --i)
N = pN + v[i]*v[i];
double t = 1.0/sqrt( N*pN);
y[i] = v[i]*t;
d[i] = pN*t;
pN = N;


for( j=0; j<C; ++j)
S[j] = 0.0;
for( i=j-1; i>=0; --i)
S[i] = S[i+1] + v[i+1]*U[i+1+R*j];

for( i=j-1; i>=0; --i)
U[i+j*R] = d[i]*U[i+j*R] - y[i]*S[i];

U[j+j*R] *= d[j];




void mat_utspec_acholesky( Int R, Int C, double b, const double* v, const double* A, double* U, double* ws)

double* y = ws;
double* z = ws+C;
Int i, j;
double N;
double pN = 1.0/b;
double alpha;
for( i=C-1; i>=0; --i)
alpha = 1.0/A[i];
z[i] = alpha*v[i];
N = pN + z[i]*v[i];
double t = sqrt( alpha/(N*pN));
y[i] = -v[i]*t;
U[i+R*i] = pN*t;
pN = N;


for( j=0; j<C; ++j)
for( i=0; i<j; ++i)
U[i+j*R] = z[j]*y[i];








share|improve this answer























    Your Answer




    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ifUsing("editor", function ()
    StackExchange.using("externalEditor", function ()
    StackExchange.using("snippets", function ()
    StackExchange.snippets.init();
    );
    );
    , "code-snippets");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "196"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );








     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188406%2fmultivariate-normal-density-using-woodbury-matrix-inversion-lemma%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    1
    down vote













    I hesitate to write this as it's not about the code obove but about a different algorithm. However I think, for small enough m (eg a quarter of n -- here A is nxn and B is mxm), that this is faster.



    The below is in terms of upper cholesky factors, ie an upper triangular matrix U so that



    U'*U = P


    Of course this is just the transpose of the lower factor.



    We want to find a cholesky factor of the inverse of



    Cov = A + M*B*M', 
    where A and B are diagonal and M is nxm


    We can write



    Cov = (A + M*B*M') = A + Sum k in [0,m-1] 
    where the v[i] are the columns of M


    To find U define U(i) to the the cholesky factor of the inverse of



    A + Sum k in [0,i] 


    then a little algebra shows



    U(i+1)'*U(i+1) = inv( A + Sum v[i]*b[i]*v[i]')
    = inv( A + Sum k in [0,i] + v[i+1]*v[i+1]')
    = inv( inv( U(i)'*U(i)) + v[i+1]*v[i+1]')
    = U(i)'*inv( I + w*w'/beta)*U(i)
    where w = U(i)*v[i] and beta = 1/b[i]


    So if we can find Z so that



    Z'*Z = inv( I + w*w'/beta)


    we will have



    U(i+1) = Z*U(i)


    It may seem that no progress has been made -- we have to do m factorisation and multiply up the results -- but in fact Z can be computed simply, and so can the products.



    We note, by Woodbury, that



    inv( I + w*w'/beta) = I - w*w'/(beta + w'*w)


    and that, somewhat surprisingly, Z is given by



    Z[k,k] = sqrt( N(k+1)/N(k))
    Z[k,i] = - w[k]*w[i]/(sqrt( N(k)*N(k+1))) i>k


    where



    N(i) = beta + Sum w[k]*w[k]
    Note N(i) = N(i+1) + w[i]*w[i]
    and N(0) = beta+w'*w


    For example, for the diagonal:



    (Z'*Z)[i,i] = Sum Z[k,i]*Z[k,i]
    = Z[i,i]*Z[i,i] + Sum k<i
    = N(i+1)/N(i) + w[i]*w[i]*Sum w[k]*w[k]/(N(k+1)*N(k))
    = N(i+1)/N(i) + w[i]*w[i]*Sum (N(k)-N(k+1))/(N(k+1)*N(k))
    = N(i+1)/N(i) + w[i]*w[i]*Sum k<i
    = N(i+1)/N(i) + w[i]*w[i]*(1/N(i) - 1/N(0))
    = (N(i+1) + w[i]*w[i])/N(i) - w[i]*w[i]/N(0)
    = 1 - w[i]*w[i]/(1+w'*w)


    and the off diagonal case is similar



    To compute Z*U we have:



    Let y[k] = w[k]/(sqrt( N(k)*N(k+1)))

    (Z*U)[i,j] = Sum Z[i,k]*U[k,j]
    = Sum Z[i,k]*U[k,j]
    = Z[i,i]*U[i,j] + Sum Z[i,k]*U[k,j]
    = Z[i,i]*U[i,j] + Sum - w[i]*w[k]/(sqrt( N(i)*N(k+1)))*U[k,j]
    = Z[i,i]*U[i,j] - w[i]*Sum w[k]*U[k,j]/(sqrt( N(i)*N(i+1)))
    = Z[i,i]*U[i,j] - y[i]*Sum w[k]*U[k,j]

    Let R[i,j] = Sum w[k]*U[k,j] then
    R[j,j] = 0
    R[i,j] = w[i+1]*U[i+1,j] + R[i+1,j]
    and
    (Z*U)[i,j] = Z[i,i]*U[i,j] - y[i]*R[i,j]


    Thus incorporating another vector into U is of order n*n



    Note that the initial step is a little different in that we want to find U so that



    U'*U = inv( A + x*b*x' }


    with A diagonal. Similar reasoning to the above leads to



    Let M(i) = beta + Sum k>=i 
    Note M(i) = M(i+1) + a[i]*x[i]*x[i]
    and M(0) = beta+x'*a*x
    where a[i] = 1/A[i]
    U[i,i] = sqrt( a[i]*M[i+1]/M[i])
    U[i,j] = sqrt( a[i]/(M[i+1]*M[i])) * x[i]* a[j]*x[j]


    An implementation, that I believe works, is below. The matrices are in column order. The ws parameter is workspace, 4*C doubles (though only C doubles are used in the second routine).
    Note that v is assumed to have dimension C, while really U is CxC, but
    could be stored in a RxC array. If you are using square matrices set R to be C.
    The function mat_ut_vec( R, C, U, v, w);
    computes U*v (for upper triangular U) in w.



    void mat_utspec_cholesky_update( Int R, Int C, double b, const double* x, double* U, double* ws)

    double* y = ws;
    double* d = y+C;
    double* S = d+C;
    double* v = S+C;
    Int i, j;

    mat_ut_vec( R, C, U, x, v);

    double N;
    double pN = 1.0/b;
    for( i=C-1; i>=0; --i)
    N = pN + v[i]*v[i];
    double t = 1.0/sqrt( N*pN);
    y[i] = v[i]*t;
    d[i] = pN*t;
    pN = N;


    for( j=0; j<C; ++j)
    S[j] = 0.0;
    for( i=j-1; i>=0; --i)
    S[i] = S[i+1] + v[i+1]*U[i+1+R*j];

    for( i=j-1; i>=0; --i)
    U[i+j*R] = d[i]*U[i+j*R] - y[i]*S[i];

    U[j+j*R] *= d[j];




    void mat_utspec_acholesky( Int R, Int C, double b, const double* v, const double* A, double* U, double* ws)

    double* y = ws;
    double* z = ws+C;
    Int i, j;
    double N;
    double pN = 1.0/b;
    double alpha;
    for( i=C-1; i>=0; --i)
    alpha = 1.0/A[i];
    z[i] = alpha*v[i];
    N = pN + z[i]*v[i];
    double t = sqrt( alpha/(N*pN));
    y[i] = -v[i]*t;
    U[i+R*i] = pN*t;
    pN = N;


    for( j=0; j<C; ++j)
    for( i=0; i<j; ++i)
    U[i+j*R] = z[j]*y[i];








    share|improve this answer



























      up vote
      1
      down vote













      I hesitate to write this as it's not about the code obove but about a different algorithm. However I think, for small enough m (eg a quarter of n -- here A is nxn and B is mxm), that this is faster.



      The below is in terms of upper cholesky factors, ie an upper triangular matrix U so that



      U'*U = P


      Of course this is just the transpose of the lower factor.



      We want to find a cholesky factor of the inverse of



      Cov = A + M*B*M', 
      where A and B are diagonal and M is nxm


      We can write



      Cov = (A + M*B*M') = A + Sum k in [0,m-1] 
      where the v[i] are the columns of M


      To find U define U(i) to the the cholesky factor of the inverse of



      A + Sum k in [0,i] 


      then a little algebra shows



      U(i+1)'*U(i+1) = inv( A + Sum v[i]*b[i]*v[i]')
      = inv( A + Sum k in [0,i] + v[i+1]*v[i+1]')
      = inv( inv( U(i)'*U(i)) + v[i+1]*v[i+1]')
      = U(i)'*inv( I + w*w'/beta)*U(i)
      where w = U(i)*v[i] and beta = 1/b[i]


      So if we can find Z so that



      Z'*Z = inv( I + w*w'/beta)


      we will have



      U(i+1) = Z*U(i)


      It may seem that no progress has been made -- we have to do m factorisation and multiply up the results -- but in fact Z can be computed simply, and so can the products.



      We note, by Woodbury, that



      inv( I + w*w'/beta) = I - w*w'/(beta + w'*w)


      and that, somewhat surprisingly, Z is given by



      Z[k,k] = sqrt( N(k+1)/N(k))
      Z[k,i] = - w[k]*w[i]/(sqrt( N(k)*N(k+1))) i>k


      where



      N(i) = beta + Sum w[k]*w[k]
      Note N(i) = N(i+1) + w[i]*w[i]
      and N(0) = beta+w'*w


      For example, for the diagonal:



      (Z'*Z)[i,i] = Sum Z[k,i]*Z[k,i]
      = Z[i,i]*Z[i,i] + Sum k<i
      = N(i+1)/N(i) + w[i]*w[i]*Sum w[k]*w[k]/(N(k+1)*N(k))
      = N(i+1)/N(i) + w[i]*w[i]*Sum (N(k)-N(k+1))/(N(k+1)*N(k))
      = N(i+1)/N(i) + w[i]*w[i]*Sum k<i
      = N(i+1)/N(i) + w[i]*w[i]*(1/N(i) - 1/N(0))
      = (N(i+1) + w[i]*w[i])/N(i) - w[i]*w[i]/N(0)
      = 1 - w[i]*w[i]/(1+w'*w)


      and the off diagonal case is similar



      To compute Z*U we have:



      Let y[k] = w[k]/(sqrt( N(k)*N(k+1)))

      (Z*U)[i,j] = Sum Z[i,k]*U[k,j]
      = Sum Z[i,k]*U[k,j]
      = Z[i,i]*U[i,j] + Sum Z[i,k]*U[k,j]
      = Z[i,i]*U[i,j] + Sum - w[i]*w[k]/(sqrt( N(i)*N(k+1)))*U[k,j]
      = Z[i,i]*U[i,j] - w[i]*Sum w[k]*U[k,j]/(sqrt( N(i)*N(i+1)))
      = Z[i,i]*U[i,j] - y[i]*Sum w[k]*U[k,j]

      Let R[i,j] = Sum w[k]*U[k,j] then
      R[j,j] = 0
      R[i,j] = w[i+1]*U[i+1,j] + R[i+1,j]
      and
      (Z*U)[i,j] = Z[i,i]*U[i,j] - y[i]*R[i,j]


      Thus incorporating another vector into U is of order n*n



      Note that the initial step is a little different in that we want to find U so that



      U'*U = inv( A + x*b*x' }


      with A diagonal. Similar reasoning to the above leads to



      Let M(i) = beta + Sum k>=i 
      Note M(i) = M(i+1) + a[i]*x[i]*x[i]
      and M(0) = beta+x'*a*x
      where a[i] = 1/A[i]
      U[i,i] = sqrt( a[i]*M[i+1]/M[i])
      U[i,j] = sqrt( a[i]/(M[i+1]*M[i])) * x[i]* a[j]*x[j]


      An implementation, that I believe works, is below. The matrices are in column order. The ws parameter is workspace, 4*C doubles (though only C doubles are used in the second routine).
      Note that v is assumed to have dimension C, while really U is CxC, but
      could be stored in a RxC array. If you are using square matrices set R to be C.
      The function mat_ut_vec( R, C, U, v, w);
      computes U*v (for upper triangular U) in w.



      void mat_utspec_cholesky_update( Int R, Int C, double b, const double* x, double* U, double* ws)

      double* y = ws;
      double* d = y+C;
      double* S = d+C;
      double* v = S+C;
      Int i, j;

      mat_ut_vec( R, C, U, x, v);

      double N;
      double pN = 1.0/b;
      for( i=C-1; i>=0; --i)
      N = pN + v[i]*v[i];
      double t = 1.0/sqrt( N*pN);
      y[i] = v[i]*t;
      d[i] = pN*t;
      pN = N;


      for( j=0; j<C; ++j)
      S[j] = 0.0;
      for( i=j-1; i>=0; --i)
      S[i] = S[i+1] + v[i+1]*U[i+1+R*j];

      for( i=j-1; i>=0; --i)
      U[i+j*R] = d[i]*U[i+j*R] - y[i]*S[i];

      U[j+j*R] *= d[j];




      void mat_utspec_acholesky( Int R, Int C, double b, const double* v, const double* A, double* U, double* ws)

      double* y = ws;
      double* z = ws+C;
      Int i, j;
      double N;
      double pN = 1.0/b;
      double alpha;
      for( i=C-1; i>=0; --i)
      alpha = 1.0/A[i];
      z[i] = alpha*v[i];
      N = pN + z[i]*v[i];
      double t = sqrt( alpha/(N*pN));
      y[i] = -v[i]*t;
      U[i+R*i] = pN*t;
      pN = N;


      for( j=0; j<C; ++j)
      for( i=0; i<j; ++i)
      U[i+j*R] = z[j]*y[i];








      share|improve this answer

























        up vote
        1
        down vote










        up vote
        1
        down vote









        I hesitate to write this as it's not about the code obove but about a different algorithm. However I think, for small enough m (eg a quarter of n -- here A is nxn and B is mxm), that this is faster.



        The below is in terms of upper cholesky factors, ie an upper triangular matrix U so that



        U'*U = P


        Of course this is just the transpose of the lower factor.



        We want to find a cholesky factor of the inverse of



        Cov = A + M*B*M', 
        where A and B are diagonal and M is nxm


        We can write



        Cov = (A + M*B*M') = A + Sum k in [0,m-1] 
        where the v[i] are the columns of M


        To find U define U(i) to the the cholesky factor of the inverse of



        A + Sum k in [0,i] 


        then a little algebra shows



        U(i+1)'*U(i+1) = inv( A + Sum v[i]*b[i]*v[i]')
        = inv( A + Sum k in [0,i] + v[i+1]*v[i+1]')
        = inv( inv( U(i)'*U(i)) + v[i+1]*v[i+1]')
        = U(i)'*inv( I + w*w'/beta)*U(i)
        where w = U(i)*v[i] and beta = 1/b[i]


        So if we can find Z so that



        Z'*Z = inv( I + w*w'/beta)


        we will have



        U(i+1) = Z*U(i)


        It may seem that no progress has been made -- we have to do m factorisation and multiply up the results -- but in fact Z can be computed simply, and so can the products.



        We note, by Woodbury, that



        inv( I + w*w'/beta) = I - w*w'/(beta + w'*w)


        and that, somewhat surprisingly, Z is given by



        Z[k,k] = sqrt( N(k+1)/N(k))
        Z[k,i] = - w[k]*w[i]/(sqrt( N(k)*N(k+1))) i>k


        where



        N(i) = beta + Sum w[k]*w[k]
        Note N(i) = N(i+1) + w[i]*w[i]
        and N(0) = beta+w'*w


        For example, for the diagonal:



        (Z'*Z)[i,i] = Sum Z[k,i]*Z[k,i]
        = Z[i,i]*Z[i,i] + Sum k<i
        = N(i+1)/N(i) + w[i]*w[i]*Sum w[k]*w[k]/(N(k+1)*N(k))
        = N(i+1)/N(i) + w[i]*w[i]*Sum (N(k)-N(k+1))/(N(k+1)*N(k))
        = N(i+1)/N(i) + w[i]*w[i]*Sum k<i
        = N(i+1)/N(i) + w[i]*w[i]*(1/N(i) - 1/N(0))
        = (N(i+1) + w[i]*w[i])/N(i) - w[i]*w[i]/N(0)
        = 1 - w[i]*w[i]/(1+w'*w)


        and the off diagonal case is similar



        To compute Z*U we have:



        Let y[k] = w[k]/(sqrt( N(k)*N(k+1)))

        (Z*U)[i,j] = Sum Z[i,k]*U[k,j]
        = Sum Z[i,k]*U[k,j]
        = Z[i,i]*U[i,j] + Sum Z[i,k]*U[k,j]
        = Z[i,i]*U[i,j] + Sum - w[i]*w[k]/(sqrt( N(i)*N(k+1)))*U[k,j]
        = Z[i,i]*U[i,j] - w[i]*Sum w[k]*U[k,j]/(sqrt( N(i)*N(i+1)))
        = Z[i,i]*U[i,j] - y[i]*Sum w[k]*U[k,j]

        Let R[i,j] = Sum w[k]*U[k,j] then
        R[j,j] = 0
        R[i,j] = w[i+1]*U[i+1,j] + R[i+1,j]
        and
        (Z*U)[i,j] = Z[i,i]*U[i,j] - y[i]*R[i,j]


        Thus incorporating another vector into U is of order n*n



        Note that the initial step is a little different in that we want to find U so that



        U'*U = inv( A + x*b*x' }


        with A diagonal. Similar reasoning to the above leads to



        Let M(i) = beta + Sum k>=i 
        Note M(i) = M(i+1) + a[i]*x[i]*x[i]
        and M(0) = beta+x'*a*x
        where a[i] = 1/A[i]
        U[i,i] = sqrt( a[i]*M[i+1]/M[i])
        U[i,j] = sqrt( a[i]/(M[i+1]*M[i])) * x[i]* a[j]*x[j]


        An implementation, that I believe works, is below. The matrices are in column order. The ws parameter is workspace, 4*C doubles (though only C doubles are used in the second routine).
        Note that v is assumed to have dimension C, while really U is CxC, but
        could be stored in a RxC array. If you are using square matrices set R to be C.
        The function mat_ut_vec( R, C, U, v, w);
        computes U*v (for upper triangular U) in w.



        void mat_utspec_cholesky_update( Int R, Int C, double b, const double* x, double* U, double* ws)

        double* y = ws;
        double* d = y+C;
        double* S = d+C;
        double* v = S+C;
        Int i, j;

        mat_ut_vec( R, C, U, x, v);

        double N;
        double pN = 1.0/b;
        for( i=C-1; i>=0; --i)
        N = pN + v[i]*v[i];
        double t = 1.0/sqrt( N*pN);
        y[i] = v[i]*t;
        d[i] = pN*t;
        pN = N;


        for( j=0; j<C; ++j)
        S[j] = 0.0;
        for( i=j-1; i>=0; --i)
        S[i] = S[i+1] + v[i+1]*U[i+1+R*j];

        for( i=j-1; i>=0; --i)
        U[i+j*R] = d[i]*U[i+j*R] - y[i]*S[i];

        U[j+j*R] *= d[j];




        void mat_utspec_acholesky( Int R, Int C, double b, const double* v, const double* A, double* U, double* ws)

        double* y = ws;
        double* z = ws+C;
        Int i, j;
        double N;
        double pN = 1.0/b;
        double alpha;
        for( i=C-1; i>=0; --i)
        alpha = 1.0/A[i];
        z[i] = alpha*v[i];
        N = pN + z[i]*v[i];
        double t = sqrt( alpha/(N*pN));
        y[i] = -v[i]*t;
        U[i+R*i] = pN*t;
        pN = N;


        for( j=0; j<C; ++j)
        for( i=0; i<j; ++i)
        U[i+j*R] = z[j]*y[i];








        share|improve this answer















        I hesitate to write this as it's not about the code obove but about a different algorithm. However I think, for small enough m (eg a quarter of n -- here A is nxn and B is mxm), that this is faster.



        The below is in terms of upper cholesky factors, ie an upper triangular matrix U so that



        U'*U = P


        Of course this is just the transpose of the lower factor.



        We want to find a cholesky factor of the inverse of



        Cov = A + M*B*M', 
        where A and B are diagonal and M is nxm


        We can write



        Cov = (A + M*B*M') = A + Sum k in [0,m-1] 
        where the v[i] are the columns of M


        To find U define U(i) to the the cholesky factor of the inverse of



        A + Sum k in [0,i] 


        then a little algebra shows



        U(i+1)'*U(i+1) = inv( A + Sum v[i]*b[i]*v[i]')
        = inv( A + Sum k in [0,i] + v[i+1]*v[i+1]')
        = inv( inv( U(i)'*U(i)) + v[i+1]*v[i+1]')
        = U(i)'*inv( I + w*w'/beta)*U(i)
        where w = U(i)*v[i] and beta = 1/b[i]


        So if we can find Z so that



        Z'*Z = inv( I + w*w'/beta)


        we will have



        U(i+1) = Z*U(i)


        It may seem that no progress has been made -- we have to do m factorisation and multiply up the results -- but in fact Z can be computed simply, and so can the products.



        We note, by Woodbury, that



        inv( I + w*w'/beta) = I - w*w'/(beta + w'*w)


        and that, somewhat surprisingly, Z is given by



        Z[k,k] = sqrt( N(k+1)/N(k))
        Z[k,i] = - w[k]*w[i]/(sqrt( N(k)*N(k+1))) i>k


        where



        N(i) = beta + Sum w[k]*w[k]
        Note N(i) = N(i+1) + w[i]*w[i]
        and N(0) = beta+w'*w


        For example, for the diagonal:



        (Z'*Z)[i,i] = Sum Z[k,i]*Z[k,i]
        = Z[i,i]*Z[i,i] + Sum k<i
        = N(i+1)/N(i) + w[i]*w[i]*Sum w[k]*w[k]/(N(k+1)*N(k))
        = N(i+1)/N(i) + w[i]*w[i]*Sum (N(k)-N(k+1))/(N(k+1)*N(k))
        = N(i+1)/N(i) + w[i]*w[i]*Sum k<i
        = N(i+1)/N(i) + w[i]*w[i]*(1/N(i) - 1/N(0))
        = (N(i+1) + w[i]*w[i])/N(i) - w[i]*w[i]/N(0)
        = 1 - w[i]*w[i]/(1+w'*w)


        and the off diagonal case is similar



        To compute Z*U we have:



        Let y[k] = w[k]/(sqrt( N(k)*N(k+1)))

        (Z*U)[i,j] = Sum Z[i,k]*U[k,j]
        = Sum Z[i,k]*U[k,j]
        = Z[i,i]*U[i,j] + Sum Z[i,k]*U[k,j]
        = Z[i,i]*U[i,j] + Sum - w[i]*w[k]/(sqrt( N(i)*N(k+1)))*U[k,j]
        = Z[i,i]*U[i,j] - w[i]*Sum w[k]*U[k,j]/(sqrt( N(i)*N(i+1)))
        = Z[i,i]*U[i,j] - y[i]*Sum w[k]*U[k,j]

        Let R[i,j] = Sum w[k]*U[k,j] then
        R[j,j] = 0
        R[i,j] = w[i+1]*U[i+1,j] + R[i+1,j]
        and
        (Z*U)[i,j] = Z[i,i]*U[i,j] - y[i]*R[i,j]


        Thus incorporating another vector into U is of order n*n



        Note that the initial step is a little different in that we want to find U so that



        U'*U = inv( A + x*b*x' }


        with A diagonal. Similar reasoning to the above leads to



        Let M(i) = beta + Sum k>=i 
        Note M(i) = M(i+1) + a[i]*x[i]*x[i]
        and M(0) = beta+x'*a*x
        where a[i] = 1/A[i]
        U[i,i] = sqrt( a[i]*M[i+1]/M[i])
        U[i,j] = sqrt( a[i]/(M[i+1]*M[i])) * x[i]* a[j]*x[j]


        An implementation, that I believe works, is below. The matrices are in column order. The ws parameter is workspace, 4*C doubles (though only C doubles are used in the second routine).
        Note that v is assumed to have dimension C, while really U is CxC, but
        could be stored in a RxC array. If you are using square matrices set R to be C.
        The function mat_ut_vec( R, C, U, v, w);
        computes U*v (for upper triangular U) in w.



        void mat_utspec_cholesky_update( Int R, Int C, double b, const double* x, double* U, double* ws)

        double* y = ws;
        double* d = y+C;
        double* S = d+C;
        double* v = S+C;
        Int i, j;

        mat_ut_vec( R, C, U, x, v);

        double N;
        double pN = 1.0/b;
        for( i=C-1; i>=0; --i)
        N = pN + v[i]*v[i];
        double t = 1.0/sqrt( N*pN);
        y[i] = v[i]*t;
        d[i] = pN*t;
        pN = N;


        for( j=0; j<C; ++j)
        S[j] = 0.0;
        for( i=j-1; i>=0; --i)
        S[i] = S[i+1] + v[i+1]*U[i+1+R*j];

        for( i=j-1; i>=0; --i)
        U[i+j*R] = d[i]*U[i+j*R] - y[i]*S[i];

        U[j+j*R] *= d[j];




        void mat_utspec_acholesky( Int R, Int C, double b, const double* v, const double* A, double* U, double* ws)

        double* y = ws;
        double* z = ws+C;
        Int i, j;
        double N;
        double pN = 1.0/b;
        double alpha;
        for( i=C-1; i>=0; --i)
        alpha = 1.0/A[i];
        z[i] = alpha*v[i];
        N = pN + z[i]*v[i];
        double t = sqrt( alpha/(N*pN));
        y[i] = -v[i]*t;
        U[i+R*i] = pN*t;
        pN = N;


        for( j=0; j<C; ++j)
        for( i=0; i<j; ++i)
        U[i+j*R] = z[j]*y[i];









        share|improve this answer















        share|improve this answer



        share|improve this answer








        edited Mar 25 at 12:56


























        answered Mar 24 at 14:10









        dmuir

        30112




        30112






















             

            draft saved


            draft discarded


























             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f188406%2fmultivariate-normal-density-using-woodbury-matrix-inversion-lemma%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Greedy Best First Search implementation in Rust

            Function to Return a JSON Like Objects Using VBA Collections and Arrays

            C++11 CLH Lock Implementation