Internals
Documentation for Pathfinder.jl
's internal functions.
See the Public Documentation section for documentation of the public interface.
Index
Pathfinder.RankUpdateEuclideanMetric
Pathfinder.UniformSampler
Pathfinder.WoodburyPDFactorization
Pathfinder.WoodburyPDMat
Pathfinder.WoodburyPDRightFactor
Pathfinder.fit_mvnormals
Pathfinder.lbfgs_inverse_hessian
Pathfinder.lbfgs_inverse_hessians
Pathfinder.pdfactorize
Pathfinder.pdunfactorize
Pathfinder.resample
Internal Interface
Pathfinder.RankUpdateEuclideanMetric
— TypeRankUpdateEuclideanMetric{T,M} <: AdvancedHMC.AbstractMetric
A Gaussian Euclidean metric whose inverse is constructed by rank-updates.
Constructors
RankUpdateEuclideanMetric(n::Int)
RankUpdateEuclideanMetric(M⁻¹::Pathfinder.WoodburyPDMat)
Construct a Gaussian Euclidean metric of size (n, n)
with inverse of M⁻¹
.
Example
julia> using LinearAlgebra, Pathfinder, AdvancedHMC
julia> Pathfinder.RankUpdateEuclideanMetric(3)
RankUpdateEuclideanMetric(diag=[1.0, 1.0, 1.0])
julia> W = Pathfinder.WoodburyPDMat(Diagonal([0.1, 0.2]), [0.7 0.2]', Diagonal([0.3]))
2×2 Pathfinder.WoodburyPDMat{Float64, Diagonal{Float64, Vector{Float64}}, Adjoint{Float64, Matrix{Float64}}, Diagonal{Float64, Vector{Float64}}, Diagonal{Float64, Vector{Float64}}, QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, UpperTriangular{Float64, Matrix{Float64}}}:
0.247 0.042
0.042 0.212
julia> Pathfinder.RankUpdateEuclideanMetric(W)
RankUpdateEuclideanMetric(diag=[0.247, 0.21200000000000002])
Pathfinder.UniformSampler
— TypeUniformSampler(scale::Real)
Sampler that in-place modifies an array to be IID uniformly distributed on [-scale, scale]
Pathfinder.WoodburyPDFactorization
— TypeWoodburyPDFactorization{T,F} <: Factorization{T}
A "square root" factorization of a positive definite Woodbury matrix.
Pathfinder.WoodburyPDMat
— TypeWoodburyPDMat <: PDMats.AbstractPDMat
Lazily represents a real positive definite (PD) matrix as an update to a full-rank PD matrix.
WoodburyPDMat(A, B, D)
Constructs the $n \times n$ PD matrix
\[W = A + B D B^\mathrm{T},\]
where $A$ is an $n \times n$ full rank positive definite matrix, $D$ is an $m \times m$ symmetric matrix, and $B$ is an $n \times m$ matrix. Note that $B$ and $D$ must be chosen such that $W$ is positive definite; otherwise an error will be thrown during construction.
Upon construction, WoodburyPDMat
calls pdfactorize
to construct a WoodburyPDFactorization
, which is used in its overloads.
Pathfinder.WoodburyPDRightFactor
— TypeWoodburyPDRightFactor{T,TA,Q,TC} <: AbstractMatrix{T}
The right factor $R$ of a WoodburyPDFactorization
.
Pathfinder.fit_mvnormals
— Methodfit_mvnormals(θs, ∇logpθs; history_length=5) -> (dists, num_bfgs_updates_rejected)
Fit a multivariate-normal distribution to each point on the trajectory θs
.
Given θs
with gradients ∇logpθs
, construct LBFGS inverse Hessian approximations with the provided history_length
. The inverse Hessians approximate a covariance. The covariances and corresponding means that define multivariate normal approximations per point are returned.
The 2nd returned value is the number of BFGS updates to the inverse Hessian matrices that were rejected due to keeping the inverse Hessian positive definite.
Pathfinder.lbfgs_inverse_hessian
— Methodlbfgs_inverse_hessian(H₀, S₀, Y₀, history_ind, history_length) -> WoodburyPDMat
Compute approximate inverse Hessian initialized from H₀
from history stored in S₀
and Y₀
.
history_ind
indicates the column in S₀
and Y₀
that was most recently added to the history, while history_length
indicates the number of first columns in S₀
and Y₀
currently being used for storing history. S = S₀[:, history_ind+1:history_length; 1:history_ind]
reorders the columns of S₀
so that the oldest is first and newest is last.
From Byrd et al. [2], Theorem 2.2, the expression for the inverse Hessian $H$ is
\[\begin{align} B &= \begin{pmatrix}H_0 Y & S\end{pmatrix}\\ R &= \operatorname{triu}(S^\mathrm{T} Y)\\ E &= I \circ R\\ D &= \begin{pmatrix} 0 & -R^{-1}\\ -R^{-\mathrm{T}} & R^\mathrm{-T} (E + Y^\mathrm{T} H_0 Y ) R^\mathrm{-1}\\ \end{pmatrix}\ H &= H_0 + B D B^\mathrm{T} \end{align}\]
References
- [2]: Byrd et al. Math. Program. 63, 1994.
Pathfinder.lbfgs_inverse_hessians
— Methodlbfgs_inverse_hessians(
θs, ∇logpθs; Hinit=gilbert_init, history_length=5, ϵ=1e-12
) -> Tuple{Vector{WoodburyPDMat},Int}
From an L-BFGS trajectory and gradients, compute the inverse Hessian approximations at each point.
Given positions θs
with gradients ∇logpθs
, construct LBFGS inverse Hessian approximations with the provided history_length
.
The 2nd returned value is the number of BFGS updates to the inverse Hessian matrices that were rejected due to keeping the inverse Hessian positive definite.
Pathfinder.pdfactorize
— Methodpdfactorize(A, B, D) -> WoodburyPDFactorization
Factorize the positive definite matrix $W = A + B D B^\mathrm{T}$.
The result is the "square root" factorization (L, R)
, where $W = L R$ and $L = R^\mathrm{T}$.
Let $U^\mathrm{T} U = A$ be the Cholesky decomposition of $A$, and let $Q X = U^{-\mathrm{T}} B$ be a thin QR decomposition. Define $C = I + XDX^\mathrm{T}$, with the Cholesky decomposition $V^\mathrm{T} V = C$. Then, $W = R^\mathrm{T} R$, where
\[R = \begin{pmatrix} U & 0 \\ 0 & I \end{pmatrix} Q^\mathrm{T} V.\]
The positive definite requirement is equivalent to the requirement that both $A$ and $C$ are positive definite.
For a derivation of this decomposition for the special case of diagonal $A$, see Zhang et al. [1], appendix A.
See pdunfactorize
, WoodburyPDFactorization
, WoodburyPDMat
References
- [1]: Zhang et al. JMLR 23(306), 2022.
Pathfinder.pdunfactorize
— Methodpdunfactorize(F::WoodburyPDFactorization) -> (A, B, D)
Perform a reverse operation to pdfactorize
.
Note that this function does not compute the inverse of pdfactorize
. It only computes matrices that produce the same matrix $W = A + B D B^\mathrm{T}$ as for the inputs to pdfactorize
.
Pathfinder.resample
— Methodresample(rng, x, log_weights, ndraws) -> (draws, psis_result)
resample(rng, x, ndraws) -> draws
Draw ndraws
samples from x
, with replacement.
If log_weights
is provided, perform Pareto smoothed importance resampling.