Public Documentation
Documentation for Pathfinder.jl
's public interface.
See the Internals section for documentation of internal functions.
Index
Pathfinder.MultiPathfinderResult
Pathfinder.PathfinderResult
Pathfinder.multipathfinder
Pathfinder.pathfinder
Public Interface
Pathfinder.pathfinder
— Functionpathfinder(fun; kwargs...)
Find the best multivariate normal approximation encountered while maximizing a log density.
From an optimization trajectory, Pathfinder constructs a sequence of (multivariate normal) approximations to the distribution specified by a log density function. The approximation that maximizes the evidence lower bound (ELBO), or equivalently, minimizes the KL divergence between the approximation and the true distribution, is returned.
The covariance of the multivariate normal distribution is an inverse Hessian approximation constructed using at most the previous history_length
steps.
Arguments
fun
: An object representing the log-density of the target distribution. Supported types include:- a callable with the signature
f(params::AbstractVector{<:Real}) -> log_density::Real
. - an object implementing the LogDensityProblems interface.
SciMLBase.OptimizationFunction
: wraps the negative log density. It must have the necessary features (e.g. a gradient or Hessian function) for the chosenoptimizer
.SciMLBase.OptimizationProblem
: an optimization problem containing a function with the same properties as the aboveOptimizationFunction
, as well as an initial point. If provided,init
anddim
are ignored.DynamicPPL.Model
: a Turing model. If provided,init
anddim
are ignored.
- a callable with the signature
Keywords
dim::Int
: dimension of the target distribution. Ignored ifinit
provided.init::AbstractVector{<:Real}
: initial point of lengthdim
from which to begin optimization. If not provided andfun
does not contain an initial point, an initial point of typeVector{Float64}
and lengthdim
is created and filled usinginit_sampler
.init_scale::Real
: scale factor $s$ such that the defaultinit_sampler
samples entries uniformly in the range $[-s, s]$init_sampler
: function with the signature(rng, x) -> x
that modifies a vector of lengthdims
in-place to generate an initial pointndraws_elbo::Int=5
: Number of draws used to estimate the ELBOndraws::Int=ndraws_elbo
: number of approximate draws to returnrng::Random.AbstractRNG
: The random number generator to be used for drawing samplesexecutor::Transducers.Executor
: Transducers.jl executor that determines if and how to perform ELBO computation in parallel. The default (Transducers.SequentialEx()
) performs no parallelization. Ifrng
is known to be thread-safe, and the log-density function is known to have no internal state, thenTransducers.PreferParallel()
may be used to parallelize log-density evaluation. This is generally only faster for expensive log density functions.history_length::Int=6
: Size of the history used to approximate the inverse Hessian.optimizer
: Optimizer to be used for constructing trajectory. Can be any optimizer compatible with Optimization.jl, so long as it supports callbacks. Defaults toOptim.LBFGS
(; m=history_length, linesearch=LineSearches.HagerZhang(), alphaguess=LineSearches.InitialHagerZhang())
.adtype::
ADTypes.AbstractADType
: Specifies which automatic differentiation backend should be used to compute the gradient, iffun
does not already specify the gradient. Default isADTypes.AutoForwardDiff()
See Optimization.jl's Automatic Differentiation Recommendations.ntries::Int=1_000
: Number of times to try the optimization, restarting if it fails. Before every restart, a new initial point is drawn usinginit_sampler
.fail_on_nonfinite::Bool=true
: Iftrue
, optimization fails if the log-density is aNaN
orInf
or if the gradient is ever non-finite. Ifnretries > 0
, then optimization will be retried after reinitialization.kwargs...
: Remaining keywords are forwarded toOptimization.solve
.
Returns
Pathfinder.PathfinderResult
— TypePathfinderResult
Container for results of single-path Pathfinder.
Fields
input
: User-provided input object, e.g. a LogDensityProblem,optim_fun
,optim_prob
, or another object.optimizer
: Optimizer used for maximizing the log-densityrng
: Pseudorandom number generator that was used for samplingoptim_prob::
SciMLBase.OptimizationProblem
: Optimization problem used for optimizationlogp
: Log-density functionfit_distribution::
Distributions.MvNormal
: ELBO-maximizing multivariate normal distributiondraws::AbstractMatrix{<:Real}
: draws from multivariate normal with size(dim, ndraws)
fit_distribution_transformed
:fit_distribution
transformed to the same space as the user-supplied target distribution. This is only different fromfit_distribution
when integrating with other packages, and its type depends on the type ofinput
.draws_transformed
:draws
transformed to be draws fromfit_distribution_transformed
.fit_iteration::Int
: Iteration at which ELBO estimate was maximizednum_tries::Int
: Number of tries until Pathfinder succeededoptim_solution::
SciMLBase.OptimizationSolution
: Solution object of optimization.optim_trace::Pathfinder.OptimizationTrace
: container for optimization trace of points, log-density, and gradient. The first point is the initial point.fit_distributions::AbstractVector{Distributions.MvNormal}
: Multivariate normal distributions for each point inoptim_trace
, wherefit_distributions[fit_iteration + 1] == fit_distribution
elbo_estimates::AbstractVector{<:Pathfinder.ELBOEstimate}
: ELBO estimates for all but the first distribution infit_distributions
.num_bfgs_updates_rejected::Int
: Number of times a BFGS update to the reconstructed inverse Hessian was rejected to keep the inverse Hessian positive definite.
Returns
Pathfinder.multipathfinder
— Functionmultipathfinder(fun, ndraws; kwargs...)
Run pathfinder
multiple times to fit a multivariate normal mixture model.
For nruns=length(init)
, nruns
parallel runs of pathfinder produce nruns
multivariate normal approximations $q_k = q(\phi | \mu_k, \Sigma_k)$ of the posterior. These are combined to a mixture model $q$ with uniform weights.
$q$ is augmented with the component index to generate random samples, that is, elements $(k, \phi)$ are drawn from the augmented mixture model
\[\tilde{q}(\phi, k | \mu, \Sigma) = K^{-1} q(\phi | \mu_k, \Sigma_k),\]
where $k$ is a component index, and $K=$ nruns
. These draws are then resampled with replacement. Discarding $k$ from the samples would reproduce draws from $q$.
If importance=true
, then Pareto smoothed importance resampling is used, so that the resulting draws better approximate draws from the target distribution $p$ instead of $q$. This also prints a warning message if the importance weighted draws are unsuitable for approximating expectations with respect to $p$.
Arguments
fun
: An object representing the log-density of the target distribution. Supported types include:- a callable with the signature
f(params::AbstractVector{<:Real}) -> log_density::Real
. - an object implementing the LogDensityProblems interface.
SciMLBase.OptimizationFunction
: wraps the negative log density. It must have the necessary features (e.g. a gradient or Hessian function) for the chosenoptimizer
.SciMLBase.OptimizationProblem
: an optimization problem containing a function with the same properties as the aboveOptimizationFunction
, as well as an initial point. If provided,init
anddim
are ignored.DynamicPPL.Model
: a Turing model. If provided,init
anddim
are ignored.
- a callable with the signature
ndraws::Int
: number of approximate draws to return
Keywords
init
: iterator of lengthnruns
of initial points of lengthdim
from which each single-path Pathfinder run will begin.length(init)
must be implemented. Ifinit
is not provided,nruns
must be, anddim
must be iffun
provided.nruns::Int
: number of runs of Pathfinder to perform. Ignored ifinit
is provided.ndraws_per_run::Int
: The number of draws to take for each component before resampling. Defaults to a number such thatndraws_per_run * nruns > ndraws
.importance::Bool=true
: Perform Pareto smoothed importance resampling of draws.rng::AbstractRNG=Random.GLOBAL_RNG
: Pseudorandom number generator. It is recommended to use a parallelization-friendly PRNG like the default PRNG on Julia 1.7 and up.executor::Transducers.Executor
: Transducers.jl executor that determines if and how to run the single-path runs in parallel, defaulting toTransducers.SequentialEx()
. If a transducer for multi-threaded computation is selected, you must first verify thatrng
and the log density function are thread-safe.executor_per_run::Transducers.Executor
: Transducers.jl executor used within each run to parallelize PRNG calls, defaulting toTransducers.SequentialEx()
. Seepathfinder
for further description.kwargs...
: Remaining keywords are forwarded topathfinder
.
Returns
Pathfinder.MultiPathfinderResult
— TypeMultiPathfinderResult
Container for results of multi-path Pathfinder.
Fields
input
: User-provided input object, e.g. eitherlogp
,(logp, ∇logp)
,optim_fun
,optim_prob
, or another object.optimizer
: Optimizer used for maximizing the log-densityrng
: Pseudorandom number generator that was used for samplingoptim_prob::
SciMLBase.OptimizationProblem
: Otimization problem used for optimizationlogp
: Log-density functionfit_distribution::
Distributions.MixtureModel
: uniformly-weighted mixture of ELBO-maximizing multivariate normal distributions from each run.draws::AbstractMatrix{<:Real}
: draws fromfit_distribution
with size(dim, ndraws)
, potentially resampled using importance resampling to be closer to the target distribution.draw_component_ids::Vector{Int}
: component id of each draw indraws
.fit_distribution_transformed
:fit_distribution
transformed to the same space as the user-supplied target distribution. This is only different fromfit_distribution
when integrating with other packages, and its type depends on the type ofinput
.draws_transformed
:draws
transformed to be draws fromfit_distribution_transformed
.pathfinder_results::Vector{<:
PathfinderResult
}
: results of each single-path Pathfinder run.psis_result::Union{Nothing,<:
PSIS.PSISResult
}
: If importance resampling was used, the result of Pareto-smoothed importance resampling.psis_result.pareto_shape
also diagnoses whetherdraws
can be used to compute estimates from the target distribution.