# Public Documentation

Documentation for `Pathfinder.jl`

's public interface.

See the Internals section for documentation of internal functions.

## Index

`Pathfinder.MultiPathfinderResult`

`Pathfinder.PathfinderResult`

`Pathfinder.multipathfinder`

`Pathfinder.pathfinder`

## Public Interface

`Pathfinder.pathfinder`

— Function```
pathfinder(ℓ; kwargs...)
pathfinder(fun::SciMLBase::OptimizationFunction; kwargs...)
pathfinder(prob::SciMLBase::OptimizationProblem; kwargs...)
```

Find the best multivariate normal approximation encountered while maximizing a log density.

From an optimization trajectory, Pathfinder constructs a sequence of (multivariate normal) approximations to the distribution specified by a log density function. The approximation that maximizes the evidence lower bound (ELBO), or equivalently, minimizes the KL divergence between the approximation and the true distribution, is returned.

The covariance of the multivariate normal distribution is an inverse Hessian approximation constructed using at most the previous `history_length`

steps.

**Arguments**

`ℓ`

: an object, representing the log-density of the target distribution and its gradient, that implements the LogDensityProblems interface.`fun::SciMLBase.OptimizationFunction`

: an optimization function that represents the negative log density with its gradient. It must have the necessary features (e.g. a Hessian function, if applicable) for the chosen optimization algorithm. For details, see Optimization.jl: OptimizationFunction.`prob::SciMLBase.OptimizationProblem`

: an optimization problem containing a function with the same properties as`fun`

, as well as an initial point, in which case`init`

and`dim`

are ignored.

**Keywords**

`dim::Int`

: dimension of the target distribution, needed only if`fun`

is provided and`init`

is not.`init::AbstractVector{<:Real}`

: initial point of length`dim`

from which to begin optimization. If not provided, an initial point of type`Vector{Float64}`

and length`dim`

is created and filled using`init_sampler`

.`init_scale::Real`

: scale factor $s$ such that the default`init_sampler`

samples entries uniformly in the range $[-s, s]$`init_sampler`

: function with the signature`(rng, x) -> x`

that modifies a vector of length`dims`

in-place to generate an initial point`ndraws_elbo::Int=5`

: Number of draws used to estimate the ELBO`ndraws::Int=ndraws_elbo`

: number of approximate draws to return`rng::Random.AbstractRNG`

: The random number generator to be used for drawing samples`executor::Transducers.Executor=Transducers.SequentialEx()`

: Transducers.jl executor that determines if and how to perform ELBO computation in parallel. The default (`SequentialEx()`

) performs no parallelization. If`rng`

is known to be thread-safe, and the log-density function is known to have no internal state, then`Transducers.PreferParallel()`

may be used to parallelize log-density evaluation. This is generally only faster for expensive log density functions.`history_length::Int=6`

: Size of the history used to approximate the inverse Hessian.`optimizer`

: Optimizer to be used for constructing trajectory. Can be any optimizer compatible with Optimization.jl, so long as it supports callbacks. Defaults to`Optim.LBFGS(; m=history_length, linesearch=LineSearches.HagerZhang(), alphaguess=LineSearches.InitialHagerZhang())`

. See the Optimization.jl documentation for details.`ntries::Int=1_000`

: Number of times to try the optimization, restarting if it fails. Before every restart, a new initial point is drawn using`init_sampler`

.`fail_on_nonfinite::Bool=true`

: If`true`

, optimization fails if the log-density is a`NaN`

or`Inf`

or if the gradient is ever non-finite. If`nretries > 0`

, then optimization will be retried after reinitialization.`kwargs...`

: Remaining keywords are forwarded to`Optimization.solve`

.

**Returns**

`Pathfinder.PathfinderResult`

— Type`PathfinderResult`

Container for results of single-path Pathfinder.

**Fields**

`input`

: User-provided input object, e.g. a LogDensityProblem,`optim_fun`

,`optim_prob`

, or another object.`optimizer`

: Optimizer used for maximizing the log-density`rng`

: Pseudorandom number generator that was used for sampling`optim_prob::SciMLBase.OptimizationProblem`

: Otimization problem used for optimization`logp`

: Log-density function`fit_distribution::Distributions.MvNormal`

: ELBO-maximizing multivariate normal distribution`draws::AbstractMatrix{<:Real}`

: draws from multivariate normal with size`(dim, ndraws)`

`fit_distribution_transformed`

:`fit_distribution`

transformed to the same space as the user-supplied target distribution. This is only different from`fit_distribution`

when integrating with other packages, and its type depends on the type of`input`

.`draws_transformed`

:`draws`

transformed to be draws from`fit_distribution_transformed`

.`fit_iteration::Int`

: Iteration at which ELBO estimate was maximized`num_tries::Int`

: Number of tries until Pathfinder succeeded`optim_solution::SciMLBase.OptimizationSolution`

: Solution object of optimization.`optim_trace::Pathfinder.OptimizationTrace`

: container for optimization trace of points, log-density, and gradient. The first point is the initial point.`fit_distributions::AbstractVector{Distributions.MvNormal}`

: Multivariate normal distributions for each point in`optim_trace`

, where`fit_distributions[fit_iteration + 1] == fit_distribution`

`elbo_estimates::AbstractVector{<:Pathfinder.ELBOEstimate}`

: ELBO estimates for all but the first distribution in`fit_distributions`

.`num_bfgs_updates_rejected::Int`

: Number of times a BFGS update to the reconstructed inverse Hessian was rejected to keep the inverse Hessian positive definite.

**Returns**

`Pathfinder.multipathfinder`

— Function```
multipathfinder(ℓ, ndraws; kwargs...)
multipathfinder(fun::SciMLBase.OptimizationFunction, ndraws; kwargs...)
```

Run `pathfinder`

multiple times to fit a multivariate normal mixture model.

For `nruns=length(init)`

, `nruns`

parallel runs of pathfinder produce `nruns`

multivariate normal approximations $q_k = q(\phi | \mu_k, \Sigma_k)$ of the posterior. These are combined to a mixture model $q$ with uniform weights.

$q$ is augmented with the component index to generate random samples, that is, elements $(k, \phi)$ are drawn from the augmented mixture model

\[\tilde{q}(\phi, k | \mu, \Sigma) = K^{-1} q(\phi | \mu_k, \Sigma_k),\]

where $k$ is a component index, and $K=$ `nruns`

. These draws are then resampled with replacement. Discarding $k$ from the samples would reproduce draws from $q$.

If `importance=true`

, then Pareto smoothed importance resampling is used, so that the resulting draws better approximate draws from the target distribution $p$ instead of $q$. This also prints a warning message if the importance weighted draws are unsuitable for approximating expectations with respect to $p$.

**Arguments**

`ℓ`

: an object, representing the log-density of the target distribution and its gradient, that implements the LogDensityProblems interface.`fun::SciMLBase.OptimizationFunction`

: an optimization function that represents a negative log density with its gradient. It must have the necessary features (e.g. a Hessian function) for the chosen optimization algorithm. For details, see Optimization.jl: OptimizationFunction.`ndraws::Int`

: number of approximate draws to return

**Keywords**

`init`

: iterator of length`nruns`

of initial points of length`dim`

from which each single-path Pathfinder run will begin.`length(init)`

must be implemented. If`init`

is not provided,`nruns`

must be, and`dim`

must be if`fun`

provided.`nruns::Int`

: number of runs of Pathfinder to perform. Ignored if`init`

is provided.`ndraws_per_run::Int`

: The number of draws to take for each component before resampling. Defaults to a number such that`ndraws_per_run * nruns > ndraws`

.`importance::Bool=true`

: Perform Pareto smoothed importance resampling of draws.`rng::AbstractRNG=Random.GLOBAL_RNG`

: Pseudorandom number generator. It is recommended to use a parallelization-friendly PRNG like the default PRNG on Julia 1.7 and up.`executor::Transducers.Executor=Transducers.SequentialEx()`

: Transducers.jl executor that determines if and how to run the single-path runs in parallel. If a transducer for multi-threaded computation is selected, you must first verify that`rng`

and the log density function are thread-safe.`executor_per_run::Transducers.Executor=Transducers.SequentialEx()`

: Transducers.jl executor used within each run to parallelize PRNG calls. Defaults to no parallelization. See`pathfinder`

for a description.`kwargs...`

: Remaining keywords are forwarded to`pathfinder`

.

**Returns**

`Pathfinder.MultiPathfinderResult`

— Type`MultiPathfinderResult`

Container for results of multi-path Pathfinder.

**Fields**

`input`

: User-provided input object, e.g. either`logp`

,`(logp, ∇logp)`

,`optim_fun`

,`optim_prob`

, or another object.`optimizer`

: Optimizer used for maximizing the log-density`rng`

: Pseudorandom number generator that was used for sampling`optim_prob::SciMLBase.OptimizationProblem`

: Otimization problem used for optimization`logp`

: Log-density function`fit_distribution::Distributions.MixtureModel`

: uniformly-weighted mixture of ELBO- maximizing multivariate normal distributions from each run.`draws::AbstractMatrix{<:Real}`

: draws from`fit_distribution`

with size`(dim, ndraws)`

, potentially resampled using importance resampling to be closer to the target distribution.`draw_component_ids::Vector{Int}`

: component id of each draw in`draws`

.`fit_distribution_transformed`

:`fit_distribution`

transformed to the same space as the user-supplied target distribution. This is only different from`fit_distribution`

when integrating with other packages, and its type depends on the type of`input`

.`draws_transformed`

:`draws`

transformed to be draws from`fit_distribution_transformed`

.`pathfinder_results::Vector{<:PathfinderResult}`

: results of each single-path Pathfinder run.`psis_result::Union{Nothing,<:PSIS.PSISResult}`

: If importance resampling was used, the result of Pareto-smoothed importance resampling.`psis_result.pareto_shape`

also diagnoses whether`draws`

can be used to compute estimates from the target distribution. See`PSIS.PSISResult`

for details