Running Pathfinder on Turing.jl models

This tutorial demonstrates how Turing can be used with Pathfinder.

We'll demonstrate with a regression example.

using AdvancedHMC, Pathfinder, Random, Turing
Random.seed!(39)

@model function regress(x, y)
    α ~ Normal()
    β ~ Normal()
    σ ~ truncated(Normal(); lower=0)
    y ~ product_distribution(Normal.(α .+ β .* x, σ))
end
x = 0:0.1:10
y = @. 2x + 1.5 + randn() * 0.2
model = regress(collect(x), y)
n_chains = 8
8

For convenience, pathfinder and multipathfinder can take Turing models as inputs and produce MCMCChains.Chains objects as outputs.

result_single = pathfinder(model; ndraws=1_000)
Single-path Pathfinder result
  tries: 1
  draws: 1000
  fit iteration: 22 (total: 27)
  fit ELBO: -1.95 ± 0.03
  fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
dim: 3
μ: [1.4759589631384424, 2.0060269413732406, -1.5478356855731017]
Σ: [0.0018286217897182317 -0.0002601667184641306 -0.0005873937020381307; -0.0002601667184641315 5.100356931222657e-5 5.198418640540482e-5; -0.0005873937020381303 5.19841864054033e-5 0.006436905716392558]
)
result_multi = multipathfinder(model, 1_000; nruns=n_chains)
Multi-path Pathfinder result
  runs: 8
  draws: 1000
  Pareto shape diagnostic: 0.13 (good)

Here, the Pareto shape diagnostic indicates that it is likely safe to use these draws to compute posterior estimates.

When passed a DynamicPPL.Model, Pathfinder also gives access to the posterior draws in a familiar Chains object.

result_multi.draws_transformed
Chains MCMC chain (1000×3×1 Array{Float64, 3}):

Iterations        = 1:1:1000
Number of chains  = 1
Samples per chain = 1000
parameters        = α, β, σ

Use `describe(chains)` for summary statistics and quantiles.

We can also use these posterior draws to initialize MCMC sampling.

init_params = collect.(eachrow(result_multi.draws_transformed.value[1:n_chains, :, 1]))
8-element Vector{Vector{Float64}}:
 [1.4788183604100789, 2.0112966820963205, 0.22343950706112325]
 [1.5368618353360335, 1.9946464198347595, 0.2015055621388207]
 [1.4829238367162103, 2.0012141841460487, 0.2242403502093551]
 [1.4455906095998192, 2.0102084830342175, 0.19628402207501333]
 [1.4301563929979353, 2.0152338351394494, 0.20473519091764364]
 [1.5376201448246691, 2.005630109930768, 0.21815643058739184]
 [1.4849482910471787, 2.003480823341345, 0.21492971029770003]
 [1.4462537604418453, 2.0070495504654127, 0.24891172551914742]
chns = sample(model, Turing.NUTS(), MCMCThreads(), 1_000, n_chains; init_params, progress=false)
Chains MCMC chain (1000×17×8 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 8
Samples per chain = 1000
Wall duration     = 7.53 seconds
Compute duration  = 5.56 seconds
parameters        = α, β, σ
internals         = n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, lp, logprior, loglikelihood

Use `describe(chains)` for summary statistics and quantiles.

We can use Pathfinder's estimate of the metric and only perform enough warm-up to tune the step size.

inv_metric = result_multi.pathfinder_results[1].fit_distribution.Σ
metric = Pathfinder.RankUpdateEuclideanMetric(inv_metric)
kernel = HMCKernel(Trajectory{MultinomialTS}(Leapfrog(0.0), GeneralisedNoUTurn()))
adaptor = StepSizeAdaptor(0.8, 1.0)  # adapt only the step size
nuts = AdvancedHMC.HMCSampler(kernel, metric, adaptor)

n_adapts = 50
n_draws = 1_000
chns = sample(
    model,
    externalsampler(nuts),
    MCMCThreads(),
    n_draws + n_adapts,
    n_chains;
    n_adapts,
    init_params,
    progress=false,
)[n_adapts + 1:end, :, :]  # drop warm-up draws
Chains MCMC chain (1000×18×8 Array{Float64, 3}):

Iterations        = 51:1:1050
Number of chains  = 8
Samples per chain = 1000
Wall duration     = 3.19 seconds
Compute duration  = 2.69 seconds
parameters        = α, β, σ
internals         = n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt, lp, logprior, loglikelihood

Use `describe(chains)` for summary statistics and quantiles.

See Initializing HMC with Pathfinder for further examples.