Running Pathfinder on Turing.jl models
This tutorial demonstrates how Turing can be used with Pathfinder.
We'll demonstrate with a regression example.
using AdvancedHMC, Pathfinder, Random, Turing
Random.seed!(39)
@model function regress(x, y)
α ~ Normal()
β ~ Normal()
σ ~ truncated(Normal(); lower=0)
y ~ product_distribution(Normal.(α .+ β .* x, σ))
end
x = 0:0.1:10
y = @. 2x + 1.5 + randn() * 0.2Precompiling packages...
5968.9 ms ✓ Bijectors → BijectorsReverseDiffChainRulesExt
1 dependency successfully precompiled in 6 seconds. 112 already precompiled.model = regress(collect(x), y)
n_chains = 88For convenience, pathfinder and multipathfinder can take Turing models as inputs and produce MCMCChains.Chains objects as outputs.
result_single = pathfinder(model; ndraws=1_000)Single-path Pathfinder result
tries: 1
draws: 1000
fit iteration: 22 (total: 27)
fit ELBO: -1.95 ± 0.03
fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
dim: 3
μ: [1.4759589593237075, 2.006026941391299, -1.5478356647744373]
Σ: [0.0018286220197252872 -0.00026016692981478836 -0.000587388771216234; -0.00026016692981478814 5.1003622229418224e-5 5.1983566955437696e-5; -0.0005873887712162338 5.198356695543741e-5 0.006436906231207773]
)
result_multi = multipathfinder(model, 1_000; nruns=n_chains)Multi-path Pathfinder result
runs: 8
draws: 1000
Pareto shape diagnostic: 0.22 (good)Here, the Pareto shape diagnostic indicates that it is likely safe to use these draws to compute posterior estimates.
When passed a DynamicPPL.Model, Pathfinder also gives access to the posterior draws in a familiar Chains object.
result_multi.draws_transformedChains MCMC chain (1000×3×1 Array{Float64, 3}):
Iterations = 1:1:1000
Number of chains = 1
Samples per chain = 1000
parameters = α, β, σ
Use `describe(chains)` for summary statistics and quantiles.
We can also use these posterior draws to initialize MCMC sampling.
init_params = collect.(eachrow(result_multi.draws_transformed.value[1:n_chains, :, 1]))8-element Vector{Vector{Float64}}:
[1.4390541778196517, 2.0121557827155407, 0.20120810911887976]
[1.4115687477656853, 2.019678627131055, 0.2096801497640124]
[1.512837000132197, 1.9985160932558892, 0.21759122435350517]
[1.3903687181025688, 2.0202399248118676, 0.21791335674841772]
[1.4934205235699054, 2.01026436070289, 0.2462402345498041]
[1.4992626991939928, 2.003316215314798, 0.1897072590471297]
[1.44326269927453, 2.007590670799191, 0.2153426948745106]
[1.5069739163429519, 1.998273937482774, 0.22273344601444686]chns = sample(model, Turing.NUTS(), MCMCThreads(), 1_000, n_chains; init_params, progress=false)Chains MCMC chain (1000×17×8 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 8
Samples per chain = 1000
Wall duration = 10.49 seconds
Compute duration = 8.41 seconds
parameters = α, β, σ
internals = n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, lp, logprior, loglikelihood
Use `describe(chains)` for summary statistics and quantiles.
We can use Pathfinder's estimate of the metric and only perform enough warm-up to tune the step size.
inv_metric = result_multi.pathfinder_results[1].fit_distribution.Σ
metric = Pathfinder.RankUpdateEuclideanMetric(inv_metric)
kernel = HMCKernel(Trajectory{MultinomialTS}(Leapfrog(0.0), GeneralisedNoUTurn()))
adaptor = StepSizeAdaptor(0.8, 1.0) # adapt only the step size
nuts = AdvancedHMC.HMCSampler(kernel, metric, adaptor)
n_adapts = 50
n_draws = 1_000
chns = sample(
model,
externalsampler(nuts),
MCMCThreads(),
n_draws + n_adapts,
n_chains;
n_adapts,
init_params,
progress=false,
)[n_adapts + 1:end, :, :] # drop warm-up drawsChains MCMC chain (1000×18×8 Array{Float64, 3}):
Iterations = 51:1:1050
Number of chains = 8
Samples per chain = 1000
Wall duration = 3.75 seconds
Compute duration = 3.26 seconds
parameters = α, β, σ
internals = n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt, lp, logprior, loglikelihood
Use `describe(chains)` for summary statistics and quantiles.
See Initializing HMC with Pathfinder for further examples.