Running Pathfinder on Turing.jl models
This tutorial demonstrates how Turing can be used with Pathfinder.
We'll demonstrate with a regression example.
using AdvancedHMC, Pathfinder, Random, Turing
Random.seed!(39)
@model function regress(x, y)
α ~ Normal()
β ~ Normal()
σ ~ truncated(Normal(); lower=0)
y .~ Normal.(α .+ β .* x, σ)
end
x = 0:0.1:10
y = @. 2x + 1.5 + randn() * 0.2
model = regress(collect(x), y)
n_chains = 8
8
For convenience, pathfinder
and multipathfinder
can take Turing models as inputs and produce MCMCChains.Chains
objects as outputs.
result_single = pathfinder(model; ndraws=1_000)
Single-path Pathfinder result
tries: 1
draws: 1000
fit iteration: 22 (total: 26)
fit ELBO: -1.82 ± 0.1
fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
dim: 3
μ: [1.4759842060455182, 2.0060275818274436, -1.5479825354725543]
Σ: [0.0018296810848421224 -0.0002589735319675427 -0.0006183452920039042; -0.0002589735319675427 5.0682815503249064e-5 5.534978328717612e-5; -0.0006183452920039042 5.534978328717612e-5 0.006425635871158675]
)
result_multi = multipathfinder(model, 1_000; nruns=n_chains)
Multi-path Pathfinder result
runs: 8
draws: 1000
Pareto shape diagnostic: 0.4 (good)
Here, the Pareto shape diagnostic indicates that it is likely safe to use these draws to compute posterior estimates.
When passed a Model
, Pathfinder also gives access to the posterior draws in a familiar MCMCChains.Chains
object.
result_multi.draws_transformed
Chains MCMC chain (1000×3×1 Array{Float64, 3}):
Iterations = 1:1:1000
Number of chains = 1
Samples per chain = 1000
parameters = α, β, σ
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
α 1.4788 0.0457 0.0015 952.9201 979.3608 1.0010 ⋯
β 2.0057 0.0074 0.0002 933.0380 904.9495 1.0019 ⋯
σ 0.2186 0.0163 0.0005 1042.8884 947.3674 1.0003 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
α 1.3976 1.4481 1.4761 1.5074 1.5773
β 1.9907 2.0002 2.0057 2.0113 2.0186
σ 0.1897 0.2066 0.2175 0.2287 0.2539
We can also use these posterior draws to initialize MCMC sampling.
init_params = collect.(eachrow(result_multi.draws_transformed.value[1:n_chains, :, 1]))
8-element Vector{Vector{Float64}}:
[1.4759282967777165, 2.0080943498816506, 0.19831475747935035]
[1.4611019738875703, 2.013202988431498, 0.26485184914663046]
[1.4536435437635513, 2.0112987905457618, 0.216829720069575]
[1.4281518099532862, 2.0124332531719316, 0.19539736180790648]
[1.4386639969929398, 2.0164082389708864, 0.22183924030035906]
[1.5278919191383389, 1.9929401187515967, 0.19556963421472995]
[1.517692871682899, 2.002654866458797, 0.23382231537270487]
[1.4840715079131972, 2.0083322478266146, 0.23299937285164243]
chns = sample(model, Turing.NUTS(), MCMCThreads(), 1_000, n_chains; init_params, progress=false)
Chains MCMC chain (1000×15×8 Array{Float64, 3}):
Iterations = 501:1:1500
Number of chains = 8
Samples per chain = 1000
Wall duration = 6.76 seconds
Compute duration = 5.26 seconds
parameters = α, β, σ
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
α 1.4758 0.0429 0.0007 3533.3337 4066.7817 1.0013 ⋯
β 2.0060 0.0075 0.0001 3701.2301 4367.0326 1.0008 ⋯
σ 0.2167 0.0153 0.0002 5265.1954 4321.8918 1.0036 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
α 1.3915 1.4469 1.4752 1.5044 1.5603
β 1.9912 2.0011 2.0061 2.0111 2.0203
σ 0.1894 0.2060 0.2157 0.2264 0.2493
We can use Pathfinder's estimate of the metric and only perform enough warm-up to tune the step size.
inv_metric = result_multi.pathfinder_results[1].fit_distribution.Σ
metric = Pathfinder.RankUpdateEuclideanMetric(inv_metric)
kernel = HMCKernel(Trajectory{MultinomialTS}(Leapfrog(0.0), GeneralisedNoUTurn()))
adaptor = StepSizeAdaptor(0.8, 1.0) # adapt only the step size
nuts = AdvancedHMC.HMCSampler(kernel, metric, adaptor)
n_adapts = 50
n_draws = 1_000
chns = sample(
model,
externalsampler(nuts),
MCMCThreads(),
n_draws + n_adapts,
n_chains;
n_adapts,
init_params,
progress=false,
)[n_adapts + 1:end, :, :] # drop warm-up draws
Chains MCMC chain (1000×16×8 Array{Float64, 3}):
Iterations = 51:1:1050
Number of chains = 8
Samples per chain = 1000
Wall duration = 2.91 seconds
Compute duration = 2.54 seconds
parameters = α, β, σ
internals = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt
Summary Statistics
parameters mean std mcse ess_bulk ess_tail rhat ⋯
Symbol Float64 Float64 Float64 Float64 Float64 Float64 ⋯
α 1.4750 0.0465 0.0006 8435.9524 5477.4297 1.0009 ⋯
β 2.0060 0.0092 0.0002 8316.2670 5781.2053 1.0002 ⋯
σ 0.2180 0.0425 0.0013 7896.5142 5311.5155 1.0006 ⋯
1 column omitted
Quantiles
parameters 2.5% 25.0% 50.0% 75.0% 97.5%
Symbol Float64 Float64 Float64 Float64 Float64
α 1.3889 1.4463 1.4757 1.5041 1.5598
β 1.9913 2.0012 2.0061 2.0112 2.0210
σ 0.1883 0.2059 0.2158 0.2271 0.2506
See Initializing HMC with Pathfinder for further examples.