Running Pathfinder on Turing.jl models

This tutorial demonstrates how Turing can be used with Pathfinder.

We'll demonstrate with a regression example.

using AdvancedHMC, Pathfinder, Random, Turing
Random.seed!(39)

@model function regress(x, y)
    α ~ Normal()
    β ~ Normal()
    σ ~ truncated(Normal(); lower=0)
    y .~ Normal.(α .+ β .* x, σ)
end
x = 0:0.1:10
y = @. 2x + 1.5 + randn() * 0.2
model = regress(collect(x), y)
n_chains = 8
8

For convenience, pathfinder and multipathfinder can take Turing models as inputs and produce MCMCChains.Chains objects as outputs.

result_single = pathfinder(model; ndraws=1_000)
Single-path Pathfinder result
  tries: 1
  draws: 1000
  fit iteration: 22 (total: 26)
  fit ELBO: -1.82 ± 0.1
  fit distribution: Distributions.MvNormal{Float64, Pathfinder.WoodburyPDMat{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, Matrix{Float64}, Matrix{Float64}, Pathfinder.WoodburyPDFactorization{Float64, LinearAlgebra.Diagonal{Float64, Vector{Float64}}, LinearAlgebra.QRCompactWYQ{Float64, Matrix{Float64}, Matrix{Float64}}, LinearAlgebra.UpperTriangular{Float64, Matrix{Float64}}}}, Vector{Float64}}(
dim: 3
μ: [1.4759842060455182, 2.0060275818274436, -1.5479825354725543]
Σ: [0.0018296810848421224 -0.0002589735319675427 -0.0006183452920039042; -0.0002589735319675427 5.0682815503249064e-5 5.534978328717612e-5; -0.0006183452920039042 5.534978328717612e-5 0.006425635871158675]
)
result_multi = multipathfinder(model, 1_000; nruns=n_chains)
Multi-path Pathfinder result
  runs: 8
  draws: 1000
  Pareto shape diagnostic: 0.4 (good)

Here, the Pareto shape diagnostic indicates that it is likely safe to use these draws to compute posterior estimates.

When passed a Model, Pathfinder also gives access to the posterior draws in a familiar MCMCChains.Chains object.

result_multi.draws_transformed
Chains MCMC chain (1000×3×1 Array{Float64, 3}):

Iterations        = 1:1:1000
Number of chains  = 1
Samples per chain = 1000
parameters        = α, β, σ

Summary Statistics
  parameters      mean       std      mcse    ess_bulk   ess_tail      rhat        Symbol   Float64   Float64   Float64     Float64    Float64   Float64    ⋯

           α    1.4788    0.0457    0.0015    952.9201   979.3608    1.0010    ⋯
           β    2.0057    0.0074    0.0002    933.0380   904.9495    1.0019    ⋯
           σ    0.2186    0.0163    0.0005   1042.8884   947.3674    1.0003    ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

           α    1.3976    1.4481    1.4761    1.5074    1.5773
           β    1.9907    2.0002    2.0057    2.0113    2.0186
           σ    0.1897    0.2066    0.2175    0.2287    0.2539

We can also use these posterior draws to initialize MCMC sampling.

init_params = collect.(eachrow(result_multi.draws_transformed.value[1:n_chains, :, 1]))
8-element Vector{Vector{Float64}}:
 [1.4759282967777165, 2.0080943498816506, 0.19831475747935035]
 [1.4611019738875703, 2.013202988431498, 0.26485184914663046]
 [1.4536435437635513, 2.0112987905457618, 0.216829720069575]
 [1.4281518099532862, 2.0124332531719316, 0.19539736180790648]
 [1.4386639969929398, 2.0164082389708864, 0.22183924030035906]
 [1.5278919191383389, 1.9929401187515967, 0.19556963421472995]
 [1.517692871682899, 2.002654866458797, 0.23382231537270487]
 [1.4840715079131972, 2.0083322478266146, 0.23299937285164243]
chns = sample(model, Turing.NUTS(), MCMCThreads(), 1_000, n_chains; init_params, progress=false)
Chains MCMC chain (1000×15×8 Array{Float64, 3}):

Iterations        = 501:1:1500
Number of chains  = 8
Samples per chain = 1000
Wall duration     = 6.76 seconds
Compute duration  = 5.26 seconds
parameters        = α, β, σ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

           α    1.4758    0.0429    0.0007   3533.3337   4066.7817    1.0013   ⋯
           β    2.0060    0.0075    0.0001   3701.2301   4367.0326    1.0008   ⋯
           σ    0.2167    0.0153    0.0002   5265.1954   4321.8918    1.0036   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

           α    1.3915    1.4469    1.4752    1.5044    1.5603
           β    1.9912    2.0011    2.0061    2.0111    2.0203
           σ    0.1894    0.2060    0.2157    0.2264    0.2493

We can use Pathfinder's estimate of the metric and only perform enough warm-up to tune the step size.

inv_metric = result_multi.pathfinder_results[1].fit_distribution.Σ
metric = Pathfinder.RankUpdateEuclideanMetric(inv_metric)
kernel = HMCKernel(Trajectory{MultinomialTS}(Leapfrog(0.0), GeneralisedNoUTurn()))
adaptor = StepSizeAdaptor(0.8, 1.0)  # adapt only the step size
nuts = AdvancedHMC.HMCSampler(kernel, metric, adaptor)

n_adapts = 50
n_draws = 1_000
chns = sample(
    model,
    externalsampler(nuts),
    MCMCThreads(),
    n_draws + n_adapts,
    n_chains;
    n_adapts,
    init_params,
    progress=false,
)[n_adapts + 1:end, :, :]  # drop warm-up draws
Chains MCMC chain (1000×16×8 Array{Float64, 3}):

Iterations        = 51:1:1050
Number of chains  = 8
Samples per chain = 1000
Wall duration     = 2.91 seconds
Compute duration  = 2.54 seconds
parameters        = α, β, σ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size, is_adapt

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

           α    1.4750    0.0465    0.0006   8435.9524   5477.4297    1.0009   ⋯
           β    2.0060    0.0092    0.0002   8316.2670   5781.2053    1.0002   ⋯
           σ    0.2180    0.0425    0.0013   7896.5142   5311.5155    1.0006   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5% 
      Symbol   Float64   Float64   Float64   Float64   Float64 

           α    1.3889    1.4463    1.4757    1.5041    1.5598
           β    1.9913    2.0012    2.0061    2.0112    2.0210
           σ    0.1883    0.2059    0.2158    0.2271    0.2506

See Initializing HMC with Pathfinder for further examples.