Hybrid Imaging of a Black Hole
In this tutorial, we will use hybrid imaging to analyze the 2017 EHT data. By hybrid imaging, we mean decomposing the model into simple geometric models, e.g., rings and such, plus a rasterized image model to soak up the additional structure. This approach was first developed in BB20
and applied to EHT 2017 data. We will use a similar model in this tutorial.
Introduction to Hybrid modeling and imaging
The benefit of using a hybrid-based modeling approach is the effective compression of information/parameters when fitting the data. Hybrid modeling requires the user to incorporate specific knowledge of how you expect the source to look like. For instance for M87, we expect the image to be dominated by a ring-like structure. Therefore, instead of using a high-dimensional raster to recover the ring, we can use a ring model plus a raster to soak up the additional degrees of freedom. This is the approach we will take in this tutorial to analyze the April 6 2017 EHT data of M87.
"nul"
Loading the Data
To get started we will load Comrade
using Comrade
Load the Data
using Pyehtim
For reproducibility we use a stable random number genreator
using StableRNGs
rng = StableRNG(11)
StableRNGs.LehmerRNG(state=0x00000000000000000000000000000017)
To download the data visit https://doi.org/10.25739/g85n-f134 To load the eht-imaging obsdata object we do:
obs = ehtim.obsdata.load_uvfits(joinpath(__DIR, "..", "..", "Data", "SR1_M87_2017_096_lo_hops_netcal_StokesI.uvfits"))
Python: <ehtim.obsdata.Obsdata object at 0x7f580273b7f0>
Now we do some minor preprocessing:
- Scan average the data since the data have been preprocessed so that the gain phases coherent.
obs = scan_average(obs).add_fractional_noise(0.02)
Python: <ehtim.obsdata.Obsdata object at 0x7f57dfe3d390>
For this tutorial we will once again fit complex visibilities since they provide the most information once the telescope/instrument model are taken into account.
dvis = extract_table(obs, Visibilities())
EHTObservationTable{Comrade.EHTVisibilityDatum{:I}}
source: M87
mjd: 57849
bandwidth: 1.856e9
sites: [:AA, :AP, :AZ, :JC, :LM, :PV, :SM]
nsamples: 274
Building the Model/Posterior
Now we build our intensity/visibility model. That is, the model that takes in a named tuple of parameters and perhaps some metadata required to construct the model. For our model, we will use a raster or ContinuousImage
model, an m-ring
model, and a large asymmetric Gaussian component to model the unresolved short-baseline flux.
function sky(θ, metadata)
(;c, σimg, f, r, σ, ma, mp, fg) = θ
(;ftot, grid) = metadata
# Form the image model
# First transform to simplex space first applying the non-centered transform
rast = (ftot*f*(1-fg)).*to_simplex(CenteredLR(), σimg.*c)
mimg = ContinuousImage(rast, grid, BSplinePulse{3}())
# Form the ring model
α = ma.*cos.(mp)
β = ma.*sin.(mp)
ring = smoothed(modify(MRing(α, β), Stretch(r), Renormalize((ftot*(1-f)*(1-fg)))), σ)
gauss = modify(Gaussian(), Stretch(μas2rad(250.0)), Renormalize(ftot*f*fg))
# We group the geometric models together for improved efficiency. This will be
# automated in future versions.
return mimg + (ring + gauss)
end
sky (generic function with 1 method)
Unlike other imaging examples (e.g., Imaging a Black Hole using only Closure Quantities) we also need to include a model for the instrument, i.e., gains as well. The gains will be broken into two components
Gain amplitudes which are typically known to 10-20%, except for LMT, which has amplitudes closer to 50-100%.
Gain phases which are more difficult to constrain and can shift rapidly.
using VLBIImagePriors
using Distributions
fgain(x) = exp(x.lg + 1im*x.gp)
G = SingleStokesGain(fgain)
intpr = (
lg= ArrayPrior(IIDSitePrior(ScanSeg(), Normal(0.0, 0.2)); LM = IIDSitePrior(ScanSeg(), Normal(0.0, 1.0))),
gp= ArrayPrior(IIDSitePrior(ScanSeg(), DiagonalVonMises(0.0, inv(π^2))); refant=SEFDReference(0.0))
)
intmodel = InstrumentModel(G, intpr)
InstrumentModel
with Jones: SingleStokesGain
with reference basis: CirBasis()
Before we move on, let's go into the model
function a bit. This function takes two arguments θ
and metadata
. The θ
argument is a named tuple of parameters that are fit to the data. The metadata
argument is all the ancillary information we need to construct the model. For our hybrid model, we will need two variables for the metadata, a grid
that specifies the locations of the image pixels and a cache
that defines the algorithm used to calculate the visibilities given the image model. This is required since ContinuousImage
is most easily computed using number Fourier transforms like the NFFT
or FFT. To combine the models, we use Comrade
's overloaded +
operators, which will combine the images such that their intensities and visibilities are added pointwise.
Now let's define our metadata. First we will define the cache for the image. This is required to compute the numerical Fourier transform.
fovxy = μas2rad(200.0)
npix = 32
g = imagepixels(fovxy, fovxy, npix, npix)
RectiGrid(
executor: Serial()
Dimensions:
(↓ X Sampled{Float64} LinRange{Float64}(-4.69663253574863e-10, 4.69663253574863e-10, 32) ForwardOrdered Regular Points,
→ Y Sampled{Float64} LinRange{Float64}(-4.69663253574863e-10, 4.69663253574863e-10, 32) ForwardOrdered Regular Points)
)
Part of hybrid imaging is to force a scale separation between the different model components to make them identifiable. To enforce this we will set the raster component to have a correlation length of 5 times the beam size.
beam = beamsize(dvis)
rat = (beam/(step(g.X)))
cprior = GaussMarkovRandomField(5*rat, size(g))
GaussMarkovRandomField(
Graph: MarkovRandomFieldGraph{1}(
dims: (32, 32)
)
Correlation Parameter: 19.9737623915157
)
For the other parameters we use a uniform priors for the ring fractional flux f
ring radius r
, ring width σ
, and the flux fraction of the Gaussian component fg
and the amplitude for the ring brightness modes. For the angular variables ξτ
and ξ
we use the von Mises prior with concentration parameter inv(π^2)
which is essentially a uniform prior on the circle. Finally for the standard deviation of the MRF we use a half-normal distribution. This is to ensure that the MRF has small differences from the mean image.
skyprior = (
c = cprior,
σimg = truncated(Normal(0.0, 0.1); lower=0.01),
f = Uniform(0.0, 1.0),
r = Uniform(μas2rad(10.0), μas2rad(30.0)),
σ = Uniform(μas2rad(0.1), μas2rad(10.0)),
ma = ntuple(_->Uniform(0.0, 0.5), 2),
mp = ntuple(_->DiagonalVonMises(0.0, inv(π^2)), 2),
fg = Uniform(0.0, 1.0),
)
(c = GaussMarkovRandomField(
Graph: MarkovRandomFieldGraph{1}(
dims: (32, 32)
)
Correlation Parameter: 19.9737623915157
), σimg = Truncated(Distributions.Normal{Float64}(μ=0.0, σ=0.1); lower=0.01), f = Distributions.Uniform{Float64}(a=0.0, b=1.0), r = Distributions.Uniform{Float64}(a=4.84813681109536e-11, b=1.454441043328608e-10), σ = Distributions.Uniform{Float64}(a=4.848136811095359e-13, b=4.84813681109536e-11), ma = (Distributions.Uniform{Float64}(a=0.0, b=0.5), Distributions.Uniform{Float64}(a=0.0, b=0.5)), mp = (DiagonalVonMises{Float64, Float64, Float64}(μ=0.0, κ=0.10132118364233778, lnorm=-1.739120733481688), DiagonalVonMises{Float64, Float64, Float64}(μ=0.0, κ=0.10132118364233778, lnorm=-1.739120733481688)), fg = Distributions.Uniform{Float64}(a=0.0, b=1.0))
Now we form the metadata
skymetadata = (;ftot=1.1, grid = g)
skym = SkyModel(sky, skyprior, g; metadata=skymetadata)
SkyModel
with map: sky
on grid: RectiGrid
This is everything we need to specify our posterior distribution, which our is the main object of interest in image reconstructions when using Bayesian inference.
using Enzyme
post = VLBIPosterior(skym, intmodel, dvis; admode=set_runtime_activity(Enzyme.Reverse))
VLBIPosterior
ObservedSkyModel
with map: sky
on grid: FourierDualDomainObservedInstrumentModel
with Jones: SingleStokesGain
with reference basis: CirBasis()Data Products: Comrade.EHTVisibilityDatum
To sample from our prior we can do
xrand = prior_sample(rng, post)
(sky = (c = [-0.44529092134185344 0.03378078605047161 … -0.4741567124507198 -0.23433102934140357; -0.4686635305847552 0.18012124390317302 … -0.17609321912104284 -0.10771419292257058; … ; 0.44000245511552644 -0.2869933402739895 … 0.3548543065216182 0.40313489057389673; 0.3538865353388511 -0.6737438360167014 … 0.8952366303298505 -0.029392711178885984], σimg = 0.03980227243556875, f = 0.5676454586976165, r = 4.955112263430132e-11, σ = 1.1076060681084754e-11, ma = (0.02128458313753645, 0.39103319842041473), mp = (0.6885456830314665, -0.8710032911110378), fg = 0.8170587283681274), instrument = (lg = [-0.22265538080801509, 0.05957725391983268, 0.14343168186903135, -0.07258306817202874, -1.3057094576190185, -0.13638716173082255, -0.09333852136523127, 0.04692075531264257, 0.7783863806163226, -0.10780502461243083 … 0.21286064829799356, -0.15420064225238939, -1.6387018716890742, -0.2152268655348303, 0.0963947567851535, -0.06329661607019992, -0.2128057296996201, -0.7234394354877258, 0.5621991578468002, -0.10933291138585476], gp = [0.0, 1.0134045335677988, 0.0, -2.08881731850661, 2.3757067401785634, -0.2475011454636265, 0.0, 1.9449518769297747, -0.34648176580934803, -2.0364149631719046 … 2.034386889255044, -2.8089100833276595, 1.7580411164254417, -3.1394933556654108, 0.0, 0.1640288876455451, -1.0881883594200465, 2.171308713291473, 1.4083967803967004, 2.2543821470654635]))
and then plot the results
import CairoMakie as CM
gpl = imagepixels(μas2rad(200.0), μas2rad(200.0), 128, 128)
fig = imageviz(intensitymap(skymodel(post, xrand), gpl));
Reconstructing the Image
To find the image we will demonstrate two methods:
Optimization to find the MAP (fast but often a poor estimator)
Sampling to find the posterior (slow but provides a substantially better estimator)
For optimization we will use the Optimization.jl
package and the LBFGS optimizer. To use this we use the comrade_opt
function
using Optimization
using OptimizationOptimJL
xopt, sol = comrade_opt(post, LBFGS();
initial_params=xrand, maxiters=2000, g_tol=1e0)
((sky = (c = [-0.12831606472375262 -0.1890389982095457 … -0.4579811004250614 -0.3033131450685616; -0.15917003182874973 -0.2667480705001328 … -0.561937523438752 -0.38793037261527386; … ; 0.26716645060481237 0.16859462573038722 … -0.13680091903664413 -0.1707030730731025; 0.15408688202039558 0.05478641116139435 … -0.11167747829868073 -0.11851027361716268], σimg = 0.74875932231518, f = 0.6993743226502495, r = 1.0473586384791714e-10, σ = 1.3595089315434176e-11, ma = (0.27876962217357126, 0.07853702354231189), mp = (2.629886756349823, 2.588262120205745), fg = 0.038302000300862935), instrument = (lg = [0.02566550553606351, 0.025708700972675552, 0.019685737534160624, 0.030512105977859996, -0.21259049917676864, 0.12477672937070486, 0.008896391836136806, 0.03606967843005842, -0.010718350962117599, 0.09084047567010506 … 0.034543287719423735, 0.019991407333753465, -0.6366394432209154, 0.01988433814849745, 0.015642788235023724, 0.03385991133519022, -0.021299941234713543, 0.026323829178653935, -0.6785337232464154, 0.01643931114747018], gp = [0.0, 0.053638635953940104, 0.0, -2.1911695512610447, 1.3536914523977444, -0.1068266258213508, 0.0, -2.24132977639159, 1.4121386286792321, -0.3136477488393237 … -1.779008977343235, -0.011114951933827928, -2.7442075098640375, -3.1328048863822384, 0.0, -1.8306371355092281, -1.87755229502336, 0.009921487811946832, -2.8791978711391604, -3.1148770276364983])), retcode: Failure
u: [-0.12831606472375262, -0.15917003182874973, -0.15693493353950375, -0.12899793572879664, -0.12746163161244073, -0.19171000170713476, -0.2698191006979437, -0.22788720689062217, -0.14546550636938815, -0.04461903130760405 … -0.9078145671466141, -0.24134349409510028, -0.8894380322911237, -0.2817333255022532, 0.009286731757371464, 0.9359913770385953, -0.24220565558980944, -0.901775819607393, -0.02499373521331736, -0.9353248454261197]
Final objective value: -1612.9377384894888
)
First we will evaluate our fit by plotting the residuals
using Plots
fig = residual(post, xopt);
These residuals suggest that we are substantially overfitting the data. This is a common side effect of MAP imaging. As a result if we plot the image we see that there is substantial high-frequency structure in the image that isn't supported by the data.
imageviz(intensitymap(skymodel(post, xopt), gpl), figure=(;resolution=(500, 400),))
To improve our results we will now move to Posterior sampling. This is the main method we recommend for all inference problems in Comrade
. While it is slower the results are often substantially better. To sample we will use the AdvancedHMC
package.
using AdvancedHMC
chain = sample(rng, post, NUTS(0.8), 700; n_adapts=500, progress=false, initial_params=xopt);
[ Info: Found initial step size 0.003125
We then remove the adaptation/warmup phase from our chain
chain = chain[501:end]
PosteriorSamples
Samples size: (200,)
sampler used: AHMC
Mean
┌───────────────────────────────────────────────────────────────────────────────
│ sky ⋯
│ @NamedTuple{c::Matrix{Float64}, σimg::Float64, f::Float64, r::Float64, σ::Fl ⋯
├───────────────────────────────────────────────────────────────────────────────
│ (c = [-0.094047 -0.128681 … -0.295233 -0.191892; -0.127476 -0.18552 … -0.329 ⋯
└───────────────────────────────────────────────────────────────────────────────
2 columns omitted
Std. Dev.
┌───────────────────────────────────────────────────────────────────────────────
│ sky ⋯
│ @NamedTuple{c::Matrix{Float64}, σimg::Float64, f::Float64, r::Float64, σ::Fl ⋯
├───────────────────────────────────────────────────────────────────────────────
│ (c = [0.580965 0.629557 … 0.525759 0.495616; 0.557537 0.565906 … 0.576435 0. ⋯
└───────────────────────────────────────────────────────────────────────────────
2 columns omitted
Warning
This should be run for 4-5x more steps to properly estimate expectations of the posterior
Now lets plot the mean image and standard deviation images. To do this we first clip the first 250 MCMC steps since that is during tuning and so the posterior is not sampling from the correct sitesary distribution.
using StatsBase
msamples = skymodel.(Ref(post), chain[begin:2:end]);
The mean image is then given by
imgs = intensitymap.(msamples, Ref(gpl))
fig = imageviz(mean(imgs), colormap=:afmhot, size=(400, 300));
fig = imageviz(std(imgs), colormap=:batlow, size=(400, 300));
We can also split up the model into its components and analyze each separately
comp = Comrade.components.(msamples)
ring_samples = getindex.(comp, 2)
rast_samples = first.(comp)
ring_imgs = intensitymap.(ring_samples, Ref(gpl));
rast_imgs = intensitymap.(rast_samples, Ref(gpl));
ring_mean, ring_std = mean_and_std(ring_imgs);
rast_mean, rast_std = mean_and_std(rast_imgs);
fig = CM.Figure(; resolution=(400, 400));
axes = [CM.Axis(fig[i, j], xreversed=true, aspect=CM.DataAspect()) for i in 1:2, j in 1:2]
CM.image!(axes[1,1], ring_mean, colormap=:afmhot); axes[1,1].title = "Ring Mean"
CM.image!(axes[1,2], ring_std, colormap=:afmhot); axes[1,2].title = "Ring Std. Dev."
CM.image!(axes[2,1], rast_mean, colormap=:afmhot); axes[2,1].title = "Rast Mean"
CM.image!(axes[2,2], rast_std./rast_mean, colormap=:afmhot); axes[2,2].title = "Rast std/mean"
CM.hidedecorations!.(axes)
fig |> DisplayAs.PNG |> DisplayAs.Text
Finally, let's take a look at some of the ring parameters
figd = CM.Figure(;resolution=(650, 400));
p1 = CM.density(figd[1,1], rad2μas(chain.sky.r)*2, axis=(xlabel="Ring Diameter (μas)",))
p2 = CM.density(figd[1,2], rad2μas(chain.sky.σ)*2*sqrt(2*log(2)), axis=(xlabel="Ring FWHM (μas)",))
p3 = CM.density(figd[1,3], -rad2deg.(chain.sky.mp.:1) .+ 360.0, axis=(xlabel = "Ring PA (deg) E of N",))
p4 = CM.density(figd[2,1], 2*chain.sky.ma.:2, axis=(xlabel="Brightness asymmetry",))
p5 = CM.density(figd[2,2], 1 .- chain.sky.f, axis=(xlabel="Ring flux fraction",))
Now let's check the residuals using draws from the posterior
p = Plots.plot();
for s in sample(chain, 10)
residual!(p, post, s, legend=false)
end
And everything looks pretty good! Now comes the hard part: interpreting the results...
This page was generated using Literate.jl.