Regional Heat Flow¶
reheatfunq.regional
¶
The reheatfunq.regional
module contains functionality to analyze regional
aggregate heat flow distributions using the
GammaConjugatePrior
and HeatFlowPredictive
classes.
The workflow for regional aggregate heat flow analysis using REHEATFUNQ consists
of the following steps:
Define the \(d_\mathrm{min}\) (e.g. \({d_\mathrm{min} = 20\,\mathrm{km}}\))
Define the conjugate prior to use. Obtain a
GammaConjugatePrior
instance (e.g. using the REHEATFUNQ default fromdefault_prior()
).Compute the posterior predictive heat flow distribution using the
HeatFlowPredictive
class. This class performs the bootstrapped updating of the gamma conjugate prior over the set of \(d_\mathrm{min}\)-conforming subsets of the heat flow data.
Exemplarily, the following code summarizes the analysis. First, we generate some toy heat flow data following a gamma distribution:
import numpy as np
rng = np.random.default_rng(123920)
alpha = 53.3
q = rng.gamma(alpha, size=15)
x = 100e3 * (rng.random(15) - 0.5)
y = 100e3 * (rng.random(15) - 0.5)
The resulting synthetic heat flow data should look like this:
Next, the GammaConjugatePrior
and HeatFlowPredictive
classes can be used to evaluate the regional aggregate heat flow distribution
from this data:
from reheatfunq.regional import (GammaConjugatePrior,
default_prior,
HeatFlowPredictive)
gcp = default_prior()
predictive = HeatFlowPredictive(q, x, y, gcp, dmin=20e3)
qplt = np.linspace(35, 85)
cdf = predictive.cdf(qplt)
The posterior predictive CDF is a broadened compared to the original CDF owing to the finite sample size and averaging over alternating data points for pairs within exclusory distance:
This image might vary slightly due to the non-fixed random number generator in
the HeatFlowPredictive
class.
A detailed use of the regional aggregate heat flow distribution estimation can be found in the Jupyter notebook jupyter/REHEATFUNQ/06-Heat-Flow-Analysis.ipynb.
- class GammaConjugatePrior(p, s, n, v, lp=None, amin=1.0)¶
Gamma conjugate prior by Miller [Miller1980].
- Parameters:
p (float | None) – The parameter \(p\) of the gamma conjugate prior. Can be seen as the initial product of heat flow values. Alternatively, \(\ln p\) can be specified through the
lp
parameter when passingNone
as argument forp
..s (float) – The parameter \(s\) of the gamma conjugate prior. Can be seen as the initial sum of heat flow values.
n (float) – The parameter \(n\) of the gamma conjugate prior. For normalization, \(n \geq v\) needs to be fulfilled.
v (float) – The parameter \(v\) of the gamma conjugate prior. For normalization, \(n \geq v\) needs to be fulfilled.
lp (float | None) – The natural logarithm of the parameter \(p\). An alternative way to specify \(p\).
amin (float) – The minimum \(\alpha\) for which the prior is defined. Has to be non-negative.
- kullback_leibler(other, amin=1.0, epsabs=0.0, epsrel=1e-10)¶
Compute the Kullback-Leibler divergence to another gamma conjugate prior.
- Parameters:
other (GammaConjugatePrior | Iterable[GammaConjugatePrior]) – Other gamma conjugate prior(s).
epsabs (float, optional) – Absolute tolerance parameter passed to the quadrature routines.
epsrel (float, optional) – Relative tolerance parameter passed to the quadrature routines
- Returns:
KL – The maximum of the Kullback-Leibler divergences from this reference prior PDF to ther
other
PDF(s).- Return type:
float
- log_likelihood(a, b)¶
Evaluate the log-likelihood given a set of gamma parameters \(\{(\alpha_i, \beta_i) : i = 1,...,N\}\).
- Parameters:
a (array_like) – Set of gamma distribution shape parameters \(a\). Has to be of the same shape as
b
.b (array_like) – Set of gamma distribution scale parameters \(b\). Has to be of the same shape as
a
.
- Returns:
p – The logarithm of the evaluated prior probability of the parameter pairs \(\{(\alpha_i, \beta_i)\}\).
- Return type:
array_like
- log_probability(a, b)¶
Evaluate the logarithm of the probability at parameter points.
- Parameters:
a (array_like) – Set of gamma distribution shape parameters \(a\). Has to be of the same shape as
b
.b (array_like) – Set of gamma distribution scale parameters \(b\). Has to be of the same shape as
a
.
- Returns:
p – The logarithm of the evaluated prior probability of the parameter pairs \(\{(\alpha_i, \beta_i)\}\).
- Return type:
array_like
- static maximum_likelihood_estimate(a, b, p0=1.0, s0=1.0, n0=1.5, v0=1.0, nv_surplus_min=0.04, vmin=0.1, amin=1.0, epsabs=0.0, epsrel=1e-10)¶
Compute the maximum likelihood estimate of the gamma conjugate prior (GCP) given a set of gamma distribution parameters \(\{(\alpha_i, \beta_i) : i = 1,...,N\}\).
- Parameters:
a (array_like) – Set of gamma distribution shape parameters \(a\). Has to be of the same shape as
b
.b (array_like) – Set of gamma distribution scale parameters \(b\). Has to be of the same shape as
a
.p0 (float, optional) – Initial guess for the GCP parameter \(p\).
s0 (float, optional) – Initial guess for the GCP parameter \(s\).
n0 (float, optional) – Initial guess for the GCP parameter \(n\).
v0 (float, optional) – Initial guess for the GCP parameter \(v\).
nv_surplus_min (float, optional) – Ensures that
n >= v * (1 + nv_surplus_min)
.amin (float, optional) – The minimum \(\alpha\) for which the prior is defined. Has to be non-negative.
epsabs (float, optional) – Absolute tolerance parameter passed to the optimization algorithm.
epsrel (float, optional) – Relative tolerance parameter passed to the optimization algorithm.
- Returns:
gcp – The gamma conjugate prior with optimized parameters.
- Return type:
- static minimum_surprise_estimate(hf_samples, pmin=1.0, pmax=100000.0, smin=0.0, smax=1000.0, vmin=0.02, vmax=1.0, nv_surplus_min=1e-08, nv_surplus_max=2.0, amin=1.0, shgo_kwargs={}, cache=None, verbose=False, return_shgo_result=False)¶
Compute the parameter estimate of the gamma conjugate prior (GCP) that minimizes the maximum Kullback-Leibler divergence between the GCP and any of the gamma distribution likelihood computed over a set of heat flow data sets.
- Parameters:
hf_samples (Iterable[array_like]) – A set of heat flow data sets.
pmin (float, optional) – Minimum value for the GCP \(p\) parameter.
pmax (float, optional) – Maximum value for the GCP \(p\) parameter.
smin (float, optional) – Minimum value for the GCP \(s\) parameter.
smax (float, optional) – Maximum value for the GCP \(s\) parameter.
vmin (float, optional) – Minimum value for the GCP \(v\) parameter.
vmax (float, optional) – Maximum value for the GCP \(v\) parameter.
nv_surplus_min (float, optional) – Lower bound for the GCP \(n\) parameter depending on the \(v\) parameter. Ensures that
n >= v * (1 + nv_surplus_min)
.nv_surplus_max (float, optional) – Upper bound for the GCP \(n\) parameter depending on the \(v\) parameter. Ensures that
n <= v * (1 + nv_surplus_max)
.amin (float, optional) – The minimum \(\alpha\) for which the prior is defined. Has to be non-negative.
shgo_kwargs (dict, optional) – Additional parameters to pass to the
scipy.optimize.shgo()
global optimizer.cache (GCPMSECache, optional) – A cache for the evaluation of the Kullback-Leiber distance cost function. Can be used to speed up the SHGO optimization when it samples a large number of times.
verbose (bool, optional) – If
True
, print some additional progress information.return_shgo_result (bool, optional) – If
True
, return thescipy.optimize.OptimizeResult
from the SHGO optimization.
- Returns:
gcp – The gamma conjugate prior with optimized parameters.
- Return type:
- static minimum_surprise_estimate_cache(hf_samples, amin=1.0)¶
Yields a cache for calling the
minimum_surprise_estimate()
method multiple times with the same parameters but different optimizer settings.- Parameters:
hf_samples (Iterable[array_like]) – The set of heat flow samples to which the Kullback-Leibler distance from various points of the parameter space is going to be computed.
amin (float, optional) – The minimum \(\alpha\) for which the prior is defined. Has to be non-negative.
- Returns:
cache – The cache object.
- Return type:
GCPMSECache
- static mle(a, b, p0=1.0, s0=1.0, n0=1.5, v0=1.0, nv_surplus_min=0.04, vmin=0.1, amin=1.0, epsabs=0.0, epsrel=1e-10)¶
Compute the maximum likelihood estimate of the gamma conjugate prior (GCP) given a set of gamma distribution parameters \(\{(\alpha_i, \beta_i) : i = 1,...,N\}\).
- Parameters:
a (array_like) – Set of gamma distribution shape parameters \(a\). Has to be of the same shape as
b
.b (array_like) – Set of gamma distribution scale parameters \(b\). Has to be of the same shape as
a
.p0 (float, optional) – Initial guess for the GCP parameter \(p\).
s0 (float, optional) – Initial guess for the GCP parameter \(s\).
n0 (float, optional) – Initial guess for the GCP parameter \(n\).
v0 (float, optional) – Initial guess for the GCP parameter \(v\).
nv_surplus_min (float, optional) – Ensures that
n >= v * (1 + nv_surplus_min)
.amin (float, optional) – The minimum \(\alpha\) for which the prior is defined. Has to be non-negative.
epsabs (float, optional) – Absolute tolerance parameter passed to the optimization algorithm.
epsrel (float, optional) – Relative tolerance parameter passed to the optimization algorithm.
- Returns:
gcp – The gamma conjugate prior with optimized parameters.
- Return type:
- posterior_predictive(q, inplace=False)¶
Evaluate the posterior predictive distribution for given heat flow data set \(\{q_i\}\).
- Parameters:
q (array_like) – Set of heat flow values.
inplace (bool, optional) – If
True
, overwrite the input array. Works only if the input is anumpy.ndarray
instance.
- Returns:
pdf – The evaluated posterior predictive PDF of heat flow.
- Return type:
array_like
Notes
The prior is agnostic to the physical unit of the heat flow values. However, to remain consistent, the posterior predictive and all successive Bayesian updates have to be performed with the same heat flow unit.
- posterior_predictive_cdf(q, inplace=False)¶
Evaluate the posterior predictive distribution for given heat flow data set \(\{q_i\}\).
- Parameters:
q (array_like) – Set of heat flow values.
inplace (bool, optional) – If
True
, overwrite the input array. Works only if the input is anumpy.ndarray
instance.
- Returns:
cdf – The evaluated posterior predictive CDF of heat flow.
- Return type:
array_like
Notes
The prior is agnostic to the physical unit of the heat flow values. However, to remain consistent, the posterior predictive and all successive Bayesian updates have to be performed with the same heat flow unit.
- probability(a, b)¶
Evaluate the probability at parameter points.
- Parameters:
a (array_like) – Set of gamma distribution shape parameters \(a\). Has to be of the same shape as
b
.b (array_like) – Set of gamma distribution scale parameters \(b\). Has to be of the same shape as
a
.
- Returns:
p – The evaluated prior probability of the parameter pairs \(\{(\alpha_i, \beta_i)\}\).
- Return type:
array_like
- updated(q)¶
Perform a Bayesian update given a heat flow data set.
- Parameters:
q (array_like) – Set of heat flow values.
- Returns:
gcp – An updated prior.
- Return type:
Notes
The prior is agnostic to the physical unit of the heat flow values. However, to remain consistent, all successive updates and the posterior predictive have to be performed with the same heat flow unit.
- visualize(ax, distributions=None, cax=None, log_axes=True, cmap='inferno', color_scale='log', plot_mean=True, q_mean=68.3, q_plot=[], qstd_plot=[], n_alpha=101, n_beta=100)¶
Visualize this GammaConjugatePrior instance on an axis.
- Parameters:
ax (matplotlib.axes.Axes) – The
matplotlib.axes.Axes
to visualize on.distributions (Iterable[array_like], optional) – A set of aggregate heat flow distributions, each given as a one-dimensional
numpy.ndarray
of heat flow values in \(\mathrm{mW}/\mathrm{m}^2\). Each distribution will be displayed via its \(\alpha\) and \(\beta\) maximum likelihood estimate, indicating regions of interest. This may also determine the extent of the plot.cax (matplotlib.axes.Axes, optional) – The
matplotlib.axes.Axes
for plotting a color bar.log_axes (bool, optional) – If
True
, set the axes scale to logarithmic, else use linear axes.cmap (str or matplotlib.colors.Colormap, optional) – Which color map to use for the background probability visualization.
color_scale (Literal['log','lin'], optional) – If ‘log’, plot log-probability in background, else plot probability linearly.
plot_mean (bool, optional) – If ‘False’, do not plot the mean heat flow lines.
q_mean (float, optional) – The global mean heat flow in \(\mathrm{mW}/\mathrm{m}^2\). The default value is 68.3 from Lucazeau (2019).
q_plot (Iterable[Tuple[float,float,float,str] | float], optional) – A set of additional average heat flow values to display. For each q a line through the \((\alpha,\beta)\) parameter space, enumerating parameter combinations whose distributions average to the given q. Each entry in q_plot needs to be either a float q or a tuple (q,amin,amax,c), where amin and amax denote the \(\alpha\)-interval within which the line should be plotted, and c is the color.
qstd_plot (Iterable[Tuple[float,float,float,str] | float], optional) – A set of additional heat flow standard deviations to display. For each qstd a line through the \((\alpha,\beta)\) parameter space, enumerating parameter combinations whose distributions are quantified by a standard deviation qstd. Each entry in qstd_plot needs to be either a float qstd or a tuple (q,amin,amax,c), where amin and amax denote the \(\alpha\)-interval within which the line should be plotted, and c is the color.
n_alpha (int, optional) – The number of grid points in the \(\alpha\) grid axis.
n_beta (int, optional) – The number of grid points in the \(\beta\) grid axis.
Notes
- Lucazeau, F. (2019). Analysis and mapping of an updated terrestrial
heat flow data set. Geochemistry, Geophysics, Geosystems, 20, 4001– 4024. https://doi.org/10.1029/2019GC008389
- class HeatFlowPredictive(q, x, y, gcp, dmin=20000.0, n_bootstrap=1000)¶
Posterior predictive distribution of regional heat flow taking into account spatial constraints (i.e. minimum distance) of the heat flow values.
- Parameters:
q (array_like) – Regional distribution of \(N\) heat flow values. Has to have the unit that the gamma conjugate prior
gcp
is optimized for.x (array_like) – \(x\) coordinates of the heat flow values.
y (array_like) – \(y\) coordinates of the heat flow values.
gcp (reheatfunq.regional.GammaConjugatePrior) – Gamma conjugate prior.
dmin (float, optional) – Minimum distance to be enforced between heat flow values of one independent sample (in m).
n_bootstrap (int, optional) – Number of randomized selections of
q
subsets adhering to the \(d_\mathrm{min}\) criterion.
- cdf(q, epsabs=0.0, epsrel=1e-10)¶
Computes the cumulative distribution function.
- Parameters:
q (array_like) – Heat flow \(q\) at which to evaluate the CDF.
epsabs (float, optional) – Absolute tolerance parameter passed to the quadrature engines.
epsrel (float, optional) – Relative tolerance parameter passed to the quadrature engines.
- Returns:
cdf – Posterior predictive cumulative distribution of regional heat flow.
- Return type:
array_like
- pdf(q, epsabs=0.0, epsrel=1e-10)¶
Computes the probability distribution function.
- Parameters:
q (array_like) – Heat flow \(q\) at which to evaluate the CDF.
epsabs (float, optional) – Absolute tolerance parameter passed to the quadrature engines.
epsrel (float, optional) – Relative tolerance parameter passed to the quadrature engines.
- Returns:
pdf – Posterior predictive probability distribution of regional heat flow.
- Return type:
array_like
- default_prior()¶
The default gamma conjugate prior from the REHEATFUNQ model description paper (Ziebarth et al., 2022a).
- Ziebarth, M. J. and von Specht, S.: REHEATFUNQ 1.4.0:
A model for regional aggregate heat flow distributions and anomaly quantification, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2023-222, 2023.
Notes
This prior is designed for heat flow data in mW/m².