Resilience Analysis

reheatfunq.resilience

This module contains functions to evaluate the performance of the REHEATFUNQ model for artifical gamma-distributed data, and its resilience against non-gamma regional aggregate heat flow distributions.

The function test_performance_cython() can be used to investigate how the REHEATFUNQ model performs for data drawn from a gamma distribution, distributed randomly within a \(R=80\,\mathrm{km}\) disk, and superposed by an AnomalyLS1980 anomaly. The sample size, the gamma distribution parameters, and the prior parameters are the tweakable parameters of this function. It is used in the Jupyter notebook jupyter/REHEATFUNQ/A3-Posterior-Impact.ipynb.

The function test_performance_mixture_cython() can be used to investigate how well the REHEATFUNQ model performs if data is not drawn from a gamma distribution but from a two-component Gaussian mixture distribution. That is, it is a resilience test that can be tweaked to a certain class of regional aggregate heat flow distributions. It is also used in the Jupyter notebook jupyter/REHEATFUNQ/A3-Posterior-Impact.ipynb.

The functions generate_synthetic_heat_flow_coverings_mix2() and generate_synthetic_heat_flow_coverings_mix3() generate synthetic RGRDCs that can mimic RGRDCs from real data. The two functions proceed as follows:

  1. Input the structure of the real-world data RGRDC: Represent each disk by a tuple \((N,k,\theta)\), where \(N\) is the sample size and \((k,\theta)\) is the maximum likelihood estimate of the gamma distribution for the regional aggregate heat flow distribution associated to the disk.

  2. Define a two-component (_mix2) or three-component (_mix3) “Gaussian” mixture distribution that describes the relative error distribution of the heat flow data. (“Gaussian” because we ignore the negative real line)

  3. For a number of M times, repeat the following steps to generate one synthetic RGRDC:

    • for each \((N,k,\theta)\), draw a sample from the \((k,\theta)\) gamma distribution

    • draw a random relative error from the “Gaussian” mixture distribution and superpose the relative error randomly in positive or negative direction

    • accept or reject according to filter criteria (heat flow positivity and max heat flow)

    • repeat until \(N\) heat flow values are found

The two functions are used in the Jupyter notebook jupyter/REHEATFUNQ/A2-Goodness-of-Fit_R_and-Mixture-Distributions.ipynb.

The function reheatfunq.resilience.generate_normal_mixture_errors_3() is an interface to the generation of the three-component “Gaussian” mixture distribution described above. An example for the distribution can be generated from this code:

from reheatfunq.resilience import \
    generate_normal_mixture_errors_3
X00 = 0.0
X01 = 0.30
X02 = 0.8
W0 = 0.3
S0 = 0.05
S1 = 0.2
W1 = 0.6
S2 = 0.05
X = generate_normal_mixture_errors_3(10000, W0, X00, S0, W1,
                                     X01, S1, X02, S2, 2089)
../_images/example-normal-mix3.svg

It is used in the Jupyter notebook jupyter/REHEATFUNQ/A2-Goodness-of-Fit_R_and-Mixture-Distributions.ipynb.


test_performance_cython(long[:] Nset, size_t M, double P_MW, double K, double T, double[:] quantile, double PRIOR_P, double PRIOR_S, double PRIOR_N, double PRIOR_V, double amin=1.0, short verbose=True, short show_failures=False, size_t seed=848782, short use_cpp_quantiles=True, double tol=1e-3, unsigned char nthread=0)

Tests the performance of the gamma model (with and without prior) for synthetic data sets that do not stem from a gamma distribution. The analysis is performed for synthetic data randomly distributed within an \(80\,\mathrm{km}\) radius disk with a straight-line fault passing through its center.

Parameters:
  • Nset (array_like) – Sample sizes \(\{N_i\}\) for which to perform the test.

  • M (int) – Number of repetition per sample size.

  • P_MW (float) – Power of the anomaly.

  • K (float) – Gamma distribution shape parameter \(k\).

  • T (float) – Gamma distribution scale parameter \(\theta\).

  • quantile (array_like) – Array of anomaly P_H posterior quantiles to evaluate. The array must be either 4 or 41 elements in size.

  • PRIOR_P (float) – Parameter \(p\) of the gamma conjugate prior.

  • PRIOR_s (float) – Parameter \(s\) of the gamma conjugate prior.

  • PRIOR_N (float) – Parameter \(n\) of the gamma conjugate prior.

  • PRIOR_V (float) – Parameter \(\nu\) of the gamma conjugate prior.

  • amin (float) – The minimum shape parameter \(\alpha\) of the gamma distribution. Has to be positive.

  • verbose (bool, optional) – If True, print some progress information.

  • show_failures (bool, optional) – Currently without effect.

  • seed (int, optional) – Random number generator seed for reproduciblity.

  • use_cpp_quantiles (bool, optional) – Currently without effect.

  • tol (float, optional) – Quantile inversion tolerance passed to the algorithms.

Returns:

res – Quantiles of the \(P_H\) posteriors. The array has the shape (2, len(Nset), len(quantile), M).

Return type:

numpy.ndarray

test_performance_mixture_cython(long[:] Nset, size_t M, double P_MW, double x0, double s0, double a0, double x1, double s1, double a1, double[:] quantile, double PRIOR_P, double PRIOR_S, double PRIOR_N, double PRIOR_V, double amin, short verbose=True, short show_failures=False, size_t seed=848782, short use_cpp_quantiles=True, double tol=1e-4)

Tests the performance of the gamma model (with and without prior) for synthetic data sets that do not stem from a gamma distribution. The analysis is performed for synthetic data randomly distributed within an \(80\,\mathrm{km}\) radius disk with a straight-line fault passing through its center.

Quantiles are computed both for the prior with the supplied parameters and for the “uninformed” prior (\(p=1\), \(s=n=\nu=0\)).

Parameters:
  • Nset (array_like) – Sample sizes \(\{N_i\}\) for which to perform the test.

  • M (int) – Number of repetition per sample size.

  • P_MW (float) – Power of the anomaly.

  • x0 (float) – Location of the first normal mixture component.

  • s0 (float) – Standard deviation of the first normal mixture component.

  • a0 (float) – Weight of the first normal mixture component.

  • x1 (float) – Location of the second normal mixture component.

  • s1 (float) – Standard deviation of the second normal mixture component.

  • quantile (array_like) – Array of anomaly P_H posterior quantiles to evaluate. The array must be either 4 or 41 elements in size.

  • PRIOR_P (float) – Parameter \(p\) of the gamma conjugate prior.

  • PRIOR_s (float) – Parameter \(s\) of the gamma conjugate prior.

  • PRIOR_N (float) – Parameter \(n\) of the gamma conjugate prior.

  • PRIOR_V (float) – Parameter \(\nu\) of the gamma conjugate prior.

  • amin (float) – The minimum shape parameter \(\alpha\) of the gamma distribution. Has to be positive.

  • verbose (bool, optional) – If True, print some progress information.

  • show_failures (bool, optional) – Currently without effect.

  • seed (int, optional) – Random number generator seed for reproduciblity.

  • use_cpp_quantiles (bool, optional) – Currently without effect.

  • tol (float, optional) – Quantile inversion tolerance passed to the algorithms.

Returns:

res – Quantiles of the \(P_H\) posteriors. The array has the shape (2, len(Nset), len(quantile), M).

Return type:

numpy.ndarray

generate_synthetic_heat_flow_coverings_mix2(const double[:] k, const double[:] t, const long[:] N, long M, double hf_max, double w0, double x00, double s0, double x10, double s1, size_t seed, unsigned short nthread)

Generate synthetic heat flow coverings using a two component normal mixture distribution as an error distribution.

Parameters:
  • k (array_like) – M gamma distribution shape parameters \(k\).

  • t (array_like) – M gamma distribution scale parameters \(\theta\).

  • N (array_like) – M sample sizes to draw from the corresponding gamma distributions.

  • M (int) – Number of RGRDCs to draw.

  • hf_max (float) – Threshold below which to accept heat flow values.

  • w0 (float) – Weight of the first normal distribution describing the error mixture distribution.

  • x00 (float) – Location of the first normal distribution.

  • s0 (float) – Standard deviation of the first normal distribution.

  • x10 (float) – Location of the second normal distribution.

  • s1 (float) – Standard deviation of the second normal distribution.

  • seed (int) – Seed by which to initialize the random number generation.

  • nthread (int) – Number of threads to use. In combination with seed, this fixes the sequence of random number generation used in this run. Keep both values the same to obtain reproducible results.

Returns:

res – List of lists distributions forming the RGRDCs.

Return type:

list[list]

generate_synthetic_heat_flow_coverings_mix3(list k, list t, list N, double hf_max, double w0, double x00, double s0, double w1, double x10, double s1, double x20, double s2, size_t seed, unsigned short nthread)

Generate synthetic heat flow coverings using a three component normal mixture distribution as an error distribution.

Parameters:
  • k (list[array_like]) – \(M\) arrays of gamma distribution shape parameters \(k\).

  • t (list) – \(M\) arrays of gamma distribution scale parameters \(\theta\).

  • N (list) – \(M\) arrays of sample sizes to draw from the corresponding gamma distributions.

  • hf_max (float) – Threshold below which to accept heat flow values.

  • w0 (float) – Weight of the first normal distribution describing the error mixture distribution.

  • x00 (float) – Location of the first normal distribution.

  • s0 (float) – Standard deviation of the first normal distribution.

  • w1 (float) – Weight of the second normal distribution describing the error mixture distribution.

  • x10 (float) – Location of the second normal distribution.

  • s1 (float) – Standard deviation of the second normal distribution.

  • x20 (float) – Location of the third normal distribution.

  • s2 (float) – Standard deviation of the third normal distribution.

  • seed (int) – Seed by which to initialize the random number generation.

  • nthread (int) – Number of threads to use. In combination with seed, this fixes the sequence of random number generation used in this run. Keep both values the same to obtain reproducible results.

Returns:

res – List of lists of distributions forming the RGRDCs.

Return type:

list[list]

generate_normal_mixture_errors_3(size_t N, double w0, double x00, double s0, double w1, double x10, double s1, double x20, double s2, size_t seed)

Draw random numbers from the three-component normal mixture distribution.

Parameters:
  • N (int) – Number of random numbers to generate.

  • w0 (float) – Weight of the first mixture component.

  • x00 (float) – Center of the first mixture component.

  • s0 (float) – Standard deviation of the first mixture component.

  • w1 (float) – Weight of the second mixture component.

  • x10 (float) – Center of the second mixture component.

  • s1 (float) – Standard deviation of the second mixture component.

  • x20 (float) – Center of the third mixture component.

  • s2 (float) – Standard deviation of the third mixture component.

  • seed (int) – Random number generator seed for reproducibility.

Returns:

X – Random values.

Return type:

numpy.ndarray