8.1.9. pelicun.uq

Constants, classes and methods for uncertainty quantification.

Functions

fit_distribution_to_percentiles(values, ...)

Fit distribution to pre-defined values at a finite number of percentiles.

fit_distribution_to_sample(raw_sample, ...)

Fit a distribution to sample using maximum likelihood estimation.

mvn_orthotope_density(mu, cov[, lower, upper])

Estimate the probability density within a hyperrectangle for an MVN distr.

rv_class_map(distribution_name)

Map convenient distributions to their corresponding class.

scale_distribution(scale_factor, family, theta)

Scale parameters of a random distribution.

Classes

BaseRandomVariable(name[, f_map, anchor])

Base abstract class for different types of random variables.

CoupledEmpiricalRandomVariable(name, theta)

Coupled empirical random variable.

DeterministicRandomVariable(name, theta[, ...])

Deterministic random variable.

EmpiricalRandomVariable(name, theta[, ...])

Empirical random variable.

LogNormalRandomVariable(name, theta[, ...])

Lognormal random variable.

MultilinearCDFRandomVariable(name, theta[, ...])

Multilinear CDF random variable.

MultinomialRandomVariable(name, theta[, ...])

Multinomial random variable.

NormalRandomVariable(name, theta[, ...])

Normal random variable.

Normal_COV(name, theta[, truncation_limits, ...])

Normal random variable with coefficient of variation.

Normal_STD(name, theta[, truncation_limits, ...])

Normal random variable with standard deviation.

RandomVariable(name, theta[, ...])

Random variable that needs values in inverse_transform.

RandomVariableRegistry(rng)

Random variable registry.

RandomVariableSet(name, rv_list, rho)

Random variable set.

UniformRandomVariable(name, theta[, ...])

Uniform random variable.

UtilityRandomVariable(name[, f_map, anchor])

Random variable that needs sample_size in inverse_transform.

WeibullRandomVariable(name, theta[, ...])

Weibull random variable.

class pelicun.uq.BaseRandomVariable(name: str, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Base abstract class for different types of random variables.

property sample: ndarray | None

Return the empirical or generated sample.

Returns:
ndarray

The empirical or generated sample.

property sample_DF: Series | None

Return the empirical or generated sample in a pandas Series.

Returns:
ndarray

The empirical or generated sample in a pandas Series.

property uni_sample: ndarray | None

Return the sample from the controlling uniform distribution.

Returns:
ndarray

The sample from the controlling uniform distribution.

class pelicun.uq.CoupledEmpiricalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Coupled empirical random variable.

inverse_transform(sample_size: int) ndarray[source]

Evaluate the inverse CDF.

Generates a new sample array from the existing empirical data by repeating the dataset until it matches the requested sample size.

Parameters:
sample_size: int

The desired size of the sample array to be generated. It dictates how many times the original dataset will be repeated to match or exceed this size, after which the array is trimmed to precisely match the requested size.

Returns:
ndarray

A new sample array derived from repeating the original dataset.

class pelicun.uq.DeterministicRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Deterministic random variable.

inverse_transform(sample_size: int) ndarray[source]

Evaluate the inverse CDF.

Parameters:
sample_size: int

The desired size of the sample array to be generated.

Returns:
ndarray

Sample array containing the deterministic value.

class pelicun.uq.EmpiricalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Empirical random variable.

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Maps given values to their corresponding positions within the empirical data array, simulating an inverse transformation based on the empirical distribution. This can be seen as a simple form of inverse CDF where values represent normalized positions within the empirical data set.

Parameters:
values: 1D float ndarray

Normalized values between 0 and 1, representing positions within the empirical data distribution.

Returns:
ndarray

The empirical data points corresponding to the given normalized positions.

class pelicun.uq.LogNormalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Lognormal random variable.

cdf(values: ndarray) ndarray[source]

Return the CDF at the given values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the CDF

Returns:
ndarray

1D float ndarray containing CDF values

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the inverse CDF

Returns:
ndarray

Inverse CDF values

class pelicun.uq.MultilinearCDFRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Multilinear CDF random variable.

This RV is defined by specifying the points that define its Cumulative Density Function (CDF), and linear interpolation between them.

cdf(values: ndarray) ndarray[source]

Return the CDF at the given values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the CDF

Returns:
ndarray

1D float ndarray containing CDF values

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the inverse CDF

Returns:
ndarray

Inverse CDF values

class pelicun.uq.MultinomialRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Multinomial random variable.

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Transforms continuous values into discrete events based on the cumulative probabilities of the multinomial distribution derived by theta.

Parameters:
values: 1D float ndarray

Continuous values to be transformed into discrete events according to the multinomial distribution’s cumulative probabilities.

Returns:
ndarray

Discrete events corresponding to the input values.

class pelicun.uq.NormalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Normal random variable.

cdf(values: ndarray) ndarray[source]

Return the CDF at the given values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the CDF

Returns:
ndarray

1D float ndarray containing CDF values

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Evaluates the inverse of the Cumulative Density Function (CDF) for the given values. Used to generate random variable realizations.

Parameters:
values: 1D float ndarray

Values for which to evaluate the inverse CDF

Returns:
ndarray

Inverse CDF values

class pelicun.uq.Normal_COV(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Normal random variable with coefficient of variation.

This class represents a normal random variable defined by mean and coefficient of variation.

class pelicun.uq.Normal_STD(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Normal random variable with standard deviation.

This class represents a normal random variable defined by mean and standard deviation.

class pelicun.uq.RandomVariable(name: str, theta: np.ndarray | None, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Random variable that needs values in inverse_transform.

constant_parameters() bool[source]

If the RV has constant or variable parameters.

Constant parameters are the same in each realization.

Returns:
bool

True if the parameters are constant, false otherwise.

abstract inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

inverse_transform_sampling() None[source]

Create a sample with inverse transform sampling.

Raises:
ValueError

If there is no available uniform sample.

class pelicun.uq.RandomVariableRegistry(rng: Generator)[source]

Random variable registry.

property RV: dict[str, BaseRandomVariable]

Returns all random variable(s) in the registry.

Returns:
dict

all random variable(s) in the registry.

property RV_sample: dict[str, ndarray | None]

Return the sample for every random variable in the registry.

Returns:
dict

The sample for every random variable in the registry.

property RV_set: dict[str, RandomVariableSet]

Return the random variable set(s) in the registry.

Returns:
dict

The random variable set(s) in the registry.

RVs(keys: list[str]) dict[str, BaseRandomVariable][source]

Return a subset of the random variables in the registry.

Parameters:
keys: list of str

Keys that define the subset.

Returns:
dict

A subset random variable(s) in the registry.

add_RV(rv: BaseRandomVariable) None[source]

Add a new random variable to the registry.

Raises:
ValueError

When the RV already exists in the registry

add_RV_set(rv_set: RandomVariableSet) None[source]

Add a new set of random variables to the registry.

generate_sample(sample_size: int, method: str) None[source]

Generate samples for all variables in the registry.

Parameters:
sample_size: int

The number of samples requested per variable.

method: str

Can be any of: ‘MonteCarlo’, ‘LHS’, ‘LHS_midpoint’ The sample generation method to use. ‘MonteCarlo’ stands for conventional random sampling; ‘LHS’ is Latin HyperCube Sampling with random sample location within each bin of the hypercube; ‘LHS_midpoint’ is like LHS, but the samples are assigned to the midpoints of the hypercube bins.

Raises:
NotImplementedError

When the RV parent class is Unknown

class pelicun.uq.RandomVariableSet(name: str, rv_list: list[BaseRandomVariable], rho: ndarray)[source]

Random variable set.

Represents a set of random variables, each of which is described by its own probability distribution. The set allows the user to define correlations between the random variables, and provides methods to sample from the correlated variables and estimate various statistical properties of the set, such as the probability density within a specified range or orthotope.

Parameters:
name: string

A unique string that identifies the set of random variables.

RV_list: list of RandomVariable

Defines the random variables in the set

Rho: float 2D ndarray

Defines the correlation matrix that describes the correlation between the random variables in the set. Currently, only the Gaussian copula is supported.

property RV: dict[str, RandomVariable]

Returns the random variable(s) assigned to the set.

Returns:
ndarray

The random variable(s) assigned to the set.

Rho(var_subset: list[str] | None = None) ndarray[source]

Return the (subset of the) correlation matrix.

Returns:
ndarray

The (subset of the) correlation matrix.

apply_correlation() None[source]

Apply correlation to n dimensional uniform samples.

Currently, correlation is applied using a Gaussian copula. First, we try using Cholesky transformation. If the correlation matrix is not positive semidefinite and Cholesky fails, use SVD to apply the correlations while preserving as much as possible from the correlation matrix.

orthotope_density(lower: ndarray | float = nan, upper: ndarray | float = nan, var_subset: list[str] | None = None) ndarray[source]

Estimate the probability density within an orthotope for the RV set.

Use the mvn_orthotope_density function in this module for the calculation. The distribution of individual RVs is not limited to the normal family. The provided limits are converted to the standard normal space that is the basis of all RVs in pelicun. Truncation limits and correlation (using Gaussian copula) are automatically taken into consideration.

Parameters:
lower: float ndarray, optional, default: np.nan

Lower bound(s) of the orthotope. A scalar value can be used for a univariate RV; a list of bounds is expected in multivariate cases. If the orthotope is not bounded from below in a dimension, use ‘np.nan’ to that dimension.

upper: float ndarray, optional, default: np.nan

Upper bound(s) of the orthotope. A scalar value can be used for a univariate RV; a list of bounds is expected in multivariate cases. If the orthotope is not bounded from above in a dimension, use ‘np.nan’ to that dimension.

var_subset: list of strings, optional, default: None

If provided, allows for selecting only a subset of the variables in the RV_set for the density calculation.

Returns:
tuple
alpha: float

Estimate of the probability density within the orthotope.

eps_alpha: float

Estimate of the error in alpha.

property sample: dict[str, ndarray | None]

Returns the sample of the variables in the set.

Returns:
ndarray

The sample of the variables in the set.

property size: int

Returns the size (i.e., number of variables in the) RV set.

Returns:
ndarray

The size (i.e., number of variables in the) RV set.

class pelicun.uq.UniformRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Uniform random variable.

cdf(values: ndarray) ndarray[source]

Return the CDF at the given values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the CDF

Returns:
ndarray

1D float ndarray containing CDF values

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the inverse CDF

Returns:
ndarray

Inverse CDF values

class pelicun.uq.UtilityRandomVariable(name: str, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Random variable that needs sample_size in inverse_transform.

abstract inverse_transform(sample_size: int) ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

inverse_transform_sampling(sample_size: int) None[source]

Create a sample with inverse transform sampling.

class pelicun.uq.WeibullRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Weibull random variable.

cdf(values: ndarray) ndarray[source]

Return the CDF at the given values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the CDF

Returns:
ndarray

1D float ndarray containing CDF values

inverse_transform(values: ndarray) ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:
values: 1D float ndarray

Values for which to evaluate the inverse CDF

Returns:
ndarray

Inverse CDF values

pelicun.uq.fit_distribution_to_percentiles(values: list[float], percentiles: list[float], families: list[str]) tuple[str, list[float]][source]

Fit distribution to pre-defined values at a finite number of percentiles.

Parameters:
values: list of float

Pre-defined values at the given percentiles. At least two values are expected.

percentiles: list of float

Percentiles where values are defined. At least two percentiles are expected.

families: list of strings {‘normal’, ‘lognormal’}

Defines the distribution family candidates.

Returns:
tuple
family: string

The optimal choice of family among the provided list of families

theta: array of float

Parameters of the fitted distribution.

pelicun.uq.fit_distribution_to_sample(raw_sample: np.ndarray, distribution: str | list[str], truncation_limits: tuple[float, float] = (nan, nan), censored_count: int = 0, detection_limits: tuple[float, float] = (nan, nan), *, multi_fit: bool = False, logger_object: Logger | None = None) tuple[np.ndarray, np.ndarray][source]

Fit a distribution to sample using maximum likelihood estimation.

The number of dimensions of the distribution are inferred from the shape of the sample data. Censoring is automatically considered if the number of censored samples and the corresponding detection limits are provided. Infinite or unspecified truncation limits lead to fitting a non-truncated distribution in that dimension.

Parameters:
raw_sample: float ndarray

Raw data that serves as the basis of estimation. The number of samples equals the number of columns and each row introduces a new feature. In other words: a list of sample lists is expected where each sample list is a collection of samples of one variable.

distribution: {‘normal’, ‘lognormal’}

Defines the target probability distribution type. Different types of distributions can be mixed by providing a list rather than a single value. Each element of the list corresponds to one of the features in the raw_sample.

truncation_limits: float ndarray, optional, default: [None, None]

Lower and/or upper truncation limits for the specified distributions. A two-element vector can be used for a univariate case, while two lists of limits are expected in multivariate cases. If the distribution is non-truncated from one side in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.

censored_count: int, optional, default: None

The number of censored samples that are beyond the detection limits. All samples outside the detection limits are aggregated into one set. This works the same way in one and in multiple dimensions. Prescription of specific censored sample counts for sub-regions of the input space outside the detection limits is not supported.

detection_limits: float ndarray, optional, default: [None, None]

Lower and/or upper detection limits for the provided samples. A two-element vector can be used for a univariate case, while two lists of limits are expected in multivariate cases. If the data is not censored from one side in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.

multi_fit: bool, optional, default: False

If True, we attempt to fit a multivariate distribution to the samples. Otherwise, we fit each marginal univariate distribution independently and estimate the correlation matrix in the end based on the fitted marginals. Using multi_fit can be advantageous with censored data and if the correlation in the data is not Gaussian. It leads to substantially longer calculation time and does not always produce better results, especially when the number of dimensions is large.

logger_object:

Logging object to be used. If no object is specified, no logging is performed.

Returns:
tuple
theta: float ndarray

Estimates of the parameters of the fitted probability distribution in each dimension. The following parameters are returned for the supported distributions: normal, normal_cov - mean, coefficient of variation; normal_std - mean, standard deviation; lognormal - median, log standard deviation;

Rho: float 2D ndarray, optional

In the multivariate case, returns the estimate of the correlation matrix.

Raises:
ValueError

If NaN values are produced during standard normal space transformation

pelicun.uq.mvn_orthotope_density(mu: float | ndarray, cov: ndarray, lower: float | ndarray = nan, upper: float | ndarray = nan) tuple[float, float][source]

Estimate the probability density within a hyperrectangle for an MVN distr.

Use the method of Alan Genz (1992) to estimate the probability density of a multivariate normal distribution within an n-orthotope (i.e., hyperrectangle) defined by its lower and upper bounds. Limits can be relaxed in any direction by assigning infinite bounds (i.e. numpy.inf).

Parameters:
mu: float scalar or ndarray

Mean(s) of the non-truncated distribution.

cov: float ndarray

Covariance matrix of the non-truncated distribution

lower: float vector, optional, default: np.nan

Lower bound(s) for the truncated distributions. A scalar value can be used for a univariate case, while a list of bounds is expected in multivariate cases. If the distribution is non-truncated from below in a subset of the dimensions, use either None or assign an infinite value (i.e. -numpy.inf) to those dimensions.

upper: float vector, optional, default: np.nan

Upper bound(s) for the truncated distributions. A scalar value can be used for a univariate case, while a list of bounds is expected in multivariate cases. If the distribution is non-truncated from above in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.

Returns:
tuple
alpha: float

Estimate of the probability density within the hyperrectangle.

eps_alpha: float

Estimate of the error in the calculated probability density.

pelicun.uq.rv_class_map(distribution_name: str) type[RandomVariable | UtilityRandomVariable][source]

Map convenient distributions to their corresponding class.

Parameters:
distribution_name: str

The name of a distribution.

Returns:
type[RandomVariable | UtilityRandomVariable]

The class of the corresponding random variable.

Raises:
ValueError

If the given distribution name does not correspond to a distribution class.

pelicun.uq.scale_distribution(scale_factor: float, family: str, theta: ndarray, truncation_limits: ndarray | None = None) tuple[ndarray, ndarray | None][source]

Scale parameters of a random distribution.

Parameters:
scale_factor: float

Value by which to scale the parameters.

family: {‘normal’ (or ‘normal_cov’), ‘normal_std’, ‘lognormal’,

‘uniform’} Defines the type of probability distribution for the random variable.

theta: float ndarray of length 2

Set of parameters that define the cumulative distribution function of the variable given its distribution type. See the expected parameters explained in the RandomVariable class. Each parameter can be defined by one or more values. If a set of values are provided for one parameter, they define ordinates of a multilinear function that is used to get the parameter values given an independent variable.

truncation_limits: float ndarray of length 2, default: None

Defines the [a,b] truncation limits for the distribution. Use None to assign no limit in one direction.

Returns:
tuple

A tuple containing the scaled parameters and truncation limits:

  • theta_new (float ndarray of length 2): Scaled parameters of the distribution.

  • truncation_limits (float ndarray of length 2 or None): Scaled truncation limits for the distribution, or None if no truncation is applied.

Raises:
ValueError

If the specified distribution family is unsupported.