pelicun.uq

Constants, classes and methods for uncertainty quantification.

Functions

`_OLS_percentiles`(params, values, perc, family)	Estimate percentiles using ordinary least squares (OLS).
`_get_limit_probs`(limits, distribution, theta)	Get the CDF value at the specified limits.
`_get_std_corr_matrix`(std_samples)	Estimate the correlation matrix.
`_get_std_samples`(samples, theta, tr_limits, ...)	Transform samples to standard normal space.
`_get_theta`(params, inits, dist_list)	Return the parameters of the target distributions.
`_mvn_scale`(x, rho)	Scaling utility function.
`_neg_log_likelihood`(params, inits, ...[, ...])	Calculate negative log likelihood.
`fit_distribution_to_percentiles`(values, ...)	Fit distribution to pre-defined values at a finite number of percentiles.
`fit_distribution_to_sample`(raw_sample, ...)	Fit a distribution to sample using maximum likelihood estimation.
`mvn_orthotope_density`(mu, cov[, lower, upper])	Estimate the probability density within a hyperrectangle for an MVN distr.
`rv_class_map`(distribution_name)	Map convenient distributions to their corresponding class.
`scale_distribution`(scale_factor, family, theta)	Scale parameters of a random distribution.

Classes

`BaseRandomVariable`(name[, f_map, anchor])	Base abstract class for different types of random variables.
`CoupledEmpiricalRandomVariable`(name, theta)	Coupled empirical random variable.
`DeterministicRandomVariable`(name, theta[, ...])	Deterministic random variable.
`EmpiricalRandomVariable`(name, theta[, ...])	Empirical random variable.
`LogNormalRandomVariable`(name, theta[, ...])	Lognormal random variable.
`MultilinearCDFRandomVariable`(name, theta[, ...])	Multilinear CDF random variable.
`MultinomialRandomVariable`(name, theta[, ...])	Multinomial random variable.
`NormalRandomVariable`(name, theta[, ...])	Normal random variable.
`Normal_COV`(name, theta[, truncation_limits, ...])	Normal random variable with coefficient of variation.
`Normal_STD`(name, theta[, truncation_limits, ...])	Normal random variable with standard deviation.
`RandomVariable`(name, theta[, ...])	Random variable that needs values in inverse_transform.
`RandomVariableRegistry`(rng)	Random variable registry.
`RandomVariableSet`(name, rv_list, rho)	Random variable set.
`UniformRandomVariable`(name, theta[, ...])	Uniform random variable.
`UtilityRandomVariable`(name[, f_map, anchor])	Random variable that needs sample_size in inverse_transform.
`WeibullRandomVariable`(name, theta[, ...])	Weibull random variable.

class pelicun.uq.BaseRandomVariable(name: str, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Base abstract class for different types of random variables.

property sample: ndarray | None

Return the empirical or generated sample.

Returns:

ndarray: The empirical or generated sample.

property sample_DF: Series | None

Return the empirical or generated sample in a pandas Series.

Returns:

ndarray: The empirical or generated sample in a pandas Series.

property uni_sample: ndarray | None

Return the sample from the controlling uniform distribution.

Returns:

ndarray: The sample from the controlling uniform distribution.

class pelicun.uq.CoupledEmpiricalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Coupled empirical random variable.

inverse_transform(sample_size: int) → ndarray[source]

Evaluate the inverse CDF.

Generates a new sample array from the existing empirical data by repeating the dataset until it matches the requested sample size.

Parameters:

sample_size: int: The desired size of the sample array to be generated. It dictates how many times the original dataset will be repeated to match or exceed this size, after which the array is trimmed to precisely match the requested size.

Returns:

ndarray: A new sample array derived from repeating the original dataset.

class pelicun.uq.DeterministicRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Deterministic random variable.

inverse_transform(sample_size: int) → ndarray[source]

Evaluate the inverse CDF.

Parameters:

sample_size: int: The desired size of the sample array to be generated.

Returns:

ndarray: Sample array containing the deterministic value.

class pelicun.uq.EmpiricalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Empirical random variable.

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Maps given values to their corresponding positions within the empirical data array, simulating an inverse transformation based on the empirical distribution. This can be seen as a simple form of inverse CDF where values represent normalized positions within the empirical data set.

Parameters:

values: 1D float ndarray: Normalized values between 0 and 1, representing positions within the empirical data distribution.

Returns:

ndarray: The empirical data points corresponding to the given normalized positions.

class pelicun.uq.LogNormalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Lognormal random variable.

cdf(values: ndarray) → ndarray[source]

Return the CDF at the given values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the CDF

Returns:

ndarray: 1D float ndarray containing CDF values

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the inverse CDF

Returns:

ndarray: Inverse CDF values

class pelicun.uq.MultilinearCDFRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Multilinear CDF random variable.

This RV is defined by specifying the points that define its Cumulative Density Function (CDF), and linear interpolation between them.

cdf(values: ndarray) → ndarray[source]

Return the CDF at the given values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the CDF

Returns:

ndarray: 1D float ndarray containing CDF values

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the inverse CDF

Returns:

ndarray: Inverse CDF values

class pelicun.uq.MultinomialRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Multinomial random variable.

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Transforms continuous values into discrete events based on the cumulative probabilities of the multinomial distribution derived by theta.

Parameters:

values: 1D float ndarray: Continuous values to be transformed into discrete events according to the multinomial distribution’s cumulative probabilities.

Returns:

ndarray: Discrete events corresponding to the input values.

class pelicun.uq.NormalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Normal random variable.

cdf(values: ndarray) → ndarray[source]

Return the CDF at the given values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the CDF

Returns:

ndarray: 1D float ndarray containing CDF values

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Evaluates the inverse of the Cumulative Density Function (CDF) for the given values. Used to generate random variable realizations.

Parameters:

values: 1D float ndarray: Values for which to evaluate the inverse CDF

Returns:

ndarray: Inverse CDF values

class pelicun.uq.Normal_COV(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Normal random variable with coefficient of variation.

This class represents a normal random variable defined by mean and coefficient of variation.

class pelicun.uq.Normal_STD(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Normal random variable with standard deviation.

This class represents a normal random variable defined by mean and standard deviation.

class pelicun.uq.RandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Random variable that needs values in inverse_transform.

constant_parameters() → bool[source]

If the RV has constant or variable parameters.

Constant parameters are the same in each realization.

Returns:

bool: True if the parameters are constant, false otherwise.

abstract inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

inverse_transform_sampling() → None[source]

Create a sample with inverse transform sampling.

Raises:

ValueError: If there is no available uniform sample.

class pelicun.uq.RandomVariableRegistry(rng: Generator)[source]

Random variable registry.

property RV: dict[str, BaseRandomVariable]

Returns all random variable(s) in the registry.

Returns:

dict: all random variable(s) in the registry.

property RV_sample: dict[str, ndarray | None]

Return the sample for every random variable in the registry.

Returns:

dict: The sample for every random variable in the registry.

property RV_set: dict[str, RandomVariableSet]

Return the random variable set(s) in the registry.

Returns:

dict: The random variable set(s) in the registry.

RVs(keys: list[str]) → dict[str, BaseRandomVariable][source]

Return a subset of the random variables in the registry.

Parameters:

keys: list of str: Keys that define the subset.

Returns:

dict: A subset random variable(s) in the registry.

add_RV(rv: BaseRandomVariable) → None[source]

Add a new random variable to the registry.

Raises:

ValueError: When the RV already exists in the registry

add_RV_set(rv_set: RandomVariableSet) → None[source]: Add a new set of random variables to the registry.

generate_sample(sample_size: int, method: str) → None[source]

Generate samples for all variables in the registry.

Parameters:

sample_size: int: The number of samples requested per variable.
method: str: Can be any of: ‘MonteCarlo’, ‘LHS’, ‘LHS_midpoint’ The sample generation method to use. ‘MonteCarlo’ stands for conventional random sampling; ‘LHS’ is Latin HyperCube Sampling with random sample location within each bin of the hypercube; ‘LHS_midpoint’ is like LHS, but the samples are assigned to the midpoints of the hypercube bins.

Raises:

NotImplementedError: When the RV parent class is Unknown

class pelicun.uq.RandomVariableSet(name: str, rv_list: list[BaseRandomVariable], rho: ndarray)[source]

Random variable set.

Represents a set of random variables, each of which is described by its own probability distribution. The set allows the user to define correlations between the random variables, and provides methods to sample from the correlated variables and estimate various statistical properties of the set, such as the probability density within a specified range or orthotope.

Parameters:

name: string: A unique string that identifies the set of random variables.
RV_list: list of RandomVariable: Defines the random variables in the set
Rho: float 2D ndarray: Defines the correlation matrix that describes the correlation between the random variables in the set. Currently, only the Gaussian copula is supported.

property RV: dict[str, RandomVariable]

Returns the random variable(s) assigned to the set.

Returns:

ndarray: The random variable(s) assigned to the set.

Rho(var_subset: list[str] | None = None) → ndarray[source]

Return the (subset of the) correlation matrix.

Returns:

ndarray: The (subset of the) correlation matrix.

apply_correlation() → None[source]

Apply correlation to n dimensional uniform samples.

Currently, correlation is applied using a Gaussian copula. First, we try using Cholesky transformation. If the correlation matrix is not positive semidefinite and Cholesky fails, use SVD to apply the correlations while preserving as much as possible from the correlation matrix.

orthotope_density(lower: ndarray | float = nan, upper: ndarray | float = nan, var_subset: list[str] | None = None) → ndarray[source]

Estimate the probability density within an orthotope for the RV set.

Use the mvn_orthotope_density function in this module for the calculation. The distribution of individual RVs is not limited to the normal family. The provided limits are converted to the standard normal space that is the basis of all RVs in pelicun. Truncation limits and correlation (using Gaussian copula) are automatically taken into consideration.

Parameters:

lower: float ndarray, optional, default: np.nan: Lower bound(s) of the orthotope. A scalar value can be used for a univariate RV; a list of bounds is expected in multivariate cases. If the orthotope is not bounded from below in a dimension, use ‘np.nan’ to that dimension.
upper: float ndarray, optional, default: np.nan: Upper bound(s) of the orthotope. A scalar value can be used for a univariate RV; a list of bounds is expected in multivariate cases. If the orthotope is not bounded from above in a dimension, use ‘np.nan’ to that dimension.
var_subset: list of strings, optional, default: None: If provided, allows for selecting only a subset of the variables in the RV_set for the density calculation.

Returns:

tuple

alpha: float: Estimate of the probability density within the orthotope.
eps_alpha: float: Estimate of the error in alpha.

property sample: dict[str, ndarray | None]

Returns the sample of the variables in the set.

Returns:

ndarray: The sample of the variables in the set.

property size: int

Returns the size (i.e., number of variables in the) RV set.

Returns:

ndarray: The size (i.e., number of variables in the) RV set.

class pelicun.uq.UniformRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Uniform random variable.

cdf(values: ndarray) → ndarray[source]

Return the CDF at the given values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the CDF

Returns:

ndarray: 1D float ndarray containing CDF values

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the inverse CDF

Returns:

ndarray: Inverse CDF values

class pelicun.uq.UtilityRandomVariable(name: str, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Random variable that needs sample_size in inverse_transform.

abstract inverse_transform(sample_size: int) → ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

inverse_transform_sampling(sample_size: int) → None[source]: Create a sample with inverse transform sampling.

class pelicun.uq.WeibullRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]

Weibull random variable.

cdf(values: ndarray) → ndarray[source]

Return the CDF at the given values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the CDF

Returns:

ndarray: 1D float ndarray containing CDF values

inverse_transform(values: ndarray) → ndarray[source]

Evaluate the inverse CDF.

Uses inverse probability integral transformation on the provided values.

Parameters:

values: 1D float ndarray: Values for which to evaluate the inverse CDF

Returns:

ndarray: Inverse CDF values

pelicun.uq.fit_distribution_to_percentiles(values: list[float], percentiles: list[float], families: list[str]) → tuple[str, list[float]][source]

Fit distribution to pre-defined values at a finite number of percentiles.

Parameters:

values: list of float: Pre-defined values at the given percentiles. At least two values are expected.
percentiles: list of float: Percentiles where values are defined. At least two percentiles are expected.
families: list of strings {‘normal’, ‘lognormal’}: Defines the distribution family candidates.

Returns:

tuple

family: string: The optimal choice of family among the provided list of families
theta: array of float: Parameters of the fitted distribution.

pelicun.uq.fit_distribution_to_sample(raw_sample: np.ndarray, distribution: str | list[str], truncation_limits: tuple[float, float] = (nan, nan), censored_count: int = 0, detection_limits: tuple[float, float] = (nan, nan), *, multi_fit: bool = False, logger_object: Logger | None = None) → tuple[np.ndarray, np.ndarray][source]

Fit a distribution to sample using maximum likelihood estimation.

The number of dimensions of the distribution are inferred from the shape of the sample data. Censoring is automatically considered if the number of censored samples and the corresponding detection limits are provided. Infinite or unspecified truncation limits lead to fitting a non-truncated distribution in that dimension.

Parameters:

raw_sample: float ndarray: Raw data that serves as the basis of estimation. The number of samples equals the number of columns and each row introduces a new feature. In other words: a list of sample lists is expected where each sample list is a collection of samples of one variable.
distribution: {‘normal’, ‘lognormal’}: Defines the target probability distribution type. Different types of distributions can be mixed by providing a list rather than a single value. Each element of the list corresponds to one of the features in the raw_sample.
truncation_limits: float ndarray, optional, default: [None, None]: Lower and/or upper truncation limits for the specified distributions. A two-element vector can be used for a univariate case, while two lists of limits are expected in multivariate cases. If the distribution is non-truncated from one side in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.
censored_count: int, optional, default: None: The number of censored samples that are beyond the detection limits. All samples outside the detection limits are aggregated into one set. This works the same way in one and in multiple dimensions. Prescription of specific censored sample counts for sub-regions of the input space outside the detection limits is not supported.
detection_limits: float ndarray, optional, default: [None, None]: Lower and/or upper detection limits for the provided samples. A two-element vector can be used for a univariate case, while two lists of limits are expected in multivariate cases. If the data is not censored from one side in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.
multi_fit: bool, optional, default: False: If True, we attempt to fit a multivariate distribution to the samples. Otherwise, we fit each marginal univariate distribution independently and estimate the correlation matrix in the end based on the fitted marginals. Using multi_fit can be advantageous with censored data and if the correlation in the data is not Gaussian. It leads to substantially longer calculation time and does not always produce better results, especially when the number of dimensions is large.
logger_object:: Logging object to be used. If no object is specified, no logging is performed.

Returns:

tuple

theta: float ndarray: Estimates of the parameters of the fitted probability distribution in each dimension. The following parameters are returned for the supported distributions: normal, normal_cov - mean, coefficient of variation; normal_std - mean, standard deviation; lognormal - median, log standard deviation;
Rho: float 2D ndarray, optional: In the multivariate case, returns the estimate of the correlation matrix.

Raises:

ValueError: If NaN values are produced during standard normal space transformation

pelicun.uq.mvn_orthotope_density(mu: float | ndarray, cov: ndarray, lower: float | ndarray = nan, upper: float | ndarray = nan) → tuple[float, float][source]

Estimate the probability density within a hyperrectangle for an MVN distr.

Use the method of Alan Genz (1992) to estimate the probability density of a multivariate normal distribution within an n-orthotope (i.e., hyperrectangle) defined by its lower and upper bounds. Limits can be relaxed in any direction by assigning infinite bounds (i.e. numpy.inf).

Parameters:

mu: float scalar or ndarray: Mean(s) of the non-truncated distribution.
cov: float ndarray: Covariance matrix of the non-truncated distribution
lower: float vector, optional, default: np.nan: Lower bound(s) for the truncated distributions. A scalar value can be used for a univariate case, while a list of bounds is expected in multivariate cases. If the distribution is non-truncated from below in a subset of the dimensions, use either None or assign an infinite value (i.e. -numpy.inf) to those dimensions.
upper: float vector, optional, default: np.nan: Upper bound(s) for the truncated distributions. A scalar value can be used for a univariate case, while a list of bounds is expected in multivariate cases. If the distribution is non-truncated from above in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.

Returns:

tuple

alpha: float: Estimate of the probability density within the hyperrectangle.
eps_alpha: float: Estimate of the error in the calculated probability density.

pelicun.uq.rv_class_map(distribution_name: str) → type[RandomVariable | UtilityRandomVariable][source]

Map convenient distributions to their corresponding class.

Parameters:

distribution_name: str: The name of a distribution.

Returns:

type[RandomVariable | UtilityRandomVariable]: The class of the corresponding random variable.

Raises:

ValueError: If the given distribution name does not correspond to a distribution class.

pelicun.uq.scale_distribution(scale_factor: float, family: str, theta: ndarray, truncation_limits: ndarray | None = None) → tuple[ndarray, ndarray | None][source]

Scale parameters of a random distribution.

Parameters:

scale_factor: float: Value by which to scale the parameters.
family: {‘normal’ (or ‘normal_cov’), ‘normal_std’, ‘lognormal’,: ‘uniform’} Defines the type of probability distribution for the random variable.
theta: float ndarray of length 2: Set of parameters that define the cumulative distribution function of the variable given its distribution type. See the expected parameters explained in the RandomVariable class. Each parameter can be defined by one or more values. If a set of values are provided for one parameter, they define ordinates of a multilinear function that is used to get the parameter values given an independent variable.
truncation_limits: float ndarray of length 2, default: None: Defines the [a,b] truncation limits for the distribution. Use None to assign no limit in one direction.

Returns:

tuple

A tuple containing the scaled parameters and truncation limits:

theta_new (float ndarray of length 2): Scaled parameters of the distribution.
truncation_limits (float ndarray of length 2 or None): Scaled truncation limits for the distribution, or None if no truncation is applied.

Raises:

ValueError: If the specified distribution family is unsupported.