8.1.9. pelicun.uq
Constants, classes and methods for uncertainty quantification.
Functions
|
Fit distribution to pre-defined values at a finite number of percentiles. |
|
Fit a distribution to sample using maximum likelihood estimation. |
|
Estimate the probability density within a hyperrectangle for an MVN distr. |
|
Map convenient distributions to their corresponding class. |
|
Scale parameters of a random distribution. |
Classes
|
Base abstract class for different types of random variables. |
|
Coupled empirical random variable. |
|
Deterministic random variable. |
|
Empirical random variable. |
|
Lognormal random variable. |
|
Multilinear CDF random variable. |
|
Multinomial random variable. |
|
Normal random variable. |
|
Normal random variable with coefficient of variation. |
|
Normal random variable with standard deviation. |
|
Random variable that needs values in inverse_transform. |
Random variable registry. |
|
|
Random variable set. |
|
Uniform random variable. |
|
Random variable that needs sample_size in inverse_transform. |
|
Weibull random variable. |
- class pelicun.uq.BaseRandomVariable(name: str, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Base abstract class for different types of random variables.
- property sample: ndarray | None
Return the empirical or generated sample.
- Returns:
- ndarray
The empirical or generated sample.
- class pelicun.uq.CoupledEmpiricalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Coupled empirical random variable.
- inverse_transform(sample_size: int) ndarray [source]
Evaluate the inverse CDF.
Generates a new sample array from the existing empirical data by repeating the dataset until it matches the requested sample size.
- Parameters:
- sample_size: int
The desired size of the sample array to be generated. It dictates how many times the original dataset will be repeated to match or exceed this size, after which the array is trimmed to precisely match the requested size.
- Returns:
- ndarray
A new sample array derived from repeating the original dataset.
- class pelicun.uq.DeterministicRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Deterministic random variable.
- class pelicun.uq.EmpiricalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Empirical random variable.
- inverse_transform(values: ndarray) ndarray [source]
Evaluate the inverse CDF.
Maps given values to their corresponding positions within the empirical data array, simulating an inverse transformation based on the empirical distribution. This can be seen as a simple form of inverse CDF where values represent normalized positions within the empirical data set.
- Parameters:
- values: 1D float ndarray
Normalized values between 0 and 1, representing positions within the empirical data distribution.
- Returns:
- ndarray
The empirical data points corresponding to the given normalized positions.
- class pelicun.uq.LogNormalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Lognormal random variable.
- class pelicun.uq.MultilinearCDFRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Multilinear CDF random variable.
This RV is defined by specifying the points that define its Cumulative Density Function (CDF), and linear interpolation between them.
- class pelicun.uq.MultinomialRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Multinomial random variable.
- inverse_transform(values: ndarray) ndarray [source]
Evaluate the inverse CDF.
Transforms continuous values into discrete events based on the cumulative probabilities of the multinomial distribution derived by theta.
- Parameters:
- values: 1D float ndarray
Continuous values to be transformed into discrete events according to the multinomial distribution’s cumulative probabilities.
- Returns:
- ndarray
Discrete events corresponding to the input values.
- class pelicun.uq.NormalRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Normal random variable.
- cdf(values: ndarray) ndarray [source]
Return the CDF at the given values.
- Parameters:
- values: 1D float ndarray
Values for which to evaluate the CDF
- Returns:
- ndarray
1D float ndarray containing CDF values
- inverse_transform(values: ndarray) ndarray [source]
Evaluate the inverse CDF.
Evaluates the inverse of the Cumulative Density Function (CDF) for the given values. Used to generate random variable realizations.
- Parameters:
- values: 1D float ndarray
Values for which to evaluate the inverse CDF
- Returns:
- ndarray
Inverse CDF values
- class pelicun.uq.Normal_COV(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Normal random variable with coefficient of variation.
This class represents a normal random variable defined by mean and coefficient of variation.
- class pelicun.uq.Normal_STD(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Normal random variable with standard deviation.
This class represents a normal random variable defined by mean and standard deviation.
- class pelicun.uq.RandomVariable(name: str, theta: np.ndarray | None, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Random variable that needs values in inverse_transform.
- constant_parameters() bool [source]
If the RV has constant or variable parameters.
Constant parameters are the same in each realization.
- Returns:
- bool
True if the parameters are constant, false otherwise.
- class pelicun.uq.RandomVariableRegistry(rng: Generator)[source]
Random variable registry.
- property RV: dict[str, BaseRandomVariable]
Returns all random variable(s) in the registry.
- Returns:
- dict
all random variable(s) in the registry.
- property RV_sample: dict[str, ndarray | None]
Return the sample for every random variable in the registry.
- Returns:
- dict
The sample for every random variable in the registry.
- property RV_set: dict[str, RandomVariableSet]
Return the random variable set(s) in the registry.
- Returns:
- dict
The random variable set(s) in the registry.
- RVs(keys: list[str]) dict[str, BaseRandomVariable] [source]
Return a subset of the random variables in the registry.
- Parameters:
- keys: list of str
Keys that define the subset.
- Returns:
- dict
A subset random variable(s) in the registry.
- add_RV(rv: BaseRandomVariable) None [source]
Add a new random variable to the registry.
- Raises:
- ValueError
When the RV already exists in the registry
- add_RV_set(rv_set: RandomVariableSet) None [source]
Add a new set of random variables to the registry.
- generate_sample(sample_size: int, method: str) None [source]
Generate samples for all variables in the registry.
- Parameters:
- sample_size: int
The number of samples requested per variable.
- method: str
Can be any of: ‘MonteCarlo’, ‘LHS’, ‘LHS_midpoint’ The sample generation method to use. ‘MonteCarlo’ stands for conventional random sampling; ‘LHS’ is Latin HyperCube Sampling with random sample location within each bin of the hypercube; ‘LHS_midpoint’ is like LHS, but the samples are assigned to the midpoints of the hypercube bins.
- Raises:
- NotImplementedError
When the RV parent class is Unknown
- class pelicun.uq.RandomVariableSet(name: str, rv_list: list[BaseRandomVariable], rho: ndarray)[source]
Random variable set.
Represents a set of random variables, each of which is described by its own probability distribution. The set allows the user to define correlations between the random variables, and provides methods to sample from the correlated variables and estimate various statistical properties of the set, such as the probability density within a specified range or orthotope.
- Parameters:
- name: string
A unique string that identifies the set of random variables.
- RV_list: list of RandomVariable
Defines the random variables in the set
- Rho: float 2D ndarray
Defines the correlation matrix that describes the correlation between the random variables in the set. Currently, only the Gaussian copula is supported.
- property RV: dict[str, RandomVariable]
Returns the random variable(s) assigned to the set.
- Returns:
- ndarray
The random variable(s) assigned to the set.
- Rho(var_subset: list[str] | None = None) ndarray [source]
Return the (subset of the) correlation matrix.
- Returns:
- ndarray
The (subset of the) correlation matrix.
- apply_correlation() None [source]
Apply correlation to n dimensional uniform samples.
Currently, correlation is applied using a Gaussian copula. First, we try using Cholesky transformation. If the correlation matrix is not positive semidefinite and Cholesky fails, use SVD to apply the correlations while preserving as much as possible from the correlation matrix.
- orthotope_density(lower: ndarray | float = nan, upper: ndarray | float = nan, var_subset: list[str] | None = None) ndarray [source]
Estimate the probability density within an orthotope for the RV set.
Use the mvn_orthotope_density function in this module for the calculation. The distribution of individual RVs is not limited to the normal family. The provided limits are converted to the standard normal space that is the basis of all RVs in pelicun. Truncation limits and correlation (using Gaussian copula) are automatically taken into consideration.
- Parameters:
- lower: float ndarray, optional, default: np.nan
Lower bound(s) of the orthotope. A scalar value can be used for a univariate RV; a list of bounds is expected in multivariate cases. If the orthotope is not bounded from below in a dimension, use ‘np.nan’ to that dimension.
- upper: float ndarray, optional, default: np.nan
Upper bound(s) of the orthotope. A scalar value can be used for a univariate RV; a list of bounds is expected in multivariate cases. If the orthotope is not bounded from above in a dimension, use ‘np.nan’ to that dimension.
- var_subset: list of strings, optional, default: None
If provided, allows for selecting only a subset of the variables in the RV_set for the density calculation.
- Returns:
- tuple
- alpha: float
Estimate of the probability density within the orthotope.
- eps_alpha: float
Estimate of the error in alpha.
- class pelicun.uq.UniformRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Uniform random variable.
- class pelicun.uq.UtilityRandomVariable(name: str, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Random variable that needs sample_size in inverse_transform.
- class pelicun.uq.WeibullRandomVariable(name: str, theta: np.ndarray, truncation_limits: np.ndarray | None = None, f_map: Callable | None = None, anchor: BaseRandomVariable | None = None)[source]
Weibull random variable.
- pelicun.uq.fit_distribution_to_percentiles(values: list[float], percentiles: list[float], families: list[str]) tuple[str, list[float]] [source]
Fit distribution to pre-defined values at a finite number of percentiles.
- Parameters:
- values: list of float
Pre-defined values at the given percentiles. At least two values are expected.
- percentiles: list of float
Percentiles where values are defined. At least two percentiles are expected.
- families: list of strings {‘normal’, ‘lognormal’}
Defines the distribution family candidates.
- Returns:
- tuple
- family: string
The optimal choice of family among the provided list of families
- theta: array of float
Parameters of the fitted distribution.
- pelicun.uq.fit_distribution_to_sample(raw_sample: np.ndarray, distribution: str | list[str], truncation_limits: tuple[float, float] = (nan, nan), censored_count: int = 0, detection_limits: tuple[float, float] = (nan, nan), *, multi_fit: bool = False, logger_object: Logger | None = None) tuple[np.ndarray, np.ndarray] [source]
Fit a distribution to sample using maximum likelihood estimation.
The number of dimensions of the distribution are inferred from the shape of the sample data. Censoring is automatically considered if the number of censored samples and the corresponding detection limits are provided. Infinite or unspecified truncation limits lead to fitting a non-truncated distribution in that dimension.
- Parameters:
- raw_sample: float ndarray
Raw data that serves as the basis of estimation. The number of samples equals the number of columns and each row introduces a new feature. In other words: a list of sample lists is expected where each sample list is a collection of samples of one variable.
- distribution: {‘normal’, ‘lognormal’}
Defines the target probability distribution type. Different types of distributions can be mixed by providing a list rather than a single value. Each element of the list corresponds to one of the features in the raw_sample.
- truncation_limits: float ndarray, optional, default: [None, None]
Lower and/or upper truncation limits for the specified distributions. A two-element vector can be used for a univariate case, while two lists of limits are expected in multivariate cases. If the distribution is non-truncated from one side in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.
- censored_count: int, optional, default: None
The number of censored samples that are beyond the detection limits. All samples outside the detection limits are aggregated into one set. This works the same way in one and in multiple dimensions. Prescription of specific censored sample counts for sub-regions of the input space outside the detection limits is not supported.
- detection_limits: float ndarray, optional, default: [None, None]
Lower and/or upper detection limits for the provided samples. A two-element vector can be used for a univariate case, while two lists of limits are expected in multivariate cases. If the data is not censored from one side in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.
- multi_fit: bool, optional, default: False
If True, we attempt to fit a multivariate distribution to the samples. Otherwise, we fit each marginal univariate distribution independently and estimate the correlation matrix in the end based on the fitted marginals. Using multi_fit can be advantageous with censored data and if the correlation in the data is not Gaussian. It leads to substantially longer calculation time and does not always produce better results, especially when the number of dimensions is large.
- logger_object:
Logging object to be used. If no object is specified, no logging is performed.
- Returns:
- tuple
- theta: float ndarray
Estimates of the parameters of the fitted probability distribution in each dimension. The following parameters are returned for the supported distributions: normal, normal_cov - mean, coefficient of variation; normal_std - mean, standard deviation; lognormal - median, log standard deviation;
- Rho: float 2D ndarray, optional
In the multivariate case, returns the estimate of the correlation matrix.
- Raises:
- ValueError
If NaN values are produced during standard normal space transformation
- pelicun.uq.mvn_orthotope_density(mu: float | ndarray, cov: ndarray, lower: float | ndarray = nan, upper: float | ndarray = nan) tuple[float, float] [source]
Estimate the probability density within a hyperrectangle for an MVN distr.
Use the method of Alan Genz (1992) to estimate the probability density of a multivariate normal distribution within an n-orthotope (i.e., hyperrectangle) defined by its lower and upper bounds. Limits can be relaxed in any direction by assigning infinite bounds (i.e. numpy.inf).
- Parameters:
- mu: float scalar or ndarray
Mean(s) of the non-truncated distribution.
- cov: float ndarray
Covariance matrix of the non-truncated distribution
- lower: float vector, optional, default: np.nan
Lower bound(s) for the truncated distributions. A scalar value can be used for a univariate case, while a list of bounds is expected in multivariate cases. If the distribution is non-truncated from below in a subset of the dimensions, use either None or assign an infinite value (i.e. -numpy.inf) to those dimensions.
- upper: float vector, optional, default: np.nan
Upper bound(s) for the truncated distributions. A scalar value can be used for a univariate case, while a list of bounds is expected in multivariate cases. If the distribution is non-truncated from above in a subset of the dimensions, use either None or assign an infinite value (i.e. numpy.inf) to those dimensions.
- Returns:
- tuple
- alpha: float
Estimate of the probability density within the hyperrectangle.
- eps_alpha: float
Estimate of the error in the calculated probability density.
- pelicun.uq.rv_class_map(distribution_name: str) type[RandomVariable | UtilityRandomVariable] [source]
Map convenient distributions to their corresponding class.
- Parameters:
- distribution_name: str
The name of a distribution.
- Returns:
- type[RandomVariable | UtilityRandomVariable]
The class of the corresponding random variable.
- Raises:
- ValueError
If the given distribution name does not correspond to a distribution class.
- pelicun.uq.scale_distribution(scale_factor: float, family: str, theta: ndarray, truncation_limits: ndarray | None = None) tuple[ndarray, ndarray | None] [source]
Scale parameters of a random distribution.
- Parameters:
- scale_factor: float
Value by which to scale the parameters.
- family: {‘normal’ (or ‘normal_cov’), ‘normal_std’, ‘lognormal’,
‘uniform’} Defines the type of probability distribution for the random variable.
- theta: float ndarray of length 2
Set of parameters that define the cumulative distribution function of the variable given its distribution type. See the expected parameters explained in the RandomVariable class. Each parameter can be defined by one or more values. If a set of values are provided for one parameter, they define ordinates of a multilinear function that is used to get the parameter values given an independent variable.
- truncation_limits: float ndarray of length 2, default: None
Defines the [a,b] truncation limits for the distribution. Use None to assign no limit in one direction.
- Returns:
- tuple
A tuple containing the scaled parameters and truncation limits:
theta_new (float ndarray of length 2): Scaled parameters of the distribution.
truncation_limits (float ndarray of length 2 or None): Scaled truncation limits for the distribution, or None if no truncation is applied.
- Raises:
- ValueError
If the specified distribution family is unsupported.