pelicun.base

Constants, basic classes, and methods for pelicun.

Functions

`_warning`(message, category, filename, lineno)	Display warnings in a custom format.
`check_if_str_is_na`(string)	Check if the provided string can be interpreted as N/A.
`control_warnings`()	Turn warnings on/off.
`convert_dtypes`(dataframe)	Convert columns to a numeric datatype whenever possible.
`convert_to_MultiIndex`()	Convert the index of a DataFrame to a MultiIndex.
`convert_to_SimpleIndex`()	Convert the index of a DataFrame to a simple, one-level index.
`convert_units`(values, unit, to_unit[, category])	Convert numeric values between different units.
`dedupe_index`(dataframe[, dtype])	Add a uid level to the index.
`describe`(data[, percentiles])	Extend descriptive statistics.
`dict_raise_on_duplicates`(ordered_pairs)	Construct a dictionary from a list of key-value pairs.
`ensure_value`(value)	Ensure a variable is not None.
`float_or_None`(string)	Convert strings to float or None.
`get`(d, path[, default])	Path-like dictionary value retrieval.
`int_or_None`(string)	Convert strings to int or None.
`invert_mapping`(original_dict)	Inverts a dictionary mapping from key to list of values.
`is_specified`(d, path)	Opposite of is_unspecified().
`is_unspecified`(d, path)	Check if something is specified.
`load_default_options`()	Load the default_config.json file to set options to default values.
`merge_default_config`(user_config)	Merge default config with user's options.
`multiply_factor_multiple_levels`(df, ...[, ...])	Multiply a value to selected rows, in place.
`parse_units`([custom_file, preserve_categories])	Parse the unit conversion factor JSON file and return a dictionary.
`show_matrix`(data, *[, use_describe])	Print a matrix in a nice way using a DataFrame.
`split_file_name`(file_path)	Separate a file name from the extension.
`str2bool`(v)	Convert a string representation of truth to boolean True or False.
`stringterpolation`(arguments)	Linear interpolation from strings.
`update`(d, path, value, *[, ...])	Set a value in a nested dictionary using a path with '/' as the separator.
`update_vals`(update_value, primary, ...)	Transfer values between nested dictionaries.
`with_parsed_str_na_values`(df)	Identify string values interpretable as N/A.

Classes

`Logger`(log_file, *, verbose, log_show_ms, ...)	Generate log files documenting execution events.
`LoggerRegistry`()	Registry to manage all logger instances.
`Options`(user_config_options[, assessment])	Analysis options and logging configuration.

class pelicun.base.Logger(log_file: str | None, *, verbose: bool, log_show_ms: bool, print_log: bool)[source]

Generate log files documenting execution events.

add_warning(msg: str) → None[source]

Add a warning to the warning stack.

Parameters:

msg: str: The warning message.

Notes

Warnings are only emitted when emit_warnings is called.

div(*, prepend_timestamp: bool = False) → None[source]: Add a divider line in the log file.

emit_warnings() → None[source]: Issues all warnings and clears the warning stack.

msg(msg: str = '', *, prepend_timestamp: bool = True, prepend_blank_space: bool = True) → None[source]

Write a message in the log file with the current time as prefix.

The time is in ISO-8601 format, e.g. 2018-06-16T20:24:04Z

Parameters:

msg: string: Message to print.
prepend_timestamp: bool: Controls whether a timestamp is placed before the message.
prepend_blank_space: bool: Controls whether blank space is placed before the message.

print_system_info() → None[source]: Write system information in the log.

reset_log_strings() → None[source]: Populate the string-related attributes of the logger.

warning(msg: str) → None[source]

Add an emit a warning immediately.

Parameters:

msg: str: Warning message

class pelicun.base.LoggerRegistry[source]

Registry to manage all logger instances.

classmethod log_exception(exc_type: type[BaseException], exc_value: BaseException, exc_traceback: TracebackType | None) → None[source]: Log exceptions to all registered loggers.

classmethod register(logger: Logger) → None[source]: Register a logger instance.

class pelicun.base.Options(user_config_options: dict[str, Any] | None, assessment: AssessmentBase | None = None)[source]

Analysis options and logging configuration.

Attributes:

sampling_method: str: Sampling method to use. Specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code). Can be any of [‘LHS’, ‘LHS_midpoint’, ‘MonteCarlo’]. The default is ‘LHS’.
units_file: str: Location of a user-specified units file, which should contain the names of supported units and their conversion factors (the value some quantity of a given unit needs to be multiplied to be expressed in the base units). Value specified in the user configuration dictionary. Pelicun comes with a set of default units which are always loaded (see settings/default_units.json in the pelicun source code). Units specified in the units_file overwrite the default units.
demand_offset: dict: Demand offsets are used in the process of mapping a component location to its associated EDP. This allows components that are sensitive to EDPs of different levels to be specified as present at the same location (e.g. think of desktop computer and suspended ceiling, both at the same story). Each component’s offset value is specified in the component fragility database. This setting applies a supplemental global offset to specific EDP types. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).
nondir_multi_dict: dict: Nondirectional components are sensitive to demands coming in any direction. Results are typically available in two orthogonal directions. FEMA P-58 suggests using the formula max(dir_1, dir_2) * 1.2 to estimate the demand for such components. This parameter allows modifying the 1.2 multiplier with a user-specified value. The change can be applied to “ALL” EDPs, or for specific EDPs, such as “PFA”, “PFV”, etc. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).
rho_cost_time: float: Specifies the correlation between the repair cost and repair time consequences. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see “RepairCostAndTimeCorrelation”) (see settings/default_config.json in the pelicun source code).
eco_scale: dict: Controls how the effects of economies of scale are handled in the damaged component quantity aggregation for loss measure estimation. The dictionary is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).
log: Logger: Logger object. Configuration parameters coming from the user’s configuration dictionary or the default configuration file control logging behavior. See Logger class.

property rng: Generator

rng property.

Returns:

Generator: Random generator

property seed: float | None

Seed property.

Returns:

float: Seed value

pelicun.base.check_if_str_is_na(string: Any) → bool[source]

Check if the provided string can be interpreted as N/A.

Parameters:

string: object: The string to evaluate

Returns:

bool: The evaluation result. Yes, if the string is considered N/A.

pelicun.base.control_warnings() → None[source]: Turn warnings on/off.

See also: pelicun/pytest.ini. Devs: make sure to update that file when addressing & eliminating warnings.

pelicun.base.convert_dtypes(dataframe: DataFrame) → DataFrame[source]

Convert columns to a numeric datatype whenever possible.

The function replaces None with NA otherwise columns containing None would continue to have the object type.

Parameters:

dataframe: DataFrame: The DataFrame that will be modified.

Returns:

DataFrame: The modified DataFrame.

pelicun.base.convert_to_MultiIndex(data: DataFrame, axis: int = 0, *, inplace: bool = False) → DataFrame[source]

pelicun.base.convert_to_MultiIndex(data: Series, axis: int = 0, *, inplace: bool = False) → Series

Convert the index of a DataFrame to a MultiIndex.

We assume that the index uses standard SimCenter convention to identify different levels: a dash character (‘-’) is expected to separate each level of the index.

Parameters:

data: DataFrame: The DataFrame that will be modified.
axis: int, optional, default:0: Identifies if the index (0) or the columns (1) shall be edited.
inplace: bool, optional, default:False: If yes, the operation is performed directly on the input DataFrame and not on a copy of it.

Returns:

DataFrame: The modified DataFrame.

Raises:

ValueError: If an invalid axis is specified.

pelicun.base.convert_to_SimpleIndex(data: DataFrame, axis: int = 0, *, inplace: bool = False) → DataFrame[source]

pelicun.base.convert_to_SimpleIndex(data: Series, axis: int = 0, *, inplace: bool = False) → Series

Convert the index of a DataFrame to a simple, one-level index.

The target index uses standard SimCenter convention to identify different levels: a dash character (‘-’) is used to separate each level of the index.

Parameters:

data: DataFrame: The DataFrame that will be modified.
axis: int, optional, default:0: Identifies if the index (0) or the columns (1) shall be edited.
inplace: bool, optional, default:False: If yes, the operation is performed directly on the input DataFrame and not on a copy of it.

Returns:

DataFrame: The modified DataFrame

Raises:

ValueError: When an invalid axis parameter is specified

Convert numeric values between different units.

Supports conversion within a specified category of units and automatically infers the category if not explicitly provided. It maintains the type of the input in the output.

Parameters:

values: (float | list[float] | np.ndarray): The numeric value(s) to convert.
unit: (str): The current unit of the values.
to_unit: (str): The target unit to convert the values into.
category: (Optional[str]): The category of the units (e.g., ‘length’, ‘pressure’). If not provided, the category will be inferred based on the provided units.

Returns:

float or list[float] or np.ndarray: The converted value(s) in the target unit, in the same data type as the input values.

Raises:

TypeError: If the input values are not of type float, list, or np.ndarray.
ValueError: If the unit, to_unit, or category is unknown or if unit and to_unit are not in the same category.

pelicun.base.dedupe_index(dataframe: ~pandas.core.frame.DataFrame, dtype: type = <class 'str'>) → DataFrame[source]

Add a uid level to the index.

Modifies the index of a DataFrame to ensure all index elements are unique by adding an extra level. Assumes that the DataFrame’s original index is a MultiIndex with specified names. A unique identifier (‘uid’) is added as an additional index level based on the cumulative count of occurrences of the original index combinations.

Parameters:

dataframe: pd.DataFrame: The DataFrame whose index is to be modified. It must have a MultiIndex.
dtype: type, optional: The data type for the new index level ‘uid’. Defaults to str.

Returns:

dataframe: pd.DataFrame: The original dataframe with an additional uid level at the index.

pelicun.base.describe(data: DataFrame | Series | ndarray, percentiles: tuple[float, ...] = (0.001, 0.023, 0.1, 0.159, 0.5, 0.841, 0.9, 0.977, 0.999)) → DataFrame[source]

Extend descriptive statistics.

Provides extended descriptive statistics for given data, including percentiles and log standard deviation for applicable columns. This function accepts both pandas Series and DataFrame objects directly, or any array-like structure which can be converted to them. It calculates common descriptive statistics and optionally adds log standard deviation for columns where all values are positive.

Parameters:

data: pd.Series, pd.DataFrame, or array-like: The data to describe. If array-like, it is converted to a DataFrame or Series before analysis.
percentiles: tuple of float, optional: Specific percentiles to include in the output. Default includes an extensive range tailored to provide a detailed summary.

Returns:

pd.DataFrame: A DataFrame containing the descriptive statistics of the input data, transposed so that each descriptive statistic is a row.

pelicun.base.dict_raise_on_duplicates(ordered_pairs: list[tuple]) → dict[source]

Construct a dictionary from a list of key-value pairs.

Constructs a dictionary from a list of key-value pairs, raising an exception if duplicate keys are found. This function ensures that no two pairs have the same key. It is particularly useful when parsing JSON-like data where unique keys are expected but not enforced by standard parsing methods.

Parameters:

ordered_pairs: list of tuples: A list of tuples, each containing a key and a value. Keys are expected to be unique across the list.

Returns:

dict: A dictionary constructed from the ordered_pairs without any duplicates.

Raises:

ValueError: If a duplicate key is found in the input list, a ValueError is raised with a message indicating the duplicate key.

Notes

This implementation is useful for contexts in which data integrity is crucial and key uniqueness must be ensured.

Examples

>>> dict_raise_on_duplicates(
...     [("key1", "value1"), ("key2", "value2"), ("key1", "value3")]
... )
ValueError: duplicate key: key1

pelicun.base.ensure_value(value: T | None) → T[source]

Ensure a variable is not None.

This function checks that the provided variable is not None. It is used to assist with type hinting by avoiding repetitive assert value is not None statements throughout the code.

Parameters:

valueOptional[T]: The variable to check, which can be of any type or None.

Returns:

T: The same variable, guaranteed to be non-None.

Raises:

TypeError: If the provided variable is None.

pelicun.base.float_or_None(string: str) → float | None[source]

Convert strings to float or None.

Parameters:

string: str: A string

Returns:

float or None: A float, if the given string can be converted to a float. Otherwise, it returns None

pelicun.base.get(d: dict | None, path: str, default: Any | None = None) → Any[source]

Path-like dictionary value retrieval.

Retrieves a value from a nested dictionary using a path with ‘/’ as the separator.

Parameters:

d: dict: The dictionary to search.
path: str: The path to the desired value, with keys separated by ‘/’.
default: Any, optional: The value to return if the path is not found. Defaults to None.

Returns:

Any: The value found at the specified path, or the default value if the path is not found.

Examples

>>> config = {
...     "DL": {
...         "Outputs": {
...             "Format": {
...                 "JSON": "desired_value"
...             }
...         }
...     }
... }
>>> get(config, '/DL/Outputs/Format/JSON', default='default_value')
'desired_value'
>>> get(config, '/DL/Outputs/Format/XML', default='default_value')
'default_value'

pelicun.base.int_or_None(string: str) → int | None[source]

Convert strings to int or None.

Parameters:

string: str: A string

Returns:

int or None: An int, if the given string can be converted to an int. Otherwise, it returns None

pelicun.base.invert_mapping(original_dict: dict) → dict[source]

Inverts a dictionary mapping from key to list of values.

Parameters:

original_dict: dict: Dictionary with values that are lists of hashable items.

Returns:

dict: New dictionary where each item in the original value lists becomes a key and the original key becomes the corresponding value.

Raises:

ValueError: If any value in the original dictionary’s value lists appears more than once.

pelicun.base.is_specified(d: dict[str, Any], path: str) → bool[source]

Opposite of is_unspecified().

Parameters:

d: dict: The dictionary to search.
path: str: The path to the desired value, with keys separated by ‘/’.

Returns:

bool: True if the value is specified, False otherwise.

pelicun.base.is_unspecified(d: dict[str, Any], path: str) → bool[source]

Check if something is specified.

Checks if a value in a nested dictionary is either non-existent, None, NaN, or an empty dictionary or list.

Parameters:

d: dict: The dictionary to search.
path: str: The path to the desired value, with keys separated by ‘/’.

Returns:

bool: True if the value is non-existent, None, or an empty dictionary or list. False otherwise.

Examples

>>> config = {
...     "DL": {
...         "Outputs": {
...             "Format": {
...                 "JSON": "desired_value",
...                 "EmptyDict": {}
...             }
...         }
...     }
... }
>>> is_unspecified(config, '/DL/Outputs/Format/JSON')
False
>>> is_unspecified(config, '/DL/Outputs/Format/XML')
True
>>> is_unspecified(config, '/DL/Outputs/Format/EmptyDict')
True

pelicun.base.load_default_options() → dict[source]

Load the default_config.json file to set options to default values.

Returns:

dict: Default options

pelicun.base.merge_default_config(user_config: dict | None) → dict[source]

Merge default config with user’s options.

Merge the user-specified config with the configuration defined in the default_config.json file. If the user-specified config does not include some option available in the default options, then the default option is used in the merged config.

Parameters:

user_config: dict: User-specified configuration dictionary

Returns:

dict: Merged configuration dictionary

pelicun.base.multiply_factor_multiple_levels(df: DataFrame, conditions: dict, factor: float, axis: int = 0, *, raise_missing: bool = True) → None[source]

Multiply a value to selected rows, in place.

Multiplies a value to selected rows of a DataFrame that is indexed with a hierarchical index (pd.MultiIndex). The change is done in place.

Parameters:

df: pd.DataFrame: The DataFrame to be modified.
conditions: dict: A dictionary mapping level names with a single value. Only the rows where the index levels have the provided values will be affected. The dictionary can be empty, in which case all rows will be affected, or contain only some levels and values, in which case only the matching rows will be affected.
factor: float: Scaling factor to use.
axis: int: With 0 the condition is checked against the DataFrame’s index, otherwise with 1 it is checked against the DataFrame’s columns.
raise_missing: bool: Raise an error if no rows are matching the given conditions.

Raises:

ValueError: If the provided axis values is not either 0 or 1.
ValueError: If there are no rows matching the conditions and raise_missing is True.

pelicun.base.parse_units(custom_file: str | None = None, *, preserve_categories: bool = False) → dict[source]

Parse the unit conversion factor JSON file and return a dictionary.

Parameters:

custom_file: str, optional: If a custom file is provided, only the units specified in the custom file are used.
preserve_categories: bool, optional: If True, maintains the original data types of category values from the JSON file. If False, converts all values to floats and flattens the dictionary structure, ensuring that each unit name is globally unique across categories.

Returns:

dict: A dictionary where keys are unit names and values are their corresponding conversion factors. If preserve_categories is True, the dictionary may maintain its original nested structure based on the JSON file. If preserve_categories is False, the dictionary is flattened to have globally unique unit names.

pelicun.base.show_matrix(data: ndarray | DataFrame, *, use_describe: bool = False) → None[source]

Print a matrix in a nice way using a DataFrame.

Parameters:

data: array-like: The matrix data to display. Can be any array-like structure that pandas can convert to a DataFrame.
use_describe: bool, default: False: If True, provides a descriptive statistical summary of the matrix including specified percentiles. If False, simply prints the matrix as is.

pelicun.base.split_file_name(file_path: str) → tuple[str, str][source]

Separate a file name from the extension.

Separates a file name from the extension accounting for the case where the file name itself contains periods.

Parameters:

file_path: str: Original file path.

Returns:

tuple

name: str: Name of the file.
extension: str: File extension.

pelicun.base.str2bool(v: str | bool) → bool[source]

Convert a string representation of truth to boolean True or False.

This function is designed to convert string inputs that represent boolean values into actual Python boolean types. It handles typical representations of truthiness and falsiness, and is case insensitive.

Parameters:

v: str or bool: The value to convert into a boolean. This can be a boolean itself (in which case it is simply returned) or a string that is expected to represent a boolean value.

Returns:

bool: The boolean value corresponding to the input.

Raises:

argparse.ArgumentTypeError: If v is a string that does not correspond to a boolean value, an error is raised indicating that a boolean value was expected.

pelicun.base.stringterpolation(arguments: str) → Callable[[np.ndarray], np.ndarray][source]

Linear interpolation from strings.

Turns a string of specially formatted arguments into a multilinear interpolating function.

Parameters:

arguments: str: String of arguments containing Y values and X values, separated by a pipe symbol (|). Individual values are separated by commas (,). Example: arguments = ‘y1,y2,y3|x1,x2,x3’

Returns:

Callable: A callable interpolating function

pelicun.base.update(d: dict[str, Any], path: str, value: Any, *, only_if_empty_or_none: bool = False) → None[source]

Set a value in a nested dictionary using a path with ‘/’ as the separator.

Parameters:

d: dict: The dictionary to update.
path: str: The path to the desired value, with keys separated by ‘/’.
value: Any: The value to set at the specified path.
only_if_empty_or_none: bool, optional: If True, only update the value if it is None or an empty dictionary. Defaults to False.

Examples

>>> d = {}
>>> update(d, 'x/y/z', 1)
>>> d
{'x': {'y': {'z': 1}}}

>>> update(d, 'x/y/z', 2, only_if_empty_or_none=True)
>>> d
{'x': {'y': {'z': 1}}}  # value remains 1 since it is not empty or None

>>> update(d, 'x/y/z', 2)
>>> d
{'x': {'y': {'z': 2}}}  # value is updated to 2

pelicun.base.update_vals(update_value: dict, primary: dict, update_path: str, primary_path: str) → None[source]

Transfer values between nested dictionaries.

Updates the values of the update nested dictionary with those provided in the primary nested dictionary. If a key already exists in update, and does not map to another dictionary, the value is left unchanged.

Parameters:

update_value: dict: Dictionary -which can contain nested dictionaries- to be updated based on the values of primary. New keys existing in primary are added to update. Values of which keys already exist in primary are left unchanged.
primary: dict: Dictionary -which can contain nested dictionaries- to be used to update the values of update.
update_path: str: Identifier for the update dictionary. Used to make error messages more meaningful.
primary_path: str: Identifier for the update dictionary. Used to make error messages more meaningful.

Raises:

ValueError: If primary[key] is dict but update[key] is not.
ValueError: If update[key] is dict but primary[key] is not.

pelicun.base.with_parsed_str_na_values(df: DataFrame) → DataFrame[source]

Identify string values interpretable as N/A.

Given a dataframe, this function identifies values that have string type and can be interpreted as N/A, and replaces them with actual NA’s.

Parameters:

df: pd.DataFrame: Dataframe to process

Returns:

pd.DataFrame: The dataframe with proper N/A values.