8.1.3. pelicun.base

Constants, basic classes, and methods for pelicun.

Functions

check_if_str_is_na(string)

Check if the provided string can be interpreted as N/A.

control_warnings()

Turn warnings on/off.

convert_dtypes(dataframe)

Convert columns to a numeric datatype whenever possible.

convert_to_MultiIndex()

Convert the index of a DataFrame to a MultiIndex.

convert_to_SimpleIndex()

Convert the index of a DataFrame to a simple, one-level index.

convert_units(values, unit, to_unit[, category])

Convert numeric values between different units.

dedupe_index(dataframe[, dtype])

Add a uid level to the index.

describe(data[, percentiles])

Extend descriptive statistics.

dict_raise_on_duplicates(ordered_pairs)

Construct a dictionary from a list of key-value pairs.

ensure_value(value)

Ensure a variable is not None.

float_or_None(string)

Convert strings to float or None.

get(d, path[, default])

Path-like dictionary value retrieval.

int_or_None(string)

Convert strings to int or None.

invert_mapping(original_dict)

Inverts a dictionary mapping from key to list of values.

is_specified(d, path)

Opposite of is_unspecified().

is_unspecified(d, path)

Check if something is specified.

load_default_options()

Load the default_config.json file to set options to default values.

merge_default_config(user_config)

Merge default config with user's options.

multiply_factor_multiple_levels(df, ...[, ...])

Multiply a value to selected rows, in place.

parse_units([custom_file, preserve_categories])

Parse the unit conversion factor JSON file and return a dictionary.

show_matrix(data, *[, use_describe])

Print a matrix in a nice way using a DataFrame.

split_file_name(file_path)

Separate a file name from the extension.

str2bool(v)

Convert a string representation of truth to boolean True or False.

stringterpolation(arguments)

Linear interpolation from strings.

update(d, path, value, *[, ...])

Set a value in a nested dictionary using a path with '/' as the separator.

update_vals(update_value, primary, ...)

Transfer values between nested dictionaries.

with_parsed_str_na_values(df)

Identify string values interpretable as N/A.

Classes

Logger(log_file, *, verbose, log_show_ms, ...)

Generate log files documenting execution events.

LoggerRegistry()

Registry to manage all logger instances.

Options(user_config_options[, assessment])

Analysis options and logging configuration.

class pelicun.base.Logger(log_file: str | None, *, verbose: bool, log_show_ms: bool, print_log: bool)[source]

Generate log files documenting execution events.

add_warning(msg: str) None[source]

Add a warning to the warning stack.

Parameters:
msg: str

The warning message.

Notes

Warnings are only emitted when emit_warnings is called.

div(*, prepend_timestamp: bool = False) None[source]

Add a divider line in the log file.

emit_warnings() None[source]

Issues all warnings and clears the warning stack.

msg(msg: str = '', *, prepend_timestamp: bool = True, prepend_blank_space: bool = True) None[source]

Write a message in the log file with the current time as prefix.

The time is in ISO-8601 format, e.g. 2018-06-16T20:24:04Z

Parameters:
msg: string

Message to print.

prepend_timestamp: bool

Controls whether a timestamp is placed before the message.

prepend_blank_space: bool

Controls whether blank space is placed before the message.

print_system_info() None[source]

Write system information in the log.

reset_log_strings() None[source]

Populate the string-related attributes of the logger.

warning(msg: str) None[source]

Add an emit a warning immediately.

Parameters:
msg: str

Warning message

class pelicun.base.LoggerRegistry[source]

Registry to manage all logger instances.

classmethod log_exception(exc_type: type[BaseException], exc_value: BaseException, exc_traceback: TracebackType | None) None[source]

Log exceptions to all registered loggers.

classmethod register(logger: Logger) None[source]

Register a logger instance.

class pelicun.base.Options(user_config_options: dict[str, Any] | None, assessment: AssessmentBase | None = None)[source]

Analysis options and logging configuration.

Attributes:
sampling_method: str

Sampling method to use. Specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code). Can be any of [‘LHS’, ‘LHS_midpoint’, ‘MonteCarlo’]. The default is ‘LHS’.

units_file: str

Location of a user-specified units file, which should contain the names of supported units and their conversion factors (the value some quantity of a given unit needs to be multiplied to be expressed in the base units). Value specified in the user configuration dictionary. Pelicun comes with a set of default units which are always loaded (see settings/default_units.json in the pelicun source code). Units specified in the units_file overwrite the default units.

demand_offset: dict

Demand offsets are used in the process of mapping a component location to its associated EDP. This allows components that are sensitive to EDPs of different levels to be specified as present at the same location (e.g. think of desktop computer and suspended ceiling, both at the same story). Each component’s offset value is specified in the component fragility database. This setting applies a supplemental global offset to specific EDP types. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).

nondir_multi_dict: dict

Nondirectional components are sensitive to demands coming in any direction. Results are typically available in two orthogonal directions. FEMA P-58 suggests using the formula max(dir_1, dir_2) * 1.2 to estimate the demand for such components. This parameter allows modifying the 1.2 multiplier with a user-specified value. The change can be applied to “ALL” EDPs, or for specific EDPs, such as “PFA”, “PFV”, etc. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).

rho_cost_time: float

Specifies the correlation between the repair cost and repair time consequences. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see “RepairCostAndTimeCorrelation”) (see settings/default_config.json in the pelicun source code).

eco_scale: dict

Controls how the effects of economies of scale are handled in the damaged component quantity aggregation for loss measure estimation. The dictionary is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).

log: Logger

Logger object. Configuration parameters coming from the user’s configuration dictionary or the default configuration file control logging behavior. See Logger class.

property rng: Generator

rng property.

Returns:
Generator

Random generator

property seed: float | None

Seed property.

Returns:
float

Seed value

pelicun.base.check_if_str_is_na(string: Any) bool[source]

Check if the provided string can be interpreted as N/A.

Parameters:
string: object

The string to evaluate

Returns:
bool

The evaluation result. Yes, if the string is considered N/A.

pelicun.base.control_warnings() None[source]

Turn warnings on/off.

See also: pelicun/pytest.ini. Devs: make sure to update that file when addressing & eliminating warnings.

pelicun.base.convert_dtypes(dataframe: DataFrame) DataFrame[source]

Convert columns to a numeric datatype whenever possible.

The function replaces None with NA otherwise columns containing None would continue to have the object type.

Parameters:
dataframe: DataFrame

The DataFrame that will be modified.

Returns:
DataFrame

The modified DataFrame.

pelicun.base.convert_to_MultiIndex(data: DataFrame, axis: int = 0, *, inplace: bool = False) DataFrame[source]
pelicun.base.convert_to_MultiIndex(data: Series, axis: int = 0, *, inplace: bool = False) Series

Convert the index of a DataFrame to a MultiIndex.

We assume that the index uses standard SimCenter convention to identify different levels: a dash character (‘-’) is expected to separate each level of the index.

Parameters:
data: DataFrame

The DataFrame that will be modified.

axis: int, optional, default:0

Identifies if the index (0) or the columns (1) shall be edited.

inplace: bool, optional, default:False

If yes, the operation is performed directly on the input DataFrame and not on a copy of it.

Returns:
DataFrame

The modified DataFrame.

Raises:
ValueError

If an invalid axis is specified.

pelicun.base.convert_to_SimpleIndex(data: DataFrame, axis: int = 0, *, inplace: bool = False) DataFrame[source]
pelicun.base.convert_to_SimpleIndex(data: Series, axis: int = 0, *, inplace: bool = False) Series

Convert the index of a DataFrame to a simple, one-level index.

The target index uses standard SimCenter convention to identify different levels: a dash character (‘-’) is used to separate each level of the index.

Parameters:
data: DataFrame

The DataFrame that will be modified.

axis: int, optional, default:0

Identifies if the index (0) or the columns (1) shall be edited.

inplace: bool, optional, default:False

If yes, the operation is performed directly on the input DataFrame and not on a copy of it.

Returns:
DataFrame

The modified DataFrame

Raises:
ValueError

When an invalid axis parameter is specified

pelicun.base.convert_units(values: float | list[float] | ndarray, unit: str, to_unit: str, category: str | None = None) float | list[float] | ndarray[source]

Convert numeric values between different units.

Supports conversion within a specified category of units and automatically infers the category if not explicitly provided. It maintains the type of the input in the output.

Parameters:
values: (float | list[float] | np.ndarray)

The numeric value(s) to convert.

unit: (str)

The current unit of the values.

to_unit: (str)

The target unit to convert the values into.

category: (Optional[str])

The category of the units (e.g., ‘length’, ‘pressure’). If not provided, the category will be inferred based on the provided units.

Returns:
float or list[float] or np.ndarray

The converted value(s) in the target unit, in the same data type as the input values.

Raises:
TypeError

If the input values are not of type float, list, or np.ndarray.

ValueError

If the unit, to_unit, or category is unknown or if unit and to_unit are not in the same category.

pelicun.base.dedupe_index(dataframe: ~pandas.core.frame.DataFrame, dtype: type = <class 'str'>) DataFrame[source]

Add a uid level to the index.

Modifies the index of a DataFrame to ensure all index elements are unique by adding an extra level. Assumes that the DataFrame’s original index is a MultiIndex with specified names. A unique identifier (‘uid’) is added as an additional index level based on the cumulative count of occurrences of the original index combinations.

Parameters:
dataframe: pd.DataFrame

The DataFrame whose index is to be modified. It must have a MultiIndex.

dtype: type, optional

The data type for the new index level ‘uid’. Defaults to str.

Returns:
dataframe: pd.DataFrame

The original dataframe with an additional uid level at the index.

pelicun.base.describe(data: DataFrame | Series | ndarray, percentiles: tuple[float, ...] = (0.001, 0.023, 0.1, 0.159, 0.5, 0.841, 0.9, 0.977, 0.999)) DataFrame[source]

Extend descriptive statistics.

Provides extended descriptive statistics for given data, including percentiles and log standard deviation for applicable columns. This function accepts both pandas Series and DataFrame objects directly, or any array-like structure which can be converted to them. It calculates common descriptive statistics and optionally adds log standard deviation for columns where all values are positive.

Parameters:
data: pd.Series, pd.DataFrame, or array-like

The data to describe. If array-like, it is converted to a DataFrame or Series before analysis.

percentiles: tuple of float, optional

Specific percentiles to include in the output. Default includes an extensive range tailored to provide a detailed summary.

Returns:
pd.DataFrame

A DataFrame containing the descriptive statistics of the input data, transposed so that each descriptive statistic is a row.

pelicun.base.dict_raise_on_duplicates(ordered_pairs: list[tuple]) dict[source]

Construct a dictionary from a list of key-value pairs.

Constructs a dictionary from a list of key-value pairs, raising an exception if duplicate keys are found. This function ensures that no two pairs have the same key. It is particularly useful when parsing JSON-like data where unique keys are expected but not enforced by standard parsing methods.

Parameters:
ordered_pairs: list of tuples

A list of tuples, each containing a key and a value. Keys are expected to be unique across the list.

Returns:
dict

A dictionary constructed from the ordered_pairs without any duplicates.

Raises:
ValueError

If a duplicate key is found in the input list, a ValueError is raised with a message indicating the duplicate key.

Notes

This implementation is useful for contexts in which data integrity is crucial and key uniqueness must be ensured.

Examples

>>> dict_raise_on_duplicates(
...     [("key1", "value1"), ("key2", "value2"), ("key1", "value3")]
... )
ValueError: duplicate key: key1
pelicun.base.ensure_value(value: T | None) T[source]

Ensure a variable is not None.

This function checks that the provided variable is not None. It is used to assist with type hinting by avoiding repetitive assert value is not None statements throughout the code.

Parameters:
valueOptional[T]

The variable to check, which can be of any type or None.

Returns:
T

The same variable, guaranteed to be non-None.

Raises:
TypeError

If the provided variable is None.

pelicun.base.float_or_None(string: str) float | None[source]

Convert strings to float or None.

Parameters:
string: str

A string

Returns:
float or None

A float, if the given string can be converted to a float. Otherwise, it returns None

pelicun.base.get(d: dict | None, path: str, default: Any | None = None) Any[source]

Path-like dictionary value retrieval.

Retrieves a value from a nested dictionary using a path with ‘/’ as the separator.

Parameters:
d: dict

The dictionary to search.

path: str

The path to the desired value, with keys separated by ‘/’.

default: Any, optional

The value to return if the path is not found. Defaults to None.

Returns:
Any

The value found at the specified path, or the default value if the path is not found.

Examples

>>> config = {
...     "DL": {
...         "Outputs": {
...             "Format": {
...                 "JSON": "desired_value"
...             }
...         }
...     }
... }
>>> get(config, '/DL/Outputs/Format/JSON', default='default_value')
'desired_value'
>>> get(config, '/DL/Outputs/Format/XML', default='default_value')
'default_value'
pelicun.base.int_or_None(string: str) int | None[source]

Convert strings to int or None.

Parameters:
string: str

A string

Returns:
int or None

An int, if the given string can be converted to an int. Otherwise, it returns None

pelicun.base.invert_mapping(original_dict: dict) dict[source]

Inverts a dictionary mapping from key to list of values.

Parameters:
original_dict: dict

Dictionary with values that are lists of hashable items.

Returns:
dict

New dictionary where each item in the original value lists becomes a key and the original key becomes the corresponding value.

Raises:
ValueError

If any value in the original dictionary’s value lists appears more than once.

pelicun.base.is_specified(d: dict[str, Any], path: str) bool[source]

Opposite of is_unspecified().

Parameters:
d: dict

The dictionary to search.

path: str

The path to the desired value, with keys separated by ‘/’.

Returns:
bool

True if the value is specified, False otherwise.

pelicun.base.is_unspecified(d: dict[str, Any], path: str) bool[source]

Check if something is specified.

Checks if a value in a nested dictionary is either non-existent, None, NaN, or an empty dictionary or list.

Parameters:
d: dict

The dictionary to search.

path: str

The path to the desired value, with keys separated by ‘/’.

Returns:
bool

True if the value is non-existent, None, or an empty dictionary or list. False otherwise.

Examples

>>> config = {
...     "DL": {
...         "Outputs": {
...             "Format": {
...                 "JSON": "desired_value",
...                 "EmptyDict": {}
...             }
...         }
...     }
... }
>>> is_unspecified(config, '/DL/Outputs/Format/JSON')
False
>>> is_unspecified(config, '/DL/Outputs/Format/XML')
True
>>> is_unspecified(config, '/DL/Outputs/Format/EmptyDict')
True
pelicun.base.load_default_options() dict[source]

Load the default_config.json file to set options to default values.

Returns:
dict

Default options

pelicun.base.merge_default_config(user_config: dict | None) dict[source]

Merge default config with user’s options.

Merge the user-specified config with the configuration defined in the default_config.json file. If the user-specified config does not include some option available in the default options, then the default option is used in the merged config.

Parameters:
user_config: dict

User-specified configuration dictionary

Returns:
dict

Merged configuration dictionary

pelicun.base.multiply_factor_multiple_levels(df: DataFrame, conditions: dict, factor: float, axis: int = 0, *, raise_missing: bool = True) None[source]

Multiply a value to selected rows, in place.

Multiplies a value to selected rows of a DataFrame that is indexed with a hierarchical index (pd.MultiIndex). The change is done in place.

Parameters:
df: pd.DataFrame

The DataFrame to be modified.

conditions: dict

A dictionary mapping level names with a single value. Only the rows where the index levels have the provided values will be affected. The dictionary can be empty, in which case all rows will be affected, or contain only some levels and values, in which case only the matching rows will be affected.

factor: float

Scaling factor to use.

axis: int

With 0 the condition is checked against the DataFrame’s index, otherwise with 1 it is checked against the DataFrame’s columns.

raise_missing: bool

Raise an error if no rows are matching the given conditions.

Raises:
ValueError

If the provided axis values is not either 0 or 1.

ValueError

If there are no rows matching the conditions and raise_missing is True.

pelicun.base.parse_units(custom_file: str | None = None, *, preserve_categories: bool = False) dict[source]

Parse the unit conversion factor JSON file and return a dictionary.

Parameters:
custom_file: str, optional

If a custom file is provided, only the units specified in the custom file are used.

preserve_categories: bool, optional

If True, maintains the original data types of category values from the JSON file. If False, converts all values to floats and flattens the dictionary structure, ensuring that each unit name is globally unique across categories.

Returns:
dict

A dictionary where keys are unit names and values are their corresponding conversion factors. If preserve_categories is True, the dictionary may maintain its original nested structure based on the JSON file. If preserve_categories is False, the dictionary is flattened to have globally unique unit names.

pelicun.base.show_matrix(data: ndarray | DataFrame, *, use_describe: bool = False) None[source]

Print a matrix in a nice way using a DataFrame.

Parameters:
data: array-like

The matrix data to display. Can be any array-like structure that pandas can convert to a DataFrame.

use_describe: bool, default: False

If True, provides a descriptive statistical summary of the matrix including specified percentiles. If False, simply prints the matrix as is.

pelicun.base.split_file_name(file_path: str) tuple[str, str][source]

Separate a file name from the extension.

Separates a file name from the extension accounting for the case where the file name itself contains periods.

Parameters:
file_path: str

Original file path.

Returns:
tuple
name: str

Name of the file.

extension: str

File extension.

pelicun.base.str2bool(v: str | bool) bool[source]

Convert a string representation of truth to boolean True or False.

This function is designed to convert string inputs that represent boolean values into actual Python boolean types. It handles typical representations of truthiness and falsiness, and is case insensitive.

Parameters:
v: str or bool

The value to convert into a boolean. This can be a boolean itself (in which case it is simply returned) or a string that is expected to represent a boolean value.

Returns:
bool

The boolean value corresponding to the input.

Raises:
argparse.ArgumentTypeError

If v is a string that does not correspond to a boolean value, an error is raised indicating that a boolean value was expected.

pelicun.base.stringterpolation(arguments: str) Callable[[np.ndarray], np.ndarray][source]

Linear interpolation from strings.

Turns a string of specially formatted arguments into a multilinear interpolating function.

Parameters:
arguments: str

String of arguments containing Y values and X values, separated by a pipe symbol (|). Individual values are separated by commas (,). Example: arguments = ‘y1,y2,y3|x1,x2,x3’

Returns:
Callable

A callable interpolating function

pelicun.base.update(d: dict[str, Any], path: str, value: Any, *, only_if_empty_or_none: bool = False) None[source]

Set a value in a nested dictionary using a path with ‘/’ as the separator.

Parameters:
d: dict

The dictionary to update.

path: str

The path to the desired value, with keys separated by ‘/’.

value: Any

The value to set at the specified path.

only_if_empty_or_none: bool, optional

If True, only update the value if it is None or an empty dictionary. Defaults to False.

Examples

>>> d = {}
>>> update(d, 'x/y/z', 1)
>>> d
{'x': {'y': {'z': 1}}}
>>> update(d, 'x/y/z', 2, only_if_empty_or_none=True)
>>> d
{'x': {'y': {'z': 1}}}  # value remains 1 since it is not empty or None
>>> update(d, 'x/y/z', 2)
>>> d
{'x': {'y': {'z': 2}}}  # value is updated to 2
pelicun.base.update_vals(update_value: dict, primary: dict, update_path: str, primary_path: str) None[source]

Transfer values between nested dictionaries.

Updates the values of the update nested dictionary with those provided in the primary nested dictionary. If a key already exists in update, and does not map to another dictionary, the value is left unchanged.

Parameters:
update_value: dict

Dictionary -which can contain nested dictionaries- to be updated based on the values of primary. New keys existing in primary are added to update. Values of which keys already exist in primary are left unchanged.

primary: dict

Dictionary -which can contain nested dictionaries- to be used to update the values of update.

update_path: str

Identifier for the update dictionary. Used to make error messages more meaningful.

primary_path: str

Identifier for the update dictionary. Used to make error messages more meaningful.

Raises:
ValueError

If primary[key] is dict but update[key] is not.

ValueError

If update[key] is dict but primary[key] is not.

pelicun.base.with_parsed_str_na_values(df: DataFrame) DataFrame[source]

Identify string values interpretable as N/A.

Given a dataframe, this function identifies values that have string type and can be interpreted as N/A, and replaces them with actual NA’s.

Parameters:
df: pd.DataFrame

Dataframe to process

Returns:
pd.DataFrame

The dataframe with proper N/A values.