8.1.3. pelicun.base
Constants, basic classes, and methods for pelicun.
Functions
|
Check if the provided string can be interpreted as N/A. |
Turn warnings on/off. |
|
|
Convert columns to a numeric datatype whenever possible. |
Convert the index of a DataFrame to a MultiIndex. |
|
Convert the index of a DataFrame to a simple, one-level index. |
|
|
Convert numeric values between different units. |
|
Add a uid level to the index. |
|
Extend descriptive statistics. |
|
Construct a dictionary from a list of key-value pairs. |
|
Ensure a variable is not None. |
|
Convert strings to float or None. |
|
Path-like dictionary value retrieval. |
|
Convert strings to int or None. |
|
Inverts a dictionary mapping from key to list of values. |
|
Opposite of is_unspecified(). |
|
Check if something is specified. |
Load the default_config.json file to set options to default values. |
|
|
Merge default config with user's options. |
|
Multiply a value to selected rows, in place. |
|
Parse the unit conversion factor JSON file and return a dictionary. |
|
Print a matrix in a nice way using a DataFrame. |
|
Separate a file name from the extension. |
|
Convert a string representation of truth to boolean True or False. |
|
Linear interpolation from strings. |
|
Set a value in a nested dictionary using a path with '/' as the separator. |
|
Transfer values between nested dictionaries. |
Identify string values interpretable as N/A. |
Classes
|
Generate log files documenting execution events. |
Registry to manage all logger instances. |
|
|
Analysis options and logging configuration. |
- class pelicun.base.Logger(log_file: str | None, *, verbose: bool, log_show_ms: bool, print_log: bool)[source]
Generate log files documenting execution events.
- add_warning(msg: str) None [source]
Add a warning to the warning stack.
- Parameters:
- msg: str
The warning message.
Notes
Warnings are only emitted when emit_warnings is called.
- msg(msg: str = '', *, prepend_timestamp: bool = True, prepend_blank_space: bool = True) None [source]
Write a message in the log file with the current time as prefix.
The time is in ISO-8601 format, e.g. 2018-06-16T20:24:04Z
- Parameters:
- msg: string
Message to print.
- prepend_timestamp: bool
Controls whether a timestamp is placed before the message.
- prepend_blank_space: bool
Controls whether blank space is placed before the message.
- class pelicun.base.LoggerRegistry[source]
Registry to manage all logger instances.
- classmethod log_exception(exc_type: type[BaseException], exc_value: BaseException, exc_traceback: TracebackType | None) None [source]
Log exceptions to all registered loggers.
- class pelicun.base.Options(user_config_options: dict[str, Any] | None, assessment: AssessmentBase | None = None)[source]
Analysis options and logging configuration.
- Attributes:
- sampling_method: str
Sampling method to use. Specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code). Can be any of [‘LHS’, ‘LHS_midpoint’, ‘MonteCarlo’]. The default is ‘LHS’.
- units_file: str
Location of a user-specified units file, which should contain the names of supported units and their conversion factors (the value some quantity of a given unit needs to be multiplied to be expressed in the base units). Value specified in the user configuration dictionary. Pelicun comes with a set of default units which are always loaded (see settings/default_units.json in the pelicun source code). Units specified in the units_file overwrite the default units.
- demand_offset: dict
Demand offsets are used in the process of mapping a component location to its associated EDP. This allows components that are sensitive to EDPs of different levels to be specified as present at the same location (e.g. think of desktop computer and suspended ceiling, both at the same story). Each component’s offset value is specified in the component fragility database. This setting applies a supplemental global offset to specific EDP types. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).
- nondir_multi_dict: dict
Nondirectional components are sensitive to demands coming in any direction. Results are typically available in two orthogonal directions. FEMA P-58 suggests using the formula max(dir_1, dir_2) * 1.2 to estimate the demand for such components. This parameter allows modifying the 1.2 multiplier with a user-specified value. The change can be applied to “ALL” EDPs, or for specific EDPs, such as “PFA”, “PFV”, etc. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).
- rho_cost_time: float
Specifies the correlation between the repair cost and repair time consequences. The value is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see “RepairCostAndTimeCorrelation”) (see settings/default_config.json in the pelicun source code).
- eco_scale: dict
Controls how the effects of economies of scale are handled in the damaged component quantity aggregation for loss measure estimation. The dictionary is specified in the user’s configuration dictionary, otherwise left as provided in the default configuration file (see settings/default_config.json in the pelicun source code).
- log: Logger
Logger object. Configuration parameters coming from the user’s configuration dictionary or the default configuration file control logging behavior. See Logger class.
- pelicun.base.check_if_str_is_na(string: Any) bool [source]
Check if the provided string can be interpreted as N/A.
- Parameters:
- string: object
The string to evaluate
- Returns:
- bool
The evaluation result. Yes, if the string is considered N/A.
- pelicun.base.control_warnings() None [source]
Turn warnings on/off.
See also: pelicun/pytest.ini. Devs: make sure to update that file when addressing & eliminating warnings.
- pelicun.base.convert_dtypes(dataframe: DataFrame) DataFrame [source]
Convert columns to a numeric datatype whenever possible.
The function replaces None with NA otherwise columns containing None would continue to have the object type.
- Parameters:
- dataframe: DataFrame
The DataFrame that will be modified.
- Returns:
- DataFrame
The modified DataFrame.
- pelicun.base.convert_to_MultiIndex(data: DataFrame, axis: int = 0, *, inplace: bool = False) DataFrame [source]
- pelicun.base.convert_to_MultiIndex(data: Series, axis: int = 0, *, inplace: bool = False) Series
Convert the index of a DataFrame to a MultiIndex.
We assume that the index uses standard SimCenter convention to identify different levels: a dash character (‘-’) is expected to separate each level of the index.
- Parameters:
- data: DataFrame
The DataFrame that will be modified.
- axis: int, optional, default:0
Identifies if the index (0) or the columns (1) shall be edited.
- inplace: bool, optional, default:False
If yes, the operation is performed directly on the input DataFrame and not on a copy of it.
- Returns:
- DataFrame
The modified DataFrame.
- Raises:
- ValueError
If an invalid axis is specified.
- pelicun.base.convert_to_SimpleIndex(data: DataFrame, axis: int = 0, *, inplace: bool = False) DataFrame [source]
- pelicun.base.convert_to_SimpleIndex(data: Series, axis: int = 0, *, inplace: bool = False) Series
Convert the index of a DataFrame to a simple, one-level index.
The target index uses standard SimCenter convention to identify different levels: a dash character (‘-’) is used to separate each level of the index.
- Parameters:
- data: DataFrame
The DataFrame that will be modified.
- axis: int, optional, default:0
Identifies if the index (0) or the columns (1) shall be edited.
- inplace: bool, optional, default:False
If yes, the operation is performed directly on the input DataFrame and not on a copy of it.
- Returns:
- DataFrame
The modified DataFrame
- Raises:
- ValueError
When an invalid axis parameter is specified
- pelicun.base.convert_units(values: float | list[float] | ndarray, unit: str, to_unit: str, category: str | None = None) float | list[float] | ndarray [source]
Convert numeric values between different units.
Supports conversion within a specified category of units and automatically infers the category if not explicitly provided. It maintains the type of the input in the output.
- Parameters:
- values: (float | list[float] | np.ndarray)
The numeric value(s) to convert.
- unit: (str)
The current unit of the values.
- to_unit: (str)
The target unit to convert the values into.
- category: (Optional[str])
The category of the units (e.g., ‘length’, ‘pressure’). If not provided, the category will be inferred based on the provided units.
- Returns:
- float or list[float] or np.ndarray
The converted value(s) in the target unit, in the same data type as the input values.
- Raises:
- TypeError
If the input values are not of type float, list, or np.ndarray.
- ValueError
If the unit, to_unit, or category is unknown or if unit and to_unit are not in the same category.
- pelicun.base.dedupe_index(dataframe: ~pandas.core.frame.DataFrame, dtype: type = <class 'str'>) DataFrame [source]
Add a uid level to the index.
Modifies the index of a DataFrame to ensure all index elements are unique by adding an extra level. Assumes that the DataFrame’s original index is a MultiIndex with specified names. A unique identifier (‘uid’) is added as an additional index level based on the cumulative count of occurrences of the original index combinations.
- Parameters:
- dataframe: pd.DataFrame
The DataFrame whose index is to be modified. It must have a MultiIndex.
- dtype: type, optional
The data type for the new index level ‘uid’. Defaults to str.
- Returns:
- dataframe: pd.DataFrame
The original dataframe with an additional uid level at the index.
- pelicun.base.describe(data: DataFrame | Series | ndarray, percentiles: tuple[float, ...] = (0.001, 0.023, 0.1, 0.159, 0.5, 0.841, 0.9, 0.977, 0.999)) DataFrame [source]
Extend descriptive statistics.
Provides extended descriptive statistics for given data, including percentiles and log standard deviation for applicable columns. This function accepts both pandas Series and DataFrame objects directly, or any array-like structure which can be converted to them. It calculates common descriptive statistics and optionally adds log standard deviation for columns where all values are positive.
- Parameters:
- data: pd.Series, pd.DataFrame, or array-like
The data to describe. If array-like, it is converted to a DataFrame or Series before analysis.
- percentiles: tuple of float, optional
Specific percentiles to include in the output. Default includes an extensive range tailored to provide a detailed summary.
- Returns:
- pd.DataFrame
A DataFrame containing the descriptive statistics of the input data, transposed so that each descriptive statistic is a row.
- pelicun.base.dict_raise_on_duplicates(ordered_pairs: list[tuple]) dict [source]
Construct a dictionary from a list of key-value pairs.
Constructs a dictionary from a list of key-value pairs, raising an exception if duplicate keys are found. This function ensures that no two pairs have the same key. It is particularly useful when parsing JSON-like data where unique keys are expected but not enforced by standard parsing methods.
- Parameters:
- ordered_pairs: list of tuples
A list of tuples, each containing a key and a value. Keys are expected to be unique across the list.
- Returns:
- dict
A dictionary constructed from the ordered_pairs without any duplicates.
- Raises:
- ValueError
If a duplicate key is found in the input list, a ValueError is raised with a message indicating the duplicate key.
Notes
This implementation is useful for contexts in which data integrity is crucial and key uniqueness must be ensured.
Examples
>>> dict_raise_on_duplicates( ... [("key1", "value1"), ("key2", "value2"), ("key1", "value3")] ... ) ValueError: duplicate key: key1
- pelicun.base.ensure_value(value: T | None) T [source]
Ensure a variable is not None.
This function checks that the provided variable is not None. It is used to assist with type hinting by avoiding repetitive assert value is not None statements throughout the code.
- Parameters:
- valueOptional[T]
The variable to check, which can be of any type or None.
- Returns:
- T
The same variable, guaranteed to be non-None.
- Raises:
- TypeError
If the provided variable is None.
- pelicun.base.float_or_None(string: str) float | None [source]
Convert strings to float or None.
- Parameters:
- string: str
A string
- Returns:
- float or None
A float, if the given string can be converted to a float. Otherwise, it returns None
- pelicun.base.get(d: dict | None, path: str, default: Any | None = None) Any [source]
Path-like dictionary value retrieval.
Retrieves a value from a nested dictionary using a path with ‘/’ as the separator.
- Parameters:
- d: dict
The dictionary to search.
- path: str
The path to the desired value, with keys separated by ‘/’.
- default: Any, optional
The value to return if the path is not found. Defaults to None.
- Returns:
- Any
The value found at the specified path, or the default value if the path is not found.
Examples
>>> config = { ... "DL": { ... "Outputs": { ... "Format": { ... "JSON": "desired_value" ... } ... } ... } ... } >>> get(config, '/DL/Outputs/Format/JSON', default='default_value') 'desired_value' >>> get(config, '/DL/Outputs/Format/XML', default='default_value') 'default_value'
- pelicun.base.int_or_None(string: str) int | None [source]
Convert strings to int or None.
- Parameters:
- string: str
A string
- Returns:
- int or None
An int, if the given string can be converted to an int. Otherwise, it returns None
- pelicun.base.invert_mapping(original_dict: dict) dict [source]
Inverts a dictionary mapping from key to list of values.
- Parameters:
- original_dict: dict
Dictionary with values that are lists of hashable items.
- Returns:
- dict
New dictionary where each item in the original value lists becomes a key and the original key becomes the corresponding value.
- Raises:
- ValueError
If any value in the original dictionary’s value lists appears more than once.
- pelicun.base.is_specified(d: dict[str, Any], path: str) bool [source]
Opposite of is_unspecified().
- Parameters:
- d: dict
The dictionary to search.
- path: str
The path to the desired value, with keys separated by ‘/’.
- Returns:
- bool
True if the value is specified, False otherwise.
- pelicun.base.is_unspecified(d: dict[str, Any], path: str) bool [source]
Check if something is specified.
Checks if a value in a nested dictionary is either non-existent, None, NaN, or an empty dictionary or list.
- Parameters:
- d: dict
The dictionary to search.
- path: str
The path to the desired value, with keys separated by ‘/’.
- Returns:
- bool
True if the value is non-existent, None, or an empty dictionary or list. False otherwise.
Examples
>>> config = { ... "DL": { ... "Outputs": { ... "Format": { ... "JSON": "desired_value", ... "EmptyDict": {} ... } ... } ... } ... } >>> is_unspecified(config, '/DL/Outputs/Format/JSON') False >>> is_unspecified(config, '/DL/Outputs/Format/XML') True >>> is_unspecified(config, '/DL/Outputs/Format/EmptyDict') True
- pelicun.base.load_default_options() dict [source]
Load the default_config.json file to set options to default values.
- Returns:
- dict
Default options
- pelicun.base.merge_default_config(user_config: dict | None) dict [source]
Merge default config with user’s options.
Merge the user-specified config with the configuration defined in the default_config.json file. If the user-specified config does not include some option available in the default options, then the default option is used in the merged config.
- Parameters:
- user_config: dict
User-specified configuration dictionary
- Returns:
- dict
Merged configuration dictionary
- pelicun.base.multiply_factor_multiple_levels(df: DataFrame, conditions: dict, factor: float, axis: int = 0, *, raise_missing: bool = True) None [source]
Multiply a value to selected rows, in place.
Multiplies a value to selected rows of a DataFrame that is indexed with a hierarchical index (pd.MultiIndex). The change is done in place.
- Parameters:
- df: pd.DataFrame
The DataFrame to be modified.
- conditions: dict
A dictionary mapping level names with a single value. Only the rows where the index levels have the provided values will be affected. The dictionary can be empty, in which case all rows will be affected, or contain only some levels and values, in which case only the matching rows will be affected.
- factor: float
Scaling factor to use.
- axis: int
With 0 the condition is checked against the DataFrame’s index, otherwise with 1 it is checked against the DataFrame’s columns.
- raise_missing: bool
Raise an error if no rows are matching the given conditions.
- Raises:
- ValueError
If the provided axis values is not either 0 or 1.
- ValueError
If there are no rows matching the conditions and raise_missing is True.
- pelicun.base.parse_units(custom_file: str | None = None, *, preserve_categories: bool = False) dict [source]
Parse the unit conversion factor JSON file and return a dictionary.
- Parameters:
- custom_file: str, optional
If a custom file is provided, only the units specified in the custom file are used.
- preserve_categories: bool, optional
If True, maintains the original data types of category values from the JSON file. If False, converts all values to floats and flattens the dictionary structure, ensuring that each unit name is globally unique across categories.
- Returns:
- dict
A dictionary where keys are unit names and values are their corresponding conversion factors. If preserve_categories is True, the dictionary may maintain its original nested structure based on the JSON file. If preserve_categories is False, the dictionary is flattened to have globally unique unit names.
- pelicun.base.show_matrix(data: ndarray | DataFrame, *, use_describe: bool = False) None [source]
Print a matrix in a nice way using a DataFrame.
- Parameters:
- data: array-like
The matrix data to display. Can be any array-like structure that pandas can convert to a DataFrame.
- use_describe: bool, default: False
If True, provides a descriptive statistical summary of the matrix including specified percentiles. If False, simply prints the matrix as is.
- pelicun.base.split_file_name(file_path: str) tuple[str, str] [source]
Separate a file name from the extension.
Separates a file name from the extension accounting for the case where the file name itself contains periods.
- Parameters:
- file_path: str
Original file path.
- Returns:
- tuple
- name: str
Name of the file.
- extension: str
File extension.
- pelicun.base.str2bool(v: str | bool) bool [source]
Convert a string representation of truth to boolean True or False.
This function is designed to convert string inputs that represent boolean values into actual Python boolean types. It handles typical representations of truthiness and falsiness, and is case insensitive.
- Parameters:
- v: str or bool
The value to convert into a boolean. This can be a boolean itself (in which case it is simply returned) or a string that is expected to represent a boolean value.
- Returns:
- bool
The boolean value corresponding to the input.
- Raises:
- argparse.ArgumentTypeError
If v is a string that does not correspond to a boolean value, an error is raised indicating that a boolean value was expected.
- pelicun.base.stringterpolation(arguments: str) Callable[[np.ndarray], np.ndarray] [source]
Linear interpolation from strings.
Turns a string of specially formatted arguments into a multilinear interpolating function.
- Parameters:
- arguments: str
String of arguments containing Y values and X values, separated by a pipe symbol (|). Individual values are separated by commas (,). Example: arguments = ‘y1,y2,y3|x1,x2,x3’
- Returns:
- Callable
A callable interpolating function
- pelicun.base.update(d: dict[str, Any], path: str, value: Any, *, only_if_empty_or_none: bool = False) None [source]
Set a value in a nested dictionary using a path with ‘/’ as the separator.
- Parameters:
- d: dict
The dictionary to update.
- path: str
The path to the desired value, with keys separated by ‘/’.
- value: Any
The value to set at the specified path.
- only_if_empty_or_none: bool, optional
If True, only update the value if it is None or an empty dictionary. Defaults to False.
Examples
>>> d = {} >>> update(d, 'x/y/z', 1) >>> d {'x': {'y': {'z': 1}}}
>>> update(d, 'x/y/z', 2, only_if_empty_or_none=True) >>> d {'x': {'y': {'z': 1}}} # value remains 1 since it is not empty or None
>>> update(d, 'x/y/z', 2) >>> d {'x': {'y': {'z': 2}}} # value is updated to 2
- pelicun.base.update_vals(update_value: dict, primary: dict, update_path: str, primary_path: str) None [source]
Transfer values between nested dictionaries.
Updates the values of the update nested dictionary with those provided in the primary nested dictionary. If a key already exists in update, and does not map to another dictionary, the value is left unchanged.
- Parameters:
- update_value: dict
Dictionary -which can contain nested dictionaries- to be updated based on the values of primary. New keys existing in primary are added to update. Values of which keys already exist in primary are left unchanged.
- primary: dict
Dictionary -which can contain nested dictionaries- to be used to update the values of update.
- update_path: str
Identifier for the update dictionary. Used to make error messages more meaningful.
- primary_path: str
Identifier for the update dictionary. Used to make error messages more meaningful.
- Raises:
- ValueError
If primary[key] is dict but update[key] is not.
- ValueError
If update[key] is dict but primary[key] is not.
- pelicun.base.with_parsed_str_na_values(df: DataFrame) DataFrame [source]
Identify string values interpretable as N/A.
Given a dataframe, this function identifies values that have string type and can be interpreted as N/A, and replaces them with actual NA’s.
- Parameters:
- df: pd.DataFrame
Dataframe to process
- Returns:
- pd.DataFrame
The dataframe with proper N/A values.