pelicun.file_io

Classes and methods that handle file input and output.

Functions

load_data(data_source[, ...])

Load data assuming it follows standard SimCenter tabular schema.

load_from_file(filepath[, log])

Load data from a file and stores it in a DataFrame.

save_to_csv(data, filepath[, units, ...])

Save data to a CSV file following the standard SimCenter schema.

substitute_default_path(data_paths[, log])

Substitute the default directory path.

pelicun.file_io.load_data(data_source: str | DataFrame, unit_conversion_factors: dict | None = None, orientation: int = 0, *, reindex: bool = True, return_units: bool = False, log: Logger | None = None) tuple[DataFrame, Series] | DataFrame[source]

Load data assuming it follows standard SimCenter tabular schema.

The data is assumed to have a single header line and an index column. The second line may start with ‘Units’ in the index and provide the units for each column in the file.

Parameters:
data_source: string or DataFrame

If it is a string, the data_source is assumed to point to the location of the source file. If it is a DataFrame, the data_source is assumed to hold the raw data.

unit_conversion_factors: dict, optional

Dictionary containing key-value pairs of unit names and their corresponding factors. Conversion factors are defined as the number of times a base unit fits in the alternative unit. If no conversion factors are specified, then no unit conversions are made.

orientation: int, {0, 1}, default: 0

If 0, variables are organized along columns; otherwise they are along the rows. This is important when converting values to follow the prescribed units.

reindex: bool

If True, reindexes the table to ensure a 0-based, continuous index

return_units: bool

If True, returns the units as well as the data to allow for adjustments in unit conversion.

log: Logger

Logger object to be used. If no object is specified, no logging is performed.

Returns:
tuple
data: DataFrame

Parsed data.

units: Series

Labels from the data and corresponding units specified in the data. Units are only returned if return_units is set to True.

Raises:
TypeError

If data_source is neither a string nor a DataFrame, a TypeError is raised.

pelicun.file_io.load_from_file(filepath: str, log: Logger | None = None) DataFrame[source]

Load data from a file and stores it in a DataFrame.

Currently, only CSV files are supported, but the function is easily extensible to support other file formats.

Parameters:
filepath: string

The location of the source file.

log: base.Logger, optional

Optional logger object.

Returns:
tuple
data: DataFrame

Data loaded from the file.

log: Logger

Logger object to be used. If no object is specified, no logging is performed.

Raises:
FileNotFoundError

If the filepath is invalid.

ValueError

If the file is not a CSV.

pelicun.file_io.save_to_csv(data: DataFrame | None, filepath: Path | None, units: Series | None = None, unit_conversion_factors: dict | None = None, orientation: int = 0, *, use_simpleindex: bool = True, log: Logger | None = None) DataFrame | None[source]

Save data to a CSV file following the standard SimCenter schema.

The produced CSV files have a single header line and an index column. The second line may start with ‘Units’ in the index or the first column may be ‘Units’ to provide the units for the data in the file.

Parameters:
data: DataFrame

The data to save.

filepath: Path

The location of the destination file. If None, the data is not saved, but returned in the end.

units: Series, optional

Provides a Series with variables and corresponding units.

unit_conversion_factors: dict, optional

Dictionary containing key-value pairs of unit names and their corresponding factors. Conversion factors are defined as the number of times a base unit fits in the alternative unit.

orientation: int, {0, 1}, default 0

If 0, variables are organized along columns; otherwise, they are along the rows. This is important when converting values to follow the prescribed units.

use_simpleindex: bool, default True

If True, MultiIndex columns and indexes are converted to SimpleIndex before saving.

log: Logger, optional

Logger object to be used. If no object is specified, no logging is performed.

Returns:
DataFrame or None

If filepath is None, returns the DataFrame with potential unit conversions and reformatting applied. Otherwise, returns None after saving the data to a CSV file.

Raises:
ValueError

If units is not None but unit_conversion_factors is None.

ValueError

If writing to a file fails.

ValueError

If the provided file name does not have the .csv suffix.

pelicun.file_io.substitute_default_path(data_paths: list[str | DataFrame], log: Logger | None = None) list[str | DataFrame][source]

Substitute the default directory path.

This function iterates over a list of data paths and replaces those with the ‘PelicunDefault/’ substring with the full paths to model files in the built-in Damage and Loss Model Library. Default paths are expected to follow the PelicunDefault/method_name/model_type.extension structure. The method_name identifies the methodology from those available in the {base.pelicun_path}/resources/dlml_resource_paths.json file. The model_type identifies the type of model requested. Currently, the following types are supported: ‘fragility’, ‘consequence_repair’, ‘loss_repair’. The extension is intended to identify ‘CSV’ files with model parameters and ‘JSON’ files with metadata. The model_type and extension strings are not limited to the supported values. If you know a particular file exists in the method’s folder, you can use the corresponding model_type.extension to access that file.

Parameters:
data_paths: list of str or pd.DataFrame

A list containing the paths to data files. These paths may include a placeholder directory ‘PelicunDefault/’ that needs to be substituted with the actual path specified in the resource mapping.

log: Logger

Logger object to be used. If no object is specified, no logging is performed.

Returns:
list of str or pd.DataFrame
Raises:
KeyError

If the method_name after ‘PelicunDefault/’ does not exist in the resource_paths keys. If the method_name after ‘PelicunDefault/’ does not exist in the legacy list of filenames preserved for backwards compatibility.

Notes

  • The function assumes that base.pelicun_path is properly initialized and points to the correct directory where resources are located.

  • If a path in the input list does not contain ‘PelicunDefault/’, the path is added to the output list unchanged.

Examples

>>> data_paths = ['PelicunDefault/Hazus Hurricane/fragility.csv', 'data/file2.txt']
>>> substitute_default_path(data_paths)
['{base.pelicun_path}/resources/DamageAndLossModelLibrary/'
  'hurricane/building/portfolio/Hazus v5.1 coupled/fragility.csv',
  'data/file2.txt']