8.1.4. pelicun.file_io

Classes and methods that handle file input and output.

Functions

load_data(data_source, unit_conversion_factors)

Load data assuming it follows standard SimCenter tabular schema.

load_from_file(filepath[, log])

Load data from a file and stores it in a DataFrame.

save_to_csv(data, filepath[, units, ...])

Save data to a CSV file following the standard SimCenter schema.

substitute_default_path(data_paths)

Substitute the default directory path with a specified path.

pelicun.file_io.load_data(data_source: str | DataFrame, unit_conversion_factors: dict | None, orientation: int = 0, *, reindex: bool = True, return_units: bool = False, log: Logger | None = None) tuple[DataFrame, Series] | DataFrame[source]

Load data assuming it follows standard SimCenter tabular schema.

The data is assumed to have a single header line and an index column. The second line may start with ‘Units’ in the index and provide the units for each column in the file.

Parameters:
data_source: string or DataFrame

If it is a string, the data_source is assumed to point to the location of the source file. If it is a DataFrame, the data_source is assumed to hold the raw data.

unit_conversion_factors: dict, optional

Dictionary containing key-value pairs of unit names and their corresponding factors. Conversion factors are defined as the number of times a base unit fits in the alternative unit. If no conversion factors are specified, then no unit conversions are made.

orientation: int, {0, 1}, default: 0

If 0, variables are organized along columns; otherwise they are along the rows. This is important when converting values to follow the prescribed units.

reindex: bool

If True, reindexes the table to ensure a 0-based, continuous index

return_units: bool

If True, returns the units as well as the data to allow for adjustments in unit conversion.

log: Logger

Logger object to be used. If no object is specified, no logging is performed.

Returns:
tuple
data: DataFrame

Parsed data.

units: Series

Labels from the data and corresponding units specified in the data. Units are only returned if return_units is set to True.

Raises:
TypeError

If data_source is neither a string nor a DataFrame, a TypeError is raised.

pelicun.file_io.load_from_file(filepath: str, log: Logger | None = None) DataFrame[source]

Load data from a file and stores it in a DataFrame.

Currently, only CSV files are supported, but the function is easily extensible to support other file formats.

Parameters:
filepath: string

The location of the source file.

log: base.Logger, optional

Optional logger object.

Returns:
tuple
data: DataFrame

Data loaded from the file.

log: Logger

Logger object to be used. If no object is specified, no logging is performed.

Raises:
FileNotFoundError

If the filepath is invalid.

ValueError

If the file is not a CSV.

pelicun.file_io.save_to_csv(data: DataFrame | None, filepath: Path | None, units: Series | None = None, unit_conversion_factors: dict | None = None, orientation: int = 0, *, use_simpleindex: bool = True, log: Logger | None = None) DataFrame | None[source]

Save data to a CSV file following the standard SimCenter schema.

The produced CSV files have a single header line and an index column. The second line may start with ‘Units’ in the index or the first column may be ‘Units’ to provide the units for the data in the file.

Parameters:
data: DataFrame

The data to save.

filepath: Path

The location of the destination file. If None, the data is not saved, but returned in the end.

units: Series, optional

Provides a Series with variables and corresponding units.

unit_conversion_factors: dict, optional

Dictionary containing key-value pairs of unit names and their corresponding factors. Conversion factors are defined as the number of times a base unit fits in the alternative unit.

orientation: int, {0, 1}, default 0

If 0, variables are organized along columns; otherwise, they are along the rows. This is important when converting values to follow the prescribed units.

use_simpleindex: bool, default True

If True, MultiIndex columns and indexes are converted to SimpleIndex before saving.

log: Logger, optional

Logger object to be used. If no object is specified, no logging is performed.

Returns:
DataFrame or None

If filepath is None, returns the DataFrame with potential unit conversions and reformatting applied. Otherwise, returns None after saving the data to a CSV file.

Raises:
ValueError

If units is not None but unit_conversion_factors is None.

ValueError

If writing to a file fails.

ValueError

If the provided file name does not have the .csv suffix.

pelicun.file_io.substitute_default_path(data_paths: list[str | DataFrame]) list[str | DataFrame][source]

Substitute the default directory path with a specified path.

This function iterates over a list of data paths and replaces occurrences of the ‘PelicunDefault/’ substring with the path specified by base.pelicun_path concatenated with ‘/resources/SimCenterDBDL/’. This operation is performed to update paths that are using a default location to a user-defined location within the pelicun framework. The updated list of paths is then returned.

Parameters:
data_paths: list of str

A list containing the paths to data files. These paths may include a placeholder directory ‘PelicunDefault/’ that needs to be substituted with the actual path specified in base.pelicun_path.

Returns:
list of str

The list with updated paths where ‘PelicunDefault/’ has been replaced with the specified path in base.pelicun_path concatenated with ‘/resources/SimCenterDBDL/’.

Notes

  • The function assumes that base.pelicun_path is properly initialized and points to the correct directory where resources are located.

  • If a path in the input list does not contain ‘PelicunDefault/’, it is added to the output list unchanged.

Examples

>>> data_paths = ['PelicunDefault/data/file1.txt',
    'data/file2.txt']
>>> substitute_default_path(data_paths)
['{base.pelicun_path}/resources/SimCenterDBDL/data/file1.txt',
'data/file2.txt']