8.1.4. pelicun.file_io
Classes and methods that handle file input and output.
Functions
|
Load data assuming it follows standard SimCenter tabular schema. |
|
Load data from a file and stores it in a DataFrame. |
|
Save data to a CSV file following the standard SimCenter schema. |
|
Substitute the default directory path with a specified path. |
- pelicun.file_io.load_data(data_source: str | DataFrame, unit_conversion_factors: dict | None, orientation: int = 0, *, reindex: bool = True, return_units: bool = False, log: Logger | None = None) tuple[DataFrame, Series] | DataFrame [source]
Load data assuming it follows standard SimCenter tabular schema.
The data is assumed to have a single header line and an index column. The second line may start with ‘Units’ in the index and provide the units for each column in the file.
- Parameters:
- data_source: string or DataFrame
If it is a string, the data_source is assumed to point to the location of the source file. If it is a DataFrame, the data_source is assumed to hold the raw data.
- unit_conversion_factors: dict, optional
Dictionary containing key-value pairs of unit names and their corresponding factors. Conversion factors are defined as the number of times a base unit fits in the alternative unit. If no conversion factors are specified, then no unit conversions are made.
- orientation: int, {0, 1}, default: 0
If 0, variables are organized along columns; otherwise they are along the rows. This is important when converting values to follow the prescribed units.
- reindex: bool
If True, reindexes the table to ensure a 0-based, continuous index
- return_units: bool
If True, returns the units as well as the data to allow for adjustments in unit conversion.
- log: Logger
Logger object to be used. If no object is specified, no logging is performed.
- Returns:
- tuple
- data: DataFrame
Parsed data.
- units: Series
Labels from the data and corresponding units specified in the data. Units are only returned if return_units is set to True.
- Raises:
- TypeError
If data_source is neither a string nor a DataFrame, a TypeError is raised.
- pelicun.file_io.load_from_file(filepath: str, log: Logger | None = None) DataFrame [source]
Load data from a file and stores it in a DataFrame.
Currently, only CSV files are supported, but the function is easily extensible to support other file formats.
- Parameters:
- filepath: string
The location of the source file.
- log: base.Logger, optional
Optional logger object.
- Returns:
- tuple
- data: DataFrame
Data loaded from the file.
- log: Logger
Logger object to be used. If no object is specified, no logging is performed.
- Raises:
- FileNotFoundError
If the filepath is invalid.
- ValueError
If the file is not a CSV.
- pelicun.file_io.save_to_csv(data: DataFrame | None, filepath: Path | None, units: Series | None = None, unit_conversion_factors: dict | None = None, orientation: int = 0, *, use_simpleindex: bool = True, log: Logger | None = None) DataFrame | None [source]
Save data to a CSV file following the standard SimCenter schema.
The produced CSV files have a single header line and an index column. The second line may start with ‘Units’ in the index or the first column may be ‘Units’ to provide the units for the data in the file.
- Parameters:
- data: DataFrame
The data to save.
- filepath: Path
The location of the destination file. If None, the data is not saved, but returned in the end.
- units: Series, optional
Provides a Series with variables and corresponding units.
- unit_conversion_factors: dict, optional
Dictionary containing key-value pairs of unit names and their corresponding factors. Conversion factors are defined as the number of times a base unit fits in the alternative unit.
- orientation: int, {0, 1}, default 0
If 0, variables are organized along columns; otherwise, they are along the rows. This is important when converting values to follow the prescribed units.
- use_simpleindex: bool, default True
If True, MultiIndex columns and indexes are converted to SimpleIndex before saving.
- log: Logger, optional
Logger object to be used. If no object is specified, no logging is performed.
- Returns:
- DataFrame or None
If filepath is None, returns the DataFrame with potential unit conversions and reformatting applied. Otherwise, returns None after saving the data to a CSV file.
- Raises:
- ValueError
If units is not None but unit_conversion_factors is None.
- ValueError
If writing to a file fails.
- ValueError
If the provided file name does not have the .csv suffix.
- pelicun.file_io.substitute_default_path(data_paths: list[str | DataFrame]) list[str | DataFrame] [source]
Substitute the default directory path with a specified path.
This function iterates over a list of data paths and replaces occurrences of the ‘PelicunDefault/’ substring with the path specified by base.pelicun_path concatenated with ‘/resources/SimCenterDBDL/’. This operation is performed to update paths that are using a default location to a user-defined location within the pelicun framework. The updated list of paths is then returned.
- Parameters:
- data_paths: list of str
A list containing the paths to data files. These paths may include a placeholder directory ‘PelicunDefault/’ that needs to be substituted with the actual path specified in base.pelicun_path.
- Returns:
- list of str
The list with updated paths where ‘PelicunDefault/’ has been replaced with the specified path in base.pelicun_path concatenated with ‘/resources/SimCenterDBDL/’.
Notes
The function assumes that base.pelicun_path is properly initialized and points to the correct directory where resources are located.
If a path in the input list does not contain ‘PelicunDefault/’, it is added to the output list unchanged.
Examples
>>> data_paths = ['PelicunDefault/data/file1.txt', 'data/file2.txt'] >>> substitute_default_path(data_paths) ['{base.pelicun_path}/resources/SimCenterDBDL/data/file1.txt', 'data/file2.txt']