cellpy.readers package

cellpy.readers package#

Subpackages#

cellpy.readers.instruments package

Submodules#

cellpy.readers.cellreader module#

Datareader for cell testers and potentiostats.

This module is used for loading data and databases created by different cell testers and exporing them in a common hdf5-format.

Example

>>> c = cellpy.get(["super_battery_run_01.res", "super_battery_run_02.res"]) # loads and merges the runs
>>> voltage_curves = c.get_cap()
>>> c.save("super_battery_run.h5")

class CellpyCell(filenames=None, selected_scans=None, profile=False, filestatuschecker=None, tester=None, initialize=False, cellpy_units=None, output_units=None, debug=False)[source]#

Bases: object

Main class for working and storing data.

This class is the main work-horse for cellpy where methods for reading, selecting, and tweaking your data is located. It also contains the header definitions, both for the cellpy hdf5 format, and for the various cell-tester file-formats that can be read.

data#: cellpy.Data object containing the data

cellpy_units#: cellpy.units object

cellpy_datadir#: path to cellpy data directory

raw_datadir#: path to raw data directory

filestatuschecker#: filestatuschecker object

force_step_table_creation#: force step table creation

ensure_step_table#: ensure step table

limit_loaded_cycles#: limit loaded cycles

profile#: profile

select_minimal#: select minimal

empty#: empty

forced_errors#: forced errors

capacity_modifiers#: capacity modifiers

sep#: delimiter to use when reading (when applicable) and exporting files

cycle_mode#: cycle mode

tester#: tester

cell_name#: cell name (session name, defaults to concatenated names of the subtests)

property active_electrode_area#: Returns the area

property active_mass#: Returns the active mass (same as mass)

add_to_summary(column: str, method: str = 'last', new_name: str | None = None) → CellpyCell[source]#

Augment the summary frame with one value per cycle pulled from raw.

For every cycle present in self.data.summary, group the raw rows of column by cycle_index and reduce them with method. The result is written onto the summary frame in place.

Parameters:

column – name of the column in self.data.raw to look up.
method – groupby reducer applied per cycle. One of "last" (default), "first", "mean", "min", "max".
new_name – name to use for the new summary column. Defaults to column.

Returns:

self (chainable).

Raises:

ValueError – if column is not present in the raw frame or method is not one of the supported reducers.
NoDataFound – propagated from self.data if no data is loaded.

property cell_name#: Returns the session name

check_file_ids(rawfiles, cellpyfile, detailed=False)[source]#

Check the stats for the files (raw-data and cellpy hdf5).

This method checks if the hdf5 file and the res-files have the same timestamps etc. to find out if we need to bother to load .res -files.

if detailed is set to True, the method returns dict containing True or False for each individual raw-file. If not, it returns False if the raw files are newer than the cellpy hdf5-file (i.e. update is needed), else True.

Parameters:

cellpyfile (str) – filename of the cellpy hdf5-file.
rawfiles (list of str) – name(s) of raw-data file(s).
detailed (bool) – return a dict containing True or False for each individual raw-file.

Returns:

Bool or dict

property cycle_mode#

property data#: Returns the DataSet instance

drop_edges(start: int, end: int) → CellpyCell[source]#: Select middle part of experiment (CellpyCell object) from cycle number ‘start’ to ‘end’

drop_from(cycle=None)[source]#: Select first part of experiment (CellpyCell object) up to cycle number ‘cycle’

drop_to(cycle=None)[source]#: Select last part of experiment (CellpyCell object) from cycle number ‘cycle’

property empty#: Gives True if the CellpyCell object is empty (or non-functional)

filtered_summary(*, rate=None, rate_columns=None, **extra_filters)[source]#

Return a filtered copy of the summary DataFrame.

Thin wrapper around cellpy.filters.filter_summary() that resolves the rate column names from self.headers_summary. See the underlying function for the full range semantics; in short (low, high) keeps rows where low < value <= high and {"value": v, "delta": d} keeps rows where v - d < value <= v + d.

Note

The name deliberately reads as a property-style “give me a filtered summary” - the return is just the summary DataFrame. The slot CellpyCell.filter_summary is reserved for a future method that returns a full CellpyCell with the summary, raw, and steps frames all filtered consistently.

Parameters:

rate – Range filter applied to the rate columns. None disables it (default).
rate_columns – Override which rate columns are filtered. Defaults to both (headers_summary.charge_c_rate, headers_summary.discharge_c_rate). Pass a single string to filter on only one side.
**extra_filters – Additional range filters registered with cellpy.filters.register_range_filter().

Returns:

Filtered copy of self.data.summary (cycle index reset to a column so the result is a plain DataFrame).

from_cycle(cycle: int) → CellpyCell[source]#: Select experiment (CellpyCell object) from cycle number ‘cycle’

from_raw(file_names=None, pre_processor_hook=None, post_processor_hook=None, is_a_file=True, refuse_copying=False, **kwargs)[source]#

Load a raw data-file.

Parameters:

file_names (list of raw-file names) – uses CellpyCell.file_names if None. If the list contains more than one file name, then the runs will be merged together. Remark! the order of the files in the list is important.
pre_processor_hook (callable) – function that will be applied to the data within the loader.
post_processor_hook (callable) – function that will be applied to the cellpy.Dataset object after initial loading.
is_a_file (bool) – set this to False if it is a not a file-like object.
refuse_copying (bool) – if set to True, the raw-file will not be copied before loading.

Transferred Parameters:

recalc (bool) – used by merging. Set to false if you don’t want cellpy to automatically shift cycle number and time (e.g. add last cycle number from previous file to the cycle numbers in the next file).
bad_steps (list of tuples) – used by ArbinLoader. (c, s) tuples of steps s (in cycle c) to skip loading.
data_points (tuple of ints) – used by ArbinLoader. Load only data from data_point[0] to data_point[1] (use None for infinite). NOT IMPLEMENTED YET.

get_cap(cycle=None, cycles=None, method='back-and-forth', insert_nan=None, shift=0.0, categorical_column=False, label_cycle_number=False, split=False, interpolated=False, dx=0.1, number_of_points=None, ignore_errors=True, inter_cycle_shift=True, interpolate_along_cap=False, capacity_then_voltage=False, mode='gravimetric', mass=None, area=None, volume=None, cycle_mode=None, usteps=None, dynamic=False, **kwargs)[source]#

Gets the capacity for the run.

Parameters:

cycle (int, list) – cycle number (s).
cycles (list) – list of cycle numbers.
method (string) –
how the curves are given
- ”back-and-forth” - standard back and forth; discharge (or charge) reversed from where charge (or discharge) ends.
- ”forth” - discharge (or charge) continues along x-axis.
- ”forth-and-forth” - discharge (or charge) also starts at 0 (or shift if not shift=0.0)
insert_nan (bool) – insert a externals.numpy.nan between the charge and discharge curves. Defaults to True for “forth-and-forth”, else False
shift – start-value for charge (or discharge) (typically used when plotting shifted-capacity).
categorical_column – add a categorical column showing if it is charge or discharge.
label_cycle_number (bool) – add column for cycle number (tidy format).
split (bool) – return a list of c and v instead of the default that is to return them combined in a DataFrame. This is only possible for some specific combinations of options (neither categorical_column=True or label_cycle_number=True are allowed).
interpolated (bool) – set to True if you would like to get interpolated data (typically if you want to save disk space or memory). Defaults to False.
dx (float) – the step used when interpolating.
number_of_points (int) – number of points to use (over-rides dx) for interpolation (i.e. the length of the interpolated data).
ignore_errors (bool) – don’t break out of loop if an error occurs.
inter_cycle_shift (bool) – cumulative shifts between consecutive cycles. Defaults to True.
interpolate_along_cap (bool) – interpolate along capacity axis instead of along the voltage axis. Defaults to False.
capacity_then_voltage (bool) – return capacity and voltage instead of voltage and capacity. Defaults to False.
mode (string) – ‘gravimetric’, ‘areal’, ‘volumetric’ or ‘absolute’. Defaults to ‘gravimetric’.
mass (float) – mass of active material (in set cellpy unit, typically mg).
area (float) – area of electrode (in set cellpy units, typically cm2).
volume (float) – volume of electrode (in set cellpy units, typically cm3).
cycle_mode (string) – if ‘anode’ the first step is assumed to be the discharge, else charge (defaults to CellpyCell.cycle_mode).
dynamic – for dynamic retrieving data from cellpy-file. [NOT IMPLEMENTED YET]
**kwargs – sent to get_ccap and get_dcap.

Returns:

pandas.DataFrame ((cycle) voltage, capacity, (direction (-1, 1))) unless split is explicitly set to True. Then it returns a tuple with capacity and voltage.

get_ccap(cycle=None, converter=None, mode='gravimetric', as_frame=True, usteps=False, **kwargs)[source]#

Returns charge capacity and voltage for the selected cycle.

Parameters:

cycle (int) – cycle number.
converter (float) – a multiplication factor that converts the values to specific values (i.e. from Ah to mAh/g). If not provided (or None), the factor is obtained from the self.get_converter_to_specific() method.
mode (string) – ‘gravimetric’, ‘areal’ or ‘absolute’. Defaults to ‘gravimetric’. Used if converter is not provided (or None).
as_frame (bool) – if True: returns externals.pandas.DataFrame instead of capacity, voltage series.
**kwargs (dict) – additional keyword arguments sent to the internal _get_cap method.

Returns:

pandas.DataFrame or list of pandas.Series if cycle=None and as_frame=False.

get_converter_to_specific(dataset: Data = None, value: float = None, from_units: CellpyUnits = None, to_units: CellpyUnits = None, mode: str = 'gravimetric') → float[source]#

Convert from absolute units to specific (areal or gravimetric).

The method provides a conversion factor that you can multiply your values with to get them into specific values.

Parameters:

dataset – data instance
value – value used to scale on.
from_units – defaults to data.raw_units.
to_units – defaults to cellpy_units.
mode (str) – gravimetric, areal or absolute

Returns:

conversion factor (float)

get_current(cycle=None, with_index=True, with_time=False, as_frame=True)[source]#

Returns current (in raw units).

Parameters:

cycle – cycle number (all cycles if None).
with_index – if True, includes the cycle index as a column in the returned pandas.DataFrame.
with_time – if True, includes the time as a column in the returned pandas.DataFrame.
as_frame – if not True, returns a list of current values as numpy arrays (one for each cycle). Remark that with_time and with_index will be False if as_frame is set to False.

Returns:

pandas.DataFrame (or list of pandas.Series if cycle=None and as_frame=False)

get_cycle_numbers(steptable=None, rate=None, rate_on=None, rate_std=None, rate_agg='first', inverse=False)[source]#

Get a array containing the cycle numbers in the test.

Parameters:

steptable (pandas.DataFrame) – the step-table to use (if None, the step-table from the cellpydata object will be used).
rate (float) – the rate to filter on. Remark that it should be given as a float, i.e. you will have to convert from C-rate to the actual numeric value. For example, use rate=0.05 if you want to filter on cycles that has a C/20 rate.
rate_on (str) – only select cycles if based on the rate of this step-type (e.g. on=”discharge”).
rate_std (float) – allow for this inaccuracy in C-rate when selecting cycles
rate_agg (str) – perform an aggregation on rate if more than one step of charge or discharge is found (e.g. “mean”, “first”, “max”). For example, if agg=’mean’, the average rate for each cycle will be returned. Set to None if you want to keep all the rates.
inverse (bool) – select steps that does not have the given C-rate.

Returns:

numpy.ndarray of cycle numbers.

get_datetime(cycle=None, with_index=True, with_time=False, as_frame=True)[source]#

Returns datetime (in raw units).

Parameters:

cycle – cycle number (all cycles if None).
with_index – if True, includes the cycle index as a column in the returned pandas.DataFrame.
with_time – if True, includes the time as a column in the returned pandas.DataFrame.
as_frame – if not True, returns a list of current values as numpy arrays (one for each cycle). Remark that with_time and with_index will be False if as_frame is set to False.

Returns:

pandas.DataFrame (or list of pandas.Series if cycle=None and as_frame=False)

get_dcap(cycle=None, converter=None, mode='gravimetric', as_frame=True, usteps=False, **kwargs)[source]#

Returns discharge capacity and voltage for the selected cycle.

Parameters:

cycle (int) – cycle number.
converter (float) – a multiplication factor that converts the values to specific values (i.e. from Ah to mAh/g). If not provided (or None), the factor is obtained from the self.get_converter_to_specific() method.
mode (string) – ‘gravimetric’, ‘areal’ or ‘absolute’. Defaults to ‘gravimetric’. Used if converter is not provided (or None).
as_frame (bool) – if True: returns externals.pandas.DataFrame instead of capacity, voltage series.
**kwargs (dict) – additional keyword arguments sent to the internal _get_cap method.

Returns:

pandas.DataFrame or list of pandas.Series if cycle=None and as_frame=False.

get_ir()[source]#: Get the IR data (Deprecated).

get_mass()[source]#

Returns the mass of the active material (in mg).

This method will be deprecated in the future.

get_number_of_cycles(steptable=None)[source]#: Get the number of cycles in the test.

get_ocv(cycles=None, direction='up', remove_first=False, interpolated=False, dx=None, number_of_points=None) → DataFrame[source]#

Get the open circuit voltage relaxation curves.

Parameters:

cycles (list of ints or None) – the cycles to extract from (selects all if not given).
direction ("up", "down", or "both") – extract only relaxations that is performed during discharge for “up” (because then the voltage relaxes upwards) etc.
remove_first – remove the first relaxation curve (typically, the first curve is from the initial rest period between assembling the data to the actual testing/cycling starts)
interpolated (bool) – set to True if you want the data to be interpolated (e.g. for creating smaller files)
dx (float) – the step used when interpolating.
number_of_points (int) – number of points to use (over-rides dx) for interpolation (i.e. the length of the interpolated data).

Returns:

pandas.DataFrame with cycle-number, step-number, step-time, and voltage columns.

get_rates(steptable=None, agg='first', direction=None)[source]#

Get the rates in the test (only valid for constant current).

Parameters:

steptable – provide custom steptable (if None, the steptable from the cellpydata object will be used).
agg (str) – perform an aggregation if more than one step of charge or discharge is found (e.g. “mean”, “first”, “max”). For example, if agg=’mean’, the average rate for each cycle will be returned. Set to None if you want to keep all the rates.
direction (str or list of str) – only select rates for this direction (e.g. “charge” or “discharge”).

Returns:

pandas.DataFrame with cycle, type, and rate_avr (i.e. C-rate) columns.

get_raw(header, cycle: Iterable | int | None = None, with_index: bool = True, with_step: bool = False, with_time: bool = False, additional_headers: list | None = None, as_frame: bool = True, scaler: float | None = None) → DataFrame | List[array][source]#

Returns the values for column with given header (in raw units).

Parameters:

header – header name.
cycle – cycle number (all cycles if None).
with_index – if True, includes the cycle index as a column in the returned pandas.DataFrame.
with_step – if True, includes the step index as a column in the returned pandas.DataFrame.
with_time – if True, includes the time as a column in the returned pandas.DataFrame.
additional_headers (list) – additional headers to include in the returned pandas.DataFrame.
as_frame – if not True, returns a list of current values as numpy arrays (one for each cycle). Remark that with_time and with_index will be False if as_frame is set to False.
scaler – if not None, the returned values are scaled by this value.

Returns:

pandas.DataFrame (or list of numpy arrays if as_frame=False)

get_step_numbers(steptype: str = 'charge', allctypes: bool = True, pdtype: bool = False, cycle_number: int = None, trim_taper_steps: int = None, steps_to_skip: list | None = None, steptable: Any = None, usteps: bool = False) → dict | Any[source]#

Get the step numbers of selected type.

Returns the selected step_numbers for the selected type of step(s). Either in a dictionary containing a list of step numbers corresponding to the selected steptype for the cycle(s), or a pandas.DataFrame instead of a dict of lists if pdtype is set to True. The frame is a sub-set of the step-table frame (i.e. all the same columns, only filtered by rows).

Parameters:

steptype (string) – string identifying type of step.
allctypes (bool) – get all types of charge (or discharge).
pdtype (bool) – return results as pandas.DataFrame
cycle_number (int) – selected cycle, selects all if not set.
trim_taper_steps (int) – number of taper steps to skip (counted from the end, i.e. 1 means skip last step in each cycle).
steps_to_skip (list) – step numbers that should not be included.
steptable (pandas.DataFrame) – optional steptable

Returns:

dict or pandas.DataFrame

Example

>>> my_charge_steps = CellpyCell.get_step_numbers(
>>>    "charge",
>>>    cycle_number = 3
>>> )
>>> print my_charge_steps
{3: [5,8]}

get_summary(use_summary_made=False)[source]#: Retrieve summary returned as a pandas DataFrame.

Warning

This function is deprecated. Use the CellpyCell.data.summary property instead.

get_timestamp(cycle=None, with_index=True, as_frame=True, in_minutes=False, units='raw')[source]#

Returns timestamp.

Parameters:

cycle – cycle number (all cycles if None).
with_index – if True, includes the cycle index as a column in the returned pandas.DataFrame.
as_frame – if not True, returns a list of current values as numpy arrays (one for each cycle). Remark that with_time and with_index will be False if as_frame is set to False.
in_minutes – (deprecated, use units=”minutes” instead) return values in minutes instead of seconds if True.
units – return values in given time unit (“raw”, “seconds”, “minutes”, “hours”).

Returns:

pandas.DataFrame (or list of pandas.Series if cycle=None and as_frame=False)

get_voltage(cycle=None, with_index=True, with_time=False, as_frame=True)[source]#

Returns voltage (in raw units).

Parameters:

cycle – cycle number (all cycles if None).
with_index – if True, includes the cycle index as a column in the returned pandas.DataFrame.
with_time – if True, includes the time as a column in the returned pandas.DataFrame.
as_frame – if not True, returns a list of current values as numpy arrays (one for each cycle). Remark that with_time and with_index will be False if as_frame is set to False.

Returns:

pandas.DataFrame (or list of pandas.Series if cycle=None and as_frame=False)

has_data_point_as_column()[source]#: Check if the raw data has data_point as column.

has_data_point_as_index()[source]#: Check if the raw data has data_point as index.

has_no_full_duplicates()[source]#: Check if the raw data has no full duplicates.

has_no_partial_duplicates(subset='data_point')[source]#: Check if the raw data has no partial duplicates.

initialize()[source]#: Initialize the CellpyCell object with empty Data instance.

inspect_nominal_capacity(cycles=None)[source]#

Method for estimating the nominal capacity

Parameters:: cycles (list of ints) – the cycles where it is assumed that the data reaches nominal capacity.
Returns:: Nominal capacity (float).

load(cellpy_file, parent_level=None, return_cls=True, accept_old=True, selector=None, **kwargs)[source]#

Loads a cellpy file.

Parameters:

cellpy_file (OtherPath, str) – Full path to the cellpy file.
parent_level (str, optional) – Parent level. Warning! Deprecating this soon!
return_cls (bool) – Return the class.
accept_old (bool) – Accept loading old cellpy-file versions. Instead of raising WrongFileVersion it only issues a warning.
selector (str) – Experimental feature - select specific ranges of data.

Returns:

cellpy.CellpyCell class if return_cls is True

load_step_specifications(file_name, short=False)[source]#

Load a table that contains step-type definitions.

This method loads a file containing a specification for each step or for each (cycle_number, step_number) combinations if short==False, and runs the make_step_table method. The step_cycle specifications that are allowed are stored in the variable cellreader.list_of_step_types.

Parameters:

file_name (str) – name of the file to load
short (bool) – if True, the file only contains step numbers and step types. If False, the file contains cycle numbers as well.

Returns:

None

loadcell(raw_files, cellpy_file=None, mass=None, summary_on_raw=True, summary_on_cellpy_file=True, find_ir=True, find_end_voltage=True, force_raw=False, use_cellpy_stat_file=None, cell_type=None, loading=None, area=None, estimate_area=True, selector=None, **kwargs)[source]#

Loads data for given cells (soon to be deprecated).

Parameters:

raw_files (list) – name of res-files
cellpy_file (path) – name of cellpy-file
mass (float) – mass of electrode or active material
summary_on_raw (bool) – calculate summary if loading from raw
summary_on_cellpy_file (bool) – calculate summary if loading from cellpy-file.
find_ir (bool) – summarize ir
find_end_voltage (bool) – summarize end voltage
force_raw (bool) – only use raw-files
use_cellpy_stat_file (bool) – use stat file if creating summary from raw
cell_type (str) – set the data type (e.g. “anode”). If not, the default from the config file is used.
loading (float) – loading in units [mass] / [area], used to calculate area if area not given
area (float) – area of active electrode
estimate_area (bool) – calculate area from loading if given (defaults to True).
selector (dict) – passed to load.
**kwargs – passed to from_raw

Example

>>> srnos = my_dbreader.select_batch("testing_new_solvent")
>>> cell_datas = []
>>> for srno in srnos:
>>> ... my_run_name = my_dbreader.get_cell_name(srno)
>>> ... mass = my_dbreader.get_mass(srno)
>>> ... rawfiles, cellpyfiles =             >>> ...     filefinder.search_for_files(my_run_name)
>>> ... cell_data = cellreader.CellpyCell()
>>> ... cell_data.loadcell(raw_files=rawfiles,
>>> ...                    cellpy_file=cellpyfiles)
>>> ... cell_data.set_mass(mass)
>>> ... cell_data.make_summary() # etc. etc.
>>> ... cell_datas.append(cell_data)
>>>

Warning

This method will soon be deprecated. Use cellpy.get instead.

make_step_table(step_specifications=None, short=False, override_step_types=None, override_raw_limits=None, profiling=False, all_steps=False, usteps=False, add_c_rate=True, skip_steps=None, sort_rows=True, from_data_point=None, nom_cap_specifics=None)[source]#

Create a table (v.4) that contains summary information for each step.

This function creates a table containing information about the different steps for each cycle and, based on that, decides what type of step it is (e.g. charge) for each cycle.

The format of the steps is:

index: cycleno - stepno - sub-step-no - ustep
Time info: average, stdev, max, min, start, end, delta
Logging info: average, stdev, max, min, start, end, delta
Current info: average, stdev, max, min, start, end, delta
Voltage info: average, stdev, max, min, start, end, delta
Type: (from pre-defined list) - SubType
Info: not used.

Parameters:

step_specifications (pandas.DataFrame) – step specifications
short (bool) – step specifications in short format
override_step_types (dict) – override the provided step types, for example set all steps with step number 5 to “charge” by providing {5: “charge”}.
override_raw_limits (dict) – override the instrument limits (resolution), for example set ‘current_hard’ to 0.1 by providing {‘current_hard’: 0.1}.
profiling (bool) – turn on profiling
usteps (bool) – investigate all steps including same steps within one cycle (this is useful for e.g. GITT).
add_c_rate (bool) – include a C-rate estimate in the steps
skip_steps (list of integers) – list of step numbers that should not be processed (future feature - not used yet).
sort_rows (bool) – sort the rows after processing.
from_data_point (int) – first data point to use.
nom_cap_specifics (str) – “gravimetric”, “areal”, or “absolute”.

Returns:

None

make_summary(find_ir=False, find_end_voltage=True, use_cellpy_stat_file=None, ensure_step_table=True, remove_duplicates=True, normalization_cycles=None, nom_cap=None, nom_cap_specifics=None, old=False, create_copy=False, exclude_types=None, exclude_steps=None, selector_type=None, selector=None, **kwargs)[source]#

Convenience function that makes a summary of the cycling data.

Parameters:

find_ir (bool) – if True, the internal resistance will be calculated.
find_end_voltage (bool) – if True, the end voltage will be calculated.
use_cellpy_stat_file (bool) – if True, the summary will be made from the cellpy_stat file (soon to be deprecated).
ensure_step_table (bool) – if True, the step-table will be made if it does not exist.
remove_duplicates (bool) – if True, duplicates will be removed from the summary.
normalization_cycles (int or list of int) – cycles to use for normalization.
nom_cap (float or str) – nominal capacity (if None, the nominal capacity from the data will be used).
nom_cap_specifics (str) – gravimetric, areal, or volumetric.
old (bool) – if True, the old summary method will be used.
create_copy (bool) – if True, a copy of the cellpy object will be returned.
exclude_types (list of str) – exclude these types from the summary.
exclude_steps (list of int) – exclude these steps from the summary.
selector_type (str) – select based on type (e.g. “non-cv”, “non-rest”, “non-ocv”, “only-cv”).
selector (callable) – custom selector function.
**kwargs – additional keyword arguments sent to internal method (check source for info).

Returns:

cellpy object with the summary added to it.

Return type:

cellpy.CellpyData

property mass#: Returns the mass

merge(datasets: list, **kwargs)[source]#: This function merges datasets into one set.

mod_raw_split_cycle(data_points: List) → None[source]#

Split cycle(s) into several cycles.

Parameters:: data_points – list of the first data point(s) for additional cycle(s).

property nom_cap#: Returns the nominal capacity

property nom_cap_specifics#: Returns the nominal capacity specific

property nominal_capacity#: Returns the nominal capacity

nominal_capacity_as_absolute(value=None, specific=None, nom_cap_specifics=None, convert_charge_units=False)[source]#: Get the nominal capacity as absolute value.

populate_step_dict(step)[source]#: Returns a dict with cycle numbers as keys and corresponding steps (list) as values.

print_steps()[source]#: Print the step table.

property raw_units#: Returns the raw_units dictionary

register_instrument_readers()[source]#: Register instrument readers.

save(filename, force=False, overwrite=None, extension='h5', ensure_step_table=None, ensure_summary_table=None, cellpy_file_format='hdf5')[source]#

Save the data structure to cellpy-format.

Parameters:

filename – (str or pathlib.Path) the name you want to give the file
force – (bool) save a file even if the summary is not made yet (not recommended)
overwrite – (bool) save the new version of the file even if old one exists.
extension – (str) filename extension.
ensure_step_table – (bool) make step-table if missing.
ensure_summary_table – (bool) make summary-table if missing.
cellpy_file_format – (str) format of the cellpy-file (only hdf5 is supported so far).

Returns:

None

select_steps(step_dict, append_df=False)[source]#: Select steps (not documented yet).

set_cellpy_datadir(directory=None)[source]#

Set the directory containing .hdf5-files.

Used for setting directory for looking for hdf5-files. A valid directory name is required.

Parameters:: directory (str) – path to hdf5-directory

Example

>>> d = CellpyCell()
>>> directory = "MyData/HDF5"
>>> d.set_raw_datadir(directory)

static set_col_first(df, col_names)[source]#

Set selected columns first in a pandas.DataFrame.

This function sets cols with names given in col_names (a list) first in the DataFrame. The last col in col_name will come first (processed last)

set_instrument(instrument=None, model=None, instrument_file=None, unit_test=False, **kwargs)[source]#

Set the instrument (i.e. tell cellpy the file-type you use).

Three different modes of setting instruments are currently supported. You can provide the already supported instrument names (see the documentation, e.g. “arbin_res”). You can use the “custom” loader by providing the path to a yaml-file describing the file format. This can be done either by setting instrument to “instrument_name::instrument_definition_file_name”, or by setting instrument to “custom” and provide the definition file name through the instrument_file keyword argument. A last option exists where you provide the yaml-file name directly to the instrument parameter. Cellpy will then look into your local instrument folder and search for the yaml-file. Some instrument types also supports a model key-word.

Parameters:

instrument – (str) in [“arbin_res”, “maccor_txt”,…]. If instrument ends with “.yml” a local instrument file will be used. For example, if instrument is “my_instrument.yml”, cellpy will look into the local instruments folders for a file called “my_instrument.yml” and then use LocalTxtLoader to load after registering the instrument. If the instrument name contains a ‘::’ separator, the part after the separator will be interpreted as ‘instrument_file’.
model – (str) optionally specify if the instrument loader supports handling several models (some instruments allow for exporting data in slightly different formats depending on the choices made during the export or the model of the instrument, e.g. different number of header lines, different encoding).
instrument_file – (path) instrument definition file,
unit_test – (bool) set to True if you want to print the settings instead of setting them.
kwargs (dict) – key-word arguments sent to the initializer of the loader class

Notes

If you are using a local instrument loader, you will have to register it first to the loader factory.

>>> c = CellpyCell()  # this will automatically register the already implemented loaders
>>> c.instrument_factory.register_builder(instrument_id, (module_name, path_to_instrument_loader_file))

It is highly recommended using the module_name as the instrument_id.

set_mass(mass, validated=None)[source]#: Warning

This function is deprecated. Use the setter instead (mass = value).

set_nom_cap(nom_cap, validated=None)[source]#: Warning

This function is deprecated. Use the setter instead (nom_cap = value).

set_raw_datadir(directory=None)[source]#

Set the directory containing .res-files.

Used for setting directory for looking for res-files.@ A valid directory name is required.

Parameters:: directory (str) – path to res-directory

Example

>>> d = CellpyCell()
>>> directory = "MyData/cycler-data"
>>> d.set_raw_datadir(directory)

set_tot_mass(mass, validated=None)[source]#: Warning

This function is deprecated. Use the setter instead (tot_mass = value).

sget_current(cycle, step)[source]#

Returns current for cycle, step.

Convenience function; same as issuing:

raw[(raw[cycle_index_header] == cycle) & (raw[step_index_header] == step)][current_header]

Parameters:

cycle – cycle number
step – step number

Returns:

pandas.Series or None if empty

sget_step_numbers(cycle, step)[source]#

Returns step number for cycle, step.

Convenience function; same as issuing:

raw[(raw[cycle_index_header] == cycle) &
     (raw[step_index_header] == step)][step_index_header]

Parameters:

cycle – cycle number
step – step number (can be a list of several step numbers)

Returns:

pandas.Series

sget_steptime(cycle, step)[source]#

Returns step time for cycle, step.

Convenience function; Convenience function; same as issuing:

raw[(raw[cycle_index_header] == cycle) & (raw[step_index_header] == step)][step_time_header]

Parameters:

cycle – cycle number
step – step number

Returns:

pandas.Series or None if empty

sget_timestamp(cycle, step)[source]#

Returns timestamp for cycle, step.

Convenience function; same as issuing:

raw[(raw[cycle_index_header] == cycle) &
     (raw[step_index_header] == step)][timestamp_header]

Parameters:

cycle – cycle number
step – step number (can be a list of several step numbers)

Returns:

pandas.Series

sget_voltage(cycle, step)[source]#

Returns voltage for cycle, step.

Convenience function; same as issuing:

raw[(raw[cycle_index_header] == cycle) &
     (raw[step_index_header] == step)][voltage_header]

Parameters:

cycle – cycle number
step – step number

Returns:

pandas.Series or None if empty

split(cycle=None)[source]#: Split experiment (CellpyCell object) into two sub-experiments. if cycle is not give, it will split on the median cycle number

split_many(base_cycles: int | List[int] | None = None) → List[CellpyCell][source]#

Split experiment (CellpyCell object) into several sub-experiments.

Parameters:: base_cycles (int or list of ints) – cycle(s) to do the split on.
Returns:: List of CellpyCell objects

to_bdf(filename=None, *, cycles=None, last_cycle=None, header_style='preferred', format='csv', extras=False, preprocess_fn=None, bdf_units=None)[source]#

Export the raw time-series in Battery Data Format (BDF).

See Battery Data Format for the full specification.

Parameters:

filename – Output path. If None or extensionless, a default <cell_name>.bdf.<format> (or <filename>.bdf.<format>) is used. An explicit suffix is honoured as-is.
cycles – Optional cycle filter. None exports all cycles; an int exports that single cycle; an iterable of ints exports the listed cycles. Combines with last_cycle.
last_cycle – If given, drop rows whose cycle index exceeds last_cycle.
header_style – "preferred" (default, BDF spec) writes headers like "Test Time / s". "machine" writes machine-readable names like "test_time_second".
format – "csv" (default) or "parquet".
extras – Append columns from data.raw that are not in the BDF column map. False (default) exports only the BDF columns. True appends every unmapped raw column verbatim (no unit conversion, original name preserved). A string or iterable of strings restricts the appended columns to the listed names. The resulting file is no longer strictly BDF-compliant.
preprocess_fn – A function that takes the raw DataFrame and returns a new DataFrame. This function is applied to the raw DataFrame after the cycle filter and before the BDF export.
bdf_units –
Optional CellpyUnits controlling the units written into the BDF file. None (default) emits a strictly BDF-compliant file (A, V, Ah, Wh, s, W, ohm). When set, each attribute on the CellpyUnits overrides the spec target for the corresponding column kind (charge → charge / discharge capacity, energy → charge / discharge energy, etc.); column labels and machine names are rebuilt from the override (e.g. "Charging Capacity / mAh" / "charging_capacity_mah") and values are scaled accordingly via pint. An incompatible unit (e.g. charge="kg") raises ValueError. A file written with overrides is no longer strictly BDF- compliant; this is logged once at INFO level.

Example:
```
from cellpy.parameters.internal_settings import CellpyUnits

# write charge in mAh and current in mA
bdf_units = CellpyUnits(charge="mAh", current="mA")
cell.to_bdf("out.bdf.csv", bdf_units=bdf_units)
```

Returns:

The path that the file was written to.

Return type:

pathlib.Path

Raises:

ValueError – If the cell has no raw data, any BDF-required column is missing from data.raw, or bdf_units specifies a unit that cannot be converted from the cell’s source unit.

to_cellpy_unit(value, physical_property)[source]#

Convert value to cellpy units.

Parameters:

value (numeric, pint.Quantity or str) – what you want to convert from
physical_property (str) – What this value is a measure of (must correspond to one of the keys in the CellpyUnits class).

Returns (numeric):: the value in cellpy units

to_csv(datadir=None, sep=None, cycles=False, raw=True, summary=True, shifted=False, method=None, shift=0.0, last_cycle=None)[source]#

Saves the data as .csv file(s).

Parameters:

datadir – folder where to save the data (uses current folder if not given).
sep – the separator to use in the csv file (defaults to CellpyCell.sep).
cycles – (bool) export voltage-capacity curves if True.
raw – (bool) export raw-data if True.
summary – (bool) export summary if True.
shifted (bool) – export with cumulated shift.
method (string) –
how the curves are given:
- ”back-and-forth” - standard back and forth; discharge (or charge) reversed from where charge (or discharge) ends.
- ”forth” - discharge (or charge) continues along x-axis.
- ”forth-and-forth” - discharge (or charge) also starts at 0 (or shift if not shift=0.0)
shift – start-value for charge (or discharge)
last_cycle – process only up to this cycle (if not None).

Returns:

None

to_cycle(cycle: int) → CellpyCell[source]#: Select experiment (CellpyCell object) to cycle number ‘cycle’

to_excel(filename=None, cycles=None, raw=False, steps=True, nice=True, get_cap_kwargs=None, to_excel_kwargs=None)[source]#

Saves the data as .xlsx file(s).

Parameters:

filename – name of the Excel file.
cycles – (None, bool, or list of ints) export voltage-capacity curves if given.
raw – (bool) export raw-data if True.
steps – (bool) export steps if True.
nice – (bool) use nice formatting if True.
get_cap_kwargs – (dict) kwargs for CellpyCell.get_cap method.
to_excel_kwargs – (dict) kwargs for pandas.DataFrame.to_excel method.

property tot_mass#: Returns the total mass

total_time_at_voltage_level(cycles=None, voltage_limit=0.5, sampling_unit='S', at='low')[source]#

Experimental method for getting the total time spent at low / high voltage.

Parameters:

cycles – cycle number (all cycles if None).
voltage_limit – voltage limit (default 0.5 V). Can be a tuple (low, high) if at=”between”.
sampling_unit – sampling unit (default “S”) H: hourly frequency T, min: minutely frequency S: secondly frequency L, ms: milliseconds U, us: microseconds N: nanoseconds
at (str) – “low”, “high”, or “between” (default “low”)

unit_scaler_from_raw(unit, physical_property)[source]#

Get the conversion factor going from raw to given unit.

Parameters:

unit (str) – what you want to convert to
physical_property (str) – what this value is a measure of (must correspond to one of the keys in the CellpyUnits class).

Returns (numeric):: conversion factor (scaler)

classmethod vacant(cell=None)[source]#

Create a CellpyCell instance.

Parameters:: cell – the attributes from the data will be copied to the new Cellpydata instance.

with_cellpy_unit(parameter, as_str=False)[source]#: Return quantity as pint.Quantity object.

with_cycles(cycles: int | List[int]) → CellpyCell[source]#

Select a subset of cycles from the experiment (CellpyCell object).

This method should only be used for quick selection of cycles (e.g. for plotting).

Parameters:: cycles (int or iterable of ints) – cycle number(s) to keep.
Returns:: A new CellpyCell object containing only the selected cycles.

get(filename=None, instrument=None, instrument_file=None, cellpy_file=None, cycle_mode=None, mass: str | Number = None, nominal_capacity: str | Number = None, nom_cap_specifics=None, loading=None, area: str | Number = None, estimate_area=True, logging_mode=None, custom_log_dir=None, custom_log_config_path=None, auto_pick_cellpy_format=True, auto_summary=True, units=None, step_kwargs=None, summary_kwargs=None, selector=None, testing=False, refuse_copying=False, initialize=False, debug=False, **kwargs)[source]#

Create a CellpyCell object

Parameters:

filename (str, os.PathLike, OtherPath, or list of raw-file names) – path to file(s) or data-set(s) to load.
instrument (str) – instrument to use (defaults to the one in your cellpy config file).
instrument_file (str or path) – yaml file for custom file type.
cellpy_file (str, os.PathLike, or OtherPath) – if both filename (a raw-file) and cellpy_file (a cellpy file) is provided, cellpy will try to check if the raw-file is has been updated since the creation of the cellpy-file and select this instead of the raw file if cellpy thinks they are similar (use with care!).
logging_mode (str) – “INFO” or “DEBUG”.
cycle_mode (str) – the cycle mode (e.g. “anode” or “full_cell”).
mass (float) – mass of active material (mg) (defaults to mass given in cellpy-file or 1.0).
nominal_capacity (float) – nominal capacity for the cell (e.g. used for finding C-rates).
nom_cap_specifics (str) – either “gravimetric” (pr mass), or “areal” (per area). (“volumetric” is not fully implemented yet - let us know if you need it).
loading (float) – loading in units [mass] / [area].
area (float) – active electrode area (e.g. used for finding the areal capacity).
estimate_area (bool) – calculate area from loading if given (defaults to True).
auto_pick_cellpy_format (bool) – decide if it is a cellpy-file based on suffix.
auto_summary (bool) – (re-) create summary.
units (dict) – update cellpy units (used after the file is loaded, e.g. when creating summary).
step_kwargs (dict) – sent to make_steps.
summary_kwargs (dict) – sent to make_summary.
selector (dict) – passed to load (when loading cellpy-files).
testing (bool) – set to True if testing (will for example prevent making .log files)
refuse_copying (bool) – set to True if you do not want to copy the raw-file before loading.
initialize (bool) – set to True if you want to initialize the CellpyCell object (probably only useful if you want to return a cellpy-file with no data in it).
debug (bool) – set to True if you want to debug the loader.
**kwargs – sent to the loader.

Transferred Parameters:

model (str) – model to use (only for loaders that supports models).
bad_steps (list of tuples) – (c, s) tuples of steps s (in cycle c) to skip loading (“arbin_res”).
dataset_number (int) – the data set number (‘Test-ID’) to select if you are dealing with arbin files with more than one data-set. Defaults to selecting all data-sets and merging them (“arbin_res”).
data_points (tuple of ints) – load only data from data_point[0] to data_point[1] (use None for infinite) (“arbin_res”).
increment_cycle_index (bool) – increment the cycle index if merging several datasets (default True)
(“arbin_res”).
sep (str) – separator used in the file (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
skip_rows (int) – number of rows to skip in the beginning of the file (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
header (int) – row number of the header (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
encoding (str) – encoding of the file (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
decimal (str) – decimal separator (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
thousand (str) – thousand separator (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
pre_processor_hook (callable) – pre-processors to use (“maccor_txt”, “neware_txt”, “local_instrument”, “custom”).
bad_steps (list) – separator used in the file (not implemented yet) (“pec_csv”).

Returns:

CellpyCell object (if successful, None if not).

Examples

>>> # read an arbin .res file and create a cellpy object with
>>> # populated summary and step-table:
>>> c = cellpy.get("my_data.res", instrument="arbin_res", mass=1.14, area=2.12, loading=1.2, nom_cap=155.2)
>>>
>>> # load a cellpy-file:
>>> c = cellpy.get("my_cellpy_file.clp")
>>>
>>> # load a txt-file exported from Maccor:
>>> c = cellpy.get("my_data.txt", instrument="maccor_txt", model="one")
>>>
>>> # load a raw-file if it is newer than the corresponding cellpy-file,
>>> # if not, load the cellpy-file:
>>> c = cellpy.get("my_data.res", cellpy_file="my_data.clp")
>>>
>>> # load a file with a custom file-description:
>>> c = cellpy.get("my_file.csv", instrument_file="my_instrument.yaml")
>>>
>>> # load three subsequent raw-files (of one cell) and merge them:
>>> c = cellpy.get(["my_data_01.res", "my_data_02.res", "my_data_03.res"])
>>>
>>> # load a data set and get the summary charge and discharge capacities
>>> # in Ah/g:
>>> c = cellpy.get("my_data.res", units=dict(capacity="Ah"))
>>>
>>> # get an empty CellpyCell instance:
>>> c = cellpy.get()  # or c = cellpy.get(initialize=True) if you want to initialize it.

instruments_dict()[source]#

Create a dictionary with the available instrument loaders.

The dictionary keys are the instrument names and the values are lists of the available models. If no models are available, the list will be empty.

Returns:: dictionary with the available instrument loaders.
Return type:: dict

print_instruments()[source]#: Prints out the available instrument loaders and their models.

cellpy.readers.core module#

This module contains several of the most important classes used in cellpy.

It also contains functions that are used by readers and utils. And it has the file version definitions.

class BaseDbReader[source]#

Bases: object

Base class for database readers.

abstractmethod from_batch(batch_name: str | None = None, include_key: bool = False, include_individual_arguments: bool = False, **kwargs: Any) → dict[source]#

Get a dictionary with the data from a batch for the journal.

Parameters:

batch – name of the batch.
include_key – include the key (the cell ids).
include_individual_arguments – include the individual arguments.

Returns:

dictionary with the data.

Return type:

dict

class BaseSimpleDbReader[source]#

Bases: object

Base class for database readers.

abstractmethod from_batch(batch_name: str, include_key: bool = False, include_individual_arguments: bool = False) → dict[source]#

abstractmethod get_area(pk: int) → float[source]#

abstractmethod get_args(pk: int) → dict[source]#

abstractmethod get_by_column_label(pk: int, name: str) → Any[source]#

abstractmethod get_cell_name(pk: int) → str[source]#

abstractmethod get_cell_type(pk: int) → str[source]#

abstractmethod get_comment(pk: int) → str[source]#

abstractmethod get_experiment_type(pk: int) → str[source]#

abstractmethod get_group(pk: int) → str[source]#

abstractmethod get_instrument(pk: int) → str[source]#

abstractmethod get_label(pk: int) → str[source]#

abstractmethod get_loading(pk: int) → float[source]#

abstractmethod get_mass(pk: int) → float[source]#

abstractmethod get_nom_cap(pk: int) → float[source]#

abstractmethod get_total_mass(pk: int) → float[source]#

abstractmethod inspect_hd5f_fixed(pk: int) → int[source]#

abstractmethod select_batch(batch: str) → List[int][source]#

class Data(**kwargs)[source]#

Bases: object

Object to store data for a cell-test.

This class is used for storing all the relevant data for a cell-test, i.e. all the data collected by the tester as stored in the raw-files, and user-provided metadata about the cell-test.

raw_data_files#

list of FileID objects.

Type:: list

raw#

raw data.

Type:: pandas.DataFrame

summary#

summary data.

Type:: pandas.DataFrame

steps#

step data.

Type:: pandas.DataFrame

meta_common#

common meta-data.

Type:: CellpyMetaCommon

meta_test_dependent#

test-dependent meta-data.

Type:: CellpyMetaIndividualTest

custom_info#

custom meta-data.

Type:: Any

raw_units#

dictionary with units for the raw data.

Type:: dict

raw_limits#

dictionary with limits for the raw data.

Type:: dict

loaded_from#

name of the file where the data was loaded from.

Type:: str

property active_electrode_area#

property cell_name#

property empty#: Check if the data object is empty.

property has_data#

property has_steps#: check if the step table exists

property has_summary#: check if the summary table exists

property loading#

property mass#

property material#

property nom_cap#

populate_defaults()[source]#: Populate the data object with default values.

property raw_id#

property start_datetime#

property tot_mass#

class FileID(filename: str | OtherPathNew = None, is_db: bool = False)[source]#

Bases: object

class for storing information about the raw-data files.

This class is used for storing and handling raw-data file information. It is important to keep track of when the data was extracted from the raw-data files so that it is easy to know if the hdf5-files used for @storing “treated” data is up-to-date.

name#

Filename of the raw-data file.

Type:: str

full_name#

Filename including path of the raw-data file.

Type:: str

size#

Size of the raw-data file.

Type:: float

last_modified#

Last time of modification of the raw-data file.

Type:: datetime

last_accessed#

last time of access of the raw-data file.

Type:: datetime

last_info_changed#

st_ctime of the raw-data file.

Type:: datetime

location#

Location of the raw-data file.

Type:: str

get_last()[source]#: Get last modification time of the file.

get_name()[source]#: Get the filename.

get_raw()[source]#

Get a list with information about the file.

The returned list contains name, size, last_modified and location.

get_size()[source]#: Get the size of the file.

property last_data_point#: Get the last data point.

populate(filename: str | OtherPathNew)[source]#

Finds the file-stats and populates the class with stat values.

Parameters:: filename (str, OtherPath) – name of the file.

class InstrumentFactory[source]#

Bases: object

Factory for instrument loaders.

property builders#

create(key: str | None, **kwargs)[source]#

Create the instrument loader module and initialize the loader class.

Parameters:

key – instrument id
**kwargs – sent to the initializer of the loader class.

Returns:

instance of loader class.

create_all(**kwargs)[source]#

Create all the instrument loader modules.

Parameters:: **kwargs – sent to the initializer of the loader class.
Returns:: dict of instances of loader classes.

get_registered_builder(key)[source]#

get_registered_builders()[source]#

get_registered_kwargs()[source]#

query(key: str, variable: str) → Any[source]#

performs a get_params lookup for the instrument loader.

Parameters:

key – instrument id.
variable – the variable you want to lookup.

Returns:

The value of the variable if the loaders get_params method supports it.

register_builder(key: str, builder: Tuple[str, Any], **kwargs) → None[source]#

Parameters:

key – instrument id
builder – (module_name, module_path)
**kwargs – stored in the factory (will be used in the future for allowing to set defaults to the builders to allow for using .query).

unregister_builder(key: str) → None[source]#

unregister an instrument loader module.

Parameters:: key – instrument id

class PagesDictBase[source]#

Bases: TypedDict

Base structure for pages_dict with known journal columns.

area: List[float | None]#

argument: List[str | None]#

cell_type: List[str | None]#

cellpy_file_name: List[str | None]#

comment: List[str | None]#

experiment: List[str | None]#

file_name_indicator: List[str | None]#

filename: List[str | None]#

fixed: List[Any | None]#

group: List[str | None]#

id_key: List[int | float | str | None]#

instrument: List[str | None]#

label: List[str | None]#

loading: List[float | None]#

mass: List[float | None]#

nom_cap: List[float | None]#

nom_cap_specifics: List[str | None]#

raw_file_names: List[str | None]#

total_mass: List[float | None]#

class PickleProtocol(level)[source]#

Bases: object

Context for using a specific pickle protocol.

Q(*args, **kwargs)[source]#

check64bit(current_system='python')[source]#: checks if you are on a 64-bit platform

collect_capacity_curves(cell, direction='charge', trim_taper_steps=None, steps_to_skip=None, steptable=None, max_cycle_number=None, **kwargs)[source]#

Create a list of pandas.DataFrames, one for each charge step.

The DataFrames are named by its cycle number.

Parameters:

cell (CellpyCell) – object
direction (str)
trim_taper_steps (integer) – number of taper steps to skip (counted from the end, i.e. 1 means skip last step in each cycle).
steps_to_skip (list) – step numbers that should not be included.
steptable (pandas.DataFrame) – optional steptable.
max_cycle_number (int) – only select cycles up to this value.

Returns:

list of pandas.DataFrames, list of cycle numbers, minimum voltage value, maximum voltage value

convert_from_simple_unit_label_to_string_unit_label(k, v)[source]#: Convert from simple unit label to string unit label.

find_all_instruments(name_contains: str | None = None) → Dict[str, Tuple[str, Path]][source]#: finds all the supported instruments

generate_default_factory()[source]#

This function searches for all available instrument readers and registers them in an InstrumentFactory instance.

Returns:: InstrumentFactory

get_ureg()[source]#

group_by_interpolate(df, x=None, y=None, group_by=None, number_of_points=100, tidy=False, individual_x_cols=False, header_name='Unit', dx=10.0, generate_new_x=True)[source]#

Do a pandas.DataFrame.group_by and perform interpolation for all groups.

This function is a wrapper around an internal interpolation function in cellpy (that uses scipy.interpolate.interp1d) that combines doing a group-by operation and interpolation.

Parameters:

df (pandas.DataFrame) – the dataframe to morph.
x (str) – the header for the x-value (defaults to normal header step_time_txt) (remark that the default group_by column is the cycle column, and each cycle normally consist of several steps (so you risk interpolating / merging several curves on top of each other (not good)).
y (str) – the header for the y-value (defaults to normal header voltage_txt).
group_by (str) – the header to group by (defaults to normal header cycle_index_txt)
number_of_points (int) – if generating new x-column, how many values it should contain.
tidy (bool) – return the result in tidy (i.e. long) format.
individual_x_cols (bool) – return as xy xy xy … data.
header_name (str) – name for the second level of the columns (only applies for xy xy xy … data) (defaults to “Unit”).
dx (float) – if generating new x-column and number_of_points is None or zero, distance between the generated values.
generate_new_x (bool) –
create a new x-column by using the x-min and x-max values from the original dataframe where the method is set by the number_of_points key-word:
1. if number_of_points is not None (default is 100):
```
new_x = np.linspace(x_max, x_min, number_of_points)
```
2. else:
```
new_x = np.arange(x_max, x_min, dx)
```

Returns: pandas.DataFrame with interpolated x- and y-values. The returned: dataframe is in tidy (long) format for tidy=True.

humanize_bytes(b, precision=1)[source]#: Return a humanized string representation of a number of b.

identify_last_data_point(data)[source]#: Find the last data point and store it in the fid instance

instrument_configurations(search_text: str = '') → Dict[str, Any][source]#

This function returns a dictionary with information about the available instrument loaders and their models.

Parameters:: search_text – string to search for in the instrument names.
Returns:: nested dictionary with information about the available instrument loaders and their models.
Return type:: dict

interpolate_y_on_x(df, x=None, y=None, new_x=None, dx=10.0, number_of_points=None, direction=1, **kwargs)[source]#

Interpolate a column based on another column.

Parameters:

df – DataFrame with the (cycle) data.
x – Column name for the x-value (defaults to the step-time column).
y – Column name for the y-value (defaults to the voltage column).
new_x (numpy array or None) – Interpolate using these new x-values instead of generating x-values based on dx or number_of_points.
dx – step-value (defaults to 10.0)
number_of_points – number of points for interpolated values (use instead of dx and overrides dx if given).
direction (-1,1) – if direction is negative, then invert the x-values before interpolating.
**kwargs – arguments passed to scipy.interpolate.interp1d

Returns: DataFrame with interpolated y-values based on given or: generated x-values.

interpolate_y_on_x_per_monotonic_segments(df, x=None, y=None, dx=10.0, number_of_points=None, direction=1, max_segments=100, **kwargs)[source]#

Interpolate y on x per strictly monotonic segment, then concatenate.

When a curve has multiple steps (e.g. CC + taper), x may not be strictly monotonic (e.g. constant voltage during taper). scipy.interp1d requires strictly increasing x, so interpolating the whole curve drops steps or produces artefacts. This helper splits the dataframe into segments where x is strictly monotonic, interpolates each segment, and concatenates.

Many segments can occur with noisy x-data: every small reversal (x[i] <= x[i-1]) starts a new segment, so O(n) segments are possible. That would mean many calls to interpolate_y_on_x (slow) and many small DataFrames (memory). If the segment count exceeds max_segments, the function returns the dataframe unchanged and logs a warning.

Parameters:

df – DataFrame with the (cycle) data.
x – Column name for the x-value.
y – Column name for the y-value.
dx – step-value for interpolation.
number_of_points – number of points (overrides dx if given).
direction (-1, 1) – 1 = x must be strictly increasing, -1 = strictly decreasing.
max_segments – if the number of monotonic segments exceeds this, return df unchanged and log a warning (default 100). Set to None for no limit.
**kwargs – passed to interpolate_y_on_x.

Returns:

DataFrame with interpolated (x, y) preserving all segments, or df unchanged if segment count exceeds max_segments.

pickle_protocol(level)[source]#

class ureg[source]#

Bases: object

Unit registry for pint.

This is a wrapper around the pint unit registry.

xldate_as_datetime(xldate, datemode=0, option='to_datetime')[source]#

Converts a xls date stamp to a more sensible format.

Parameters:

xldate (str, int) – date stamp in Excel format.
datemode (int) – 0 for 1900-based, 1 for 1904-based.
option (str) – option in (“to_datetime”, “to_float”, “to_string”), return value

Returns:

datetime (datetime object, float, or string).

cellpy.readers.dbreader module#

class DbSheetCols[source]#: Bases: object

class Reader(db_file=None, db_datadir=None, db_datadir_processed=None, db_frame=None, batch=None, batch_col_name=None)[source]#

Bases: BaseSimpleDbReader

extract_date_from_cell_name(force=False)[source]#

filter_by_col(column_names)[source]#

filters sheet/table by columns (input is column header)

The routine returns the serial numbers with values>1 in the selected columns.

Parameters:: column_names (list) – the column headers.
Returns:: pandas.DataFrame

filter_by_col_value(column_name, min_val=None, max_val=None)[source]#

filters sheet/table by column.

The routine returns the serial-numbers with min_val <= values >= max_val in the selected column.

Parameters:

column_name (str) – column name.
min_val (int) – minimum value of serial number.
max_val (int) – maximum value of serial number.

Returns:

pandas.DataFrame

filter_by_slurry(slurry, appender='_')[source]#

Filters sheet/table by slurry name.

Input is slurry name or list of slurry names, for example ‘es030’ or [“es012”,”es033”,”es031”].

Parameters:

slurry (str or list of strings) – slurry names.
appender (chr) – char that surrounds slurry names.

Returns:

List of serial_number (ints).

filter_selected(serial_numbers)[source]#

from_batch(batch_name: str, include_key: bool = False, include_individual_arguments: bool = False) → dict[source]#

get_all()[source]#

get_area(serial_number)[source]#

get_areal_loading(serial_number)[source]#

get_args(serial_number: int) → dict[source]#

get_by_column_label(column_name, serial_number)[source]#

get_cell_name(serial_number)[source]#

get_cell_type(serial_number)[source]#

get_comment(serial_number)[source]#

get_experiment_type(serial_number)[source]#

get_file_name_indicator(serial_number)[source]#

get_fileid(serial_number, full_path=True)[source]#

get_group(serial_number)[source]#

get_instrument(serial_number)[source]#

get_label(serial_number)[source]#

get_loading(serial_number)[source]#

get_mass(serial_number)[source]#

get_nom_cap(serial_number)[source]#

get_nom_cap_specifics(serial_number)[source]#

get_total_mass(serial_number)[source]#

inspect_exists(serial_number)[source]#

inspect_hd5f_fixed(serial_number)[source]#

static intersect(lists)[source]#

pick_table()[source]#: Pick the table and return a pandas.DataFrame.

print_serial_number_info(serial_number, print_to_screen=True)[source]#

Print information about the run.

Parameters:

serial_number – serial number.
print_to_screen – runs the print statement if True, returns txt if not.

Returns:

txt if print_to_screen is False, else None.

select_all(serial_numbers)[source]#

Select rows for identification for a list of serial_number.

Parameters:: serial_numbers – list (or ndarray) of serial numbers
Returns:: pandas.DataFrame

select_batch(batch, batch_col_name=None, case_sensitive=True, drop=True, clean=False, **kwargs) → List[int][source]#

Selects the rows in column batch_col_number.

Parameters:

batch – batch to select
batch_col_name – column name to use for batch selection (default: DbSheetCols.batch).
case_sensitive – if True, the batch name must match exactly (default: True).
drop – if True, all un-selected rows are dropped from the table (default: True).
clean – if True and drop is True, the table is cleaned from duplicates and NaNs (default: False).

Returns:

List of row indices

select_serial_number_row(serial_number)[source]#

Select row for identification number serial_number

Parameters:: serial_number – serial number
Returns:: pandas.DataFrame

static subtract(list1, list2)[source]#

static subtract_many(list1, lists)[source]#

static union(lists)[source]#

cellpy.readers.do module#

Modifiers for cellpy.CellPyCell objects.

This module is used for modifying cellpy.CellPyCell objects after they have been created. All modifiers should take a cellpy.CellPyCell object as input and return a new cellpy.CellPyCell object. This is to ensure that the original cellpy.CellPyCell object is not modified in place and that the raw data is not changed (unless explicitly requested). This is an experimental feature of cellpy and is not yet fully implemented.

copy(c_old)[source]#

cellpy.readers.filefinder module#

Dumps the raw-file directory to a list.

Parameters:

raw_file_dir (path) – optional, directory where to look for run-files (default: read prm-file)
project_dir (path) – optional, subdirectory in raw_file_dir to look for run-files
extension (str) – optional, extension of run-files (without the ‘.’). If not given, all files will be listed.
glob_txt (str, optional) – optional, glob pattern to use when searching for files.
allow_error_level (int, optional) – accept errors up to this level when using the find command Defaults to 3. (1 raises Exception, 2 skips, 3 tries to process the stdout regardless).

Returns:

list of file paths.

Return type:

list of str

Examples

>>> # find all files in your raw-file directory:
>>> filelist_1 = filefinder.find_in_raw_file_directory()

>>> # find all files in your raw-file directory in the subdirectory 'MY-PROJECT':
>>> filelist_2 = filefinder.find_in_raw_file_directory(raw_file_dir=rawdatadir/"MY-PROJECT")

>>> # find all files in your raw-file directory with the extension '.raw' in the subdirectory 'MY-PROJECT':
>>> filelist_3 = filefinder.find_in_raw_file_directory(raw_file_dir=rawdatadir/"MY-PROJECT", extension="raw")

>>> # find all files in your raw-file directory with the extension '.raw' in the subdirectory 'MY-PROJECT'
>>> # that contains the string 'good' in the file name
>>> filelist_4 = filefinder.find_in_raw_file_directory(
>>>     raw_file_dir=rawdatadir/"MY-PROJECT",
>>>     glob_txt="*good*",
>>>     extension="raw"
>>>)

Notes

Uses ‘find’ and ‘ssh’ to search for files.

Dumps the raw-file directory to a list.

Parameters:

raw_file_dir (path) – optional, directory where to look for run-files (default: read prm-file)
project_dir (path) – optional, subdirectory in raw_file_dir to look for run-files
extension (str) – optional, extension of run-files (without the ‘.’). If not given, all files will be listed.
levels (int, optional) – How many sublevels to list. Defaults to 1. If you want to list all sublevels, use listdir(levels=-1). If you want to list only the current level (no subdirectories), use listdir(levels=0).
only_filename (bool, optional) – If True, only the file names will be returned. Defaults to False.
with_prefix (bool, optional) – If True, the full path to the files including the prefix and the location (e.g. ‘scp://user@server.com/…’) will be returned. Defaults to True.

Returns:

list of file paths (only the actual file names).

Return type:

list of str

Notes

This function might be rather slow and memory consuming if you have a lot of files in your raw-file directory. If you have a lot of files, you might want to consider running this function in a separate process (e.g. in a separate python script or using multiprocessing).

The function currently returns the full path to the files from the root directory. It does not include the prefix (e.g. ssh://). Future versions might change this to either include the prefix or return the files relative to the raw_file_dir directory.

Searches for files (raw-data files and cellpy-files).

Parameters:

run_name (str) – run-file identification.
raw_extension (str) – optional, extension of run-files (without the ‘.’).
cellpy_file_extension (str) – optional, extension for cellpy files (without the ‘.’).
raw_file_dir (path) – optional, directory where to look for run-files (default: read prm-file)
project_dir (path) – optional, subdirectory in raw_file_dir to look for run-files
cellpy_file_dir (path) – optional, directory where to look for cellpy-files (default: read prm-file)
prm_filename (path) – optional parameter file can be given.
file_name_format (str) – format of raw-file names or a glob pattern (default: YYYYMMDD_[name]EEE_CC_TT_RR).
reg_exp (str) – use regular expression instead (defaults to None).
sub_folders (bool) – perform search also in sub-folders.
file_list (list of str) – perform the search within a given list of filenames instead of searching the folder(s). The list should not contain the full filepath (only the actual file names). If you want to provide the full path, you will have to modify the file_name_format or reg_exp accordingly.
with_prefix (bool) – if True, the file list contains full paths to the files (including the prefix and the location).
pre_path (path or str) – path to prepend the list of files selected from the file_list.

Returns:

run-file names (list of strings) and cellpy-file-name (str of full path).

cellpy.readers.sql_dbreader module#

This module is an example of how to implement a custom database reader for the batch utility in cellpy.

class Base(**kwargs: Any)[source]#

Bases: DeclarativeBase

Base class for the database models.

metadata: ClassVar[MetaData] = MetaData()#: Refers to the _schema.MetaData collection that will be used for new _schema.Table objects.

See also

orm_declarative_metadata

registry: ClassVar[registry] = <sqlalchemy.orm.decl_api.registry object>#: Refers to the _orm.registry in use where new _orm.Mapper objects will be associated.

class Batch(**kwargs)[source]#

Bases: Base

Model for batch objects in the database.

cells: Mapped[List[Cell]]#

comment: Mapped[str | None]#

name: Mapped[str]#

pk: Mapped[int]#

class Cell(**kwargs)[source]#

Bases: Base

Model for cell objects in the database.

active_material_mass_fraction: Mapped[float | None]#

area: Mapped[float | None]#

argument: Mapped[str | None]#

batches: Mapped[List[Batch] | None]#

cell_design: Mapped[str | None]#

cell_exists: Mapped[bool | None]#

cell_group: Mapped[str | None]#

cell_type: Mapped[str | None]#

cellpy_file_name: Mapped[str | None]#

channel: Mapped[str | None]#

comment_cell: Mapped[str | None]#

comment_general: Mapped[str | None]#

comment_history: Mapped[str | None]#

comment_slurry: Mapped[str | None]#

electrolyte: Mapped[str | None]#

experiment_type: Mapped[str | None]#

formation: Mapped[str | None]#

frozen: Mapped[bool | None]#

inactive_additive_mass: Mapped[float | None]#

instrument: Mapped[str | None]#

label: Mapped[str | None]#

loading_active: Mapped[float | None]#

mass_active: Mapped[float | None]#

mass_total: Mapped[float | None]#

material_class: Mapped[str | None]#

material_group_label: Mapped[str | None]#

material_label: Mapped[str | None]#

material_pre_processing: Mapped[str | None]#

material_solvent: Mapped[str | None]#

material_sub_label: Mapped[str | None]#

material_surface_processing: Mapped[str | None]#

name: Mapped[str]#

nominal_capacity: Mapped[float | None]#

pasting_thickness: Mapped[str | None]#

pk: Mapped[int]#

project: Mapped[str | None]#

raw_data: Mapped[List[RawData]]#

schedule: Mapped[str | None]#

selected: Mapped[bool | None]#

separator: Mapped[str | None]#

solvent_solid_ratio: Mapped[str | None]#

temperature: Mapped[float | None]#

test_date: Mapped[str | None]#

class RawData(**kwargs)[source]#

Bases: Base

Model for raw data objects in the database.

cell: Mapped[Cell]#

cell_pk: Mapped[int]#

is_file: Mapped[bool]#

name: Mapped[str]#

pk: Mapped[int]#

class SQLReader(db_connection: str = None, batch: str = None, **kwargs)[source]#

Bases: BaseSimpleDbReader

A custom database reader for the batch utility in cellpy.

add_batch_object(batch: Batch) → None[source]#

Add a batch object to the database.

For this to work, you will have to create a batch object first, then populate it with data (including the cell objects that the batch refers to, see .add_cell_object), and finally add it to the database using this method.

Examples

>>> from cellpy.readers import sql_dbreader
>>> db = sql_dbreader.SQLReader()
>>> db.open_db("my_db.sqlite")

>>> # create a batch object:
>>> batch = sql_dbreader.Batch()
>>> batch.name = "my_batch"
>>> batch.comment = "my_comment"

>>> # add the cells to the batch:
>>> batch.cells = [cell1, cell2, cell3]

>>> db.add_batch_object(batch)

add_cell_object(cell: Cell) → None[source]#

Add a cell object to the database.

For this to work, you will have to create a cell object first, then populate it with data, and finally add it to the database using this method.

Examples

>>> from cellpy.readers import sql_dbreader
>>> cell = sql_dbreader.Cell()
>>> cell.name = "my_cell"
>>> cell.label = "my_label"
>>> cell.project = "my_project"
>>> cell.cell_group = "my_cell_group"
>>> # ...and so on...

>>> db = sql_dbreader.SQLReader()
>>> db.open_db("my_db.sqlite")
>>> db.add_cell_object(cell)

Parameters:: cell – cellpy.readers.sql_dbreader.Cell object
Returns:: None

add_raw_data_object(raw_data: RawData) → None[source]#

create_db(db_uri: str = 'sqlite:///cellpy.db', echo: bool = False, **kwargs) → None[source]#

extract_date_from_cell_name(force=False)[source]#

from_batch(batch_name: str, include_key: bool = False, include_individual_arguments: bool = False) → dict[source]#

Get a dictionary with the data from a batch for the journal.

Parameters:

batch_name – name of the batch.
include_key – include the key (the cell ids).
include_individual_arguments – include the individual arguments.

Returns:

dictionary with the data.

Return type:

dict

get_area(pk: int) → float[source]#

get_args(pk: int) → dict[source]#

get_by_column_label(pk: int, name: str) → Any[source]#

get_cell_name(pk: int) → str[source]#

get_cell_type(pk: int) → str[source]#

get_comment(pk: int) → str[source]#

get_experiment_type(pk: int) → str[source]#

get_group(pk: int) → str[source]#

get_instrument(pk: int) → str[source]#

get_label(pk: int) → str[source]#

get_loading(pk: int) → float[source]#

get_mass(pk: int) → float[source]#

get_nom_cap(pk: int) → float[source]#

get_total_mass(pk: int) → float[source]#

import_cells_from_excel_sqlite(db_path: str = None, echo: bool = False, allow_duplicates: bool = False, allow_updates: bool = True, process_batches=True, clear=False) → None[source]#

Import cells from old db to new db.

Parameters:

db_path – path to old db (if not provided, it will use the already loaded db if it exists).
echo – will echo sql statements (if loading, i.e. if db_path is provided).
allow_duplicates – will not import if cell already exists in new db.
allow_updates – will update existing cells in new db.
process_batches – will process batches (if any) in old db.
clear – will clear all rows in new db before importing (asks for confirmation).

Returns:

None

inspect_hd5f_fixed(pk: int) → int[source]#

load_excel_sqlite(db_path: str, echo: bool = False) → None[source]#

Load an old sqlite cellpy database created from an Excel file.

You can use the cellpy.utils.batch_tools.sqlite_from_excel.run() function to convert an Excel file to a sqlite database.

open_db(db_uri: str = 'sqlite:///cellpy.db', echo: bool = False, **kwargs) → None[source]#

select_batch(batch_name: str) → List[int][source]#

view_old_excel_sqlite_table_columns() → None[source]#: Prints the columns of the old sqlite database.

cellpy.readers package

Contents

cellpy.readers package#

Subpackages#

Submodules#

cellpy.readers.cellreader module#

cellpy.readers.core module#

cellpy.readers.dbreader module#

cellpy.readers.do module#

cellpy.readers.filefinder module#

cellpy.readers.sql_dbreader module#

Module contents#