cellpy.utils.batch#

Routines for batch processing of cells (v2).

Module Contents#

Classes#

Batch

A convenience class for running batch procedures.

Functions#

from_journal(→ Batch)

Create a Batch from a journal file

init(→ Batch)

Returns an initialized instance of the Batch class.

iterate_batches(folder[, extension, glob_pattern])

Iterate through all journals in given folder.

load(name, project[, batch_col, allow_from_journal, ...])

Load a batch from a journal file or create a new batch and load it if the journal file does not exist.

load_journal(journal_file, **kwargs)

Load a journal file.

load_pages(→ pandas.DataFrame)

Retrieve pages from a Journal file.

naked(→ Batch)

Returns an empty instance of the Batch class.

process_batch(→ Batch)

Execute a batch run, either from a given file_name or by giving the name and project as input.

Attributes#

COLUMNS_SELECTED_FOR_VIEW

class Batch(*args, **kwargs)[source]#

A convenience class for running batch procedures.

The Batch class contains (among other things):

  • iterator protocol

  • a journal with info about the different cells where the main information is accessible as a pandas.DataFrame through the .pages attribute

  • a data lookup accessor .data that behaves similarly as a dict.

The initialization accepts arbitrary arguments and keyword arguments. It first looks for the file_name and db_reader keyword arguments.

Usage:

b = Batch((name, (project)), **kwargs)

Examples

>>> b = Batch("experiment001", "main_project")
>>> b = Batch("experiment001", "main_project", batch_col="b02")
>>> b = Batch(name="experiment001", project="main_project", batch_col="b02")
>>> b = Batch(file_name="cellpydata/batchfiles/cellpy_batch_experiment001.json")
Parameters:

name (str) – (project (str))

Keyword Arguments:
  • file_name (str or pathlib.Path) – journal file name to load.

  • db_reader (str) – data-base reader to use (defaults to “default” as given in the config-file or prm-class).

  • frame (pandas.DataFrame) – load from given dataframe.

  • default_log_level (str) – custom log-level (defaults to None (i.e. default log-level in cellpy)).

  • custom_log_dir (str or pathlib.Path) – custom folder for putting the log-files.

  • force_raw_file (bool) – load from raw regardless (defaults to False).

  • force_cellpy (bool) – load cellpy-files regardless (defaults to False).

  • force_recalc (bool) – Always recalculate (defaults to False).

  • export_cycles (bool) – Extract and export individual cycles to csv (defaults to True).

  • export_raw (bool) – Extract and export raw-data to csv (defaults to True).

  • export_ica (bool) – Extract and export individual dQ/dV data to csv (defaults to True).

  • accept_errors (bool) – Continue automatically to next file if error is raised (defaults to False).

  • nom_cap (float) – give a nominal capacity if you want to use another value than the one given in the config-file or prm-class.

property cell_names: list[source]#
property cell_raw_headers: pandas.Index[source]#
property cell_step_headers: pandas.Index[source]#
property cell_summary_headers: pandas.Index[source]#
property cells: cellpy.utils.batch_tools.batch_core.Data[source]#

Access cells as a Data object (attribute lookup and automatic loading).

Note

Write b.cells.x and press <TAB>. Then a pop-up might appear, and you can choose the cell you would like to retrieve.

Warning

It seems that it is not always working as intended, at least not in my jupyter lab anymore. Instead, you can use b.experiment.data or write cells = b.cells and then use cells.x and press <TAB> to get the pop-up.

property info_file[source]#

The name of the info file.

Warning

Will be deprecated soon - use journal_name instead.

property journal: cellpy.utils.batch_tools.batch_journals.LabJournal[source]#
property journal_name[source]#
property labels[source]#
property name[source]#
property pages: pandas.DataFrame[source]#
property summaries[source]#

Concatenated summaries from all cells (multiindex dataframe).

property summary_headers[source]#

The column names of the concatenated summaries

property view[source]#

Show the selected info about each cell.

Warning

Will be deprecated soon - use report() instead.

combine_summaries(export_to_csv=True, **kwargs) None[source]#

Combine selected columns from each of the cells into single frames.

Keyword Arguments:
  • export_to_csv (bool) – export the combined summaries to csv (defaults to True).

  • **kwargs – sent to the summary_collector.

Returns:

None

create_journal(description=None, from_db=True, auto_use_file_list=None, file_list_kwargs=None, **kwargs)[source]#

Create journal pages.

This method is a wrapper for the different Journal methods for making journal pages (Batch.experiment.journal.xxx). It is under development. If you want to use ‘advanced’ options (i.e. not loading from a db), please consider using the methods available in Journal for now.

Parameters:
  • description

    the information and meta-data needed to generate the journal pages:

    • empty: create an empty journal

    • dict: create journal pages from a dictionary

    • pd.DataFrame: create journal pages from a pandas.DataFrame

    • ’filename.json’: load cellpy batch file

    • ’filename.xlsx’: create journal pages from an Excel file.

  • from_db (bool) – Deprecation Warning: this parameter will be removed as it is the default anyway. Generate the pages from a db (the default option). This will be over-ridden if description is given.

  • auto_use_file_list (bool) – Experimental feature. If True, a file list will be generated and used instead of searching for files in the folders.

  • file_list_kwargs (dict) – Experimental feature. Keyword arguments to be sent to the file list generator.

  • **kwargs – sent to sub-function(s) (e.g. from_db -> simple_db_reader -> find_files -> filefinder.search_for_files).

The following keyword arguments are picked up by from_db:

Transferred Parameters:
  • project – None

  • name – None

  • batch_col – None

The following keyword arguments are picked up by simple_db_reader:

Transferred Parameters:
  • reader – a reader object (defaults to dbreader.Reader)

  • cell_ids – keys (cell IDs)

  • file_list – file list to send to filefinder (instead of searching in folders for files).

  • pre_path – prepended path to send to filefinder.

  • include_key – include the key col in the pages (the cell IDs).

  • include_individual_arguments – include the argument column in the pages.

  • additional_column_names – list of additional column names to include in the pages.

The following keyword arguments are picked up by filefinder.search_for_files:

Transferred Parameters:
  • run_name (str) – run-file identification.

  • raw_extension (str) – optional, extension of run-files (without the ‘.’).

  • cellpy_file_extension (str) – optional, extension for cellpy files (without the ‘.’).

  • raw_file_dir (path) – optional, directory where to look for run-files (default: read prm-file)

  • project_dir (path) – subdirectory in raw_file_dir to look for run-files

  • cellpy_file_dir (path) – optional, directory where to look for cellpy-files (default: read prm-file)

  • prm_filename (path) – optional parameter file can be given.

  • file_name_format (str) – format of raw-file names or a glob pattern (default: YYYYMMDD_[name]EEE_CC_TT_RR).

  • reg_exp (str) – use regular expression instead (defaults to None).

  • sub_folders (bool) – perform search also in sub-folders.

  • file_list (list of str) – perform the search within a given list of filenames instead of searching the folder(s). The list should not contain the full filepath (only the actual file names). If you want to provide the full path, you will have to modify the file_name_format or reg_exp accordingly.

  • pre_path (path or str) – path to prepend the list of files selected from the file_list.

The following keyword arguments are picked up by journal.to_file:

Transferred Parameters:

duplicate_to_local_folder (bool) – default True.

Returns:

None

drop(cell_label=None)[source]#

Drop cells from the journal.

If cell_label is not given, cellpy will look into the journal for session info about bad cells, and if it finds it, it will remove those from the journal.

Note

Remember to save your journal again after modifying it.

Warning

This method has not been properly tested yet.

Parameters:

cell_label (str) – the cell label of the cell you would like to remove.

Returns:

cellpy.utils.batch object (returns a copy if keep_old is True).

drop_cell(cell_label)[source]#

Drop a cell from the journal.

Parameters:

cell_label – the cell label of the cell you would like to remove.

drop_cells(cell_labels)[source]#

Drop cells from the journal.

Parameters:

cell_labels – the cell labels of the cells you would like to remove.

drop_cells_marked_bad()[source]#

Drop cells that has been marked as bad from the journal (experimental feature).

duplicate_cellpy_files(location: str = 'standard', selector: dict = None, **kwargs) None[source]#

Copy the cellpy files and make a journal with the new names available in the current folder.

Parameters:
  • location

    where to copy the files. Either choose among the following options:

    • ’standard’: data/interim folder

    • ’here’: current directory

    • ’cellpydatadir’: the stated cellpy data dir in your settings (prms)

    or if the location is not one of the above, use the actual value of the location argument.

  • selector (dict) – if given, the cellpy files are reloaded after duplicating and modified based on the given selector(s).

  • **kwargs – sent to Batch.experiment.update if selector is provided

Returns:

The updated journal pages.

duplicate_journal(folder=None) None[source]#

Copy the journal to folder.

Parameters:
  • folder (str or pathlib.Path) – folder to copy to (defaults to the

  • folder). (current)

export_cellpy_files(path=None, **kwargs) None[source]#
export_journal(filename=None) None[source]#

Export the journal to xlsx.

Parameters:

filename (str or pathlib.Path) – the name of the file to save the journal to. If not given, the journal will be saved to the default name.

Link journal content to the cellpy-files and load the step information.

Parameters:
  • max_cycle (int) – set maximum cycle number to link to.

  • force_combine_summaries (bool) – automatically run combine_summaries (set this to True if you are re-linking without max_cycle for a batch that previously were linked with max_cycle)

load() None[source]#

Load the selected datasets.

Warning

Will be deprecated soon - use update instead.

make_summaries() None[source]#

Combine selected columns from each of the cells into single frames and export.

Warning

This method will be deprecated in the future. Use combine_summaries instead.

mark_as_bad(cell_label)[source]#

Mark a cell as bad (experimental feature).

Parameters:

cell_label – the cell label of the cell you would like to mark as bad.

paginate() None[source]#

Create the folders where cellpy will put its output.

plot(backend=None, reload_data=False, **kwargs)[source]#

Plot the summaries (e.g. capacity vs. cycle number).

Parameters:
  • backend (str) – plotting backend (plotly, bokeh, matplotlib, seaborn)

  • reload_data (bool) – reload the data before plotting

  • **kwargs – sent to the plotter

Keyword Arguments:
  • color_map (str, any) – color map to use (defaults to px.colors.qualitative.Set1 for plotly and “Set1” for seaborn)

  • ce_range (list) – optional range for the coulombic efficiency plot

  • min_cycle (int) – minimum cycle number to plot

  • max_cycle (int) – maximum cycle number to plot

  • title (str) – title of the figure (defaults to “Cycle Summary”)

  • x_label (str) – title of the x-label (defaults to “Cycle Number”)

  • direction (str) – plot charge or discharge (defaults to “charge”)

  • rate (bool) – (defaults to False)

  • ir (bool) – (defaults to True)

  • group_legends (bool) – group the legends so that they can be turned visible/invisible as a group (defaults to True) (only for plotly)

  • base_template (str) – template to use for the plot (only for plotly)

plot_summaries(output_filename=None, backend=None, reload_data=False, **kwargs) None[source]#

Plot the summaries.

Warning

This method will be deprecated in the future. Use plot instead.

recalc(**kwargs) None[source]#

Run make_step_table and make_summary on all cells.

Keyword Arguments:
  • save (bool) – Save updated cellpy-files if True (defaults to True).

  • step_opts (dict) – parameters to inject to make_steps (defaults to None).

  • summary_opts (dict) – parameters to inject to make_summary (defaults to None).

  • indexes (list) – Only recalculate for given indexes (i.e. list of cell-names) (defaults to None).

  • calc_steps (bool) – Run make_steps before making the summary (defaults to True).

  • testing (bool) – Only for testing purposes (defaults to False).

Returns:

None

remove_mark_as_bad(cell_label)[source]#

Remove the bad cell mark from a cell (experimental feature).

Parameters:

cell_label – the cell label of the cell you would like to remove the bad mark from.

report(stylize=True, grouped=False, check=False)[source]#

Create a report on all the cells in the batch object.

Important

To perform a reporting, cellpy needs to access all the data (and it might take some time).

Parameters:
  • stylize (bool) – apply some styling to the report (default True).

  • grouped (bool) – add information based on the group cell belongs to (default False).

  • check (bool) – check if the data seems to be without errors (0 = no errors, 1 = partial duplicates) (default False).

Returns:

pandas.DataFrame

save() None[source]#

Save journal and cellpy files.

The journal file will be saved in the project directory and in the batch-file-directory (prms.Paths.batchfiledir). The latter is useful for processing several batches using the iterate_batches functionality.

The name and location of the cellpy files is determined by the journal pages.

save_journal() None[source]#

Save the journal (json-format).

The journal file will be saved in the project directory and in the batch-file-directory (prms.Paths.batchfiledir). The latter is useful for processing several batches using the iterate_batches functionality.

show_pages(number_of_rows=5)[source]#

Show the journal pages.

Warning

Will be deprecated soon - use pages.head() instead.

update(pool=False, **kwargs) None[source]#

Updates the selected datasets.

Keyword Arguments:
  • all_in_memory (bool) – store the cellpydata in memory (default False)

  • cell_specs (dict of dicts) – individual arguments pr. cell. The cellspecs key-word argument dictionary will override the **kwargs and the parameters from the journal pages for the indicated cell.

  • logging_mode (str) – sets the logging mode for the loader(s).

  • accept_errors (bool) – if True, the loader will continue even if it encounters errors.

Additional keyword arguments are sent to the loader(s) if not picked up earlier. Remark that you can obtain the same pr. cell by providing a cellspecs dictionary. The kwargs have precedence over the parameters given in the journal pages, but will be overridden by parameters given by cellspecs.

Merging picks up the following keyword arguments:

Transferred Parameters:

recalc (Bool) – set to False if you don’t want automatic recalculation of cycle numbers etc. when merging several data-sets.

Loading picks up the following keyword arguments:

Transferred Parameters:

selector (dict) – selector-based parameters sent to the cellpy-file loader (hdf5) if loading from raw is not necessary (or turned off).

from_journal(journal_file, autolink=True, testing=False) Batch[source]#

Create a Batch from a journal file

init(*args, empty=False, **kwargs) Batch[source]#

Returns an initialized instance of the Batch class.

Parameters:
  • empty (bool) – if True, the batch will not be linked to any database and an empty batch is returned

  • *args

    passed directly to Batch()

    • name: name of batch.

    • project: name of project.

    • batch_col: batch column identifier.

Keyword Arguments:
  • file_name – json file if loading from pages (journal).

  • default_log_level – “INFO” or “DEBUG”. Defaults to “CRITICAL”.

Other keyword arguments are sent to the Batch object.

Examples

>>> empty_batch = Batch.init(db_reader=None)
>>> batch_from_file = Batch.init(file_name="cellpy_batch_my_experiment.json")
>>> normal_init_of_batch = Batch.init()
iterate_batches(folder, extension='.json', glob_pattern=None, **kwargs)[source]#

Iterate through all journals in given folder.

Warning

This function is from ancient times and needs to be updated. It might have grown old and grumpy. Expect it to change in the near future.

Parameters:
  • folder (str or pathlib.Path) – folder containing the journal files.

  • extension (str) – extension for the journal files (used when creating a default glob-pattern).

  • glob_pattern (str) – optional glob pattern.

  • **kwargs – keyword arguments passed to batch.process_batch.

load(name, project, batch_col=None, allow_from_journal=True, drop_bad_cells=True, force_reload=False, **kwargs)[source]#

Load a batch from a journal file or create a new batch and load it if the journal file does not exist.

Parameters:
  • name (str) – name of batch

  • project (str) – name of project

  • batch_col (str) – batch column identifier (only used for loading from db with simple_db_reader)

  • allow_from_journal (bool) – if True, the journal file will be loaded if it exists

  • force_reload (bool) – if True, the batch will be reloaded even if the journal file exists

  • drop_bad_cells (bool) – if True, bad cells will be dropped (only apply if journal file is loaded)

  • auto_use_file_list (bool) – Experimental feature. If True, a file list will be generated and used instead of searching for files in the folders.

  • **kwargs – sent to Batch during initialization

Keyword Arguments:
  • db_reader (str) – data-base reader to use (defaults to “default” as given in the config-file or prm-class).

  • frame (pandas.DataFrame) – load from given dataframe.

  • default_log_level (str) – custom log-level (defaults to None (i.e. default log-level in cellpy)).

  • custom_log_dir (str or pathlib.Path) – custom folder for putting the log-files.

  • force_raw_file (bool) – load from raw regardless (defaults to False).

  • force_cellpy (bool) – load cellpy-files regardless (defaults to False).

  • force_recalc (bool) – Always recalculate (defaults to False).

  • export_cycles (bool) – Extract and export individual cycles to csv (defaults to True).

  • export_raw (bool) – Extract and export raw-data to csv (defaults to True).

  • export_ica (bool) – Extract and export individual dQ/dV data to csv (defaults to True).

  • accept_errors (bool) – Continue automatically to next file if error is raised (defaults to False).

  • nom_cap (float) – give a nominal capacity if you want to use another value than the one given in the config-file or prm-class.

Returns:

populated Batch object (cellpy.utils.batch.Batch)

load_journal(journal_file, **kwargs)[source]#

Load a journal file.

Parameters:
  • journal_file (str) – path to journal file.

  • **kwargs – sent to Journal.from_file

Returns:

journal

load_pages(file_name) pandas.DataFrame[source]#

Retrieve pages from a Journal file.

This function is here to let you easily inspect a Journal file without starting up the full batch-functionality.

Examples

>>> from cellpy.utils import batch
>>> journal_file_name = 'cellpy_journal_one.json'
>>> pages = batch.load_pages(journal_file_name)
Returns:

pandas.DataFrame

naked(name=None, project=None) Batch[source]#

Returns an empty instance of the Batch class.

Examples

>>> empty_batch = naked()
process_batch(*args, **kwargs) Batch[source]#

Execute a batch run, either from a given file_name or by giving the name and project as input.

Warning

This function is from ancient times and needs to be updated. It might have grown old and grumpy. Expect it to change in the near future.

Examples

>>> process_batch(file_name | (name, project), **kwargs)
Parameters:

*args – file_name or name and project (both string)

Keyword Arguments:
  • backend (str) – what backend to use when plotting (‘bokeh’ or ‘matplotlib’). Defaults to ‘matplotlib’.

  • dpi (int) – resolution used when saving matplotlib plot(s). Defaults to 300 dpi.

  • default_log_level (str) – What log-level to use for console output. Chose between ‘CRITICAL’, ‘DEBUG’, or ‘INFO’. The default is ‘CRITICAL’ (i.e. usually no log output to console).

Returns:

cellpy.batch.Batch object

COLUMNS_SELECTED_FOR_VIEW[source]#