Batch processing#

The batch processing routines allow for convenient processing and comparison of multiple datasets simultaneously. These rely on a proper configuration of cellpy, including a properly working config file and a database file. A basic introduction on how to setup and use the batch processing routines is given here.

Setting up things properly#

Make sure you have a properly working config file#

For cellpy to find stuff, it needs to know where to look. A config file exists for this purpose. This is typically called .cellpy_prms_username.conf, and located in your home or user directory.

For more details on the config file, have a look at Setup and configuration.

The database file#

This notebook uses the cellpy batch utility. For it to work properly (or at all) you will have to provide it with a database. Currently, cellpy ships with a very simple database solution that hardly justifies its name as a database. It reads an excel-file where the first row acts as column headers, the second provides the type (e.g. string, bool, etc), and the rest provides the necessary information for each of the cells (one row pr. cell). You can of course choose to implement a database and a loader your self.

A sample excel file (“db-file”) is provided within the examples folder on GitHub. You will need fill inn values manually, one row for each cell you want to load. Then you will have to put it in the database folder (as defined in your config file where it says db_file: in the Paths-section). The name of the file must also be the same as defined in the config-file (db_filename:, i.e cellpy_db.xlsx in the example config file snippet above).

When cellpy reads the file, it uses the batch column (see below) to select which rows (i.e. cells) to load. For example, if the “b01” batch column is the one you tell cellpy to use and you provide it with the name “casandras_experiment”, it will only select the rows that has “casandras_experiment” in the “b01” column. You provide cellpy with the “lookup” name when you issue the batch.init command, for example:

b = batch.init("paper01", "cool_project", batch_col="b01")

You must always have the columns colored green filled out. And make sure that the id column (the first one in the example xlsx file) has a unique integer for each row (it is used as a “key” when looking up stuff from the file).

Filenames#

Make sure that the names of your experiment-files (for example your .res files) are of the form date_something_that_describes_the_cell.res (this is the name-format supported at the moment).

Loading batch data#

[1]:

import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
from rich import print

import cellpy
from cellpy import prms
from cellpy import prmreader
from cellpy.utils import batch, collectors

Check and (if necessary) override some of the configuration parameters:

[2]:

prms.Paths.db_path = "."
prms.Paths.db_filename = "cellpy_db.xlsx"
prms.Paths.rawdatadir = "data/raw"
prms.Paths.cellpydatadir = "data/cellpyfiles"
prms.Paths.filelogdir = "out"
prms.Paths.notebookdir = "out"
prms.Paths.batchfiledir = "out"
prms.Paths.outdatadir = "out"

Initialising the cellpy batch object#

To create Journal Pages, appropriate names for the project and the experiment have to be set:

[3]:

project = "cool_project"
name = "paper01"
batch_col = "b01"

[4]:

print(" INITIALISATION OF BATCH ".center(80, "="))
b = batch.init(name, project, batch_col=batch_col)

=========================== INITIALISATION OF BATCH ============================

Setting some parameters on automatic export of selected files:

[5]:

b.experiment.export_raw = False
b.experiment.export_cycles = False
b.experiment.export_ica = False

Load info from your database and write the corresponding journal pages:

[6]:

b.create_journal()

Create the appropriate folders where cellpy will place the output files:

[7]:

b.paginate()

Have a look at the resulting dataframe:

[8]:

b.pages

[8]:

	argument	mass	total_mass	loading	nom_cap	area	experiment	fixed	label	cell_type	instrument	raw_file_names	cellpy_file_name	comment	group	sub_group
filename
20180418_sf033_2_cc	None	0.337149	0.56	0.190787	3118.817466	1.767146	cycling	0	sf033_2	anode	arbin_res	[data/raw\20180418_sf033_2_cc_01.res]	data/cellpyfiles/20180418_sf033_2_cc.h5	SF12 Filter D micro-slurry	1	1
20180418_sf033_3_cc	None	0.343169	0.57	0.194194	3118.817466	1.767146	cycling	0	sf033_3	anode	arbin_res	[data/raw\20180418_sf033_3_cc_01.res]	data/cellpyfiles/20180418_sf033_3_cc.h5	SF12 Filter D micro-slurry	1	2
20180418_sf033_4_cc	None	0.288984	0.48	0.163532	3118.817466	1.767146	cycling	0	sf033_4	anode	arbin_res	[data/raw\20180418_sf033_4_cc_01.res]	data/cellpyfiles/20180418_sf033_4_cc.h5	SF12 Filter D micro-slurry	1	3
20180418_sf033_5_cc	None	0.295005	0.49	0.166939	3118.817466	1.767146	cycling	0	sf033_5	anode	arbin_res	[data/raw\20180418_sf033_5_cc_01.res]	data/cellpyfiles/20180418_sf033_5_cc.h5	SF12 Filter D micro-slurry	1	4
20180420_sf036_2_cc	None	0.572383	0.95	0.323902	3122.348698	1.767146	cycling	0	sf036_2	anode	arbin_res	[data/raw\20180420_sf036_2_cc_01.res]	data/cellpyfiles/20180420_sf036_2_cc.h5	SF12 Filter 1 micro-slurry	2	1
20180420_sf036_3_cc	None	0.716985	1.19	0.405730	3122.348698	1.767146	cycling	0	sf036_3	anode	arbin_res	[data/raw\20180420_sf036_3_cc_01.res]	data/cellpyfiles/20180420_sf036_3_cc.h5	SF12 Filter 1 micro-slurry	2	2
20180420_sf036_4_cc	None	0.584433	0.97	0.330721	3122.348698	1.767146	cycling	0	sf036_4	anode	arbin_res	[data/raw\20180420_sf036_4_cc_01.res]	data/cellpyfiles/20180420_sf036_4_cc.h5	SF12 Filter 1 micro-slurry	2	3

Note: You can of course also create this dataframe yourself without loading from the .xlsx database file.

Loading data into the initialised batch object#

Now that everything is set up b.update() loads the data (and exports the corresponding .csv-files if export_(raw/cycles/ica) = True). Depending on the size of your datafiles, this might take some time:

[9]:

b.update()

Exploring batch data#

The report() method creates a report/summary on all the cells in your cellpy batch object:

[10]:

b.report()

[10]:

	mass	total_mass	loading	nom_cap	empty	raw_rows	steps_rows	summary_rows	last_cycle	average_capacity	max_capacity	min_capacity	std_capacity
filename
20180418_sf033_2_cc	0.337149	0.560000	0.190787	3118.817466	False	160059	1578	304	304	1567.198001	2079.481739	0.000000	209.150717
20180418_sf033_3_cc	0.343169	0.570000	0.194194	3118.817466	False	160980	1587	304	304	1597.665927	2103.339517	0.000000	205.046181
20180418_sf033_4_cc	0.288984	0.480000	0.163532	3118.817466	False	155754	1567	304	304	1493.788287	1952.530597	0.000000	189.297846
20180418_sf033_5_cc	0.295005	0.490000	0.166939	3118.817466	False	169567	1588	304	304	1741.579324	2302.442797	0.000000	227.149486
20180420_sf036_2_cc	0.572383	0.950000	0.323902	3122.348698	False	157750	1586	304	304	1479.043916	2319.709751	0.000000	474.421220
20180420_sf036_3_cc	0.716985	1.190000	0.405730	3122.348698	False	134496	1571	304	304	1062.506245	2323.285459	0.000000	622.550951
20180420_sf036_4_cc	0.584433	0.970000	0.330721	3122.348698	False	128547	1561	304	304	880.014288	2608.773865	0.000000	889.235451

To get a visual overview over all cells in your cellpy batch object, we can use the convenient b.plot() function. This plots the charge capacity, coulombic efficiency and resistance vs. cycle number. Setting rate=True adds a plot of C-rates.

[11]:

b.plot(rate=True)

Working with batch objects#

The implemented Collectors are meant to simplify plotting and exporting when working with batch objects. Available collectors include the BatchSummaryCollector, the BatchCycleCollector and the BatchICACollector.

Summaries#

The BatchSummaryCollector class collects and shows sumaries, including, e.g., the option to show statistical variations in the data (spread=True):

[12]:

group_labels = {1: "starts ok", 2: "starts best"}
discharge_cap_summaries_full = collectors.BatchSummaryCollector(
    b,
    columns=["discharge_capacity_gravimetric"],
    max_cycle=100,
    group_it=True,
    data_collector_arguments=dict(custom_group_labels=group_labels),
    spread=True,
    height=600,
)
discharge_cap_summaries_full.show()

figure name: paper01_collected_summaries_discharge_capacity_gravimetric_average

These summaries can be saved for later:

[13]:

# discharge_cap_summaries_full.save(serial_number=1)

Summary data can also be accessed from b.summaries:

[14]:

discharge_capacity = b.summaries.discharge_capacity_gravimetric
charge_capacity = b.summaries.charge_capacity_gravimetric
coulombic_efficiency = b.summaries.coulombic_efficiency
ir_charge = b.summaries.ir_charge

and ploted using matplotlib:

[15]:

fig, (ax1, ax2) = plt.subplots(2, 1)
ax1.plot(discharge_capacity)
ax1.set_ylabel("capacity ")
ax2.plot(ir_charge)
ax2.set_xlabel("cycle")
ax2.set_ylabel("resistance")

[15]:

Text(0, 0.5, 'resistance')

../../_images/examples_batch_utility_cellpy_batch_processing_docs_37_1.png

Cycles#

The BatchCyclesCollector class creates a collection of capacity plots, including several different options for customization. Two examples are shown here:

[16]:

cells_collected = collectors.BatchCyclesCollector(b, max_cycle=10)
cells_collected.show()

figure name: paper01_collected_cycles_intp_p100_bf_pr_cell

[17]:

cycles_collected = collectors.BatchCyclesCollector(
    b,
    cycles=[1, 2, 3, 10, 100, 200],
    collector_type="forth-and-forth",
    plot_type="fig_pr_cycle",
)
cycles_collected.show()

figure name: paper01_collected_cycles_intp_p100_ff_pr_cyc

Incremental capacity analysis (ICA)#

Similarly, the BatchICACollector creates a collection of ICA (dQ/dV) plots:

[18]:

icas_collected = collectors.BatchICACollector(b,cycles=[2,3,4])
icas_collected.show()

figure name: paper01_collected_ica_pr_cell

Looking at individual cells in a batch#

The batch object is in principle a collection of several CellpyCell objects. Those can of course be selected and looked at individually.

To check which cells are contained within your batch, you can simply print the cell names:

[19]:

cell_labels = b.experiment.cell_names
print(cell_labels)

[
    '20180418_sf033_2_cc',
    '20180418_sf033_3_cc',
    '20180418_sf033_4_cc',
    '20180418_sf033_5_cc',
    '20180420_sf036_2_cc',
    '20180420_sf036_3_cc',
    '20180420_sf036_4_cc'
]

Select one cell to look at:

[20]:

label = cell_labels[0]
c = b.experiment.data[label]

Now that you have selected one cell, you can use all the standard cellpy routines available for CellpyCells, e.g. view the available info on this cell:

[21]:

#c

And use the get_cap method to extract and plot voltage curves:

[22]:

cap = c.get_cap(categorical_column=True, method="forth-and-forth")
cap.head(2)

[22]:

	voltage	capacity	direction
267	2.721604	0.000054	-1
268	2.708690	0.002016	-1

[23]:

fig, ax = plt.subplots()
ax.plot(cap.capacity, cap.voltage)
ax.set_xlabel("capacity")
ax.set_ylabel("voltage");

../../_images/examples_batch_utility_cellpy_batch_processing_docs_51_0.png

Cleaning up the plot a bit…

[24]:

voltage_capacity_100 = c.get_cap(cycle=100, method="forth-and-forth", interpolated=True, number_of_points=80)
voltage_capacity_200 = c.get_cap(cycle=200, method="forth-and-forth", interpolated=True, number_of_points=80)

fig, ax = plt.subplots()
ax.set_xlabel(f"capacity ({c.cellpy_units.charge}/{c.cellpy_units.specific_gravimetric})")
ax.set_ylabel(f"voltage ({c.cellpy_units.voltage} vs. Li/Li+)")
ax.plot(voltage_capacity_100.capacity, voltage_capacity_100.voltage, "o-", label="cycle 100")
ax.plot(voltage_capacity_200.capacity, voltage_capacity_200.voltage, "o-", label="cycle 200")
ax.legend();

../../_images/examples_batch_utility_cellpy_batch_processing_docs_53_0.png

[ ]:

Batch processing

Contents