Batch processing#

The batch processing routines allow for convenient processing and comparison of multiple datasets simultaneously. These rely on a proper configuration of cellpy, including a properly working config file and a database file. A basic introduction on how to setup and use the batch processing routines is given here.

Setting up things properly#

Make sure you have a properly working config file#

For cellpy to find stuff, it needs to know where to look. A config file exists for this purpose. This is typically called .cellpy_prms_username.conf, and located in your home or user directory.

For more details on the config file, have a look at Setup and configuration.

The database file#

This notebook uses the cellpy batch utility. For it to work properly (or at all) you will have to provide it with a database. Currently, cellpy ships with a very simple database solution that hardly justifies its name as a database. It reads an excel-file where the first row acts as column headers, the second provides the type (e.g. string, bool, etc), and the rest provides the necessary information for each of the cells (one row pr. cell). You can of course choose to implement a database and a loader your self.

A sample excel file (“db-file”) is provided within the examples folder on GitHub. You will need fill inn values manually, one row for each cell you want to load. Then you will have to put it in the database folder (as defined in your config file where it says db_file: in the Paths-section). The name of the file must also be the same as defined in the config-file (db_filename:, i.e cellpy_db.xlsx in the example config file snippet above).

When cellpy reads the file, it uses the batch column (see below) to select which rows (i.e. cells) to load. For example, if the “b01” batch column is the one you tell cellpy to use and you provide it with the name “casandras_experiment”, it will only select the rows that has “casandras_experiment” in the “b01” column. You provide cellpy with the “lookup” name when you issue the batch.init command, for example:

b = batch.init("paper01", "cool_project", batch_col="b01")

You must always have the columns colored green filled out. And make sure that the id column (the first one in the example xlsx file) has a unique integer for each row (it is used as a “key” when looking up stuff from the file).

Filenames#

Make sure that the names of your experiment-files (for example your .res files) are of the form date_something_that_describes_the_cell.res (this is the name-format supported at the moment).

Loading batch data#

[1]:
import numpy as np
import matplotlib.pyplot as plt
import plotly.express as px
from rich import print

import cellpy
from cellpy import prms
from cellpy import prmreader
from cellpy.utils import batch, collectors

Check and (if necessary) override some of the configuration parameters:

[2]:
prms.Paths.db_path = "."
prms.Paths.db_filename = "cellpy_db.xlsx"
prms.Paths.rawdatadir = "data/raw"
prms.Paths.cellpydatadir = "data/cellpyfiles"
prms.Paths.filelogdir = "out"
prms.Paths.notebookdir = "out"
prms.Paths.batchfiledir = "out"
prms.Paths.outdatadir = "out"

Initialising the cellpy batch object#

To create Journal Pages, appropriate names for the project and the experiment have to be set:

[3]:
project = "cool_project"
name = "paper01"
batch_col = "b01"
[4]:
print(" INITIALISATION OF BATCH ".center(80, "="))
b = batch.init(name, project, batch_col=batch_col)
=========================== INITIALISATION OF BATCH ============================

Setting some parameters on automatic export of selected files:

[5]:
b.experiment.export_raw = False
b.experiment.export_cycles = False
b.experiment.export_ica = False

Load info from your database and write the corresponding journal pages:

[6]:
b.create_journal()

Create the appropriate folders where cellpy will place the output files:

[7]:
b.paginate()

Have a look at the resulting dataframe:

[8]:
b.pages
[8]:
argument mass total_mass loading nom_cap area experiment fixed label cell_type instrument raw_file_names cellpy_file_name comment group sub_group
filename
20180418_sf033_2_cc None 0.337149 0.56 0.190787 3118.817466 1.767146 cycling 0 sf033_2 anode arbin_res [data/raw\20180418_sf033_2_cc_01.res] data/cellpyfiles/20180418_sf033_2_cc.h5 SF12 Filter D micro-slurry 1 1
20180418_sf033_3_cc None 0.343169 0.57 0.194194 3118.817466 1.767146 cycling 0 sf033_3 anode arbin_res [data/raw\20180418_sf033_3_cc_01.res] data/cellpyfiles/20180418_sf033_3_cc.h5 SF12 Filter D micro-slurry 1 2
20180418_sf033_4_cc None 0.288984 0.48 0.163532 3118.817466 1.767146 cycling 0 sf033_4 anode arbin_res [data/raw\20180418_sf033_4_cc_01.res] data/cellpyfiles/20180418_sf033_4_cc.h5 SF12 Filter D micro-slurry 1 3
20180418_sf033_5_cc None 0.295005 0.49 0.166939 3118.817466 1.767146 cycling 0 sf033_5 anode arbin_res [data/raw\20180418_sf033_5_cc_01.res] data/cellpyfiles/20180418_sf033_5_cc.h5 SF12 Filter D micro-slurry 1 4
20180420_sf036_2_cc None 0.572383 0.95 0.323902 3122.348698 1.767146 cycling 0 sf036_2 anode arbin_res [data/raw\20180420_sf036_2_cc_01.res] data/cellpyfiles/20180420_sf036_2_cc.h5 SF12 Filter 1 micro-slurry 2 1
20180420_sf036_3_cc None 0.716985 1.19 0.405730 3122.348698 1.767146 cycling 0 sf036_3 anode arbin_res [data/raw\20180420_sf036_3_cc_01.res] data/cellpyfiles/20180420_sf036_3_cc.h5 SF12 Filter 1 micro-slurry 2 2
20180420_sf036_4_cc None 0.584433 0.97 0.330721 3122.348698 1.767146 cycling 0 sf036_4 anode arbin_res [data/raw\20180420_sf036_4_cc_01.res] data/cellpyfiles/20180420_sf036_4_cc.h5 SF12 Filter 1 micro-slurry 2 3

Note: You can of course also create this dataframe yourself without loading from the .xlsx database file.

Loading data into the initialised batch object#

Now that everything is set up b.update() loads the data (and exports the corresponding .csv-files if export_(raw/cycles/ica) = True). Depending on the size of your datafiles, this might take some time:

[9]:
b.update()

Exploring batch data#

The report() method creates a report/summary on all the cells in your cellpy batch object:

[10]:
b.report()
[10]:
  mass total_mass loading nom_cap empty raw_rows steps_rows summary_rows last_cycle average_capacity max_capacity min_capacity std_capacity
filename                          
20180418_sf033_2_cc 0.337149 0.560000 0.190787 3118.817466 False 160059 1578 304 304 1567.198001 2079.481739 0.000000 209.150717
20180418_sf033_3_cc 0.343169 0.570000 0.194194 3118.817466 False 160980 1587 304 304 1597.665927 2103.339517 0.000000 205.046181
20180418_sf033_4_cc 0.288984 0.480000 0.163532 3118.817466 False 155754 1567 304 304 1493.788287 1952.530597 0.000000 189.297846
20180418_sf033_5_cc 0.295005 0.490000 0.166939 3118.817466 False 169567 1588 304 304 1741.579324 2302.442797 0.000000 227.149486
20180420_sf036_2_cc 0.572383 0.950000 0.323902 3122.348698 False 157750 1586 304 304 1479.043916 2319.709751 0.000000 474.421220
20180420_sf036_3_cc 0.716985 1.190000 0.405730 3122.348698 False 134496 1571 304 304 1062.506245 2323.285459 0.000000 622.550951
20180420_sf036_4_cc 0.584433 0.970000 0.330721 3122.348698 False 128547 1561 304 304 880.014288 2608.773865 0.000000 889.235451

To get a visual overview over all cells in your cellpy batch object, we can use the convenient b.plot() function. This plots the charge capacity, coulombic efficiency and resistance vs. cycle number. Setting rate=True adds a plot of C-rates.

[11]:
b.plot(rate=True)

Working with batch objects#

The implemented Collectors are meant to simplify plotting and exporting when working with batch objects. Available collectors include the BatchSummaryCollector, the BatchCycleCollector and the BatchICACollector.

Summaries#

The BatchSummaryCollector class collects and shows sumaries, including, e.g., the option to show statistical variations in the data (spread=True):

[12]:
group_labels = {1: "starts ok", 2: "starts best"}
discharge_cap_summaries_full = collectors.BatchSummaryCollector(
    b,
    columns=["discharge_capacity_gravimetric"],
    max_cycle=100,
    group_it=True,
    data_collector_arguments=dict(custom_group_labels=group_labels),
    spread=True,
    height=600,
)
discharge_cap_summaries_full.show()
figure name: paper01_collected_summaries_discharge_capacity_gravimetric_average

These summaries can be saved for later:

[13]:
# discharge_cap_summaries_full.save(serial_number=1)

Summary data can also be accessed from b.summaries:

[14]:
discharge_capacity = b.summaries.discharge_capacity_gravimetric
charge_capacity = b.summaries.charge_capacity_gravimetric
coulombic_efficiency = b.summaries.coulombic_efficiency
ir_charge = b.summaries.ir_charge

and ploted using matplotlib:

[15]:
fig, (ax1, ax2) = plt.subplots(2, 1)
ax1.plot(discharge_capacity)
ax1.set_ylabel("capacity ")
ax2.plot(ir_charge)
ax2.set_xlabel("cycle")
ax2.set_ylabel("resistance")
[15]:
Text(0, 0.5, 'resistance')
../../_images/examples_batch_utility_cellpy_batch_processing_docs_37_1.png

Cycles#

The BatchCyclesCollector class creates a collection of capacity plots, including several different options for customization. Two examples are shown here:

[16]:
cells_collected = collectors.BatchCyclesCollector(b, max_cycle=10)
cells_collected.show()
figure name: paper01_collected_cycles_intp_p100_bf_pr_cell
[17]:
cycles_collected = collectors.BatchCyclesCollector(
    b,
    cycles=[1, 2, 3, 10, 100, 200],
    collector_type="forth-and-forth",
    plot_type="fig_pr_cycle",
)
cycles_collected.show()
figure name: paper01_collected_cycles_intp_p100_ff_pr_cyc

Incremental capacity analysis (ICA)#

Similarly, the BatchICACollector creates a collection of ICA (dQ/dV) plots:

[18]:
icas_collected = collectors.BatchICACollector(b,cycles=[2,3,4])
icas_collected.show()
figure name: paper01_collected_ica_pr_cell

Looking at individual cells in a batch#

The batch object is in principle a collection of several CellpyCell objects. Those can of course be selected and looked at individually.

To check which cells are contained within your batch, you can simply print the cell names:

[19]:
cell_labels = b.experiment.cell_names
print(cell_labels)
[
    '20180418_sf033_2_cc',
    '20180418_sf033_3_cc',
    '20180418_sf033_4_cc',
    '20180418_sf033_5_cc',
    '20180420_sf036_2_cc',
    '20180420_sf036_3_cc',
    '20180420_sf036_4_cc'
]

Select one cell to look at:

[20]:
label = cell_labels[0]
c = b.experiment.data[label]

Now that you have selected one cell, you can use all the standard cellpy routines available for CellpyCells, e.g. view the available info on this cell:

[21]:
#c

And use the get_cap method to extract and plot voltage curves:

[22]:
cap = c.get_cap(categorical_column=True, method="forth-and-forth")
cap.head(2)
[22]:
voltage capacity direction
267 2.721604 0.000054 -1
268 2.708690 0.002016 -1
[23]:
fig, ax = plt.subplots()
ax.plot(cap.capacity, cap.voltage)
ax.set_xlabel("capacity")
ax.set_ylabel("voltage");
../../_images/examples_batch_utility_cellpy_batch_processing_docs_51_0.png

Cleaning up the plot a bit…

[24]:
voltage_capacity_100 = c.get_cap(cycle=100, method="forth-and-forth", interpolated=True, number_of_points=80)
voltage_capacity_200 = c.get_cap(cycle=200, method="forth-and-forth", interpolated=True, number_of_points=80)

fig, ax = plt.subplots()
ax.set_xlabel(f"capacity ({c.cellpy_units.charge}/{c.cellpy_units.specific_gravimetric})")
ax.set_ylabel(f"voltage ({c.cellpy_units.voltage} vs. Li/Li+)")
ax.plot(voltage_capacity_100.capacity, voltage_capacity_100.voltage, "o-", label="cycle 100")
ax.plot(voltage_capacity_200.capacity, voltage_capacity_200.voltage, "o-", label="cycle 200")
ax.legend();
../../_images/examples_batch_utility_cellpy_batch_processing_docs_53_0.png
[ ]: