Basics usage

The getting started with cellpy tutorial (opinionated version)

This tutorial will help you getting started with cellpy and tries to give you a step-by-step recipe. It starts with installation, and you should select the installation method that best suits your needs (or your level).

How to install and run cellpy - the tea spoon explanation for standard users

If you are used to installing stuff from the command line (or shell), then things might very well run smoothly. If you are not, then you might want to read through the guide for complete beginners first (see below Setting up cellpy on Windows for complete beginners).

1. Install a scientific stack of python 3.x

If the words “virtual environment” or “miniconda” do not ring any bells, you should install the Anaconda scientific Python distribution. Go to www.anaconda.com and select the Anaconda distribution (press the Download Now button). Use at least python 3.9, and select the 64 bit version (if you fail at installing the 64 bit version, then you can try the weaker 32 bit version). Download it and let it install.

Caution

The bin version matters sometimes, so try to make a mental note of what you selected. E.g., if you plan to use the Microsoft Access odbc driver (see below), and it is 32-bit, you probably should chose to install a 32-bit python version).

Python should now be available on your computer, as well as a huge amount of python packages. And Anaconda is kind enough to also install an alternative command window called “Anaconda Prompt” that has the correct settings ensuring that the conda command works as it should.

2. Create a virtual environment

This step can be omitted (but its not necessarily smart to do so). Create a virtual conda environment called cellpy (the name is not important, but it should be a name you are able to remember) by following the steps below:

Open up the “Anaconda Prompt” (or use the command window) and type

conda create -n cellpy

This creates your virtual environment (here called cellpy) in which cellpy will be installed and used.

You then have to activate the environment:

conda activate cellpy

3. Install cellpy

In your activated cellpy environment in the Anaconda Prompt if you chose to make one, or in the base environment if you chose not to, run:

conda install -c conda-forge cellpy

Congratulations, you have (hopefully) successfully installed cellpy.

If you run into problems, doublecheck that all your dependencies are installed and check your Microsoft Access odbc drivers.

4. Check your cellpy installation

The easiest way to check if cellpy has been installed, is to issue the command for printing the version number to the screen

cellpy info --version

If the program prints the expected version number, you probably succeeded. If it crashes, then you will have to retrace your steps, redo stuff and hope for the best. If it prints an older (lower) version number than you expect, there is a big chance that you have installed it earlier, and what you would like to do is to do an upgrade instead of an install

python -m pip install --upgrade cellpy

If you want to install a pre-release (a version that is so bleeding edge that it ends with a alpha or beta release identification, e.g. ends with .b2). Then you will need to add the –pre modifier

python -m pip install --pre cellpy

To run a more complete check of your installation, there exist a cellpy sub-command than can be helpful

cellpy info --check

5. Set up cellpy

After you have installed cellpy it is highly recommended that you create an appropriate configuration file and folders for raw data, cellpy-files, logs, databases and output data (and inform cellpy about it).

To do this, run the setup command:

cellpy setup

To run the setup in interactive mode, use -i:

cellpy setup -i

This creates the cellpy configuration file .cellpy_prms_USERNAME.conf in your home directory (USERNAME = your user name) and creates the standard cellpy_data folders (if they do not exist). The -i option makes sure that the setup is done interactively: The program will ask you about where specific folders are, e.g. where you would like to put your outputs and where your cell data files are located. If the folders do not exist, cellpy will try to create them.

If you want to specify a root folder different from the default (your HOME folder), you can use the -d option e.g. cellpy setup -i -d /Users/kingkong/cellpydir

Hint

You can always edit your configurations directly in the cellpy configuration file .cellpy_prms_USER.conf. This file should be located inside your home directory, /. in posix and c:usersUSERNAME in not-too-old windows.

6. Create a notebook and run cellpy

Inside your Anaconda Prompt window, write:

jupyter notebook  # or jupyter lab

Your browser should then open and you are ready to write your first cellpy script.

There are many good tutorials on how to work with jupyter. This one by Real Python is good for beginners: Jupyter Notebook: An Introduction

Setting up cellpy on Windows for complete beginners

This guide provides step-by-step instructions for installing Cellpy on a Windows system, especially tailored for beginners.

1. Installing Python

  • First, download Python from the official website. Choose the latest version for Windows.

  • Run the downloaded installer. On the first screen of the setup, ensure to check the box

    saying “Add Python to PATH” before clicking “Install Now”.

  • After installation, you can verify it by opening the Command Prompt (see below) and typing:

    python --version
    

    This command should return the version of Python that you installed.

2. Opening Command Prompt

  • Press the Windows key, usually located at the bottom row of your keyboard, between the Ctrl and Alt keys.

  • Type “Command Prompt” into the search bar that appears at the bottom of the screen when you press the Windows key.

  • Click on the “Command Prompt” application to open it.

3. Creating a Virtual Environment

A virtual environment is a tool that helps to keep dependencies required by different projects separate by creating isolated Python environments for them. Here’s how to create one:

  • Open Command Prompt.

  • Navigate to the directory where you want to create your virtual environment using the cd command. For example:

    cd C:\Users\YourUsername\Documents
    
  • Type the following command and press enter to create a new virtual environment (replace envname with the name you want to give to your virtual environment):

    python -m venv envname
    
  • To activate the virtual environment, type the following command and press enter:

    envname\Scripts\activate
    

    You’ll know it worked if you see (envname) before the prompt in your Command Prompt window.

4. Installing Jupyter Notebook and matplotlib

Jupyter Notebook is an open-source web application that allows you to create documents containing live code, equations, visualizations, and text. It’s very useful, especially for beginners. To install Jupyter Notebook:

  • Make sure your virtual environment is activated.

  • Type the following command and press enter:

    python -m pip install jupyter matplotlib
    

5. Installing cellpy

Next, you need to install cellpy. You can install it via pip (Python’s package manager). To install cellpy:

  • Make sure your virtual environment is activated.

  • Type the following command and press enter:

    python -m pip install cellpy
    

6. Launching Jupyter Notebook

  • Make sure your virtual environment is activated.

  • Type the following command and press enter:

    jupyter notebook
    
  • This will open a new tab in your web browser with the Jupyter’s interface. From there, create a new Python notebook by clicking on “New” > “Python 3”.

7. Trying out cellpy

Here’s a simple example of how to use Cellpy in a Jupyter notebook:

  • In the first cell of the notebook, import Cellpy by typing:

    import cellpy
    

    Press Shift + Enter to run the cell.

  • In the new cell, load your data file (replace “datafile.res” and “/path/to/your/data” with your actual filename and path):

    filepath = "/path/to/your/data/datafile.res"
    
    c = cellpy.get(filepath)  # create a new cellpy object
    

    Press Shift + Enter to run the cell and load the data.

  • To see a summary of the loaded data, create a new cell and type:

    print(c.data.summary.head())
    

    Press Shift + Enter to run the cell and print the summary.

Congratulations! You’ve successfully set up Cellpy in a virtual environment on your Windows PC and loaded your first data file. For more information and examples, check out the official Cellpy documentation.

Cellpy includes convenient functions for accessing the data. Here’s a basic example of how to plot voltage vs. capacity.

  • In a new cell in your Jupyter notebook, first, import matplotlib, which is a Python plotting library:

    import matplotlib.pyplot as plt
    

    Press Shift + Enter to run the cell.

  • Then, iterate through all cycles numbers, extract the capacity curves and plot:

    for cycle in c.get_cycle_numbers():
        d = c.get_cap(cycle)
        plt.plot(d["capacity"], d["voltage"])
    plt.show()
    

    Press Shift + Enter to run the cell.

    This will produce a plot for each cycle in the loaded data.

Once you’ve loaded your data, you can save it to a hdf5 file for later use:

c.save("saved_data.h5")

This saves the loaded data to a file named ‘saved_data.h5’.

Now, lets try to create some dQ/dV plots. dQ/dV is a plot of the change in capacity (Q) with respect to the change in voltage (V). It’s often used in battery analysis to observe specific electrochemical reactions. Here’s how to create one:

  • In a new cell in your Jupyter notebook, first, if you have not imported matplotlib:

    import matplotlib.pyplot as plt
    

    Press Shift + Enter to run the cell.

  • Then, calculate dQ/dV using Cellpy’s ica utility:

    import cellpy.utils.ica as ica
    
    dqdv = ica.dqdv_frames(c, cycle=[1, 10, 100], voltage_resolution=0.01)
    

    Press Shift + Enter to run the cell.

  • Now, you can create a plot of dQ/dV. In a new cell, type:

    plt.figure(figsize=(10, 8))
    plt.plot(dqdv["v"], dqdv["dq"], label="dQ/dV")
    plt.xlabel("Voltage (V)")
    plt.ylabel("dQ/dV (Ah/V)")
    plt.legend()
    plt.grid(True)
    plt.show()
    

    Press Shift + Enter to run the cell.

In the code above, plt.figure is used to create a new figure, plt.plot plots the data, plt.xlabel and plt.ylabel set the labels for the x and y axes, plt.legend adds a legend to the plot, plt.grid adds a grid to the plot, and plt.show displays the plot.

With this, you should be able to see the dQ/dV plot in your notebook.

Remember that the process of creating a dQ/dV plot can be quite memory-intensive, especially for large datasets, so it may take a while for the plot to appear.

For more information and examples, check out the official Cellpy documentation and the matplotlib documentation.

This recipe can only take you a certain distance. If you want to become more efficient with Python and Cellpy, you might want to try to install it using the method described in the chapter “Installing and setting up cellpy” in the official Cellpy documentation.

More about installing and setting up cellpy

Fixing dependencies

To make sure your environment contains the correct packages and dependencies required for running cellpy, you can create an environment based on the available environment.yml file. Download the environment.yml file and place it in the directory shown in your Anaconda Prompt. If you want to change the name of the environment, you can do this by changing the first line of the file. Then type (in the Anaconda Prompt):

conda env create -f environment.yml

Then activate your environment:

conda activate cellpy

cellpy relies on a number of other python package and these need to be installed. Most of these packages are included when creating the environment based on the environment.yml file as outlined above.

Basic dependencies

In general, you need the typical scientific python pack, including

  • numpy

  • scipy

  • pandas

Additional dependencies are:

  • pytables is needed for working with the hdf5 files (the cellpy-files):

conda install -c conda-forge pytables
  • lmfit is required to use some of the fitting routines in cellpy:

conda install -c conda-forge lmfit
  • holoviz and plotly: plotting library used in several of our example notebooks.

  • jupyter: used for tutorial notebooks and in general very useful tool for working with and sharing your cellpy results.

For more details, I recommend that you look at the documentation of these packages (google it) and install them. You can most likely use the same method as for pytables etc.

Additional requirements for .res files

Note! .res files from Arbin testers are actually in a Microsoft Access format.

For Windows users: if you do not have one of the most recent Office versions, you might not be allowed to install a driver of different bit than your office version is using (the installers can be found here). Also remark that the driver needs to be of the same bit as your Python (so, if you are using 32 bit Python, you will need the 32 bit driver).

For POSIX systems: I have not found any suitable drivers. Instead, cellpy will try to use mdbtoolsto first export the data to temporary csv-files, and then import from those csv-file (using the pandas library). You can install mdbtools using your systems preferred package manager (e.g. apt-get install mdbtools).

The cellpy configuration file

The paths to raw data, the cellpy data base file, file locations etc. are set in the .cellpy_prms_USER.conf file that is located in your home directory.

To get the filepath to your config file (and other cellpy info), run:

cellpy info -l

The config file is written in YAML format and it should be relatively easy to edit it in a text editor.

Within the config file, the paths are the most important parts that need to be set up correctly. This tells cellpy where to find (and save) different files, such as the database file and raw data.

Furthermore, the config file contains details about the database-file to be used for cell info and metadata (i.e. type and structure of the database file such as column headers etc.). For more details, see chapter on Configuring cellpy.

The ‘database’ file

The database file should contain information (cell name, type, mass loading etc.) on your cells, so that cellpy can find and link the test data to the provided metadata.

The database file is also useful when working with the cellpy batch routine.

Useful cellpy commands

To help installing and controlling your cellpy installation, a CLI (command-line-interface) is provided with several commands (including the already mentioned info for getting information about your installation, and setup for helping you to set up your installation and writing a configuration file).

To get a list of these commands including some basic information, you can issue

cellpy --help

This will output some (hopefully) helpful text

Usage: cellpy [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  edit   Edit your cellpy config file.
  info   This will give you some valuable information about your cellpy.
  new    Set up a batch experiment.
  pull   Download examples or tests from the big internet.
  run    Run a cellpy process.
  serve  Start a Jupyter server
  setup  This will help you to setup cellpy.

You can get information about the sub-commands by issuing –-help after them also. For example, issuing

cellpy info --help

gives

Usage: cellpy info [OPTIONS]

Options:
 -v, --version    Print version information.
 -l, --configloc  Print full path to the config file.
 -p, --params     Dump all parameters to screen.
 -c, --check      Do a sanity check to see if things works as they should.
 --help           Show this message and exit.

Running your first script

As with most software, you are encouraged to play a little with it. I hope there are some useful stuff in the code repository (for example in the examples folder).

Hint

The cellpy pull command can assist in downloading both examples and tests.

Start by trying to import cellpy in an interactive Python session. If you have an icon to press to start up the Python in interactive mode, do that (it could also be for example an ipython console or a Jupyter Notebook). You can also start an interactive Python session if you are in your terminal window of command window by just writing python and pressing enter. Hint: Remember to activate your cellpy (or whatever name you chose) environment.

Once inside Python, try issuing import cellpy. Hopefully you should not see any error-messages.

Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:36:06)
[MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cellpy
>>>

Nothing bad happened this time. If you got an error message, try to interpret it and check if you have skipped any steps in this tutorial. Maybe you are missing the box package? If so, go out of the Python interpreter if you started it in your command window, or open another command window and write

pip install python-box

and try again.

Now let’s try to be a bit more ambitious. Start up python again if you are not still running it and try this:

>>> from cellpy import prmreader
>>> prmreader.info()

The prmreader.info() command should print out information about your cellpy settings. For example where you selected to look for your input raw files (prms.Paths.rawdatadir).

Try scrolling to find your own prms.Paths.rawdatadir. Does it look right? These settings can be changed by either re-running the cellpy setup -i command (not in Python, but in the command window / terminal window). You probably need to use the --reset flag this time since it is not your first time running it).

Interacting with your data

Read cell data

We assume that we have cycled a cell and that we have two files with results (we had to stop the experiment and re-start for some reason). The files are in the .res format (Arbin).

The easiest way to load data is to use the cellpy.get method:

import cellpy

electrode_mass = 0.658 # active mass of electrode in mg
file_name = "20170101_ife01_cc_01.res"
cell_data = cellpy.get(file_name, mass=electrode_mass, cycle_mode="anode")

Note

Even though the CellpyCell object in the example above got the name cell_data, it is more common to just simply name it c (i.e. c = cellpy.get(...)). Similarly, the cellpy naming convention for cellpy.utils.Batch objects is to name them b (i.e. b = batch.init(...) (assuming then that the batch module was imported somewhere earlier in the code)).

If you prefer, you can obtain the same by using cellpy.cellreader.CellpyCell object directly. However, we recommend using the cellpy.get method. But just in case you want to know how to do it…

First, import the cellreader-object from cellpy:

import os
from cellpy import cellreader

Then define some settings and variables and create the CellpyCell-object:

raw_data_dir = r"C:\raw_data"
out_data_dir = r"C:\processed_data"
cellpy_data_dir = r"C:\CellpyCell"
cycle_mode = "anode" # default is usually "anode", but...
# These can also be set in the configuration file

electrode_mass = 0.658 # active mass of electrode in mg

# list of files to read (Arbin .res type):
raw_file = ["20170101_ife01_cc_01.res", "20170101_ife01_cc_02.res"]
# the second file is a 'continuation' of the first file...

# list consisting of file names with full path
raw_files = [os.path.join(raw_data_dir, f) for f in raw_file]

# creating the CellpyCell object and set the cycle mode:
cell_data = cellreader.CellpyCell()
cell_data.cycle_mode = cycle_mode

Now we will read the files, merge them, and create a summary:

# if the list of files are in a list they are automatically merged:
cell_data.from_raw([raw_files])
cell_data.set_mass(electrode_mass)
cell_data.make_summary()
# Note: make_summary will automatically run the
# make_step_table function if it does not exist.

Save / export data

When you have loaded your data and created your CellpyCell object, it is time to save everything in the cellpy-format:

# defining a name for the cellpy_file (hdf5-format)
cellpy_data_dir = r"C:\cellpy_data\cellpy_files"
cellpy_file = os.path.join(cellpy_data_dir, "20170101_ife01_cc2.h5")
cell_data.save(cellpy_file)

The cellpy format is much faster to load than the raw-file formats typically encountered. It also includes the summary and step-tables, and it is easy to add more data to the file later on.

To export data to csv format, CellpyCell has a method called to_csv.

# export data to csv
out_data_directory = r"C:\processed_data\csv"
# this exports the summary data to a .csv file:
cell_data.to_csv(out_data_directory, sep=";", cycles=False, raw=False)
# export also the current voltage cycles by setting cycles=True
# export also the raw data by setting raw=True

Note

CellpyCell objects store the data (including the summary and step-tables) in pandas DataFrames. This means that you can easily export the data to other formats, such as Excel, by using the to_excel method of the DataFrame object. In addition, CellpyCell objects have a method called to_excel that exports the data to an Excel file.

More about the cellpy.get method

Note

This chapter would benefit from some more love and care. Any help on that would be highly appreciated.

The following keyword arguments is current supported by cellpy.get:

# from the docstring:
Args:
    filename (str, os.PathLike, OtherPath, or list of raw-file names): path to file(s) to load
    instrument (str): instrument to use (defaults to the one in your cellpy config file)
    instrument_file (str or path): yaml file for custom file type
    cellpy_file (str, os.PathLike, OtherPath): if both filename (a raw-file) and cellpy_file (a cellpy file)
        is provided, cellpy will try to check if the raw-file is has been updated since the
        creation of the cellpy-file and select this instead of the raw file if cellpy thinks
        they are similar (use with care!).
    logging_mode (str): "INFO" or "DEBUG"
    cycle_mode (str): the cycle mode (e.g. "anode" or "full_cell")
    mass (float): mass of active material (mg) (defaults to mass given in cellpy-file or 1.0)
    nominal_capacity (float): nominal capacity for the cell (e.g. used for finding C-rates)
    loading (float): loading in units [mass] / [area]
    area (float): active electrode area (e.g. used for finding the areal capacity)
    estimate_area (bool): calculate area from loading if given (defaults to True)
    auto_pick_cellpy_format (bool): decide if it is a cellpy-file based on suffix.
    auto_summary (bool): (re-) create summary.
    units (dict): update cellpy units (used after the file is loaded, e.g. when creating summary).
    step_kwargs (dict): sent to make_steps
    summary_kwargs (dict): sent to make_summary
    selector (dict): passed to load (when loading cellpy-files).
    testing (bool): set to True if testing (will for example prevent making .log files)
    **kwargs: sent to the loader

Reading a cellpy file:

c = cellpy.get("my_cellpyfile.cellpy")
# or
c = cellpy.get("my_cellpyfile.h5")

Reading anode half-cell data from arbin sql:

c = cellpy.get("my_cellpyfile", instrument="arbin_sql", cycle_mode="anode")
# Remark! if sql prms are not set in your config-file you have to set them manually (e.g. setting values in
#    prms.Instruments.Arbin.VAR)

Reading data obtained by exporting csv from arbin sql using non-default delimiter sign:

c = cellpy.get("my_cellpyfile.csv", instrument="arbin_sql_csv", sep=";")

Reading data obtained by exporting a csv file from Maccor using a sub-model (this example uses one of the models already available inside cellpy):

c = cellpy.get(filename="name.txt", instrument="maccor_txt", model="one", mass=1.0)

Reading csv file using the custom loader where the format definitions are given in a user-supplied yaml-file:

c = cellpy.get(filename="name.txt", instrument_file="my_custom_file_format.yml")

If you specify both the raw file name(s) and the cellpy file name to``cellpy.get`` you can make cellpy select whether-or-not to load directly from the raw-file or use the cellpy-file instead. cellpy will check if the raw file(s) is/are updated since the last time you saved the cellpy file - if not, then it will load the cellpy file instead (this is usually much faster than loading the raw file(s)). You can also input the masses and enforce that it creates a summary automatically.

cell_data.get(
    raw_files=[raw_files],
    cellpy_file=cellpy_file,
    mass=electrode_mass,
    auto_summary=True,
)

if not cell_data.check():
    print("Could not load the data")

Working with external files

To work with external files you will need to set some environment variables. This can most easily be done by creating a file called .env_cellpy in your user directory (e.g. C:\Users\jepe):

# content of .env_cellpy
CELLPY_PASSWORD=1234
CELLPY_KEY_FILENAME=C:\\Users\\jepe\\.ssh\\id_key
CELLPY_HOST=myhost.com
CELLPY_USER=jepe

You can then load the file using the cellpy.get method by providing the full path to the file, including the protocol (e.g. scp://) and the user name and host (e.g. jepe@myhost.com):

# assuming appropriate ``.env_cellpy`` file is present
raw_file = "scp://jepe@myhost.com/path/to/file.txt"
c = cellpy.get(filename=raw_file, instrument="maccor_txt", model="one", mass=1.0)

cellpy will automatically download the file to a temporary directory and read it.

Save / export data

Saving data to cellpy format is done by the CellpyCell.save method. To export data to csv format, CellpyCell has a method called to_csv.

# export data to csv
out_data_directory = r"C:\processed_data\csv"
# this exports the summary data to a .csv file:
cell_data.to_csv(out_data_directory, sep=";", cycles=False, raw=False)
# export also the current voltage cycles by setting cycles=True
# export also the raw data by setting raw=True

Stuff that you might want to do with cellpy

Note

This chapter would benefit from some more love and care. Any help on that would be highly appreciated.

A more or less random collection of things that you might want to do with cellpy. This is not a tutorial, but rather a collection of examples.

Extract current-voltage graphs

If you have loaded your data into a CellpyCell-object, let’s now consider how to extract current-voltage graphs from your data. We assume that the name of your CellpyCell-object is cell_data:

cycle_number = 5
charge_capacity, charge_voltage = cell_data.get_ccap(cycle_number)
discharge_capacity, discharge_voltage = cell_data.get_dcap(cycle_number)

You can also get the capacity-voltage curves with both charge and discharge:

capacity, charge_voltage = cell_data.get_cap(cycle_number)
# the second capacity (charge (delithiation) for typical anode half-cell experiments)
# will be given "in reverse".

The CellpyCell object has several get-methods, including getting current, timestamps, etc.

Extract summaries of runs

Summaries of runs includes data pr. cycle for your data set. Examples of summary data is charge- and discharge-values, coulombic efficiencies and internal resistances. These are calculated by the make_summary method.

Remark that note all the possible summary statistics are calculated as default. This means that you might have to re-run the make_summary method with appropriate parameters as input (e.g. normalization_cycle, to give the appropriate cycle numbers to use for finding nominal capacity).

Another method is responsible for investigating the individual steps in the data (make_step_table). It is typically run automatically before creating the summaries (since the summary creation depends on the step_table). This table is interesting in itself since it contains delta, minimum, maximum and average values for the measured values pr. step. This is used to find out what type of step it is, e.g. a charge-step or maybe a ocv-step. It is possible to provide information to this function if you already knows what kind of step each step is. This saves cellpy for a lot of work.

Remark that the default is to calculate values for each unique (step-number - cycle-number) pair. For some experiments, a step can be repeated many times pr. cycle. And if you need for example average values of the voltage for each step (for example if you are doing GITT experiments), you would need to tell make_step_table that it should calculate for all the steps (all_steps=True).

Create dQ/dV plots

The methods for creating incremental capacity curves is located in the cellpy.utils.ica module (Extracting ica data).

Do some plotting

The plotting methods are located in the cellpy.utils.plotting module (Have a look at the data).

What else?

There are many things you can do with cellpy. The idea is that you should be able to use cellpy as a tool to do your own analysis. This means that you need to know a little bit about python and how to use the different modules. It is not difficult, but it requires some playing around and maybe reading some of the source code. Let’s keep our fingers crossed and hope that the documentation will be improved in the future.

Why not just try out the highly popular (?) cellpy.utils.batch utility. You will need to make (or copy from a friend) the “database” (an excel-file with appropriate headers in the first row) and make sure that all the paths are set up correctly in you cellpy configuration file. Then you can process many cells in one go. And compare them.

Or, for example: If you would like to do some interactive plotting of your data, try to install plotly and use Jupyter Lab to make some fancy plots and dash-boards.

And why not: make a script that goes through all your thousands of measured cells, extracts the life-time (e.g. number of cycles until the capacity has dropped below 80% of the average of the three first cycles), and plot this versus time the cell was put. And maybe color the data-points based on who was doing the experiment?

Configuring cellpy

How the configuration parameters are set and read

When cellpy is imported, a default set of parameters is set. Then it tries to read the parameters from your .conf-file (located in your user directory). If successful, the parameters set in your .conf-file will over-ride the default.

The parameters are stored in the module cellpy.parameters.prms.

If you would like to change some of the settings during your script (or in your jupyter notebook), e.g. if you want to use the cycle_mode option “cathode” instead of the default “anode”, then import the prms class and set new values:

from cellpy import parameters.prms

# Changing cycle_mode to cathode
prms.Reader.cycle_mode = 'cathode'

# Changing delimiter to  ',' (used when saving .csv files)
prms.Reader.sep = ','

# Changing the default folder for processed (output) data
prms.Paths.outdatadir = 'experiment01/processed_data'

The configuration file

cellpy tries to read your .conf-file when imported the first time, and looks in your user directory after files named .cellpy_prms_SOMENAME.conf.

If you have run cellpy setup in the cmd window or in the shell, the configuration file will be placed in the appropriate place. It will have the name .cellpy_prms_USERNAME.conf (where USERNAME is your username).

The configuration file is a YAML-file and it is reasonably easy to read and edit (but remember that YAML is rather strict with regards to spaces and indentations).

As an example, here are the first lines from one of the authors’ configuration file:

---
Paths:
    outdatadir: C:\scripts\processing_cellpy\out
    rawdatadir: I:\Org\MPT-BAT-LAB\Arbin-data
    cellpydatadir: C:\scripts\processing_cellpy\cellpyfiles
    db_path: C:\scripts\processing_cellpy\db
    filelogdir: C:\scripts\processing_cellpy\logs
    examplesdir: C:\scripts\processing_cellpy\examples
    notebookdir: C:\scripts\processing_cellpy\notebooks
    templatedir: C:\scripting\processing_cellpy\templates
    batchfiledir: C:\scripts\processing_cellpy\batchfiles
    db_filename: 2023_Cell_Analysis_db_001.xlsx
    env_file: .env_cellpy


FileNames:
    file_name_format: YYYYMMDD_[NAME]EEE_CC_TT_RR

The first part contains definitions of the different paths, files and file-patterns that cellpy will use. This is the place where you most likely will have to do some edits sometime.

The next part contains definitions required when using a database:

# settings related to the db used in the batch routine
Db:
    db_type: simple_excel_reader
    db_table_name: db_table
    db_header_row: 0
    db_unit_row: 1
    db_data_start_row: 2
    db_search_start_row: 2
    db_search_end_row: -1

# definitions of headers for the simple_excel_reader
DbCols:
    id:
    - id
    - int
    exists:
    - exists
    - bol
    batch:
    - batch
    - str
    sub_batch_01:
    - b01
    - str
    .
    .

This part is rather long (since it needs to define the column names used in the db excel sheet).

The next part contains settings regarding your dataset and the cellreader, as well as for the different instruments. At the bottom you will find the settings for the batch utility.

# settings related to your data
DataSet:
    nom_cap: 3579

# settings related to the reader
Reader:
    Reader:
        diagnostics: false
        filestatuschecker: size
        force_step_table_creation: true
        force_all: false
        sep: ;
        cycle_mode: anode
        sorted_data: true
        select_minimal: false
        limit_loaded_cycles:
        ensure_step_table: false
        voltage_interpolation_step: 0.01
        time_interpolation_step: 10.0
        capacity_interpolation_step: 2.0
        use_cellpy_stat_file: false
        auto_dirs: true

# settings related to the instrument loader
# (each instrument can have its own set of settings)
Instruments:
    tester: arbin
    custom_instrument_definitions_file:

Arbin:
    max_res_filesize: 1000000000
    chunk_size:
    max_chunks:
    use_subprocess: false
    detect_subprocess_need: false
    sub_process_path:
    office_version: 64bit
    SQL_server: localhost
    SQL_UID:
    SQL_PWD:
    SQL_Driver: ODBC Driver 17 for SQL Server
    odbc_driver:
Maccor:
    default_model: one

# settings related to running the batch procedure
Batch:
    fig_extension: png
    backend: bokeh
    notebook: true
    dpi: 300
    markersize: 4
    symbol_label: simple
    color_style_label: seaborn-deep
    figure_type: unlimited
    summary_plot_width: 900
    summary_plot_height: 800
    summary_plot_height_fractions:
    - 0.2
    - 0.5
    - 0.3
...

As you can see, the author of this particular file most likely works with silicon as anode material for lithium ion batteries (the nom_cap is set to 3579 mAh/g, i.e. the theoretical gravimetric lithium capacity for silicon at normal temperatures) and is using windows.

By the way, if you are wondering what the ‘.’ means… it means nothing - it was just something I added in this tutorial text to indicate that there is more stuff in the actual file than what is shown here.

Working with the pandas.DataFrame objects directly

Note

This chapter would benefit from some more love and care. Any help on that would be highly appreciated.

The CellpyCell object stores the data in several pandas.DataFrame objects. The easies way to get to the DataFrames is by the following procedure:

# Assumed name of the CellpyCell object: c

# get the 'test':
data = c.data
# data is now a cellpy Data object (cellpy.readers.cellreader.Data)

# pandas.DataFrame with data vs cycle number (coulombic efficiency, charge-capacity etc.):
summary_data = data.summary
# you could also get the summary data by:
summary_data = c.data.summary

# pandas.DataFrame with the raw data:
raw_data = data.raw

# pandas.DataFrame with statistics on each step and info about step type:
step_info = data.steps

You can then manipulate your data with the standard pandas.DataFrame methods (and pandas methods in general).

Happy pandas-ing!

Data mining / using a database

Note

This chapter would benefit from some more love and care. Any help on that would be highly appreciated.

One important motivation for developing the cellpy project is to facilitate handling many cell testing experiments within a reasonable time and with a “tunable” degree of automation. It is therefore convenient to be able to couple both the meta-data (look-up) to some kind of data-base, as well as saving the extracted key parameters to either the same or another database (where I recommend the latter). The database(s) will be a valuable asset for further data analyses (either using statistical methods, e.g. Bayesian modelling, or as input to machine learning algorithms, for example deep learning using cnn).

Meta-data database

TODO.

Parameters and feature extraction

TODO.

Bayesian modelling

TODO.

Example: reinforcement deep learning (resnet)

TODO.

The cellpy command

To assist in using cellpy more efficiently, a set of routines are available from the command line by issuing the cellpy command at the shell (or in the cmd window).

$ cellpy
Usage: cellpy [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  edit   Edit your cellpy config or database files.
  info   This will give you some valuable information about your cellpy.
  new    Set up a batch experiment (might need git installed).
  pull   Download examples or tests from the big internet (needs git).
  run    Run a cellpy process (e.g.
  serve  Start a Jupyter server.
  setup  This will help you to set up cellpy.

The cli is still under development (cli stands for command-line-interface, by the way). Both the cellpy new and the cellpy serve command worked the last time I tried them. But it might not work on your computer. If you run into problems, let us know.

Information

A couple of commands are implemented to get some information about your cellpy environment (currently getting your cellpy version and the location of your configuration file):

$ cellpy info --version
[cellpy] version: 0.4.1

$ cellpy info --configloc
[cellpy] -> C:\Users\jepe\.cellpy_prms_jepe.conf

Setting up cellpy from the cli

To get the most out of cellpy it is to best to set it up properly. To help with this, you can use the setup command. If you include the --interactive switch, you will be prompted for your preferred location for the different folders / directories cellpy uses (it still will work without them, though).

$ cellpy setup --interactive

The command will create a starting cellpy configuration file (,cellpy_prms_USERNAME.conf) or update it if it exists, and create the following directory structure:

batchfiles/
cellpyfiles/
db/
examples/
instruments/
logs/
notebooks/
out/
raw/
templates/

Note

It is recommended to rerun setup each time you update cellpy.

Note

You can get help for each sub-command by turning on the --help switch. For example, for setup:

$ cellpy setup --help

You will then get some more detailed information on the different switches you have at your disposal:

Usage: cellpy setup [OPTIONS]

  This will help you to setup cellpy.

Options:
  -i, --interactive       Allows you to specify div. folders and setting.
  -nr, --not-relative     If root-dir is given, put it directly in the root
                          (/) folder i.e. do not put it in your home directory.
                          Defaults to False. Remark that if you specifically
                          write a path name instead of selecting the suggested
                          default, the path you write will be used as is.
  -dr, --dry-run          Run setup in dry mode (only print - do not execute).
                          This is typically used when developing and testing
                          cellpy. Defaults to False.
  -r, --reset             Do not suggest path defaults based on your current
                          configuration-file
  -d, --root-dir PATH     Use custom root dir. If not given, your home
                          directory will be used as the top level where
                          cellpy-folders will be put. The folder path must
                          follow directly after this option (if used).
                          Example: $ cellpy setup -d 'MyDir'
  -n, --folder-name PATH
  -t, --testuser TEXT     Fake name for fake user (for testing)
  --help                  Show this message and exit.

The cellpy templating system

If you are performing the same type of data processing for many cells, and possibly many times, it is beneficial to start out with a template.

Currently, cellpy provides a template system defaulting to a set of Jupyter notebooks and a folder structure where the code is based on the batch utility (cellpy.utils.batch).

The templates are pulled from the cellpy_templates repository. It uses cookiecutter under the hood (and therefore needs git installed).

This repository contains several template sets. The default is named standard, but you can set another default in your configuration file.

You can also make your own templates and store them locally on your computer (in the templates directory). The template should be in a zip file and start with “cellpy_template” and end with “.zip”.

$ cellpy new --help


Usage: cellpy new [OPTIONS]

  Set up a batch experiment (might need git installed).

Options:
  -t, --template TEXT        Provide template name.
  -d, --directory TEXT       Create in custom directory.
  -p, --project TEXT         Provide project name (i.e. sub-directory name).
  -e, --experiment TEXT      Provide experiment name (i.e. lookup-value).
  -u, --local-user-template  Use local template from the templates directory.
  -s, --serve                Run Jupyter.
  -r, --run                  Use PaperMill to run the notebook(s) from the
                             template (will only work properly if the
                             notebooks can be sorted in correct run-order by
                             'sorted'.
  -j, --lab                  Use Jupyter Lab instead of Notebook when serving.
  -l, --list                 List available templates and exit.
  --help                     Show this message and exit.

Automatically running batches

The run command is used for running the appropriate editor for your database, and for running (processing) files in batches.

$ cellpy run --help

Usage: cellpy run [OPTIONS] [NAME]

  Run a cellpy process (batch-job, edit db, ...).
  You can use this to launch specific applications.

  Examples:

      edit your cellpy database

         cellpy run db

      run a batch job described in a journal file

         cellpy run -j my_experiment.json

Options:
  -j, --journal         Run a batch job defined in the given journal-file
  -k, --key             Run a batch job defined by batch-name
  -f, --folder          Run all batch jobs iteratively in a given folder
  -p, --cellpy-project  Use PaperMill to run the notebook(s) within the given
                        project folder (will only work properly if the
                        notebooks can be sorted in correct run-order by
                        'sorted'). Warning! since we are using `click` - the
                        NAME will be 'converted' when it is loaded (same as
                        print(name) does) - so you can't use backslash ('\')
                        as normal in windows (use either '/' or '\\' instead).
  -d, --debug           Run in debug mode.
  -s, --silent          Run in silent mode.
  --raw                 Force loading raw-file(s).
  --cellpyfile          Force cellpy-file(s).
  --minimal             Minimal processing.
  --nom-cap FLOAT       nominal capacity (used in calculating rates etc)
  --batch_col TEXT      batch column (if selecting running from db)
  --project TEXT        name of the project (if selecting running from db)
  -l, --list            List batch-files.
  --help                Show this message and exit.