Basics usage

The getting started with cellpy tutorial (opinionated version)

This tutorial will help you getting started with cellpy and tries to give you a step-by-step recipe. The information in this tutorial can also (most likely) be found elsewhere. For the novice users, jump directly to chapter 1.2.

How to install cellpy - the minimalistic explanation

If you know what you are doing, and only need the most basic features of cellpy, you should be able to get things up and running by issuing a simple

pip install cellpy

It is recommended that you use a Python environment (or conda environment) and give it a easy to remember name e.g. cellpy.

You also need the typical scientific python pack, including numpy, scipy, and pandas. It is recommended that you at least install scipy before you install cellpy (the main benefit being that you can use conda so that you don’t have to hassle with missing C-compilers if you are on an Windows machine).

Install a couple of other dependencies

You should also install some additional dependencies:

pytables is needed for working with the hdf5 files (the cellpy-files):

conda install -c conda-forge pytables

If you would like to use some of the fitting routines in cellpy, you will need to install lmfit:

conda install -c conda-forge lmfit

Another tool that is really handy is Jupyter. And the plotting library bundle holoviz. You might already have them installed. If not, I recommend that you look at their documentation (google it) and install them. You can most likely use the same method as for pytables etc.

Note! In addition to the requirements set in the setup.py file, you will also need a Python ODBC bridge for loading .res-files from Arbin testers. And possible also other ‘too-be-implemented’ file formats. I recommend pyodbc that can be installed from conda forge or using pip.

conda install -c conda-forge pyodbc

For reading .res-files (which actually are in a Microsoft Access format) you also need a driver or similar to help your ODBC bridge accessing it. A small hint for Windows users: if you don’t have one of the most recent Office version, you might not be allowed to install a driver of different bit than your office version is using (the installers can be found here). Also remark that the driver needs to be of the same bit as your Python (so, if you are using 32 bit Python, you will need the 32 bit driver).

For POSIX systems, I have not found any suitable drivers. Instead, cellpy will try to use mdbtoolsto first export the data to temporary csv-files, and then import from those csv-file (using the pandas library). You can install mdbtools using your systems preferred package manager (e.g. apt-get install mdbtools).

The tea spoon explanation

If you are used to installing stuff from the command line (or shell), then things might very well run smoothly. However, a considerable percentage of us don’t feel exceedingly comfortable installing things by writing commands inside a small black window. Let’s face it; we belong to the point-and-click (or double-click) generation, not the write-cryptic-commands generation. So, hopefully without insulting the savvy, here is a “tea-spoon explanation”

Install a scientific stack of python 3.x

If the words “virtual environment” or “miniconda” don’t ring any bells, you should install the Anaconda scientific Python distribution. Go to www.anaconda.com and select the Anaconda distribution (press the Download Now button). And no, don´t select python 2.7. Use at least python 3.6. And select the 64 bit version (if you fail at installing the 64 bit version, then you can try the weaker 32 bit version). Download it and let it install.

Create a virtual environment

This step can be omitted (but its not necessary very smart to do so). Create a virtual conda environment called my_cellpy (the name is not important, but it should be a name you are able to remember).

Open up a command window (you can find a command window on Windows by e.g pressing the Windows button + r and typing cmd.exe), or even better, open up “anaconda prompt”. Then type

conda create -n my_cellpy

Then activate your environment:

conda activate my_cellpy

If you get an error message, then it could be that your Python version is not available for you (maybe you installed as root?). If you were using the command window on windows, try to locate the “anaconda prompt” program and run that instead.

Install cellpy

conda install -c conda-forge cellpy

Note that the bin version matters some times, so try to make a mental note of what you selected (for example, if you plan to use the Microsoft Access odbc driver, and it is 32-bit, you probably should chose to install an 32-bit python version (see next sub-chapter)).

If you don’t have the newest office suit, you might need to install the Microsoft Access odbc driver which can be downloaded from this page

Check your installation

The easiest way to check if cellpy has been installed, is to issue the command for printing the version number to the screen

cellpy info --version

If the program prints the expected version number, you probably succeeded. If it crashes, then you will have to retrace your steps, redo stuff and hope for the best. If it prints an older (lower) version number than you expect, it is a big chance that you have installed it earlier, and what you would like to do is to do an upgrade instead of an install

pip install --upgrade cellpy

It could also be that you want to install a pre-release (a version that is so bleeding edge that it ends with a alpha or beta release identification, e.g. ends with .b2). Then you will need to add the –pre modifier

pip install --pre cellpy

To run a more complete check of your installation, there exist a cellpy sub-command than can be helpful

cellpy info --check

The cellpy command to your rescue

To help installing and controlling your cellpy installation, a CLI is provided with four main commands, including info for getting information about your installation, and setup for helping you to set up your installation and writing a configuration file.

To get more information, you can issue

cellpy --help

This will out-put some (hopefully) helpful text

Usage: cellpy [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  edit   Edit your cellpy config file.
  info   This will give you some valuable information about your cellpy.
  new    Set up a batch experiment.
  pull   Download examples or tests from the big internet.
  run    Run a cellpy process.
  serve  Start a Jupyter server
  setup  This will help you to setup cellpy.

You can get information about the sub-commands by issuing –-help after them also. For example, issuing

cellpy info --help

gives

Usage: cellpy info [OPTIONS]

Options:
  -v, --version    Print version information.
  -l, --configloc  Print full path to the config file.
  -p, --params     Dump all parameters to screen.
  -c, --check      Do a sanity check to see if things works as they should.
  --help           Show this message and exit.

Using the cellpy command for your first time setup

After you have installed cellpy it is highly recommended that you create an appropriate configuration file and create folders for raw data, cellpy-files, logs, databases and output data (and inform cellpy about it)

cellpy setup -i

The -i option makes sure that the setup is done interactively. The program will ask you about where specific folders are, e.g. where you would like to put your outputs and where your cell data files are located. If the folders don’t exist, cellpy will try to create them.

If you want to specify a root folder different from the default (your HOME folder), you can use the -d option e.g. cellpy setup -i -d /Users/kingkong/cellpydir

Hint

If you don’t choose the -i option and goes for accepting all the defaults, you can always edit your configurations directly in the cellpy configuration file (that should be located inside your home directory, /~ in posix and c:usersNAME in not-too-old windows).

When you have answered all your questions, a configuration file will be made and saved to your home directory. You can always issue cellpy info -l to find out where your configuration file is located (it’s written in YAML format and it should be relatively easy to edit it in a text editor)

Running your first script

As with most software, you are encouraged to play a little with it. I hope there are some useful stuff in the code repository (for example in the examples folder).

Hint

The cellpy pull command can assist in downloading both examples and tests.

Let’s start by a trying to import cellpy in an interactive Python session. If you have an icon to press to start up the Python in interactive mode, do that (it could also be for example an ipython console or a Jupyter Notebook). You can also start an interactive Python session if you are in your terminal window of command window by just writing python and pressing enter.

Once inside Python, try issuing import cellpy. Hopefully you should not see any error-messages.

Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:36:06)
[MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cellpy
>>>

Nothing bad happened this time. If you got an error message, try to interpret it and check if you have skipped any steps in this tutorial. Maybe you are missing the box package? if so, go out of the Python interpreter if you started it in your command window, or open another command window and write

pip install python-box

and try again.

Now let’s try to be a bit more ambitious. Start up python again if you not still running it and try this:

>>> from cellpy import prmreader
>>> prmreader.info()

The prmreader.info() command should print out information about your cellpy settings. For example where you selected to look for your input raw files (prms.Paths.rawdatadir).

Try scrolling to find your own prms.Paths.rawdatadir. Does it look right? These settings can be changed by either re-running the cellpy setup -i command (not in Python, but in the command window / terminal window). You probably need to use the --reset flag this time since it is not your first time running it).

What next?

For example: If you want to use the highly popular (?) cellpy.utils.batch utility, you need to make (or copy from a friend) the “database” (an excel-file with appropriate headers in the first row) and make sure that all the paths are set up correctly in you cellpy configuration file.

Or, for example: If you would like to do some interactive plotting of your data, try to install holoviz and use Jupyter Lab to make some fancy plots and dash-boards.

And why not: make a script that goes through all your thousands of measured cells, extracts the life-time (e.g. number of cycles until the capacity has dropped below 80% of the average of the three first cycles), and plot this versus time the cell was put. And maybe color the data-points based on who was doing the experiment?

Configuring cellpy

How the configuration parameters are set and read

When cellpy is imported, it sets a default set of parameters. Then it tries to read the parameters from your .conf-file (located in your user directory). If it is successful, the parameters set in your .conf-file will over-ride the default ones.

The parameters are stored in the module cellpy.parameters.prms.

If you during your script (or in your jupyter notebook) would like to change some of the settings (e.g. if you want to use the cycle_mode option “cathode” instead of the default “anode”), then import the prms class and set new values:

from cellpy import parameters.prms

# Changing cycle_mode to cathode
prms.Reader.cycle_mode = 'cathode'

# Changing delimiter to  ',' (used when saving .csv files)
prms.Reader.sep = ','

# Changing the default folder for processed (output) data
prms.Paths.outdatadir = 'experiment01/processed_data'

The configuration file

cellpy tries to read your .conf-file when imported the first time, and looks in your user directory on posix or in the documents folder on windows (e.g. C:\Users\USERNAME\Documents on not-too-old versions of windows) after files named .cellpy_prms_SOMENAME.conf.

If you have run cellpy setup in the cmd window or in the shell, the configuration file will be placed in the appropriate place. It will have the name .cellpy_prms_USERNAME.conf (where USERNAME is your username).

The configuration file is a YAML-file and it is reasonably easy to read and edit (but remember that YAML is rather strict with regards to spaces and indentations).

As an example, here are the first lines from one of the authors’ configuration file:

---
Paths:
  outdatadir: C:\scripts\processing_cellpy\out
  rawdatadir: I:\Org\MPT-BAT-LAB\Arbin-data
  cellpydatadir: C:\scripts\processing_cellpy\cellpyfiles
  db_path: C:\scripts\processing_cellpy\db
  filelogdir: C:\scripts\processing_cellpy\logs
  examplesdir: C:\scripts\processing_cellpy\examples
  notebookdir: C:\scripts\processing_cellpy\notebooks
  templatedir: C:\scripting\processing_cellpy\templates
  batchfiledir: C:\scripts\processing_cellpy\batchfiles
  db_filename: 2020_Cell_Analysis_db_001.xlsx

FileNames:
  file_name_format: YYYYMMDD_[NAME]EEE_CC_TT_RR

The first part contains definitions of the different paths, files and file-patterns that cellpy will use. This is probably the place where you most likely will have to do some edits sometime.

Next comes definitions needed when using a db.

# settings related to the db used in the batch routine
Db:
  db_type: simple_excel_reader
  db_table_name: db_table
  db_header_row: 0
  db_unit_row: 1
  db_data_start_row: 2
  db_search_start_row: 2
  db_search_end_row: -1

# definitions of headers for the simple_excel_reader
DbCols:
  id:
  - id
  - int
  exists:
  - exists
  - bol
  batch:
  - batch
  - str
  sub_batch_01:
  - b01
  - str
  .
  .

Its rather long (since it needs to define the column names used in the db excel sheet). After this, the settings the datasets and the cellreader comes, as well as for the different instruments. You will also find the settings for the batch utility at the bottom.

# settings related to your data
DataSet:
  nom_cap: 3579

# settings related to the reader
Reader:
  Reader:
    diagnostics: false
    filestatuschecker: size
    force_step_table_creation: true
    force_all: false
    sep: ;
    cycle_mode: anode
    sorted_data: true
    load_only_summary: false
    select_minimal: false
    limit_loaded_cycles:
    ensure_step_table: false
    daniel_number: 5
    voltage_interpolation_step: 0.01
    time_interpolation_step: 10.0
    capacity_interpolation_step: 2.0
    use_cellpy_stat_file: false
    auto_dirs: true

# settings related to the instrument loader
# (each instrument can have its own set of settings)
Instruments:
  tester: arbin
  custom_instrument_definitions_file:

  Arbin:
    max_res_filesize: 1000000000
    chunk_size:
    max_chunks:
    use_subprocess: false
    detect_subprocess_need: false
    sub_process_path:
    office_version: 64bit
    SQL_server: localhost
    SQL_UID:
    SQL_PWD:
    SQL_Driver: ODBC Driver 17 for SQL Server
    odbc_driver:
  Maccor:
    default_model: one

# settings related to running the batch procedure
Batch:
  fig_extension: png
  backend: bokeh
  notebook: true
  dpi: 300
  markersize: 4
  symbol_label: simple
  color_style_label: seaborn-deep
  figure_type: unlimited
  summary_plot_width: 900
  summary_plot_height: 800
  summary_plot_height_fractions:
  - 0.2
  - 0.5
  - 0.3
...

As you can see, the author of this particular file most likely works with silicon as anode material for lithium ion batteries (the nom_cap is set to 3579 mAh/g, i.e. the theoretical gravimetric lithium capacity for silicon at normal temperatures). And, he or she is using windows.

By the way, if you are wondering what the ‘.’ means… it means nothing - it was just something I added in this tutorial text to indicate that there are more stuff in the actual file than what is shown here.

Interacting with your data

Read cell data

We assume that we have cycled a cell and that we have two files with results (we had to stop the experiment and re-start for some reason). The files are in the .res format (Arbin).

The easiest way to load data is to use the cellpy.get method.

import cellpy

electrode_mass = 0.658 # active mass of electrode in mg
cell_data = cellpy.get("20170101_ife01_cc_01.res", mass=electrode_mass, cycle_mode="anode")

If you prefer, you can obtain the same by using cellpy.cellreader.CellpyData object directly: First, import the cellreader-object from cellpy:

import os
from cellpy import cellreader

Then define some settings and variables and create the CellpyData-object:

raw_data_dir = r"C:\raw_data"
out_data_dir = r"C:\processed_data"
cellpy_data_dir = r"C:\CellpyData"
cycle_mode = "anode" # default is usually "anode", but...
# These can also be set in the configuration file

electrode_mass = 0.658 # active mass of electrode in mg

# list of files to read (Arbin .res type):
raw_file = ["20170101_ife01_cc_01.res", "20170101_ife01_cc_02.res"]
# the second file is a 'continuation' of the first file...

# list consisting of file names with full path
raw_files = [os.path.join(raw_data_dir, f) for f in raw_file]

# creating the CellpyData object and sets the cycle mode:
cell_data = cellreader.CellpyData()
cell_data.cycle_mode = cycle_mode

Now we will read the files, merge them, and create a summary:

# if the list of files are in a list they are automatically merged:
cell_data.from_raw([raw_files])
cell_data.set_mass(electrode_mass)
cell_data.make_summary()
# Note: make_summary will automatically run the
# make_step_table function if it does not exist.

Then its probably best to save the data in the cellpy-format:

# defining a name for the cellpy_file (hdf5-format)
cellpy_file = os.path.join(cellpy_data_dir, "20170101_ife01_cc2.h5")
cell_data.save(cellpy_file)

For convenience, cellpy also has a method that can be used to select whether-or-not to load directly from the raw-file. Using the loadcell method, you can specify both the raw file name(s) and the cellpy file name, and cellpy will check if the raw file(s) is/are updated since the last time you saved the cellpy file - if not, then it will load the cellpy file instead (this is usually much faster than loading the raw file(s)). You can also input the masses and enforce that it creates a summary automatically.

cell_data.loadcell(raw_files=[raw_files], cellpy_file=cellpy_file,
                       mass=[electrode_mass], summary_on_raw=True,
                       force_raw=False)

if not cell_data.check():
    print("Could not load the data")

More about the cellpy.get method

The following keyword arguments is current supported by cellpy.get:

# from the docstring:
Args:
    filename (str, os.PathLike, or list of raw-file names): path to file(s)
    mass (float): mass of active material (mg) (defaults to mass given in cellpy-file or 1.0)
    instrument (str): instrument to use (defaults to the one in your cellpy config file) (arbin_res, arbin_sql, arbin_sql_csv, arbin_sql_xlxs)
    instrument_file (str or path): yaml file for custom file type
    nominal_capacity (float): nominal capacity for the cell (e.g. used for finding C-rates)
    logging_mode (str): "INFO" or "DEBUG"
    cycle_mode (str): the cycle mode (e.g. "anode" or "full_cell")
    auto_summary (bool): (re-) create summary.
    testing (bool): set to True if testing (will for example prevent making .log files)
    **kwargs: sent to the loader

Reading a cellpy file:

c = cellpy.get("my_cellpyfile.cellpy")
# or
c = cellpy.get("my_cellpyfile.h5")

Reading anode half-cell data from arbin sql:

c = cellpy.get("my_cellpyfile", instrument="arbin_sql", cycle_mode="anode")
# Remark! if sql prms are not set in your config-file you have to set them manually (e.g. setting values in
#    prms.Instruments.Arbin.VAR)

Reading data obtained by exporting csv from arbin sql using non-default delimiter sign:

c = cellpy.get("my_cellpyfile.csv", instrument="arbin_sql_csv", sep=";")

Reading data obtained by exporting a csv file from Maccor using a sub-model (this example uses one of the models already available inside cellpy):

c = cellpy.get(filename="name.txt", instrument="maccor_txt", model="one", mass=1.0)

Reading csv file using the custom loader where the format definitions are given in a user-supplied yaml-file:

c = cellpy.get(filename="name.txt", instrument_file="my_custom_file_format.yml")

Extract current-voltage graphs

If you have loaded your data into a CellpyData-object, let’s now consider how to extract current-voltage graphs from your data. We assume that the name of your CellpyData-object is cell_data:

cycle_number = 5
charge_capacity, charge_voltage = cell_data.get_ccap(cycle_number)
discharge_capacity, discharge_voltage = cell_data.get_dcap(cycle_number)

You can also get the capacity-voltage curves with both charge and discharge:

capacity, charge_voltage = cell_data.get_cap(cycle_number)
# the second capacity (charge (delithiation) for typical anode half-cell experiments)
# will be given "in reverse".

The CellpyData object has several get-methods, including getting current, timestamps, etc.

Extract summaries of runs

Summaries of runs includes data pr. cycle for your data set. Examples of summary data is charge- and discharge-values, coulombic efficiencies and internal resistances. These are calculated by the make_summary method.

Remark that note all the possible summary statistics are calculated as default. This means that you might have to re-run the make_summary method with appropriate parameters as input (e.g. normalization_cycle, to give the appropriate cycle numbers to use for finding nominal capacity).

Another method is responsible for investigating the individual steps in the data (make_step_table). It is typically run automatically before creating the summaries (since the summary creation depends on the step_table). This table is interesting in itself since it contains delta, minimum, maximum and average values for the measured values pr. step. This is used to find out what type of step it is, e.g. a charge-step or maybe a ocv-step. It is possible to provide information to this function if you already knows what kind of step each step is. This saves Cellpy for a lot of work.

Remark that the default is to calculate values for each unique (step-number - cycle-number) pair. For some experiments, a step can be repeated many times pr. cycle. And if you need for example average values of the voltage for each step (for example if you are doing GITT experiments), you would need to tell make_step_table that it should calculate for all the steps (all_steps=True).

Create dQ/dV plots

The methods for creating incremental capacity curves is located in the cellpy.utils.ica module.

Save / export data

Saving data to cellpy format is done by the CellpyData.save method. To export data to csv format, CellpyData has a method called to_csv.

# export data to csv
out_data_directory = r"C:\processed_data\csv"
# this exports the summary data to a .csv file:
cell_data.to_csv(out_data_directory, sep=";", cycles=False, raw=False)
# export also the current voltage cycles by setting cycles=True
# export also the raw data by setting raw=True

Working with the pandas.DataFrame objects directly

The CellpyData object stores the data in several pandas.DataFrame objects. The easies way to get to the DataFrames is by the following procedure:

# Assumed name of the CellpyData object: cellpy_data

# get the 'test':
c = cell_data.cell
# cellpy_test is now a cellpy Cell object (cellpy.readers.cellreader.Cell)

# pandas.DataFrame with data vs cycle number (e.g. coulombic efficiency):
summary_data = c.summary

# pandas.DataFrame with the raw data:
raw_data = c.raw

# pandas.DataFrame with statistics on each step and info about step type:
step_info = c.steps

You can then manipulate your data with the standard pandas.DataFrame methods (and pandas methods in general).

Note

At the moment, CellpyData objects can store several sets of test-data (several ‘tests’). They are stored in a list. It is not recommended to utilise this ‘possible to store multiple tests’ feature as it might be removed very soon (have not decided upon that yet).

Happy pandas-ing!

Data mining / using a database

One important motivation for developing the cellpy project is to facilitate handling many cell testing experiments within a reasonable time and with a “tunable” degree of automation. It is therefore convenient to be able to couple both the meta-data (look-up) to some kind of data-base, as well as saving the extracted key parameters to either the same or another database (where I recommend the latter). The database(s) will be a valuable asset for further data analyses (either using statistical methods, e.g. Bayesian modelling, or as input to machine learning algorithms, for example deep learning using cnn).

TODO.

TODO.

TODO.

TODO.

The cellpy command

To assist in using cellpy more efficiently, a set of routines are available from the command line by issuing the cellpy command at the shell (or in the cmd window).

$ cellpy
Usage: cellpy [OPTIONS] COMMAND [ARGS]...

Options:
 --help  Show this message and exit.

Commands:
    edit   Edit your cellpy config file.
    info   This will give you some valuable information about your cellpy.
    new    Set up a batch experiment.
    pull   Download examples or tests from the big internet.
    run    Run a cellpy process.
    serve  Start a Jupyter server
    setup  This will help you to setup cellpy.

As can be seen from the help-text, the cli is still under development (cli stands for command-line-interface, by the way). Both the cellpy new and the cellpy serve command worked the last time I tried them. But it might not work on your computer.

A couple of commands are implemented to get some information about your cellpy environment (currently getting your cellpy version and the location of your configuration file):

$ cellpy info --version
[cellpy] version: 0.3.1

$ cellpy info --configloc
[cellpy] ->C:\Users\jepe\_cellpy_prms_jepe.conf

The most important command is probably the setup command (that should be run when you install cellpy for the first time).

$ cellpy setup --interactive

Another very nice command is the new command that sets up a project structure for batch-processing cell data (using templates, either from github or from your local computer).

$ cellpy new