Basics usage
The getting started with cellpy
tutorial (opinionated version)
This tutorial will help you getting started with cellpy
and
tries to give you a step-by-step recipe. The information in this tutorial
can also (most likely) be found elsewhere. For the novice users,
jump directly to chapter 1.2.
How to install cellpy
- the minimalistic explanation
If you know what you are doing, and only need the most basic features
of cellpy
, you should be able to get things up and running by
issuing a simple
pip install cellpy
It is recommended that you use a Python environment (or conda
environment) and give it a easy to remember name e.g. cellpy
.
You also need the typical scientific python pack, including numpy
,
scipy
, and pandas
. It is recommended that you at least install
scipy
before you install cellpy
(the main benefit being that you
can use conda
so that you don’t have to hassle with missing
C-compilers if you are on an Windows machine).
Install a couple of other dependencies
You should also install some additional dependencies:
pytables
is needed for working with the hdf5 files (the cellpy-files):
conda install -c conda-forge pytables
If you would like to use some of the fitting routines in cellpy
, you
will need to install lmfit
:
conda install -c conda-forge lmfit
Another tool that is really handy is Jupyter. And the plotting library bundle holoviz. You might already have them installed. If not, I recommend that you look at their documentation (google it) and install them. You can most likely use the same method as for pytables etc.
Note! In addition to the requirements set in the setup.py
file, you
will also need a Python ODBC bridge for loading .res-files from Arbin
testers. And possible also other ‘too-be-implemented’ file formats. I
recommend pyodbc that
can be installed from conda forge or using pip.
conda install -c conda-forge pyodbc
For reading .res-files (which actually are in a Microsoft Access format) you also need a driver or similar to help your ODBC bridge accessing it. A small hint for Windows users: if you don’t have one of the most recent Office version, you might not be allowed to install a driver of different bit than your office version is using (the installers can be found here). Also remark that the driver needs to be of the same bit as your Python (so, if you are using 32 bit Python, you will need the 32 bit driver).
For POSIX systems, I have not found any suitable drivers. Instead,
cellpy
will try to use mdbtools
to first export the data to
temporary csv-files, and then import from those csv-file (using the
pandas
library). You can install mdbtools
using your systems
preferred package manager (e.g. apt-get install mdbtools
).
The tea spoon explanation
If you are used to installing stuff from the command line (or shell), then things might very well run smoothly. However, a considerable percentage of us don’t feel exceedingly comfortable installing things by writing commands inside a small black window. Let’s face it; we belong to the point-and-click (or double-click) generation, not the write-cryptic-commands generation. So, hopefully without insulting the savvy, here is a “tea-spoon explanation”
Install a scientific stack of python 3.x
If the words “virtual environment” or “miniconda” don’t ring any bells,
you should install the Anaconda scientific Python distribution. Go to
www.anaconda.com and select the
Anaconda distribution (press the Download Now
button). And no, don´t
select python 2.7. Use at least python 3.6. And select the 64 bit version
(if you fail at installing the 64 bit version, then you can try the
weaker 32 bit version). Download it and let it install.
Create a virtual environment
This step can be omitted (but its not necessary very smart to do so).
Create a virtual conda environment called my_cellpy
(the name is not
important, but it should be a name you are able to remember).
Open up a command window (you can find a command window on Windows by
e.g pressing the Windows button + r and typing cmd.exe
), or even better,
open up “anaconda prompt”. Then type
conda create -n my_cellpy
Then activate your environment:
conda activate my_cellpy
If you get an error message, then it could be that your Python version is not available for you (maybe you installed as root?). If you were using the command window on windows, try to locate the “anaconda prompt” program and run that instead.
Install cellpy
conda install -c conda-forge cellpy
Note that the bin version matters some times, so try to make a mental note of what you selected (for example, if you plan to use the Microsoft Access odbc driver, and it is 32-bit, you probably should chose to install an 32-bit python version (see next sub-chapter)).
If you don’t have the newest office suit, you might need to install the Microsoft Access odbc driver which can be downloaded from this page
Check your installation
The easiest way to check if cellpy
has been installed, is to issue
the command for printing the version number to the screen
cellpy info --version
If the program prints the expected version number, you probably
succeeded. If it crashes, then you will have to retrace your steps, redo
stuff and hope for the best. If it prints an older (lower) version
number than you expect, it is a big chance that you have installed it
earlier, and what you would like to do is to do an upgrade
instead
of an install
pip install --upgrade cellpy
It could also be that you want to install a pre-release (a version that is so bleeding edge that it ends with a alpha or beta release identification, e.g. ends with .b2). Then you will need to add the –pre modifier
pip install --pre cellpy
To run a more complete check of your installation, there exist a
cellpy
sub-command than can be helpful
cellpy info --check
The cellpy
command to your rescue
To help installing and controlling your cellpy
installation, a CLI
is provided with four main commands, including info
for getting
information about your installation, and setup
for helping you to
set up your installation and writing a configuration file.
To get more information, you can issue
cellpy --help
This will out-put some (hopefully) helpful text
Usage: cellpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
edit Edit your cellpy config file.
info This will give you some valuable information about your cellpy.
new Set up a batch experiment.
pull Download examples or tests from the big internet.
run Run a cellpy process.
serve Start a Jupyter server
setup This will help you to setup cellpy.
You can get information about the sub-commands by issuing –-help after them also. For example, issuing
cellpy info --help
gives
Usage: cellpy info [OPTIONS]
Options:
-v, --version Print version information.
-l, --configloc Print full path to the config file.
-p, --params Dump all parameters to screen.
-c, --check Do a sanity check to see if things works as they should.
--help Show this message and exit.
Using the cellpy
command for your first time setup
After you have installed cellpy
it is highly recommended that you
create an appropriate configuration file and create folders for raw
data, cellpy-files, logs, databases and output data (and inform
cellpy
about it)
cellpy setup -i
The -i
option makes sure that the setup is done interactively.
The program will ask you about where specific folders are, e.g. where
you would like to put your outputs and where your cell data files are
located. If the folders don’t exist, cellpy
will try to create them.
If you want to specify a root folder different from the default (your HOME
folder), you can use the -d
option e.g.
cellpy setup -i -d /Users/kingkong/cellpydir
Hint
If you don’t choose the -i
option and goes for accepting all the defaults,
you can always edit your configurations
directly in the cellpy configuration file (that should be located inside your
home directory, /~ in posix and c:usersNAME in not-too-old windows).
When you have answered all your questions, a configuration file will be
made and saved to your home directory. You can always issue
cellpy info -l
to find out where your configuration file is located
(it’s written in YAML format and it should be relatively easy to edit it
in a text editor)
Running your first script
As with most software, you are encouraged to play a little with it. I hope there are some useful stuff in the code repository (for example in the examples folder).
Hint
The cellpy pull
command can assist in downloading
both examples and tests.
Let’s start by a trying to import cellpy
in an interactive Python session.
If you have an icon to press to start up the Python in interactive mode,
do that (it could also be for example an ipython console or a
Jupyter Notebook).
You can also start an interactive Python session
if you are in your terminal window of command window by just writing python
and pressing enter.
Once inside Python, try issuing import cellpy
. Hopefully you should not see
any error-messages.
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:36:06)
[MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cellpy
>>>
Nothing bad happened this time. If you got an error message, try to interpret
it and check if you have skipped any steps in this tutorial. Maybe you are
missing the box
package? if so, go out of the Python interpreter if you
started it in your command window, or open another command window and write
pip install python-box
and try again.
Now let’s try to be a bit more ambitious. Start up python again if you not still running it and try this:
>>> from cellpy import prmreader
>>> prmreader.info()
The prmreader.info()
command should print out information about your
cellpy settings. For example where you selected to look for your input
raw files (prms.Paths.rawdatadir
).
Try scrolling to find your own prms.Paths.rawdatadir
. Does it look
right? These settings can be changed by either re-running the
cellpy setup -i
command (not in Python, but in the command window /
terminal window). You probably need to use the --reset
flag this time
since it is not your first time running it).
What next?
For example: If you want to use the highly popular (?) cellpy.utils.batch
utility, you
need to make (or copy from a friend) the “database” (an excel-file with
appropriate headers in the first row) and make sure that all the paths
are set up correctly in you cellpy configuration file.
Or, for example: If you would like to do some interactive plotting of your data, try to install holoviz and use Jupyter Lab to make some fancy plots and dash-boards.
And why not: make a script that goes through all your thousands of measured cells, extracts the life-time (e.g. number of cycles until the capacity has dropped below 80% of the average of the three first cycles), and plot this versus time the cell was put. And maybe color the data-points based on who was doing the experiment?
Configuring cellpy
How the configuration parameters are set and read
When cellpy
is imported, it sets a default set of parameters.
Then it tries to read the parameters
from your .conf-file (located in your user directory). If it is successful,
the parameters set in your .conf-file
will over-ride the default ones.
The parameters are stored in the module cellpy.parameters.prms
.
If you during your script (or in your jupyter notebook
) would like to
change some of the settings (e.g. if you
want to use the cycle_mode
option “cathode” instead of the
default “anode”), then import the prms class and set new
values:
from cellpy import parameters.prms
# Changing cycle_mode to cathode
prms.Reader.cycle_mode = 'cathode'
# Changing delimiter to ',' (used when saving .csv files)
prms.Reader.sep = ','
# Changing the default folder for processed (output) data
prms.Paths.outdatadir = 'experiment01/processed_data'
The configuration file
cellpy
tries to read your .conf-file when imported the first time,
and looks in your user directory on posix or in the documents folder on
windows (e.g. C:\Users\USERNAME\Documents on not-too-old versions of windows) after
files named .cellpy_prms_SOMENAME.conf
.
If you have run cellpy setup
in the cmd window or in the shell, the
configuration file will be placed in the appropriate place.
It will have the name .cellpy_prms_USERNAME.conf
(where USERNAME is your username).
The configuration file is a YAML-file and it is reasonably easy to read and edit (but remember that YAML is rather strict with regards to spaces and indentations).
As an example, here are the first lines from one of the authors’ configuration file:
---
Paths:
outdatadir: C:\scripts\processing_cellpy\out
rawdatadir: I:\Org\MPT-BAT-LAB\Arbin-data
cellpydatadir: C:\scripts\processing_cellpy\cellpyfiles
db_path: C:\scripts\processing_cellpy\db
filelogdir: C:\scripts\processing_cellpy\logs
examplesdir: C:\scripts\processing_cellpy\examples
notebookdir: C:\scripts\processing_cellpy\notebooks
templatedir: C:\scripting\processing_cellpy\templates
batchfiledir: C:\scripts\processing_cellpy\batchfiles
db_filename: 2020_Cell_Analysis_db_001.xlsx
FileNames:
file_name_format: YYYYMMDD_[NAME]EEE_CC_TT_RR
The first part contains definitions of the different paths, files and file-patterns
that cellpy
will use. This is probably the place
where you most likely will have to do some edits sometime.
Next comes definitions needed when using a db.
# settings related to the db used in the batch routine
Db:
db_type: simple_excel_reader
db_table_name: db_table
db_header_row: 0
db_unit_row: 1
db_data_start_row: 2
db_search_start_row: 2
db_search_end_row: -1
# definitions of headers for the simple_excel_reader
DbCols:
id:
- id
- int
exists:
- exists
- bol
batch:
- batch
- str
sub_batch_01:
- b01
- str
.
.
Its rather long (since it needs to define the column names used in the db excel sheet).
After this, the settings the datasets and the cellreader
comes, as well as for the different instruments.
You will also find the settings for the batch
utility at the bottom.
# settings related to your data
DataSet:
nom_cap: 3579
# settings related to the reader
Reader:
Reader:
diagnostics: false
filestatuschecker: size
force_step_table_creation: true
force_all: false
sep: ;
cycle_mode: anode
sorted_data: true
load_only_summary: false
select_minimal: false
limit_loaded_cycles:
ensure_step_table: false
daniel_number: 5
voltage_interpolation_step: 0.01
time_interpolation_step: 10.0
capacity_interpolation_step: 2.0
use_cellpy_stat_file: false
auto_dirs: true
# settings related to the instrument loader
# (each instrument can have its own set of settings)
Instruments:
tester: arbin
custom_instrument_definitions_file:
Arbin:
max_res_filesize: 1000000000
chunk_size:
max_chunks:
use_subprocess: false
detect_subprocess_need: false
sub_process_path:
office_version: 64bit
SQL_server: localhost
SQL_UID:
SQL_PWD:
SQL_Driver: ODBC Driver 17 for SQL Server
odbc_driver:
Maccor:
default_model: one
# settings related to running the batch procedure
Batch:
fig_extension: png
backend: bokeh
notebook: true
dpi: 300
markersize: 4
symbol_label: simple
color_style_label: seaborn-deep
figure_type: unlimited
summary_plot_width: 900
summary_plot_height: 800
summary_plot_height_fractions:
- 0.2
- 0.5
- 0.3
...
As you can see, the author of this particular file most likely works with
silicon as anode material for lithium ion
batteries (the nom_cap
is set to 3579 mAh/g, i.e. the theoretical
gravimetric lithium capacity for silicon at
normal temperatures). And, he or she is using windows.
By the way, if you are wondering what the ‘.’ means… it means nothing - it was just something I added in this tutorial text to indicate that there are more stuff in the actual file than what is shown here.
Interacting with your data
Read cell data
We assume that we have cycled a cell and that we have two files with results (we had to stop the experiment and re-start for some reason). The files are in the .res format (Arbin).
The easiest way to load data is to use the
cellpy.get
method.
import cellpy
electrode_mass = 0.658 # active mass of electrode in mg
cell_data = cellpy.get("20170101_ife01_cc_01.res", mass=electrode_mass, cycle_mode="anode")
If you prefer, you can obtain the same by using cellpy.cellreader.CellpyData
object directly:
First, import the cellreader-object from cellpy
:
import os
from cellpy import cellreader
Then define some settings and variables and create the CellpyData-object:
raw_data_dir = r"C:\raw_data"
out_data_dir = r"C:\processed_data"
cellpy_data_dir = r"C:\CellpyData"
cycle_mode = "anode" # default is usually "anode", but...
# These can also be set in the configuration file
electrode_mass = 0.658 # active mass of electrode in mg
# list of files to read (Arbin .res type):
raw_file = ["20170101_ife01_cc_01.res", "20170101_ife01_cc_02.res"]
# the second file is a 'continuation' of the first file...
# list consisting of file names with full path
raw_files = [os.path.join(raw_data_dir, f) for f in raw_file]
# creating the CellpyData object and sets the cycle mode:
cell_data = cellreader.CellpyData()
cell_data.cycle_mode = cycle_mode
Now we will read the files, merge them, and create a summary:
# if the list of files are in a list they are automatically merged:
cell_data.from_raw([raw_files])
cell_data.set_mass(electrode_mass)
cell_data.make_summary()
# Note: make_summary will automatically run the
# make_step_table function if it does not exist.
Then its probably best to save the data in the cellpy-format:
# defining a name for the cellpy_file (hdf5-format)
cellpy_file = os.path.join(cellpy_data_dir, "20170101_ife01_cc2.h5")
cell_data.save(cellpy_file)
For convenience, cellpy
also has a method that can be used to select whether-or-not to load
directly from the raw-file.
Using the loadcell
method, you can specify both the raw
file name(s) and the cellpy file name, and
cellpy
will check if the raw file(s) is/are updated since
the last time you saved the cellpy file - if not,
then it will load the cellpy file instead (this is usually much faster
than loading the raw file(s)).
You can also input the masses and enforce that it creates a
summary automatically.
cell_data.loadcell(raw_files=[raw_files], cellpy_file=cellpy_file,
mass=[electrode_mass], summary_on_raw=True,
force_raw=False)
if not cell_data.check():
print("Could not load the data")
More about the cellpy.get
method
The following keyword arguments is current supported by cellpy.get
:
# from the docstring:
Args:
filename (str, os.PathLike, or list of raw-file names): path to file(s)
mass (float): mass of active material (mg) (defaults to mass given in cellpy-file or 1.0)
instrument (str): instrument to use (defaults to the one in your cellpy config file) (arbin_res, arbin_sql, arbin_sql_csv, arbin_sql_xlxs)
instrument_file (str or path): yaml file for custom file type
nominal_capacity (float): nominal capacity for the cell (e.g. used for finding C-rates)
logging_mode (str): "INFO" or "DEBUG"
cycle_mode (str): the cycle mode (e.g. "anode" or "full_cell")
auto_summary (bool): (re-) create summary.
testing (bool): set to True if testing (will for example prevent making .log files)
**kwargs: sent to the loader
Reading a cellpy file:
c = cellpy.get("my_cellpyfile.cellpy")
# or
c = cellpy.get("my_cellpyfile.h5")
Reading anode half-cell data from arbin sql:
c = cellpy.get("my_cellpyfile", instrument="arbin_sql", cycle_mode="anode")
# Remark! if sql prms are not set in your config-file you have to set them manually (e.g. setting values in
# prms.Instruments.Arbin.VAR)
Reading data obtained by exporting csv from arbin sql using non-default delimiter sign:
c = cellpy.get("my_cellpyfile.csv", instrument="arbin_sql_csv", sep=";")
Reading data obtained by exporting a csv file from Maccor
using a sub-model (this example uses one of the models already available inside cellpy
):
c = cellpy.get(filename="name.txt", instrument="maccor_txt", model="one", mass=1.0)
Reading csv file using the custom loader where the format definitions are given in a user-supplied yaml-file:
c = cellpy.get(filename="name.txt", instrument_file="my_custom_file_format.yml")
Extract current-voltage graphs
If you have loaded your data into a CellpyData-object,
let’s now consider how to extract current-voltage graphs
from your data. We assume that the name of your
CellpyData-object is cell_data
:
cycle_number = 5
charge_capacity, charge_voltage = cell_data.get_ccap(cycle_number)
discharge_capacity, discharge_voltage = cell_data.get_dcap(cycle_number)
You can also get the capacity-voltage curves with both charge and discharge:
capacity, charge_voltage = cell_data.get_cap(cycle_number)
# the second capacity (charge (delithiation) for typical anode half-cell experiments)
# will be given "in reverse".
The CellpyData
object has several get-methods, including getting current,
timestamps, etc.
Extract summaries of runs
Summaries of runs includes data pr. cycle for your data set. Examples of
summary data is charge- and
discharge-values, coulombic efficiencies and internal resistances.
These are calculated by the
make_summary
method.
Remark that note all the possible summary statistics are calculated as
default. This means that you might have to re-run the make_summary
method
with appropriate parameters as input (e.g. normalization_cycle
,
to give the appropriate cycle numbers to use for finding nominal capacity).
Another method is responsible for investigating the individual steps in the
data (make_step_table
). It is typically run automatically before creating
the summaries (since the summary creation depends on the step_table). This
table is interesting in itself since it contains delta, minimum, maximum and
average values for the measured values pr. step. This is used to find out
what type of step it is, e.g. a charge-step or maybe a ocv-step. It is
possible to provide information to this function if you already knows what
kind of step each step is. This saves Cellpy
for a lot of work.
Remark that the default is to calculate values for each unique (step-number -
cycle-number) pair. For some experiments, a step can be repeated many times
pr. cycle. And if you need for example average values of the voltage for each
step (for example if you are doing GITT experiments), you would need to
tell make_step_table
that it should calculate for all the steps
(all_steps=True
).
Create dQ/dV plots
The methods for creating incremental capacity curves is located in
the cellpy.utils.ica
module.
Save / export data
Saving data to cellpy format is done by the CellpyData.save
method.
To export data to csv format,
CellpyData
has a method called to_csv
.
# export data to csv
out_data_directory = r"C:\processed_data\csv"
# this exports the summary data to a .csv file:
cell_data.to_csv(out_data_directory, sep=";", cycles=False, raw=False)
# export also the current voltage cycles by setting cycles=True
# export also the raw data by setting raw=True
Working with the pandas.DataFrame
objects directly
The CellpyData
object stores the data in several pandas.DataFrame
objects.
The easies way to get to the DataFrames is by the following procedure:
# Assumed name of the CellpyData object: cellpy_data
# get the 'test':
c = cell_data.cell
# cellpy_test is now a cellpy Cell object (cellpy.readers.cellreader.Cell)
# pandas.DataFrame with data vs cycle number (e.g. coulombic efficiency):
summary_data = c.summary
# pandas.DataFrame with the raw data:
raw_data = c.raw
# pandas.DataFrame with statistics on each step and info about step type:
step_info = c.steps
You can then manipulate your data with the standard pandas.DataFrame
methods
(and pandas
methods in general).
Note
At the moment, CellpyData
objects can store several sets of test-data
(several ‘tests’). They are stored
in a list. It is not recommended to utilise this
‘possible to store multiple tests’ feature as it might be
removed very soon (have not decided upon that yet).
Happy pandas-ing!
Data mining / using a database
One important motivation for developing the cellpy
project is to facilitate
handling many cell testing experiments within a reasonable time and with a
“tunable” degree of automation. It is therefore convenient to be able to
couple both the meta-data (look-up) to some kind of data-base, as well as
saving the extracted key parameters to either the same or another database
(where I recommend the latter). The database(s) will be a valuable asset for
further data analyses (either using statistical methods, e.g. Bayesian
modelling, or as input to machine learning algorithms, for example deep
learning using cnn).
TODO.
TODO.
TODO.
TODO.
The cellpy command
To assist in using cellpy
more efficiently, a set of routines are available from
the command line
by issuing the cellpy
command at the shell (or in the cmd window).
$ cellpy
Usage: cellpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
edit Edit your cellpy config file.
info This will give you some valuable information about your cellpy.
new Set up a batch experiment.
pull Download examples or tests from the big internet.
run Run a cellpy process.
serve Start a Jupyter server
setup This will help you to setup cellpy.
As can be seen from the help-text, the cli is still under development
(cli stands for command-line-interface, by the way). Both the cellpy new
and the cellpy serve
command worked the last time I tried them. But
it might not work on your computer.
A couple of commands are implemented to get some information about your cellpy environment (currently getting your cellpy version and the location of your configuration file):
$ cellpy info --version
[cellpy] version: 0.3.1
$ cellpy info --configloc
[cellpy] ->C:\Users\jepe\_cellpy_prms_jepe.conf
The most important command is probably the setup
command (that should be run
when you install cellpy for the first time).
$ cellpy setup --interactive
Another very nice command is the new
command that sets up a project structure
for batch-processing cell data (using templates, either from github or from your local computer).
$ cellpy new