Basics usage
The getting started with cellpy
tutorial (opinionated version)
This tutorial will help you getting started with cellpy
and
tries to give you a step-by-step recipe. It starts with installation, and you
should select the installation method that best suits your needs (or your level).
How to install and run cellpy
- the tea spoon explanation for standard users
If you are used to installing stuff from the command line (or shell), then things might very well run smoothly. If you are not, then you might want to read through the guide for complete beginners first (see below Setting up cellpy on Windows for complete beginners).
1. Install a scientific stack of python 3.x
If the words “virtual environment” or “miniconda” do not ring any bells,
you should install the Anaconda scientific Python distribution. Go to
www.anaconda.com and select the
Anaconda distribution (press the Download Now
button).
Use at least python 3.9, and select the 64 bit version
(if you fail at installing the 64 bit version, then you can try the
weaker 32 bit version). Download it and let it install.
Caution
The bin version matters sometimes, so try to make a mental note of what you selected. E.g., if you plan to use the Microsoft Access odbc driver (see below), and it is 32-bit, you probably should chose to install a 32-bit python version).
Python should now be available on your computer, as well as a huge amount of python packages. And Anaconda is kind enough to also install an alternative command window called “Anaconda Prompt” that has the correct settings ensuring that the conda command works as it should.
2. Create a virtual environment
This step can be omitted (but its not necessarily smart to do so).
Create a virtual conda environment called cellpy
(the name is not
important, but it should be a name you are able to remember) by following
the steps below:
Open up the “Anaconda Prompt” (or use the command window) and type
conda create -n cellpy
This creates your virtual environment (here called cellpy) in which cellpy
will be installed and used.
You then have to activate the environment:
conda activate cellpy
3. Install cellpy
In your activated cellpy
environment in the Anaconda Prompt if you
chose to make one, or in the base environment if you chose not to, run:
conda install -c conda-forge cellpy
Congratulations, you have (hopefully) successfully installed cellpy.
If you run into problems, doublecheck that all your dependencies are installed and check your Microsoft Access odbc drivers.
4. Check your cellpy installation
The easiest way to check if cellpy
has been installed, is to issue
the command for printing the version number to the screen
cellpy info --version
If the program prints the expected version number, you probably
succeeded. If it crashes, then you will have to retrace your steps, redo
stuff and hope for the best. If it prints an older (lower) version
number than you expect, there is a big chance that you have installed it
earlier, and what you would like to do is to do an upgrade
instead
of an install
python -m pip install --upgrade cellpy
If you want to install a pre-release (a version that is so bleeding edge that it ends with a alpha or beta release identification, e.g. ends with .b2). Then you will need to add the –pre modifier
python -m pip install --pre cellpy
To run a more complete check of your installation, there exist a
cellpy
sub-command than can be helpful
cellpy info --check
5. Set up cellpy
After you have installed cellpy
it is highly recommended that you
create an appropriate configuration file and folders for raw data,
cellpy-files, logs, databases and output data (and inform cellpy
about it).
To do this, run the setup command:
cellpy setup
To run the setup in interactive mode, use -i:
cellpy setup -i
This creates the cellpy configuration file .cellpy_prms_USERNAME.conf
in your home directory (USERNAME = your user name) and creates the standard
cellpy_data folders (if they do not exist).
The -i
option makes sure that the setup is done interactively:
The program will ask you about where specific folders are, e.g. where
you would like to put your outputs and where your cell data files are
located. If the folders do not exist, cellpy
will try to create them.
If you want to specify a root folder different from the default (your HOME
folder), you can use the -d
option e.g.
cellpy setup -i -d /Users/kingkong/cellpydir
Hint
You can always edit your configurations directly in the cellpy configuration
file .cellpy_prms_USER.conf
. This file should be located inside your
home directory, /. in posix and c:usersUSERNAME in not-too-old windows.
6. Create a notebook and run cellpy
Inside your Anaconda Prompt window, write:
jupyter notebook # or jupyter lab
Your browser should then open and you are ready to write your first cellpy script.
There are many good tutorials on how to work with jupyter. This one by Real Python is good for beginners: Jupyter Notebook: An Introduction
Setting up cellpy
on Windows for complete beginners
This guide provides step-by-step instructions for installing Cellpy on a Windows system, especially tailored for beginners.
1. Installing Python
First, download Python from the official website. Choose the latest version for Windows.
- Run the downloaded installer. On the first screen of the setup, ensure to check the box
saying “Add Python to PATH” before clicking “Install Now”.
After installation, you can verify it by opening the Command Prompt (see below) and typing:
python --version
This command should return the version of Python that you installed.
2. Opening Command Prompt
Press the Windows key, usually located at the bottom row of your keyboard, between the Ctrl and Alt keys.
Type “Command Prompt” into the search bar that appears at the bottom of the screen when you press the Windows key.
Click on the “Command Prompt” application to open it.
3. Creating a Virtual Environment
A virtual environment is a tool that helps to keep dependencies required by different projects separate by creating isolated Python environments for them. Here’s how to create one:
Open Command Prompt.
Navigate to the directory where you want to create your virtual environment using the cd command. For example:
cd C:\Users\YourUsername\Documents
Type the following command and press enter to create a new virtual environment (replace envname with the name you want to give to your virtual environment):
python -m venv envname
To activate the virtual environment, type the following command and press enter:
envname\Scripts\activate
You’ll know it worked if you see (envname) before the prompt in your Command Prompt window.
4. Installing Jupyter Notebook and matplotlib
Jupyter Notebook is an open-source web application that allows you to create documents containing live code, equations, visualizations, and text. It’s very useful, especially for beginners. To install Jupyter Notebook:
Make sure your virtual environment is activated.
Type the following command and press enter:
python -m pip install jupyter matplotlib
5. Installing cellpy
Next, you need to install cellpy
. You can install it via pip (Python’s package manager).
To install cellpy
:
Make sure your virtual environment is activated.
Type the following command and press enter:
python -m pip install cellpy
6. Launching Jupyter Notebook
Make sure your virtual environment is activated.
Type the following command and press enter:
jupyter notebook
This will open a new tab in your web browser with the Jupyter’s interface. From there, create a new Python notebook by clicking on “New” > “Python 3”.
7. Trying out cellpy
Here’s a simple example of how to use Cellpy in a Jupyter notebook:
In the first cell of the notebook, import Cellpy by typing:
import cellpy
Press Shift + Enter to run the cell.
In the new cell, load your data file (replace “datafile.res” and “/path/to/your/data” with your actual filename and path):
filepath = "/path/to/your/data/datafile.res" c = cellpy.get(filepath) # create a new cellpy object
Press Shift + Enter to run the cell and load the data.
To see a summary of the loaded data, create a new cell and type:
print(c.data.summary.head())
Press Shift + Enter to run the cell and print the summary.
Congratulations! You’ve successfully set up Cellpy in a virtual environment on your Windows PC and loaded your first data file. For more information and examples, check out the official Cellpy documentation.
Cellpy includes convenient functions for accessing the data. Here’s a basic example of how to plot voltage vs. capacity.
In a new cell in your Jupyter notebook, first, import matplotlib, which is a Python plotting library:
import matplotlib.pyplot as plt
Press Shift + Enter to run the cell.
Then, iterate through all cycles numbers, extract the capacity curves and plot:
for cycle in c.get_cycle_numbers(): d = c.get_cap(cycle) plt.plot(d["capacity"], d["voltage"]) plt.show()
Press Shift + Enter to run the cell.
This will produce a plot for each cycle in the loaded data.
Once you’ve loaded your data, you can save it to a hdf5 file for later use:
c.save("saved_data.h5")
This saves the loaded data to a file named ‘saved_data.h5’.
Now, lets try to create some dQ/dV plots. dQ/dV is a plot of the change in capacity (Q) with respect to the change in voltage (V). It’s often used in battery analysis to observe specific electrochemical reactions. Here’s how to create one:
In a new cell in your Jupyter notebook, first, if you have not imported matplotlib:
import matplotlib.pyplot as plt
Press Shift + Enter to run the cell.
Then, calculate dQ/dV using Cellpy’s ica utility:
import cellpy.utils.ica as ica dqdv = ica.dqdv_frames(c, cycle=[1, 10, 100], voltage_resolution=0.01)
Press Shift + Enter to run the cell.
Now, you can create a plot of dQ/dV. In a new cell, type:
plt.figure(figsize=(10, 8)) plt.plot(dqdv["v"], dqdv["dq"], label="dQ/dV") plt.xlabel("Voltage (V)") plt.ylabel("dQ/dV (Ah/V)") plt.legend() plt.grid(True) plt.show()
Press Shift + Enter to run the cell.
In the code above, plt.figure is used to create a new figure, plt.plot plots the data, plt.xlabel and plt.ylabel set the labels for the x and y axes, plt.legend adds a legend to the plot, plt.grid adds a grid to the plot, and plt.show displays the plot.
With this, you should be able to see the dQ/dV plot in your notebook.
Remember that the process of creating a dQ/dV plot can be quite memory-intensive, especially for large datasets, so it may take a while for the plot to appear.
For more information and examples, check out the official Cellpy documentation and the matplotlib documentation.
This recipe can only take you a certain distance. If you want to become more efficient with Python and Cellpy, you might want to try to install it using the method described in the chapter “Installing and setting up cellpy” in the official Cellpy documentation.
More about installing and setting up cellpy
Fixing dependencies
To make sure your environment contains the correct packages and dependencies
required for running cellpy, you can create an environment based on the available
environment.yml
file. Download the
environment.yml
file and place it in the directory shown in your Anaconda Prompt. If you want to
change the name of the environment, you can do this by changing the first line of
the file. Then type (in the Anaconda Prompt):
conda env create -f environment.yml
Then activate your environment:
conda activate cellpy
cellpy
relies on a number of other python package and these need
to be installed. Most of these packages are included when creating the environment
based on the environment.yml
file as outlined above.
Basic dependencies
In general, you need the typical scientific python pack, including
numpy
scipy
pandas
Additional dependencies are:
pytables
is needed for working with the hdf5 files (the cellpy-files):
conda install -c conda-forge pytables
lmfit
is required to use some of the fitting routines incellpy
:
conda install -c conda-forge lmfit
holoviz
andplotly
: plotting library used in several of our example notebooks.jupyter
: used for tutorial notebooks and in general very useful tool for working with and sharing yourcellpy
results.
For more details, I recommend that you look at the documentation of these packages (google it) and install them. You can most likely use the same method as for pytables etc.
Additional requirements for .res files
Note! .res files from Arbin testers are actually in a Microsoft Access format.
For Windows users: if you do not have one of the most recent Office versions, you might not be allowed to install a driver of different bit than your office version is using (the installers can be found here). Also remark that the driver needs to be of the same bit as your Python (so, if you are using 32 bit Python, you will need the 32 bit driver).
For POSIX systems: I have not found any suitable drivers. Instead,
cellpy
will try to use mdbtools
to first export the data to
temporary csv-files, and then import from those csv-file (using the
pandas
library). You can install mdbtools
using your systems
preferred package manager (e.g. apt-get install mdbtools
).
The cellpy configuration file
The paths to raw data, the cellpy data base file, file locations etc. are set in
the .cellpy_prms_USER.conf
file that is located in your home directory.
To get the filepath to your config file (and other cellpy info), run:
cellpy info -l
The config file is written in YAML format and it should be relatively easy to edit it in a text editor.
Within the config file, the paths are the most important parts that need to
be set up correctly. This tells cellpy
where to find (and save) different files,
such as the database file and raw data.
Furthermore, the config file contains details about the database-file to be used for cell info and metadata (i.e. type and structure of the database file such as column headers etc.). For more details, see chapter on Configuring cellpy.
The ‘database’ file
The database file should contain information (cell name, type, mass loading etc.) on your cells, so that cellpy can find and link the test data to the provided metadata.
The database file is also useful when working with the cellpy
batch routine.
Useful cellpy
commands
To help installing and controlling your cellpy
installation, a CLI
(command-line-interface) is provided with several commands (including the already
mentioned info
for getting information about your installation, and
setup
for helping you to set up your installation and writing a configuration file).
To get a list of these commands including some basic information, you can issue
cellpy --help
This will output some (hopefully) helpful text
Usage: cellpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
edit Edit your cellpy config file.
info This will give you some valuable information about your cellpy.
new Set up a batch experiment.
pull Download examples or tests from the big internet.
run Run a cellpy process.
serve Start a Jupyter server
setup This will help you to setup cellpy.
You can get information about the sub-commands by issuing –-help after them also. For example, issuing
cellpy info --help
gives
Usage: cellpy info [OPTIONS]
Options:
-v, --version Print version information.
-l, --configloc Print full path to the config file.
-p, --params Dump all parameters to screen.
-c, --check Do a sanity check to see if things works as they should.
--help Show this message and exit.
Running your first script
As with most software, you are encouraged to play a little with it. I hope there are some useful stuff in the code repository (for example in the examples folder).
Hint
The cellpy pull
command can assist in downloading
both examples and tests.
Start by trying to import cellpy
in an interactive Python session.
If you have an icon to press to start up the Python in interactive mode,
do that (it could also be for example an ipython console or a Jupyter
Notebook).
You can also start an interactive Python session if you are in your
terminal window of command window by just writing python
and pressing
enter.
Hint: Remember to activate your cellpy (or whatever name you
chose) environment.
Once inside Python, try issuing import cellpy
. Hopefully you should not see
any error-messages.
Python 3.9.9 | packaged by conda-forge | (main, Dec 20 2021, 02:36:06)
[MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import cellpy
>>>
Nothing bad happened this time. If you got an error message, try to interpret
it and check if you have skipped any steps in this tutorial. Maybe you are
missing the box
package? If so, go out of the Python interpreter if you
started it in your command window, or open another command window and write
pip install python-box
and try again.
Now let’s try to be a bit more ambitious. Start up python again if you are not still running it and try this:
>>> from cellpy import prmreader
>>> prmreader.info()
The prmreader.info()
command should print out information about your
cellpy settings. For example where you selected to look for your input
raw files (prms.Paths.rawdatadir
).
Try scrolling to find your own prms.Paths.rawdatadir
. Does it look
right? These settings can be changed by either re-running the
cellpy setup -i
command (not in Python, but in the command window /
terminal window). You probably need to use the --reset
flag this time
since it is not your first time running it).
Interacting with your data
Read cell data
We assume that we have cycled a cell and that we have two files with results (we had to stop the experiment and re-start for some reason). The files are in the .res format (Arbin).
The easiest way to load data is to use the cellpy.get
method:
import cellpy
electrode_mass = 0.658 # active mass of electrode in mg
file_name = "20170101_ife01_cc_01.res"
cell_data = cellpy.get(file_name, mass=electrode_mass, cycle_mode="anode")
Note
Even though the CellpyCell
object in the example above got the name cell_data
,
it is more common to just simply name it c
(i.e. c = cellpy.get(...)
). Similarly,
the cellpy
naming convention for cellpy.utils.Batch
objects is to name them b
(i.e. b = batch.init(...)
(assuming then that the batch
module was imported
somewhere earlier in the code)).
If you prefer, you can obtain the same by using cellpy.cellreader.CellpyCell
object directly. However, we
recommend using the cellpy.get
method. But just in case you want to know how to do it…
First, import the cellreader-object from cellpy
:
import os
from cellpy import cellreader
Then define some settings and variables and create the CellpyCell-object:
raw_data_dir = r"C:\raw_data"
out_data_dir = r"C:\processed_data"
cellpy_data_dir = r"C:\CellpyCell"
cycle_mode = "anode" # default is usually "anode", but...
# These can also be set in the configuration file
electrode_mass = 0.658 # active mass of electrode in mg
# list of files to read (Arbin .res type):
raw_file = ["20170101_ife01_cc_01.res", "20170101_ife01_cc_02.res"]
# the second file is a 'continuation' of the first file...
# list consisting of file names with full path
raw_files = [os.path.join(raw_data_dir, f) for f in raw_file]
# creating the CellpyCell object and set the cycle mode:
cell_data = cellreader.CellpyCell()
cell_data.cycle_mode = cycle_mode
Now we will read the files, merge them, and create a summary:
# if the list of files are in a list they are automatically merged:
cell_data.from_raw([raw_files])
cell_data.set_mass(electrode_mass)
cell_data.make_summary()
# Note: make_summary will automatically run the
# make_step_table function if it does not exist.
Save / export data
When you have loaded your data and created your CellpyCell
object, it is
time to save everything in the cellpy-format:
# defining a name for the cellpy_file (hdf5-format)
cellpy_data_dir = r"C:\cellpy_data\cellpy_files"
cellpy_file = os.path.join(cellpy_data_dir, "20170101_ife01_cc2.h5")
cell_data.save(cellpy_file)
The cellpy format is much faster to load than the raw-file formats typically encountered. It also includes the summary and step-tables, and it is easy to add more data to the file later on.
To export data to csv format,
CellpyCell
has a method called to_csv
.
# export data to csv
out_data_directory = r"C:\processed_data\csv"
# this exports the summary data to a .csv file:
cell_data.to_csv(out_data_directory, sep=";", cycles=False, raw=False)
# export also the current voltage cycles by setting cycles=True
# export also the raw data by setting raw=True
Note
CellpyCell
objects store the data (including the summary and step-tables)
in pandas DataFrames
. This means that you can easily export the data to
other formats, such as Excel, by using the to_excel
method of the
DataFrame object. In addition, CellpyCell
objects have a method called
to_excel
that exports the data to an Excel file.
More about the cellpy.get
method
Note
This chapter would benefit from some more love and care. Any help on that would be highly appreciated.
The following keyword arguments is current supported by cellpy.get
:
# from the docstring:
Args:
filename (str, os.PathLike, OtherPath, or list of raw-file names): path to file(s) to load
instrument (str): instrument to use (defaults to the one in your cellpy config file)
instrument_file (str or path): yaml file for custom file type
cellpy_file (str, os.PathLike, OtherPath): if both filename (a raw-file) and cellpy_file (a cellpy file)
is provided, cellpy will try to check if the raw-file is has been updated since the
creation of the cellpy-file and select this instead of the raw file if cellpy thinks
they are similar (use with care!).
logging_mode (str): "INFO" or "DEBUG"
cycle_mode (str): the cycle mode (e.g. "anode" or "full_cell")
mass (float): mass of active material (mg) (defaults to mass given in cellpy-file or 1.0)
nominal_capacity (float): nominal capacity for the cell (e.g. used for finding C-rates)
loading (float): loading in units [mass] / [area]
area (float): active electrode area (e.g. used for finding the areal capacity)
estimate_area (bool): calculate area from loading if given (defaults to True)
auto_pick_cellpy_format (bool): decide if it is a cellpy-file based on suffix.
auto_summary (bool): (re-) create summary.
units (dict): update cellpy units (used after the file is loaded, e.g. when creating summary).
step_kwargs (dict): sent to make_steps
summary_kwargs (dict): sent to make_summary
selector (dict): passed to load (when loading cellpy-files).
testing (bool): set to True if testing (will for example prevent making .log files)
**kwargs: sent to the loader
Reading a cellpy file:
c = cellpy.get("my_cellpyfile.cellpy")
# or
c = cellpy.get("my_cellpyfile.h5")
Reading anode half-cell data from arbin sql:
c = cellpy.get("my_cellpyfile", instrument="arbin_sql", cycle_mode="anode")
# Remark! if sql prms are not set in your config-file you have to set them manually (e.g. setting values in
# prms.Instruments.Arbin.VAR)
Reading data obtained by exporting csv from arbin sql using non-default delimiter sign:
c = cellpy.get("my_cellpyfile.csv", instrument="arbin_sql_csv", sep=";")
Reading data obtained by exporting a csv file from Maccor
using a sub-model (this example uses one of the models already available inside cellpy
):
c = cellpy.get(filename="name.txt", instrument="maccor_txt", model="one", mass=1.0)
Reading csv file using the custom loader where the format definitions are given in a user-supplied yaml-file:
c = cellpy.get(filename="name.txt", instrument_file="my_custom_file_format.yml")
If you specify both the raw file name(s) and the cellpy file name to``cellpy.get``
you can make cellpy
select whether-or-not to load
directly from the raw-file or use the cellpy-file instead.
cellpy
will check if the raw file(s) is/are updated since
the last time you saved the cellpy file - if not,
then it will load the cellpy file instead (this is usually much faster
than loading the raw file(s)).
You can also input the masses and enforce that it creates a
summary automatically.
cell_data.get(
raw_files=[raw_files],
cellpy_file=cellpy_file,
mass=electrode_mass,
auto_summary=True,
)
if not cell_data.check():
print("Could not load the data")
Working with external files
To work with external files you will need to set some environment variables. This can most
easily be done by creating a file called .env_cellpy
in your user directory (e.g. C:\Users\jepe
):
# content of .env_cellpy
CELLPY_PASSWORD=1234
CELLPY_KEY_FILENAME=C:\\Users\\jepe\\.ssh\\id_key
CELLPY_HOST=myhost.com
CELLPY_USER=jepe
You can then load the file using the cellpy.get
method by providing the full path to the file,
including the protocol (e.g. scp://
) and the user name and host (e.g. jepe@myhost.com
):
# assuming appropriate ``.env_cellpy`` file is present
raw_file = "scp://jepe@myhost.com/path/to/file.txt"
c = cellpy.get(filename=raw_file, instrument="maccor_txt", model="one", mass=1.0)
cellpy will automatically download the file to a temporary directory and read it.
Save / export data
Saving data to cellpy format is done by the CellpyCell.save
method.
To export data to csv format,
CellpyCell
has a method called to_csv
.
# export data to csv
out_data_directory = r"C:\processed_data\csv"
# this exports the summary data to a .csv file:
cell_data.to_csv(out_data_directory, sep=";", cycles=False, raw=False)
# export also the current voltage cycles by setting cycles=True
# export also the raw data by setting raw=True
Stuff that you might want to do with cellpy
Note
This chapter would benefit from some more love and care. Any help on that would be highly appreciated.
A more or less random collection of things that you might want to do with
cellpy
. This is not a tutorial, but rather a collection of examples.
Extract current-voltage graphs
If you have loaded your data into a CellpyCell-object,
let’s now consider how to extract current-voltage graphs
from your data. We assume that the name of your
CellpyCell-object is cell_data
:
cycle_number = 5
charge_capacity, charge_voltage = cell_data.get_ccap(cycle_number)
discharge_capacity, discharge_voltage = cell_data.get_dcap(cycle_number)
You can also get the capacity-voltage curves with both charge and discharge:
capacity, charge_voltage = cell_data.get_cap(cycle_number)
# the second capacity (charge (delithiation) for typical anode half-cell experiments)
# will be given "in reverse".
The CellpyCell
object has several get-methods, including getting current,
timestamps, etc.
Extract summaries of runs
Summaries of runs includes data pr. cycle for your data set. Examples of
summary data is charge- and
discharge-values, coulombic efficiencies and internal resistances.
These are calculated by the
make_summary
method.
Remark that note all the possible summary statistics are calculated as
default. This means that you might have to re-run the make_summary
method
with appropriate parameters as input (e.g. normalization_cycle
,
to give the appropriate cycle numbers to use for finding nominal capacity).
Another method is responsible for investigating the individual steps in the
data (make_step_table
). It is typically run automatically before creating
the summaries (since the summary creation depends on the step_table). This
table is interesting in itself since it contains delta, minimum, maximum and
average values for the measured values pr. step. This is used to find out
what type of step it is, e.g. a charge-step or maybe a ocv-step. It is
possible to provide information to this function if you already knows what
kind of step each step is. This saves cellpy
for a lot of work.
Remark that the default is to calculate values for each unique (step-number -
cycle-number) pair. For some experiments, a step can be repeated many times
pr. cycle. And if you need for example average values of the voltage for each
step (for example if you are doing GITT experiments), you would need to
tell make_step_table
that it should calculate for all the steps
(all_steps=True
).
Create dQ/dV plots
The methods for creating incremental capacity curves is located in
the cellpy.utils.ica
module (Extracting ica data).
Do some plotting
The plotting methods are located in the cellpy.utils.plotting
module
(Have a look at the data).
What else?
There are many things you can do with cellpy
. The idea is that you
should be able to use cellpy
as a tool to do your own analysis. This
means that you need to know a little bit about python and how to use
the different modules. It is not difficult, but it requires some
playing around and maybe reading some of the source code. Let’s keep our
fingers crossed and hope that the documentation will be improved in the
future.
Why not just try out the highly popular (?) cellpy.utils.batch
utility. You will need to make (or copy from a friend) the “database” (an excel-file
with appropriate headers in the first row) and make sure that all the paths
are set up correctly in you cellpy configuration file. Then you can
process many cells in one go. And compare them.
Or, for example: If you would like to do some interactive plotting of your data, try to install plotly and use Jupyter Lab to make some fancy plots and dash-boards.
And why not: make a script that goes through all your thousands of measured cells, extracts the life-time (e.g. number of cycles until the capacity has dropped below 80% of the average of the three first cycles), and plot this versus time the cell was put. And maybe color the data-points based on who was doing the experiment?
Configuring cellpy
How the configuration parameters are set and read
When cellpy
is imported, a default set of parameters is set.
Then it tries to read the parameters
from your .conf-file (located in your user directory). If successful,
the parameters set in your .conf-file will over-ride the default.
The parameters are stored in the module cellpy.parameters.prms
.
If you would like to change some of the settings during your script
(or in your jupyter notebook
), e.g. if you
want to use the cycle_mode
option “cathode” instead of the
default “anode”, then import the prms class and set new
values:
from cellpy import parameters.prms
# Changing cycle_mode to cathode
prms.Reader.cycle_mode = 'cathode'
# Changing delimiter to ',' (used when saving .csv files)
prms.Reader.sep = ','
# Changing the default folder for processed (output) data
prms.Paths.outdatadir = 'experiment01/processed_data'
The configuration file
cellpy
tries to read your .conf-file when imported the first time,
and looks in your user directory after
files named .cellpy_prms_SOMENAME.conf
.
If you have run cellpy setup
in the cmd window or in the shell, the
configuration file will be placed in the appropriate place.
It will have the name .cellpy_prms_USERNAME.conf
(where USERNAME is your username).
The configuration file is a YAML-file and it is reasonably easy to read and edit (but remember that YAML is rather strict with regards to spaces and indentations).
As an example, here are the first lines from one of the authors’ configuration file:
---
Paths:
outdatadir: C:\scripts\processing_cellpy\out
rawdatadir: I:\Org\MPT-BAT-LAB\Arbin-data
cellpydatadir: C:\scripts\processing_cellpy\cellpyfiles
db_path: C:\scripts\processing_cellpy\db
filelogdir: C:\scripts\processing_cellpy\logs
examplesdir: C:\scripts\processing_cellpy\examples
notebookdir: C:\scripts\processing_cellpy\notebooks
templatedir: C:\scripting\processing_cellpy\templates
batchfiledir: C:\scripts\processing_cellpy\batchfiles
db_filename: 2023_Cell_Analysis_db_001.xlsx
env_file: .env_cellpy
FileNames:
file_name_format: YYYYMMDD_[NAME]EEE_CC_TT_RR
The first part contains definitions of the different paths, files and file-patterns
that cellpy
will use. This is the place where you most likely will have to do
some edits sometime.
The next part contains definitions required when using a database:
# settings related to the db used in the batch routine
Db:
db_type: simple_excel_reader
db_table_name: db_table
db_header_row: 0
db_unit_row: 1
db_data_start_row: 2
db_search_start_row: 2
db_search_end_row: -1
# definitions of headers for the simple_excel_reader
DbCols:
id:
- id
- int
exists:
- exists
- bol
batch:
- batch
- str
sub_batch_01:
- b01
- str
.
.
This part is rather long (since it needs to define the column names used in the db excel sheet).
The next part contains settings regarding your dataset and the cellreader
, as well as for
the different instruments
. At the bottom you will find the settings for the batch
utility.
# settings related to your data
DataSet:
nom_cap: 3579
# settings related to the reader
Reader:
Reader:
diagnostics: false
filestatuschecker: size
force_step_table_creation: true
force_all: false
sep: ;
cycle_mode: anode
sorted_data: true
select_minimal: false
limit_loaded_cycles:
ensure_step_table: false
voltage_interpolation_step: 0.01
time_interpolation_step: 10.0
capacity_interpolation_step: 2.0
use_cellpy_stat_file: false
auto_dirs: true
# settings related to the instrument loader
# (each instrument can have its own set of settings)
Instruments:
tester: arbin
custom_instrument_definitions_file:
Arbin:
max_res_filesize: 1000000000
chunk_size:
max_chunks:
use_subprocess: false
detect_subprocess_need: false
sub_process_path:
office_version: 64bit
SQL_server: localhost
SQL_UID:
SQL_PWD:
SQL_Driver: ODBC Driver 17 for SQL Server
odbc_driver:
Maccor:
default_model: one
# settings related to running the batch procedure
Batch:
fig_extension: png
backend: bokeh
notebook: true
dpi: 300
markersize: 4
symbol_label: simple
color_style_label: seaborn-deep
figure_type: unlimited
summary_plot_width: 900
summary_plot_height: 800
summary_plot_height_fractions:
- 0.2
- 0.5
- 0.3
...
As you can see, the author of this particular file most likely works with
silicon as anode material for lithium ion
batteries (the nom_cap
is set to 3579 mAh/g, i.e. the theoretical
gravimetric lithium capacity for silicon at normal temperatures) and is using windows.
By the way, if you are wondering what the ‘.’ means… it means nothing - it was just something I added in this tutorial text to indicate that there is more stuff in the actual file than what is shown here.
Working with the pandas.DataFrame
objects directly
Note
This chapter would benefit from some more love and care. Any help on that would be highly appreciated.
The CellpyCell
object stores the data in several pandas.DataFrame
objects.
The easies way to get to the DataFrames is by the following procedure:
# Assumed name of the CellpyCell object: c
# get the 'test':
data = c.data
# data is now a cellpy Data object (cellpy.readers.cellreader.Data)
# pandas.DataFrame with data vs cycle number (coulombic efficiency, charge-capacity etc.):
summary_data = data.summary
# you could also get the summary data by:
summary_data = c.data.summary
# pandas.DataFrame with the raw data:
raw_data = data.raw
# pandas.DataFrame with statistics on each step and info about step type:
step_info = data.steps
You can then manipulate your data with the standard pandas.DataFrame
methods
(and pandas
methods in general).
Happy pandas-ing!
Data mining / using a database
Note
This chapter would benefit from some more love and care. Any help on that would be highly appreciated.
One important motivation for developing the cellpy
project is to facilitate
handling many cell testing experiments within a reasonable time and with a
“tunable” degree of automation. It is therefore convenient to be able to
couple both the meta-data (look-up) to some kind of data-base, as well as
saving the extracted key parameters to either the same or another database
(where I recommend the latter). The database(s) will be a valuable asset for
further data analyses (either using statistical methods, e.g. Bayesian
modelling, or as input to machine learning algorithms, for example deep
learning using cnn).
Meta-data database
TODO.
Parameters and feature extraction
TODO.
Bayesian modelling
TODO.
Example: reinforcement deep learning (resnet)
TODO.
The cellpy command
To assist in using cellpy
more efficiently, a set of routines are available from
the command line
by issuing the cellpy
command at the shell (or in the cmd window).
$ cellpy
Usage: cellpy [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
edit Edit your cellpy config or database files.
info This will give you some valuable information about your cellpy.
new Set up a batch experiment (might need git installed).
pull Download examples or tests from the big internet (needs git).
run Run a cellpy process (e.g.
serve Start a Jupyter server.
setup This will help you to set up cellpy.
The cli is still under development (cli stands for command-line-interface, by the way).
Both the cellpy new
and the cellpy serve
command worked the last time I tried them.
But it might not work on your computer. If you run into problems, let us know.
Information
A couple of commands are implemented to get some information about your
cellpy
environment (currently getting your
cellpy
version and the location of your configuration file):
$ cellpy info --version
[cellpy] version: 0.4.1
$ cellpy info --configloc
[cellpy] -> C:\Users\jepe\.cellpy_prms_jepe.conf
Setting up cellpy
from the cli
To get the most out of cellpy
it is to best to set it up properly. To help
with this, you can use the setup
command. If you include the --interactive
switch,
you will be prompted for your preferred location for the different folders / directories
cellpy
uses (it still will work without them, though).
$ cellpy setup --interactive
The command will create a starting cellpy
configuration file (,cellpy_prms_USERNAME.conf)
or update it if it exists, and create the following directory structure:
batchfiles/
cellpyfiles/
db/
examples/
instruments/
logs/
notebooks/
out/
raw/
templates/
Note
It is recommended to rerun setup each time you update cellpy
.
Note
You can get help for each sub-command by turning on the --help
switch.
For example, for setup
:
$ cellpy setup --help
You will then get some more detailed information on the different switches you have at your disposal:
Usage: cellpy setup [OPTIONS]
This will help you to setup cellpy.
Options:
-i, --interactive Allows you to specify div. folders and setting.
-nr, --not-relative If root-dir is given, put it directly in the root
(/) folder i.e. do not put it in your home directory.
Defaults to False. Remark that if you specifically
write a path name instead of selecting the suggested
default, the path you write will be used as is.
-dr, --dry-run Run setup in dry mode (only print - do not execute).
This is typically used when developing and testing
cellpy. Defaults to False.
-r, --reset Do not suggest path defaults based on your current
configuration-file
-d, --root-dir PATH Use custom root dir. If not given, your home
directory will be used as the top level where
cellpy-folders will be put. The folder path must
follow directly after this option (if used).
Example: $ cellpy setup -d 'MyDir'
-n, --folder-name PATH
-t, --testuser TEXT Fake name for fake user (for testing)
--help Show this message and exit.
The cellpy templating system
If you are performing the same type of data processing for many cells, and possibly many times, it is beneficial to start out with a template.
Currently, cellpy
provides a template system defaulting to a set of Jupyter notebooks
and
a folder structure where the code is based on the batch
utility (cellpy.utils.batch
).
The templates are pulled from the cellpy_templates repository. It uses cookiecutter
under
the hood (and therefore needs git installed).
This repository contains several template sets. The default is named standard, but you can set another default in your configuration file.
You can also make your own templates and store them locally on your computer (in the templates directory). The template should be in a zip file and start with “cellpy_template” and end with “.zip”.
$ cellpy new --help
Usage: cellpy new [OPTIONS]
Set up a batch experiment (might need git installed).
Options:
-t, --template TEXT Provide template name.
-d, --directory TEXT Create in custom directory.
-p, --project TEXT Provide project name (i.e. sub-directory name).
-e, --experiment TEXT Provide experiment name (i.e. lookup-value).
-u, --local-user-template Use local template from the templates directory.
-s, --serve Run Jupyter.
-r, --run Use PaperMill to run the notebook(s) from the
template (will only work properly if the
notebooks can be sorted in correct run-order by
'sorted'.
-j, --lab Use Jupyter Lab instead of Notebook when serving.
-l, --list List available templates and exit.
--help Show this message and exit.
Automatically running batches
The run
command is used for running the appropriate editor for your
database, and for running (processing) files in batches.
$ cellpy run --help
Usage: cellpy run [OPTIONS] [NAME]
Run a cellpy process (batch-job, edit db, ...).
You can use this to launch specific applications.
Examples:
edit your cellpy database
cellpy run db
run a batch job described in a journal file
cellpy run -j my_experiment.json
Options:
-j, --journal Run a batch job defined in the given journal-file
-k, --key Run a batch job defined by batch-name
-f, --folder Run all batch jobs iteratively in a given folder
-p, --cellpy-project Use PaperMill to run the notebook(s) within the given
project folder (will only work properly if the
notebooks can be sorted in correct run-order by
'sorted'). Warning! since we are using `click` - the
NAME will be 'converted' when it is loaded (same as
print(name) does) - so you can't use backslash ('\')
as normal in windows (use either '/' or '\\' instead).
-d, --debug Run in debug mode.
-s, --silent Run in silent mode.
--raw Force loading raw-file(s).
--cellpyfile Force cellpy-file(s).
--minimal Minimal processing.
--nom-cap FLOAT nominal capacity (used in calculating rates etc)
--batch_col TEXT batch column (if selecting running from db)
--project TEXT name of the project (if selecting running from db)
-l, --list List batch-files.
--help Show this message and exit.