undefined | Firaz Zakariya

Lithium-ion cells lose capacity as they cycle. For EVs, knowing how much capacity remains, and when a pack will fall below the threshold that makes a vehicle unusable, matters for resale value, warranty pricing, and second-life battery markets. This project is a reproducible benchmark of four model families on that prediction task.

Problem

Existing comparisons in the literature are hard to replicate: different datasets, different train/test splits, different error metrics. Before choosing a model architecture for a production system, I wanted an honest, apples-to-apples comparison on real driving profiles rather than lab constant-current cycling.

Approach

Data. The benchmark will use public EV field cycling data with documented provenance and reproducible train/test splits. Each record carries cycle-level signals: voltage, current, temperature, state of charge over time, and measured capacity. Dataset selection prioritises real driving profiles over lab constant-current cycling, so the comparison reflects conditions a production system would face.

Four model families compared:

Statistical baseline: exponential decay or linear regression on hand-crafted cycle features. Interpretable and fast; the benchmark everything else must beat.
Gradient-boosted (LightGBM): hand-crafted features per cycle (capacity fade rate, internal resistance estimate, discharge curve statistics), trained with grouped cross-validation by cell to prevent leakage.
LSTM: sequence model on raw voltage/current curves within each cycle, with a fixed input window per cycle and standard sequence-to-scalar training setup.
Physics-informed: a model that encodes degradation priors (e.g. Arrhenius temperature dependence) as constraints, so predictions respect known electrochemical behaviour rather than fitting curves in a vacuum.

Evaluation. Prognostic metrics: RMSE on remaining useful life, relative error at end-of-life prediction, and cross-dataset generalisation (train on one dataset, evaluate on another).

FastAPI. The best-performing model is served via a small REST API: POST a cell’s recent cycle data, get back a predicted RUL and confidence interval.

Results

The benchmark has not been run yet. When complete, results will report in-distribution RMSE on remaining useful life, cross-dataset RMSE (train on one source, evaluate on another), and inference latency per model family. The narrative will focus on which architecture wins where, and what the generalisation gap says about deploying any of these in production.

Code

A public repository with data preprocessing, training scripts, evaluation notebooks, and the optional FastAPI service will follow once the benchmark is complete.