Automate recurrent end-to-end ML training pipelines to enable Regression Testing and continuous performance improvement.
Regression Testing in ML is very similar to functional testing in software. Developers establish a dataset of test scenarios and continuously evaluate the newly trained model against them to ensure that no regression are introduced.
Every time a code change is pushed (feature code, data processing code, training code, sampling, evaluation, etc.), the end-to-end training pipeline is triggered pointing at the reference dataset. If all scenarios pass, the model is safe to ship.
In an ideal world, incorrect predictions can easily be identified, labeled, and added to the training dataset for the model to learn from them. In reality, identifying incorrect inferences can be challenging due to the absence of ground truth.
Some systems enable human feedback, others leverage auto-labeling, or error mining, using heuristics.
These failed examples must be identified, labeled, and added to the canonical training dataset so that the model can explore the long tail of more infrequent events that were not originally sufficiently represented in the training dataset.
Just simple Python, no infrastructure
Get rich insights into inputs, outputs, logs, errors. Rerun pipelines from the UI with cached results.
Run your pipelines on your local machine or in a GPU cluster with no
change in code.
Read up on what we shipped in 0.22.1. Helm chart, deep links, reruns, and more!
What does observability mean for Machine Learning pipelines?