Simple, secure, and reproducible packaging for AI/ML projects
Moving a model from a Jupyter notebook to an ML tool or development server, then to a production server like Kubernetes is difficult because each tool uses its own packaging mechanism, and requires engineers to repackage the model multiple times. This slows down development and introduces risk.
KitOps is an open source DevOps project built to standardize packaging, reproduction, deployment, and tracking of AI / ML models, so it can be run anywhere, just like application code
KitOps solves multiple problems:
Unlike Dockerfiles, Kitfiles are a modular package - pull just a part of the ModelKit, like the model or dataset, or pull the whole package with one simple command.
Storing ModelKits in your organization’s container registry provides a history of meaningful state changes for auditing. ModelKits are immutable so are perfect for a secure bill-of-materials (SBOM) initiative.
By building ModelKits on industry standards, anyone (not just data scientists) can participate in the model development lifecycle whether they’re integrating models with their application, experimenting with them locally, or deploying them to production.
ModelKits can be stored in your existing container registry and work with the tools your team is already using, so you can use the same deployment pipelines and endpoints you’ve hardened with your application development process.
Download and install Kit CLI.
Install the CLICreate a simple manifest file called a Kitfile with your model, dataset and code. Then build and push the ModelKit to a registry for sharing.
LEARN MOREPull the ModelKit into your pipeline, or use kit dev to start working with the model locally.
USE CASESVisit our GitHub repo for a list of all features and our roadmap.
A ModelKit package includes models, datasets, configurations, and code in an OCI artifact. Add as much or as little as your project needs.
LEARN MOREEach ModelKit package is immutable and includes a SHA digest for itself, and every artifact it holds.
LEARN MOREEach ModelKit is tagged and versioned so everyone knows which dataset and model work together.
LEARN MOREModelKits can be used with any AI, ML, or LLM project - even multi-modal models.
LEARN MOREPack or unpack a ModelKit locally or as part of your CI/CD workflow for testing, integration, or deployment.
LEARN MOREKit's Dev Mode lets your run an LLM locally, configure it, and prompt/chat with it instantly
LEARN MOREAI projects are more than just a model, you need a codebase, dataset, documentation too.
Our quickstart ModelKits have everything you need in one easy to find place.
The ModelKit is an OCI compliant package (like a container, but more fully featured) that contains everything needed to integrate with a model, or deploy it to production.
The ModelKit holds the serialized model, dataset, hyperparameters, input / output structure, and validation criteria. Kitfiles define a ModelKit in a modular and easy-to-understand way.
The Kit CLI is a command line interface (CLI) that performs actions on ModelKits.
You can: build and version ModelKits; push or pull them from a model registry; run them locally with a RESTful API we generate for your model automatically, and deploy them to staging or production.
ModelKits do both. With a ModelKit, you can package all the parts of your AI project in one shareable asset, and tag them with a version. ModelKits were designed for the model development lifecycle, where projects are handed off from data science teams to application teams to deployment teams. Versioning and packaging makes it easy for team members to find the datasets and configurations that map to a specific model version. You can read more details about KitOps in our overview.
The easiest way to get started is to follow our Quick Start, where you’ll learn how to:
Yes [choir sings hallelujah], each ModelKit includes SHA digests for the ModelKit and every artifact it holds so you can quickly see if something changed between ModelKit versions.
Increased speed: Teams can work faster with a centralized and versioned package for their AI project coordination. ModelKits eliminate hunting for datasets or code, and make it obvious which datasets and configurations are needed for each model. Handoffs can be automated and executed quickly and with confidence.
Reduced risk: ModelKits are self-verifying. Both the ModelKit itself and all the artifacts added to it are tamper-proof. Anyone can quickly and easily verify when something may have changed.
Improved efficiency: Models stored in ModelKits can be run locally for experimentation or application integration, or packaged for deployment with a single command. Any artifact in a ModelKit can be separately pulled saving time and space on local or shared machines. This makes it easy for data scientists, application developers, and DevOps engineers to find and grab the pieces they need to do their job without being overwhelmed with unnecessary files.
ModelKits store their assets as OCI-compatible artifacts. This makes them compatible with nearly every development and deployment tool and registry in use today.
ModelKits can be stored in any OCI-compliant registry - for example in a container registry like Docker Hub or Jozu Hub, or your favorite cloud vendor’s container registry, they can even be stored in an artifact repository like Artifactory.
Yes, it is licensed with the Apache 2.0 license and welcomes all users and contributors. If you’re interested in contributing, let us know.
No, ModelKits complement containers - in fact, KitOps can take a ModelKit and generate a container for the model automatically. However, not all models should be deployed inside containers - sometimes it’s more efficient and faster to deploy an init container linked to the model for deployment. Datasets may also not need to be in containers - many datasets are easier to read and manipulate for training and validation when they’re not in a container. Finally, each container is still separate so even if you do want to put everything in its own container it’s not clear to people outside the AI project which datasets go with which models and which configurations.
Models and datasets in AI projects are often 10s or 100s of GB in size. Git was designed to work with many small files that can be easily diff’ed between versions. Git treats models and datasets stored in LFS (large file storage) as atomic blobs and can’t differentiate between versions of them. This makes it both inefficient and dangerous since it’s easy for someone to tamper with the models and datasets in the LFS without Git knowing. Finally, once you use LFS, a clone is no longer guaranteed to be the same as the original repo, because the repo refers to an LFS server that is independent of the clone and can change independently.
KitOps is the only standards-based and open source solution for packaging and versioning AI project assets. Popular MLOps tools use proprietary and often closed formats to lock you into their ecosystem. This makes handoffs between MLOps tool users and non-MLOps tool users (like your application development and DevOps teams) unnecessarily hard. The future of MLOps tools is still being written, and it’s likely that many will be acquired or shut down and the cost of moving projects from one proprietary format to another is high. By using the OCI standard that’s already supported by nearly every tool on the planet, ModelKits give you a future-proofed solution for packaging and versioning that is compatible with both your MLOps tools and development / DevOps tools so everyone can collaborate regardless of the tools they use.
Enterprise support for ModelKits and the Kit CLI is available from Jozu.