Getting Started
Getting started with Feast and Kubeflow
This guide provides the necessary resources to install Feast alongside Kubeflow, describes the usage of Feast with Kubeflow components, and provides examples that users can follow to test their setup.
For an overview of Feast, please read Introduction to Feast.
Installing Feast with Kubeflow
Requirements
- This guide assumes that you have a running Kubeflow cluster already. If you don’t have Kubeflow installed, then head on over to the Kubeflow installation guide.
- This guide also assumes that you have a running online feature store that Feast supports (Redis, Datastore, DynamoDB).
- The latest version of Feast does not need to be installed into Kubernetes. It is possible to run Feast completely from CI or as a client library (during training or inference)
- Feast requires a bucket (S3, GCS, Minio, etc) to maintain a feature registry, requires an online feature store for serving feature values, and it requires a scheduler to keep the online store up to date.
Installation
To use Feast with Kubeflow, please follow the following steps:
- Install Feast into your development environment, as well as any environment where you want to register feature views or read features from the feature store.
- Create a feature repository to store your feature views and entities. Make sure to configure your feature_store.yaml to point to your online store. Please see the online store configuration reference here for more details.
- Deploy your feature store. This step configures your online store and sets up your feature registry.
- Build a training dataset. This step is typically executed from a Kubeflow Pipeline from which you’d train a model.
- Load features into the online store. This step can also be executed from a Kubernetes cron job.
- Read features from the online store. This step is typically executed from your model serving service, right before calling your model for a prediction.
Advanced
- Please see this guide which provides best practices for running Feast in a production context.
- Please see this guide for upgrading from Feast 0.9 (Spark-based) to the latest Feast (0.12+).
Accessing Feast from Kubeflow
Once Feast is installed within the same Kubernetes cluster as Kubeflow, users can access its APIs directly without any additional steps.
Feast APIs can roughly be grouped into the following sections:
- Feature definition and management: Feast provides both a Python SDK and CLI for interacting with Feast Core. Feast Core allows users to define and register features and entities and their associated metadata and schemas. The Python SDK is typically used from within a Jupyter notebook by end users to administer Feast, but ML teams may opt to version control feature specifications in order to follow a GitOps based approach.
- Model training: The Feast Python SDK can be used to trigger the creation of training datasets. The most natural place to use this SDK is to create a training dataset as part of a Kubeflow Pipeline prior to model training.
- Model serving: The Feast Python SDK can also be used for online feature retrieval. This client is used to retrieve feature values for inference with Model Serving systems like KFServing, TFX, or Seldon.
Examples
Please see our tutorials section for a full list of examples:
Next steps
- For more details on Feast concepts please see the Feast documentation
- Please see our changelog and roadmap for new or upcoming functionality.
- Please use GitHub issues for any feedback, issues, or feature requests.
- If you would like to get involved with Feast, come and visit us in #Feast or join our community calls, mailing list, or have a look at our contribution process
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.
Last modified November 30, 2023: clean up external-add-ons section (3d4a099)