Google Summer of Code 2024
Kubeflow Community is excited to announce that we are applying this year to participate in the the Google Summer of Code 2024
Below are the “ideas/features” from different projects in the Kubeflow that being submitted for seeking proposals from GSoC Students.
Registration
Please go to Google Summer of Code 2024 and sign up as student and look up Kubeflow organization and select the projects you want to submit proposals. Note, you can submit more than one proposal. Contributor Applications open March 20, 2024 and end on Apr 4th 2024. So start writing your proposals.
Mentors
Kubeflow community is an unparalleled opportunity to make a lasting impact and share your wealth of experience. As a mentor, you’ll have the chance to guide and shape the next generation of talent, while also collaborating with like-minded individuals to elevate projects to new heights. With dedicated support mechanisms like community recognition and feedback channels, you’ll feel valued and empowered every step of the way. Join us in this exciting journey of mentorship and collaboration! If you want submit an project idea reach out Ramesh Reddy on slack.
Students
Welcome to an incredible chance to dive into the vibrant Kubeflow community! Join us and become a valued contributor to our rapidly expanding network dedicated to AI/ML technologies on Kubernetes. Kubeflow mentors come with years of valuable experience and a passion for community growth and project enhancement, they’re the guiding stars of our program. Immerse yourself in a dynamic environment where you’ll not only gain insights from top industry experts but also sharpen your skills in open source collaboration, time management, and presentation prowess alongside making impactful technical contributions to projects of your interest. Our dedicated mentors will be by your side every step of the way, offering invaluable guidance and support throughout the project. Get ready to embark on an exciting journey of growth and learning!
If you need to reach out our mentors, join Kubeflow Slack and look up your mentor or find project specific channel on Kubeflow slack
Project Ideas for the 2024
Project 1: Kubeflow Notebooks 2.0
Kubeflow Notebook is a widely used component of Kubeflow that allows Data Scientists and ML Engineers to run web-based IDEs (JupyterLab, VSCode, RStudio) on Kubernetes clusters.
There is currently an effort to create the next major version of Kubeflow Notebooks.
The main idea is to change the Kubeflow Notebook CRD so that it is no longer just a wrapper around a Kubernetes PodSpec.
This foundational change enables users to:
- Update existing notebooks after spawning, to change their “pod config” (CPU/GPU/RAM), “volumes” (storage), and “image” (what packages are installed) from options that are defined by their admin.
- Make spawning notebooks less confusing for end-users. Pod configs stop being about specific parts of the PodSpec (e.g. tolerations, requests, limits), and become a drop-down list of user-friendly names (e.g. “Big GPU Notebook - A100 - 128GB”), similar to cloud “instance types”.
- Give admins more control over how workspaces are spawned, and the lifecycle of the “options” which are available to users. For example, admins can now “redirect” existing image/pod configs to new ones, but delay the application of these updates until the next pod restart (during which, the interface will display a warning to users that a change is pending).
- Support new web-based IDEs without needing to specifically integrate with them. Cluster admins can define a custom “kind” for their internal app, or even make “flavors” of existing apps (like Jupyter and VSCode) with the packages and pod-sizes required for specific teams in their organization.
You would be part of the larger effort, and involved in one or more code deliverables:
- See Kubeflow Notebooks docs: https://www.kubeflow.org/docs/components/notebooks/overview/
- See Kubeflow Notebooks 2.0 GitHub proposal: https://github.com/kubeflow/kubeflow/issues/7156
- See Kubeflow Notebooks 2.0 design document: https://docs.google.com/document/d/1_zk06zebbaTBdJ8TdU07Ibky25hqHGARXjVcsp2qEnU/edit
Skills required: Kubernetes Controllers (Golang - Kubebuilder) AND/OR Web Development (JS - Angular, Python - Flask)
Difficulty: medium/high
Length: 250 hrs
Mentors: Mathew Wicks, Kimonas Sitorchos, Julius von Kohout
Project 2: Rootless Kubeflow Container Images (Istio Ambient Mesh)
Kubeflow uses Istio as a service mesh, which by default requires “root level” network permissions for its init-containers. We want to reduce the number of privileged containers required to run Kubeflow, so are investigating using the Istio CNI, and eventually the Istio Ambient mesh.
You would be involved in testing and investigating the impacts of these changes, and helping push the integration forwards.
See the proposal for more information: https://github.com/kubeflow/manifests/blob/master/proposals/20200913-rootlessKubeflow.md
Skills required: Istio, Kubernetes, YAML
Difficulty: medium
Length: 250 hrs
Mentors: Kimonas Sitorchos, Julius von Kohout
Project 3: Triage and Categorize Kubeflow GitHub Issues & PRs
The Kubeflow project needs help to triage, categorize, and highlight important Issues/PRs from the https://github.com/kubeflow/kubeflow GitHub repo. There are around 200 open Issues and 200 open PRs, in addition to many Issues/PRs that have been lost to time (closed automatically due to inactivity).
Specifically, your goal would be to:
- Decide which Issues/PRs are still relevant
- Categorize Issues/PRs by type
- De-duplicate multiple Issues for the same request
- Suggest which ones are the most important.
- Help find “good first issues” for new members:
- Review which PRs are likely safe to merge (especially dependabot ones)
Skills required: GitHub, Kubernetes, YAML, Python, GO, JS
Difficulty: medium
Length: 250 hrs
Mentors: Mathew Wicks, Kimonas Sitorchos, Julius von Kohout
Project 4: Implement LLM Tuning API for Katib
Recently, we implemented a new train
Python SDK API in Kubeflow Training Operator to easily fine-tune LLMs on multiple GPUs with predefined datasets provider, model provider, and HuggingFace trainer.
To continue our roadmap around LLMOps in Kubeflow, we want to give user functionality to tune HyperParameters of LLMs using simple Python SDK APIs. It requires making appropriate changes to the Katib Python SDK which allows users to set model, dataset, and HyperParameters that they want to optimize for LLM.
Skills required: Kubernetes, YAML, Python
Difficulty: medium
Length: 250 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuki Iwai
Project 5: Support Distributed Jax for Training Operator
Open issue: https://github.com/kubeflow/training-operator/issues/1619
We want to integrate Jax in Training Operator to run distributed training and fine-tuning jobs on Kubernetes using the Jax ML framework. We need to create a new Kubernetes Custom Resource for Jax (e.g. JaxJob) and update the Training Operator controller to support it. Potentially, we can integrate Jax with the Training Operator Python SDK to give Data Scientists simple APIs to create JaxJob on Kubernetes.
Skills required: Kubernetes, Go, YAML, Python
Difficulty: medium
Length: 250 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuki Iwai
Project 6: Push-based metrics collection for Katib
Open issue: https://github.com/kubeflow/katib/issues/577.
Katib implements Metrics Collector as a sidecar container to collect training metrics from the Trials once training is complete. This Metrics Collector waits until the training container is complete and parses training logs to get appropriate metrics like accuracy or loss to get evaluation results for the HyperParameter tuning algorithm.
Sometimes the container sidecar approach might not work for users. For example, if their Trial resources executor doesn’t support sidecar containers. For such use-cases, we want to implement a new API to the Katib Python SDK to allow users to push metrics directly from their training scripts to the Katib DB.
Skills required: Kubernetes, Go, YAML, Python
Difficulty: medium
Length: 250 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuki Iwai
Project 7: Automate docs generation for Kubeflow Python SDKs
Open issue: https://github.com/kubeflow/katib/issues/2081
Training Operator and Katib SDKs have a valid docstring for each public API that users are running. We want to automatically generate documentation for Kubeflow users from these docstrings, so users don’t need to read source code to understand APIs parameters.
Skills required: Python
Difficulty: medium
Length: 250 hrs
Mentors: Andrey Velichkevich, Johnu George, Yuki Iwai, Shivay Lamba
Project 8: Support various parameter distributions like log-uniform in Katib
Open issue: https://github.com/kubeflow/katib/pull/2059
We need to enhance Katib Experiment APIs to support various parameter distributions like uniform, log-uniform, qlog-uniform to make Katib more native to other HyperParameter tuning frameworks like Hyperopt. Currently, Katib supports only uniform distribution of integer, float, and categorical HyperParameters.
Skills required: Kubernetes, Python, Go, YAML
Difficulty: medium
Length: [250 hrs]
Mentors: Andrey Velichkevich, Johnu George, Yuki Iwai
Project 9: PostgreSQL integration in Kubeflow Pipelines
Open issue: https://github.com/kubeflow/pipelines/issues/9813
Kubeflow Pipelines must store information about pipelines, experiments, runs, and artifacts in a database. Currently, the only database it supports is MySQL/MariaDB.
We plan to support PostgreSQL as an alternative to MySQL/MariaDB so users will be able to reuse existing databases, and PostgreSQL will be a good use case for supporting multiple databases.
Skills required: Kubernetes, Python, Go, YAML
Difficulty: medium
Length: 250 hrs
Mentors: Ricardo Martinelli, Shivay Lamba
Project 10: Enhancing KF Model Registry Python client for seamless ML imports from alternative registries
We aim to extend the capabilities of the KF Model Registry Python client by enabling smooth imports from various machine learning registries. While import from HuggingFace is already implemented (and can be used as a basis) we seek to integrate support for MLFlow, and other popular registry formats.
Skills required: Python, ML model serialization formats, YAML, Kubernetes/Kubeflow as a plus
Difficulty: medium
Length: 250 hrs
Mentors: Matteo Mortari, Andrea Lamparelli
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.