Skip to content

Commit bf36603

Browse files
committed
add yolo submodule and materializer
1 parent 717a328 commit bf36603

16 files changed

+146
-171
lines changed

.gitmodules

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
[submodule "sign-language-detection-yolov5/yolov5"]
2+
path = sign-language-detection-yolov5/yolov5
3+
url = https://github.com/safoinme/yolov5.git

sign-language-detection-yolov5/.dockerignore

-1
This file was deleted.
-155
Original file line numberDiff line numberDiff line change
@@ -1,157 +1,2 @@
1-
# Building and Using an MLOps Stack with ZenML
21

3-
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/zenml)](https://pypi.org/project/zenml/)
42

5-
The purpose of this repository is to demonstrate how [ZenML](https://github.com/zenml-io/zenml) enables your machine
6-
learning projects in a multitude of ways:
7-
8-
- By offering you a framework or template to develop within
9-
- By seamlessly integrating into the tools you love and need
10-
- By allowing you to easily switch orchestrator for your pipelines
11-
- By bringing much needed Zen into your machine learning
12-
13-
**ZenML** is an extensible, open-source MLOps framework to create production-ready machine learning pipelines. Built for
14-
data scientists, it has a simple, flexible syntax, is cloud- and tool-agnostic, and has interfaces/abstractions that
15-
are catered towards ML workflows.
16-
17-
At its core, **ZenML pipelines execute ML-specific workflows** from sourcing data to splitting, preprocessing, training,
18-
all the way to the evaluation of results and even serving. There are many built-in batteries to support common ML
19-
development tasks. ZenML is not here to replace the great tools that solve these individual problems. Rather, it
20-
**integrates natively with popular ML tooling** and gives standard abstraction to write your workflows.
21-
22-
Within this repo we will use ZenML to build pipelines that seamlessly use [Evidently](https://evidentlyai.com/),
23-
[MLFlow](https://mlflow.org/), [Kubeflow Pipelines](https://www.kubeflow.org/) and post
24-
results to our [Discord](https://discord.com/).
25-
26-
![](_assets/evidently+mlflow+discord+kubeflow.png)
27-
28-
[![](https://img.youtube.com/vi/Ne-dt9tu11g/0.jpg)](https://www.youtube.com/watch?v=Ne-dt9tu11g)
29-
30-
_Come watch along as Hamza Tahir, Co-Founder and CTO of ZenML showcases an early version of this repo
31-
to the MLOps.community._
32-
33-
## :computer: System Requirements
34-
35-
In order to run this demo you need to have some packages installed on your machine.
36-
37-
Currently, this will only run on UNIX systems.
38-
39-
| package | MacOS installation | Linux installation |
40-
| ------- | -------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- |
41-
| docker | [Docker Desktop for Mac](https://docs.docker.com/desktop/mac/install/) | [Docker Engine for Linux ](https://docs.docker.com/engine/install/ubuntu/) |
42-
| kubectl | [kubectl for mac](https://kubernetes.io/docs/tasks/tools/install-kubectl-macos/) | [kubectl for linux](https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/) |
43-
| k3d | [Brew Installation of k3d](https://formulae.brew.sh/formula/k3d) | [k3d installation linux](https://k3d.io/v5.2.2/) |
44-
45-
## :snake: Python Requirements
46-
47-
Once you've got the system requirements figured out, let's jump into the Python packages you need.
48-
Within the Python environment of your choice, run:
49-
50-
```bash
51-
git clone https://github.com/zenml-io/zenfiles
52-
cd nba-pipeline
53-
pip install -r requirements.txt
54-
```
55-
56-
If you are running the `run_pipeline.py` script, you will also need to install some integrations using zenml:
57-
58-
```bash
59-
zenml integration install evidently -f
60-
zenml integration install mlflow -f
61-
zenml integration install kubeflow -f
62-
```
63-
64-
## :basketball: The Task
65-
66-
A couple of weeks ago, we were looking for a fun project to work on for the next chapter of our ZenHacks. During our
67-
initial discussions, we realized that it would be really great to work with an NBA dataset, as we could quickly get
68-
close to a real-life application like a "3-Pointer Predictor" while simultaneously entertaining ourselves with one
69-
of the trending topics within our team.
70-
71-
As we were building the dataset around a "3-Pointer Predictor", we realized that there is one factor that we need to
72-
take into consideration first: Stephen Curry, The Baby Faced Assassin. In our opinion, there is no denying that he
73-
changed the way that the games are played in the NBA and we wanted to actually prove that this was the case first.
74-
75-
That's why our story in this ZenHack will start with a pipeline dedicated to drift detection. As the breakpoint of this
76-
drift, we will be using the famous "Double Bang" game that the Golden State Warriors played against Oklahoma City
77-
Thunder back in 2016. Following that, we will build a training pipeline which will generate a model that predicts
78-
the number of three-pointers made by a team in a single game, and ultimately, we will use these trained models and
79-
create an inference pipeline for the upcoming matches in the NBA.
80-
81-
![Diagram depicting the Training and Inference pipelines](_assets/Training and Inference Pipeline.png)
82-
83-
## :notebook: Diving into the code
84-
85-
We're ready to go now. You have two options:
86-
87-
### Notebook
88-
89-
You can spin up a step-by-step guide in `Building and Using An MLOPs Stack With ZenML.ipynb`:
90-
91-
```python
92-
jupyter notebook
93-
```
94-
95-
### Script
96-
97-
You can also directly run the code, using the `run_pipeline.py` script.
98-
99-
```python
100-
python run_pipeline.py drift # Run one-shot drift pipeline
101-
python run_pipeline.py train # Run training pipeline
102-
python run_pipeline.py infer # Run inference pipeline
103-
```
104-
105-
## :rocket: Going from local orchestration to kubeflow pipelines
106-
107-
ZenML manages the configuration of the infrastructure where ZenML pipelines are run using ZenML `Stacks`. For now, a Stack consists of:
108-
109-
- A metadata store: To store metadata like parameters and artifact URIs
110-
- An artifact store: To store interim data step output.
111-
- An orchestrator: A service that actually kicks off and runs each step of the pipeline.
112-
- An optional container registry: To store Docker images that are created to run your pipeline.
113-
114-
![Local ZenML stack](_assets/localstack.png)
115-
116-
To transition from running our pipelines locally (see diagram above) to running them on Kubeflow Pipelines, we only need to register a new stack:
117-
118-
```bash
119-
zenml container-registry register local_registry --flavor=default --uri=localhost:5000
120-
zenml orchestrator register kubeflow_orchestrator --flavor=kubeflow
121-
zenml stack register local_kubeflow_stack \
122-
-m local_metadata_store \
123-
-a local_artifact_store \
124-
-o kubeflow_orchestrator \
125-
-c local_registry
126-
```
127-
128-
To reduce the amount of manual setup steps, we decided to work with a local Kubeflow Pipelines deployment in this repository (if you're interested in running your ZenML pipelines remotely, check out [our docs](https://docs.zenml.io/component-gallery/orchestrators/kubeflow#how-to-use-it).
129-
130-
For the local setup, our kubeflow stack keeps the existing `local_metadata_store` and `local_artifact_store` but replaces the orchestrator and adds a local container registry (see diagram below).
131-
132-
Once the stack is registered we can activate it and provision resources for the local Kubeflow Pipelines deployment:
133-
134-
```bash
135-
zenml stack set local_kubeflow_stack
136-
zenml stack up
137-
```
138-
139-
![ZenML stack for running pipelines on a local Kubeflow Pipelines deployment](_assets/localstack-with-kubeflow-orchestrator.png)
140-
141-
## :checkered_flag: Cleaning up when you're done
142-
143-
Once you are done running this notebook you might want to stop all running processes. For this, run the following command.
144-
(This will tear down your `k3d` cluster and the local docker registry.)
145-
146-
```shell
147-
zenml stack set local_kubeflow_stack
148-
zenml stack down -f
149-
```
150-
151-
## :question: FAQ
152-
153-
1. **MacOS** When starting the container registry for Kubeflow, I get an error about port 5000 not being available.
154-
`OSError: [Errno 48] Address already in use`
155-
156-
Solution: In order for Kubeflow to run, the docker container registry currently needs to be at port 5000. MacOS, however, uses
157-
port 5000 for the Airplay receiver. Here is a guide on how to fix this [Freeing up port 5000](https://12ft.io/proxy?q=https%3A%2F%2Fanandtripathi5.medium.com%2Fport-5000-already-in-use-macos-monterey-issue-d86b02edd36c).
Binary file not shown.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Copyright (c) ZenML GmbH 2022. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at:
6+
#
7+
# https://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
12+
# or implied. See the License for the specific language governing
13+
# permissions and limitations under the License.
14+
15+
from materializer.yolo_model_materializer import Yolov5ModelMaterializer
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Copyright (c) ZenML GmbH 2021. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at:
6+
#
7+
# https://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
12+
# or implied. See the License for the specific language governing
13+
# permissions and limitations under the License.
14+
"""Materializer for Yolov5 Trained Model."""
15+
16+
import os
17+
from typing import Dict, Any
18+
import tempfile
19+
from typing import Type
20+
import torch
21+
22+
from zenml.artifacts import ModelArtifact
23+
from zenml.io import fileio
24+
from zenml.logger import get_logger
25+
from zenml.materializers.base_materializer import BaseMaterializer
26+
from zenml.utils import io_utils
27+
28+
logger = get_logger(__name__)
29+
30+
DEFAULT_YOLOV5_MODEL_FILENAME = "model.pt"
31+
32+
33+
class Yolov5ModelMaterializer(BaseMaterializer):
34+
"""Materializer for Yolo Trained Model."""
35+
36+
ASSOCIATED_TYPES = (dict,)
37+
ASSOCIATED_ARTIFACT_TYPES = (ModelArtifact,)
38+
39+
def handle_input(self, data_type: Type[dict]) -> dict:
40+
"""Read from artifact store and return a Dict object.
41+
42+
Args:
43+
data_type: An Dict type.
44+
45+
Returns:
46+
An Dict object.
47+
"""
48+
super().handle_input(data_type)
49+
50+
# Create a temporary directory to store the model
51+
temp_dir = tempfile.TemporaryDirectory()
52+
53+
# Copy from artifact store to temporary directory
54+
io_utils.copy_dir(self.artifact.uri, temp_dir.name)
55+
56+
# Load the Bento from the temporary directory
57+
yolov5_model = torch.load(os.path.join(temp_dir.name, DEFAULT_YOLOV5_MODEL_FILENAME))
58+
return yolov5_model
59+
60+
def handle_return(self, ckpt: dict) -> None:
61+
"""Write to artifact store.
62+
63+
Args:
64+
ckpt: A Dict contains informations regarding yolov5 model.
65+
"""
66+
super().handle_return(ckpt)
67+
68+
# Create a temporary directory to store the model
69+
temp_dir = tempfile.TemporaryDirectory(prefix="zenml-temp-")
70+
temp_ckpt_path = os.path.join(temp_dir.name, DEFAULT_YOLOV5_MODEL_FILENAME)
71+
72+
# save the image in a temporary directory
73+
torch.save(ckpt, temp_ckpt_path)
74+
75+
# copy the saved image to the artifact store
76+
io_utils.copy_dir(temp_dir.name, self.artifact.uri)
77+
78+
# Remove the temporary directory
79+
fileio.rmtree(temp_dir.name)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Copyright (c) ZenML GmbH 2022. All Rights Reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at:
6+
#
7+
# https://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
12+
# or implied. See the License for the specific language governing
13+
# permissions and limitations under the License.

sign-language-detection-yolov5/pipelines/train_pipeline.py

+17-3
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,24 @@
1414

1515

1616
from zenml.pipelines import pipeline
17+
from zenml.config import DockerSettings
18+
from zenml.integrations.constants import MLFLOW
1719

20+
docker_settings = DockerSettings(parent_image="ultralytics/yolov5:latest", requirements="./requirements.txt",required_integrations=[MLFLOW])
1821

19-
@pipeline(enable_cache=True)
22+
@pipeline(enable_cache=True,
23+
settings={
24+
"docker": docker_settings,
25+
"orchestrator.local_docker": {
26+
"run_args": {
27+
"device_requests": [{ "device_ids": ["0"], "capabilities": [['gpu']] }],
28+
"shm_size": 18446744073692774399,
29+
"ipc_mode": "host",
30+
"ulimit": [{ "name": "memlock", "soft": -1 },{ "name": "stack", "soft": -1 }],
31+
}
32+
}
33+
}
34+
)
2035
def yolov5_pipeline(
2136
data_loader,
2237
train_augmenter,
@@ -28,5 +43,4 @@ def yolov5_pipeline(
2843
augmented_trainset = train_augmenter(train)
2944
augmented_validset = valid_augmenter(valid)
3045
model = trainer(augmented_trainset,augmented_validset)
31-
detector = detector(test,model)
32-
46+
detector = detector(test,model)
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1-
nba-api
2-
notebook
3-
zenml==0.6.2
1+
zenml==0.21.1
2+
roboflow==0.2.18
3+
albumentations==1.3.0
4+
albumentations[imgaug]

sign-language-detection-yolov5/run.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
data_loader=data_loader(),
2222
train_augmenter=train_augmenter(),
2323
valid_augmenter=valid_augmenter(),
24-
trainer=trainer(),
24+
trainer=trainer(), # .configure(output_materializers=Yolov5ModelMaterializer),
2525
detector=detector(),
2626
)
2727
pipeline_instance.run()

sign-language-detection-yolov5/steps/data_loader.py

+2
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import numpy as np
55
from roboflow import Roboflow
66
from zenml.steps import step, BaseParameters, Output
7+
#from zenml.materializers import BuiltInContainerMaterializer
78
import cv2
89

910

@@ -22,6 +23,7 @@ def roboflow_download(api_key:str, workspace:str, project:str, annotation_type:s
2223
dataset = project.version(6).download(annotation_type)
2324
return dataset.location
2425

26+
#@step(output_materializers={"train_images": BuiltInContainerMaterializer, "val_images": BuiltInContainerMaterializer, "test_images": BuiltInContainerMaterializer})
2527
@step
2628
def data_loader(
2729
params: TrainerParameters,

sign-language-detection-yolov5/steps/detector.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -61,5 +61,5 @@ def image_saver(image_set:Dict):
6161
resized_image = cv2.resize(value[0], dim, interpolation = cv2.INTER_AREA)
6262
cv2.imwrite(f'inference/images/{key}', resized_image)
6363

64-
def model_saver(model:torch.nn.Module):
64+
def model_saver(model:Dict):
6565
torch.save(model, "./inference/model/best.pt")

sign-language-detection-yolov5/steps/train_augmenter.py

+2
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
import albumentations as A
88

99
from zenml.steps import step, BaseParameters, Output
10+
#from zenml.materializers import BuiltInContainerMaterializer
1011

1112

1213
class AugmenterParameters(BaseParameters):
@@ -16,6 +17,7 @@ class AugmenterParameters(BaseParameters):
1617

1718

1819

20+
#@step(output_materializers={"augmented_images": BuiltInContainerMaterializer})
1921
@step
2022
def train_augmenter(
2123
#params:AugmenterParameters,

0 commit comments

Comments
 (0)