diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/README.md b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/README.md new file mode 100644 index 000000000..166ae0d63 --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/README.md @@ -0,0 +1,158 @@ +# Object Detection - ASP.NET Core Web & WPF Desktop Sample + +| ML.NET version | API type | Status | App Type | Data type | Scenario | ML Task | Algorithms | +|----------------|-------------|------------|-------------|-------------|------------------|---------------|-----------------------------------| +| v1.7.1 | Dynamic API | Up-to-date | End-End app | image files | Object detection | Deep Learning | ONNX: Custom Vision | + +## Problem + +Object detection is one of the main applicatinos of deep learning by being able to not only classify part of an image, but also show where in the image the object is with a bounding box. For deep learning scenarios, you can either use a pre-trained model or train your own model. This sample uses an object detection model exported from [Custom Vision](https://www.customvision.ai). + +## How the sample works + +This sample consists of a single console application that builds an ML.NET pipeline from an ONNX model downnloaded from Custom Vision and predicts as well as shows the bounding box on any images in the "test" folder. + +## ONNX + +The Open Neural Network eXchange i.e [ONNX](http://onnx.ai/) is an open format to represent deep learning models. With ONNX, developers can move models between state-of-the-art tools and choose the combination that is best for them. ONNX is developed and supported by a community of partners, including Microsoft. + +## Model input and output + +In order to parse the prediction output of the ONNX model, we need to understand the format (or shape) of the input and output tensors. To do this, we'll start by using [Netron](https://netron.app/), a GUI visualizer for neural networks and machine learning models, to inspect the model. + +Below is an example of what we'd see upon opening this sample's model with Netron: + +![Output from inspecting the model with Netron](./assets/onnx-input.jpg) + +From the output above, we can see the ONNX model has the following input/output formats: + +### Input: 'image_tensor' 3x320x320 + +The first thing to notice is that the **input tensor's name** is **'image_tensor'**. We'll need this name later when we define **input** parameter of the estimation pipeline. + +We can also see that the or **shape of the input tensor** is **3x320x320**. This tells that the image passed into the model should be 320 high x 320 wide. The '3' indicates the image(s) should be in BGR format; the first 3 'channels' are blue, green, and red, respectively. + +### Output + +We can see that the ONNX model has three outputs: +- **detected_classes**: An array of indexes that corresponds to the **labels.txt** file of what classes have been detected in the image. The labels are the tags that are added when uploading images to the Custom Vision service. +- **detected_boxes**: An array of floats that are normalized to the input image. There will be a set of four items in the array for each bounding box. +- **detected_scores**: An array of scores for each detected class. + +## Solution + +## Code Walkthrough + +Create a class that defines the data schema to use while loading data into an `IDataView`. ML.NET supports the `Bitmap` type for images, so we'll specify `Bitmap` property decorated with the `ImageTypeAttribute` and pass in the height and width dimensions we got by [inspecting the model](#model-input-and-output), as shown below. + +```csharp +public class StopSignInput +{ + public struct ImageSettings + { + public const int imageHeight = 320; + public const int imageWidth = 320; + } + + public class StopSignInput + { + [ImageType(ImageSettings.imageHeight, ImageSettings.imageWidth)] + public Bitmap Image { get; set; } + } +} +``` + +### ML.NET: Configure the model + +The first step is to create an empty `DataView` to obtain the schema of the data to use when configuring the model. + +```csharp +var data = _mlContext.Data.LoadFromEnumerable(new List()); +``` + +Next, we can use the input and output tensor names we got by [inspecting the model](#model-input-and-output) to define the **input** and **output** parameters of the ONNX Model. We can use this information to define the estimator pipeline. Usually, when dealing with deep neural networks, you must adapt the images to the format expected by the network. For this reason, the code below resizes and transforms the images (pixel values are normalized across all R,G,B channels). Since we have multiple outputs in our model, we can use the overload in **ApplyOnnxModel** to define a string array of output column names. + +```csharp +var pipeline = context.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image_tensor", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(StopSignInput.Image)) + .Append(context.Transforms.ExtractPixels(outputColumnName: "image_tensor")) + .Append(context.Transforms.ApplyOnnxModel(outputColumnNames: new string[] { "detected_boxes", "detected_scores", "detected_classes" }, + inputColumnNames: new string[] { "image_tensor" }, modelFile: "./Model/model.onnx")); +``` + +Last, create the model by fitting the `DataView`. + +```csharp +var model = pipeline.Fit(data); +``` + +## Create a PredictionEngine + +After the model is configured, create a `PredictionEngine`, and then pass the image to the engine to classify images using the model. + +```csharp +var predictionEngine = context.Model.CreatePredictionEngine(model); +``` + +## Detect objects in an image + +When obtaining the prediction from images in the `test` directory, we get a `long` array in the `PredictedLabels` property, a `float` array in the `BoundingBoxes` property, and a `float` array in the `Scores` property. For each test image load it into a `FileStream` and parse it into a `Bitmap` object, then we use the `Bitmap` object to send into our input to make a prediction. + +We use the `Chunk` method to determine how many bounding boxes were predicted and use that to draw the bounding boxes on the image. To get the labels, we use the `labels.txt` file and use the `PredictedLabels` property to look up the label. + +```csharp +var labels = File.ReadAllLines("./model/labels.txt"); + +var testFiles = Directory.GetFiles("./test"); + +Bitmap testImage; + +foreach (var image in testFiles) +{ + using (var stream = new FileStream(image, FileMode.Open)) + { + testImage = (Bitmap)Image.FromStream(stream); + } + + var prediction = predictionEngine.Predict(new StopSignInput { Image = testImage }); + + var boundingBoxes = prediction.BoundingBoxes.Chunk(prediction.BoundingBoxes.Count() / prediction.PredictedLabels.Count()); + + var originalWidth = testImage.Width; + var originalHeight = testImage.Height; + + for (int i = 0; i < boundingBoxes.Count(); i++) + { + var boundingBox = boundingBoxes.ElementAt(i); + + var left = boundingBox[0] * originalWidth; + var top = boundingBox[1] * originalHeight; + var right = boundingBox[2] * originalWidth; + var bottom = boundingBox[3] * originalHeight; + + var x = left; + var y = top; + var width = Math.Abs(right - left); + var height = Math.Abs(top - bottom); + + var label = labels[prediction.PredictedLabels[i]]; + + using var graphics = Graphics.FromImage(testImage); + + graphics.DrawRectangle(new Pen(Color.Red, 3), x, y, width, height); + graphics.DrawString(label, new Font(FontFamily.Families[0], 32f), Brushes.Red, x + 5, y + 5); + } + + if (File.Exists(predictedImage)) + { + File.Delete(predictedImage); + } + + testImage.Save(predictedImage); +} +``` + +## Output + +For this object detection scenario, we will output a new photo where the bounding boxes and label are drawn onto it. If one already exists when running the console application, it will delete it and save a new photo. + +![Multiple bounding boxes output](./assets/object-detection-output.jpg) \ No newline at end of file diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX.sln b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX.sln new file mode 100644 index 000000000..2934e1600 --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX.sln @@ -0,0 +1,25 @@ + +Microsoft Visual Studio Solution File, Format Version 12.00 +# Visual Studio Version 17 +VisualStudioVersion = 17.1.32228.430 +MinimumVisualStudioVersion = 10.0.40219.1 +Project("{FAE04EC0-301F-11D3-BF4B-00C04F79EFBC}") = "StopSignDetection_ONNX", "StopSignDetection_ONNX\StopSignDetection_ONNX.csproj", "{37A33ADD-47A7-4B09-B323-CB9BCBC86851}" +EndProject +Global + GlobalSection(SolutionConfigurationPlatforms) = preSolution + Debug|Any CPU = Debug|Any CPU + Release|Any CPU = Release|Any CPU + EndGlobalSection + GlobalSection(ProjectConfigurationPlatforms) = postSolution + {37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Debug|Any CPU.ActiveCfg = Debug|Any CPU + {37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Debug|Any CPU.Build.0 = Debug|Any CPU + {37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Release|Any CPU.ActiveCfg = Release|Any CPU + {37A33ADD-47A7-4B09-B323-CB9BCBC86851}.Release|Any CPU.Build.0 = Release|Any CPU + EndGlobalSection + GlobalSection(SolutionProperties) = preSolution + HideSolutionNode = FALSE + EndGlobalSection + GlobalSection(ExtensibilityGlobals) = postSolution + SolutionGuid = {0FCF9329-4869-4595-94F9-56E4055DA8D4} + EndGlobalSection +EndGlobal diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/labels.txt b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/labels.txt new file mode 100644 index 000000000..fafbbfa30 --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/labels.txt @@ -0,0 +1 @@ +stop-sign \ No newline at end of file diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/model.onnx b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/model.onnx new file mode 100644 index 000000000..c005bb8b5 Binary files /dev/null and b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Model/model.onnx differ diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Program.cs b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Program.cs new file mode 100644 index 000000000..12b7adcce --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/Program.cs @@ -0,0 +1,80 @@ +using Microsoft.ML; +using Microsoft.ML.Transforms.Image; +using StopSignDetection_ONNX; +using System.Drawing; + +var context = new MLContext(); + +var data = context.Data.LoadFromEnumerable(new List()); +var root = new FileInfo(typeof(Program).Assembly.Location); +var assemblyFolderPath = root.Directory.FullName; + +// Create pipeline +var pipeline = context.Transforms.ResizeImages(resizing: ImageResizingEstimator.ResizingKind.Fill, outputColumnName: "image_tensor", imageWidth: ImageSettings.imageWidth, imageHeight: ImageSettings.imageHeight, inputColumnName: nameof(StopSignInput.Image)) + .Append(context.Transforms.ExtractPixels(outputColumnName: "image_tensor")) + .Append(context.Transforms.ApplyOnnxModel(outputColumnNames: new string[] { "detected_boxes", "detected_scores", "detected_classes" }, + inputColumnNames: new string[] { "image_tensor" }, modelFile: "./Model/model.onnx")); + +// Fit and create prediction engine +var model = pipeline.Fit(data); + +var predictionEngine = context.Model.CreatePredictionEngine(model); + +var labels = File.ReadAllLines("./Model/labels.txt"); + +var testFiles = Directory.GetFiles("./test"); + +Bitmap testImage; + +foreach (var image in testFiles) +{ + // Load test image into memory + var predictedImage = $"{Path.GetFileName(image)}-predicted.jpg"; + + using (var stream = new FileStream(image, FileMode.Open)) + { + testImage = (Bitmap)Image.FromStream(stream); + } + + // Predict on test image + var prediction = predictionEngine.Predict(new StopSignInput { Image = testImage }); + + // Calculate how many sets of bounding boxes we get from the prediction + var boundingBoxes = prediction.BoundingBoxes.Chunk(prediction.BoundingBoxes.Count() / prediction.PredictedLabels.Count()); + + var originalWidth = testImage.Width; + var originalHeight = testImage.Height; + + // Draw boxes and predicted label + for (int i = 0; i < boundingBoxes.Count(); i++) + { + var boundingBox = boundingBoxes.ElementAt(i); + + var left = boundingBox[0] * originalWidth; + var top = boundingBox[1] * originalHeight; + var right = boundingBox[2] * originalWidth; + var bottom = boundingBox[3] * originalHeight; + + var x = left; + var y = top; + var width = Math.Abs(right - left); + var height = Math.Abs(top - bottom); + + // Get predicted label from labels file + var label = labels[prediction.PredictedLabels[i]]; + + // Draw bounding box and add label to image + using var graphics = Graphics.FromImage(testImage); + + graphics.DrawRectangle(new Pen(Color.NavajoWhite, 8), x, y, width, height); + graphics.DrawString(label, new Font(FontFamily.Families[0], 18f), Brushes.NavajoWhite, x + 5, y + 5); + } + + // Save the prediction image, but delete it if it already exists before saving + if (File.Exists(predictedImage)) + { + File.Delete(predictedImage); + } + + testImage.Save(Path.Combine(assemblyFolderPath, predictedImage)); +} \ No newline at end of file diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignDetection_ONNX.csproj b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignDetection_ONNX.csproj new file mode 100644 index 000000000..85423f34a --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignDetection_ONNX.csproj @@ -0,0 +1,35 @@ + + + + Exe + net6.0 + enable + enable + + + + + + + + + + + + + + + PreserveNewest + + + PreserveNewest + + + PreserveNewest + + + PreserveNewest + + + + diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignInput.cs b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignInput.cs new file mode 100644 index 000000000..6d633a0b4 --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignInput.cs @@ -0,0 +1,17 @@ +using Microsoft.ML.Transforms.Image; +using System.Drawing; + +namespace StopSignDetection_ONNX +{ + public struct ImageSettings + { + public const int imageHeight = 320; + public const int imageWidth = 320; + } + + public class StopSignInput + { + [ImageType(ImageSettings.imageHeight, ImageSettings.imageWidth)] + public Bitmap Image { get; set; } + } +} diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignPrediction.cs b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignPrediction.cs new file mode 100644 index 000000000..4aa0f191d --- /dev/null +++ b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/StopSignPrediction.cs @@ -0,0 +1,16 @@ +using Microsoft.ML.Data; + +namespace StopSignDetection_ONNX +{ + public class StopSignPrediction + { + [ColumnName("detected_classes")] + public long[] PredictedLabels { get; set; } + + [ColumnName("detected_boxes")] + public float[] BoundingBoxes { get; set; } + + [ColumnName("detected_scores")] + public float[] Scores { get; set; } + } +} diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-multiple-test.jpg b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-multiple-test.jpg new file mode 100644 index 000000000..67f0abcce Binary files /dev/null and b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-multiple-test.jpg differ diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-test.jpg b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-test.jpg new file mode 100644 index 000000000..2147ccd81 Binary files /dev/null and b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/StopSignDetection_ONNX/test/stop-sign-test.jpg differ diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/object-detection-output.jpg b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/object-detection-output.jpg new file mode 100644 index 000000000..b9035934e Binary files /dev/null and b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/object-detection-output.jpg differ diff --git a/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/onnx-input.jpg b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/onnx-input.jpg new file mode 100644 index 000000000..8c13a00c4 Binary files /dev/null and b/samples/csharp/end-to-end-apps/StopSignDetection_ONNX/assets/onnx-input.jpg differ