You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Important**: This F# sample needs to be updated to the new dynamic API available since ML.NET 0.6. It currently uses the deprecated LearningPipeline API.
10
-
11
-
Contribution from the community will be welcomed!
12
-
13
-
------------------------------------
14
7
15
8
In this introductory sample, you'll see how to use [ML.NET](https://www.microsoft.com/net/learn/apps/machine-learning-and-ai/ml-dotnet) to predict taxi fares. In the world of machine learning, this type of prediction is known as **regression**.
16
9
@@ -38,60 +31,83 @@ The common feature for all those examples is that the parameter we want to predi
38
31
## Solution
39
32
To solve this problem, first we will build an ML model. Then we will train the model on existing data, evaluate how good it is, and lastly we'll consume the model to predict taxi fares.
Building a model includes: uploading data (`taxi-fare-train.csv` with `TextLoader`), transforming the data so it can be used effectively by an ML algorithm (with `ColumnCopier`,`CategoricalOneHotVectorizer`,`ColumnConcatenator`), and choosing a learning algorithm (`FastTreeRegressor`). All of those steps are stored in a `LearningPipeline`:
38
+
Building a model includes: uploading data (`taxi-fare-train.csv` with `TextLoader`), transforming the data so it can be used effectively by an ML algorithm (`FastTreeRegressor` in this case):
39
+
46
40
```fsharp
47
41
// LearningPipeline holds all steps of the learning process: data, transforms, learners.
48
-
let pipeline = LearningPipeline()
49
-
50
-
// The TextLoader loads a dataset. The schema of the dataset is specified by passing a class containing
51
-
// all the column names and their types. This will be used to create the model, and train it.
// We apply our selected Trainer (SDCA Regression algorithm)
78
+
let pipelineWithTrainer =
79
+
pipeline
80
+
|> Pipeline.append(new SdcaRegressionTrainer(mlcontext, new SdcaRegressionTrainer.Arguments(), "Features", "Label"))
78
81
```
79
82
80
83
### 2. Train model
81
-
Training the model is a process of running the chosen algorithm on a training data (with known fare values) to tune the parameters of the model. It is implemented in the `Train()` API. To perform training we just call the method and provide the types for our data object `TaxiTrip` and prediction object `TaxiTripFarePrediction`.
84
+
Training the model is a process of running the chosen algorithm on a training data (with known fare values) to tune the parameters of the model. It is implemented in the `Fit()` API. To perform training we just call the method while providing the DataView.
82
85
83
86
```fsharp
84
-
let model = pipeline.Train<TaxiTrip, TaxiTripFarePrediction>()
87
+
let model = pipelineWithTrainer.Fit dataView
85
88
```
86
89
87
90
### 3. Evaluate model
88
91
We need this step to conclude how accurate our model operates on new data. To do so, the model from the previous step is run against another dataset that was not used in training (`taxi-fare-test.csv`). This dataset also contains known fares. `RegressionEvaluator` calculates the difference between known fares and values predicted by the model in various metrics.
89
92
90
93
```fsharp
91
-
let testData = TextLoader(TestDataPath).CreateFrom<TaxiTrip>(separator=',')
92
-
93
-
let evaluator = RegressionEvaluator()
94
-
let metrics = evaluator.Evaluate(model, testData)
94
+
let testDataView = MultiFileSource testDataLocation |> textLoader.Read
95
+
96
+
printfn "=============== Evaluating Model's accuracy with Test data==============="
97
+
98
+
let predictions = model.Transform testDataView
99
+
100
+
let regressionCtx = RegressionContext mlcontext
101
+
let metrics = regressionCtx.Evaluate(predictions, "Label", "Score")
>*To learn more on how to understand the metrics, check out the Machine Learning glossary from the [ML.NET Guide](https://docs.microsoft.com/en-us/dotnet/machine-learning/) or use any available materials on data science and machine learning*.
@@ -104,20 +120,31 @@ If you are not satisfied with the quality of the model, there are a variety of w
104
120
After the model is trained, we can use the `Predict()` API to predict the fare amount for specified trip.
105
121
106
122
```fsharp
107
-
let prediction = model.Predict(TestTaxiTrips.Trip1)
108
-
Console.WriteLine(sprintf "Predicted fare: {prediction.FareAmount:0.####}, actual fare: 29.5")
123
+
//Prediction test
124
+
// Create prediction engine and make prediction.
125
+
let engine = model.MakePredictionFunction<TaxiTrip, TaxiTripFarePrediction> mlcontext
Where `TestTaxiTrips.Trip1` stores the information about the trip we'd like to get the prediction for.
111
145
112
-
```fsharp
113
-
module TestTaxiTrips =
114
-
let Trip1 =
115
-
TaxiTrip(
116
-
VendorId = "VTS",
117
-
RateCode = "1",
118
-
PassengerCount = 1.0,
119
-
TripDistance = 10.33,
120
-
PaymentType = "CSH",
121
-
FareAmount = 0.0 // predict it. actual = 29.5
122
-
)
123
-
```
146
+
147
+
Finally, you can plot in a chart how the tested predictions are distributed and how the regression is performing with the implemented method `PlotRegressionChart()` as in the following screenshot:
0 commit comments