From e87ebe9fede6ab47a627dfd660308aa07ddbb3b1 Mon Sep 17 00:00:00 2001 From: Jeff Handley Date: Mon, 23 Dec 2024 04:14:01 -0800 Subject: [PATCH 1/2] Fix broken link format --- .../Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml b/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml index 540a3fb1c..51bc724a8 100644 --- a/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml +++ b/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml @@ -37,7 +37,7 @@ * [Tokenization](https://en.wikipedia.org/wiki/Lexical_analysis#Tokenization) * [Text normalization](https://en.wikipedia.org/wiki/Text_normalization) * [Predefined and custom stopwords removal](https://en.wikipedia.org/wiki/Stop_words) - * [Word-based or character-based Ngram extraction and SkipGram extraction (through the advanced [options](xref:Microsoft.ML.Transforms.TextFeaturizingEstimator.Options.WordFeatureExtractor))](https://en.wikipedia.org/wiki/N-gram) + * [Word-based or character-based Ngram extraction and SkipGram extraction](https://en.wikipedia.org/wiki/N-gram) (through the advanced [options](xref:Microsoft.ML.Transforms.TextFeaturizingEstimator.Options.WordFeatureExtractor)) * [TF, IDF or TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) * [L-p vector normalization](xref: Microsoft.ML.Transforms.LpNormNormalizingTransformer) From 1fd9c66a7e65d9926f7edc3c67e5e00fe378f2c9 Mon Sep 17 00:00:00 2001 From: Jeff Handley Date: Mon, 23 Dec 2024 04:18:46 -0800 Subject: [PATCH 2/2] Remove errant space breaking xref link --- .../Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml b/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml index 51bc724a8..4f4704d18 100644 --- a/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml +++ b/dotnet/xml/Microsoft.ML.Transforms.Text/TextFeaturizingEstimator.xml @@ -39,7 +39,7 @@ * [Predefined and custom stopwords removal](https://en.wikipedia.org/wiki/Stop_words) * [Word-based or character-based Ngram extraction and SkipGram extraction](https://en.wikipedia.org/wiki/N-gram) (through the advanced [options](xref:Microsoft.ML.Transforms.TextFeaturizingEstimator.Options.WordFeatureExtractor)) * [TF, IDF or TF-IDF](https://en.wikipedia.org/wiki/Tf%E2%80%93idf) - * [L-p vector normalization](xref: Microsoft.ML.Transforms.LpNormNormalizingTransformer) + * [L-p vector normalization](xref:Microsoft.ML.Transforms.LpNormNormalizingTransformer) By default the features are made of (word/character) n-grams/skip-grams​ and the number of features are equal to the vocabulary size found by analyzing the data. To output an additional column with the tokens generated, use [OutputTokensColumnName](xref:Microsoft.ML.Transforms.Text.TextFeaturizingEstimator.Options.OutputTokensColumnName).