Skip to content

Commit 4bb780d

Browse files
Explain how to use NLP
1 parent 26f4eff commit 4bb780d

File tree

3 files changed

+33
-3
lines changed

3 files changed

+33
-3
lines changed

README.md

Lines changed: 15 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,9 +176,23 @@ The **IsReadable** property will be false if no article was extracted, whatever
176176

177177
### Natural Language Processing Features
178178

179-
Often the webpage containing the article will contain metadata about the language used in the article. Sometimes this is not true, so the library comes with a delegate `LanguageIdentification` that can be used to identify the language based on the text itself. The default method just returns the language in the metadata. This is the old, standard behavior.
179+
Often the webpage containing the article will contain metadata about the language used in the article. Sometimes this is not true, so the library comes with a delegate `LanguageIdentification` in `Article`, that can be used to identify the language based on the text itself. This delegate will be called on the `TextContent` of the `Article`. The default method just returns the language in the metadata. This is the old, standard behavior.
180180

181181
You can use the delegate to implement your own method to identify the language. We also provide a decent implementation, that actually does something, using FastText. This implementation is distributed in a separate nuget package, `SmartReader.NaturalLanguageProcessing`.
182+
To use the language identification feature you will need to call the `Enable` method.
183+
184+
```
185+
NLP.Enable();
186+
```
187+
188+
This will change the delegate `LanguageIdentification`, so it will automatically identify the language and set the property `Language` in every `Article` object created by the library.
189+
190+
191+
To restore the default behavior, instead use `RestoreDefaults`.
192+
193+
```
194+
NLP.RestoreDefaults();
195+
```
182196

183197
There is also a delegate to create a summary of the article : `CreateSummary`. Also in this case the default implementation returns the summary provided by the metadata. In practice this is usually a short summary meant for social media sharing. At this moment we do not provide any better implementation.
184198

docfx_project/articles/advanced.md

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,23 @@ Article.Converter = MagicConverter;
133133

134134
## Natural Language Processing
135135

136-
Often the webpage containing the article will contain metadata about the language used in the article. Sometimes this is not true, so the library comes with a delegate `LanguageIdentification` that can be used to identify the language based on the text itself. The default method just returns the language in the metadata. This is the old, standard behavior.
136+
Often the webpage containing the article will contain metadata about the language used in the article. Sometimes this is not true, so the library comes with a delegate `LanguageIdentification` in `Article`, that can be used to identify the language based on the text itself. This delegate will be called on the `TextContent` of the `Article`. The default method just returns the language in the metadata. This is the old, standard behavior.
137+
138+
You can use the delegate to implement your own method to identify the language. We also provide a decent implementation, that actually does something, using FastText. This implementation is distributed in a separate nuget package, `SmartReader.NaturalLanguageProcessing`.
139+
To use the language identification feature you will need to call the `Enable` method.
140+
141+
```
142+
NLP.Enable();
143+
```
144+
145+
This will change the delegate `LanguageIdentification`, so it will automatically identify the language and set the property `Language` in every `Article` object created by the library.
146+
147+
148+
To restore the default behavior, instead use `RestoreDefaults`.
149+
150+
```
151+
NLP.RestoreDefaults();
152+
```
137153

138154
The delegate accepts two arguments:
139155
- the first one will receive the text of the article

src/SmartReader.NaturalLanguageProcessing/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ This library provides a basic implementation for the natural language processing
44

55
We want to provide also a basic implementation for creating a summary, but there is no timeline yet.
66

7-
This library is meant to be just a wrapper to hook natural language processing features into SmartReader. Implementation of the features come from other libraries.
7+
This library is meant to be just a wrapper to hook natural language processing features into SmartReader. Implementation of the features come from other libraries.

0 commit comments

Comments
 (0)