-
Notifications
You must be signed in to change notification settings - Fork 43
Moderation provider docs #2619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Moderation provider docs #2619
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
d8cebff
chat/moderation: add tisane rule docs
owenpearson 11e2007
chat/moderation: add bodyguard rule docs
owenpearson 5d56860
chat/moderation: extract common fields for before-publish rules
owenpearson 421f4ea
Apply suggestions from code review
owenpearson File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: Bodyguard | ||
meta_description: "Detect and remove unwanted content in a Chat Room using Bodyguard AI." | ||
--- | ||
|
||
[Bodyguard](https://bodyguard.ai/) is a powerful contextual analysis platform that can be used to moderate content in chat rooms. | ||
|
||
The Bodyguard integration can be applied to chat rooms so that you can use Bodyguard's content moderation capabilities to detect and handle inappropriate content before it's published to other users. | ||
|
||
## Integration setup <a id="setup"/> | ||
|
||
Configure the integration in your [Ably dashboard](https://ably.com/accounts/any/apps/any/integrations) or using the [Control API](/docs/account/control-api). | ||
|
||
The following fields are specific to Bodyguard configuration: | ||
|
||
| Field | Description | | ||
| ----- | ----------- | | ||
| Bodyguard API key | The API key for your Bodyguard account. | | ||
| Channel ID | The ID of your Bodyguard channel where moderation rules are configured. | | ||
| Default Language (optional) | The default language to use for content analysis. This will be used as a fallback in case automatic language detection fails. | | ||
| Model URL (optional) | A custom URL if using a custom moderation model. | | ||
|
||
For additional configuration options shared across all before-publish moderation rules, see the [common configuration fields](/docs/chat/moderation#common-config). | ||
|
||
Messages will be rejected if Bodyguard's analysis returns a `REMOVE` recommended action based on the moderation rules configured in your Bodyguard channel. | ||
|
||
## Handling rejections <a id="rejections"/> | ||
|
||
Messages are rejected when they fail Bodyguard's analysis. Bodyguard returns a REMOVE action in these instances and the messages will not be published to your channel. The publish request will be rejected. Moderation rejections will use the error code `42213`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: Tisane | ||
meta_description: "Detect and remove unwanted content in a Chat Room using Tisane AI." | ||
--- | ||
|
||
[Tisane](https://tisane.ai/) is a powerful Natural Language Understanding (NLU) platform that can be used to moderate content in chat rooms. | ||
|
||
The Tisane integration can be applied to chat rooms so that you can use Tisane's text moderation capabilities to detect and handle inappropriate content before it's published to other users. | ||
|
||
## Integration setup <a id="setup"/> | ||
|
||
Configure the integration in your [Ably dashboard](https://ably.com/accounts/any/apps/any/integrations) or using the [Control API](/docs/account/control-api). | ||
|
||
The following are the fields specific to Tisane configuration: | ||
|
||
| Field | Description | | ||
| ----- | ----------- | | ||
owenpearson marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| Tisane API key | The API key for your Tisane account. | | ||
| Thresholds | A map of [text moderation categories](https://docs.tisane.ai/apis/tisane-api-response-guide#supported-types) to severity levels (`low`, `medium`, `high`, `extreme`). When moderating text, any message deemed to be at or above a specified threshold will be rejected and not published to the chat room. | | ||
| Default Language | The language to use for content analysis. | | ||
| Model URL (optional) | A custom URL if using a custom moderation model. | | ||
|
||
For additional configuration options shared across all before-publish moderation rules, see the [common configuration fields](/docs/chat/moderation#common-config). | ||
|
||
## Handling rejections <a id="rejections"/> | ||
|
||
If a message fails moderation the message will not be published and the publish request will be rejected. | ||
|
||
Moderation rejections will use error code `42213`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,6 +19,18 @@ When using before publish moderation, a message is reviewed by an automated mode | |
|
||
This approach provides additional safety guarantees, but may come at the cost of a small amount of latency, as messages must be vetted prior to being published. | ||
|
||
#### Common configuration fields <a id="common-config"/> | ||
|
||
All before-publish moderation rules share the following configuration fields: | ||
|
||
| Field | Description | | ||
| ----- | ----------- | | ||
| Retry timeout | Maximum duration (in milliseconds) that an attempt to invoke the rule may take (including any retries). The possible range is 0 - 5000ms. | | ||
| Max retries | Maximum number of retries after the first attempt at invoking the rule. | | ||
| Failed action | The action to take in the event that a rule fails to invoke. Options are reject the request or publish anyway. | | ||
| Too many requests action | The action to take in the event that the moderation provider returns a 429 (Too Many Requests Response). Options are to fail rule invocation, or retry. | | ||
| Room filter (optional) | A regular expression to match to specific chat rooms. | | ||
|
||
### After publish <a id="after-publish"/> | ||
|
||
When using after publish moderation, a message is published as normal, but is forwarded to a moderation engine after the fact. This enables you to avoid the latency penalty of vetting content prior to publish, at the expense of bad content being visible in the chat room (at least briefly). Many automated moderation solutions are able to process and delete offending messages within a few seconds of publication. | ||
|
@@ -37,12 +49,26 @@ Alternatively, you might have a custom solution you wish to integrate with, or a | |
|
||
[Hive](https://hivemoderation.com) provide automated content moderation solutions. The first of these is the [model only](/docs/chat/moderation/direct/hive-model-only) solution, which provides access to a powerful ML model that takes content and categorises it against various criteria, for example, violence or hate speech. For each classification, it also provides an indication of the severity of the infraction. Using this information, you can determine what level of classification is appropriate for your chat room and filter / reject content accordingly. Hive offer free credits to allow you to experiment with this solution. | ||
|
||
The second solution is the [dashboard](/docs/chat/moderation/direct/hive-dashboard). This is an all-in-one moderation tool, that allows you to combine automated workflows using ML models as well as human review and decisions to control the content in your chat room. | ||
The second solution is the [dashboard](/docs/chat/moderation/direct/hive-dashboard). This is an all-in-one moderation tool, that allows you to combine automated workflows using ML models as well as human review and decisions to control the content in your chat room. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not this PR but a quick fix - there's a few links in the direct vs custom section that are broken There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will fix in a separate PR! |
||
|
||
Ably is able to integrate your chat rooms directly directly with both of these solutions, to allow you to get up and running with Hive moderation with minimal code required. | ||
|
||
### Tisane <a id="tisane"/> | ||
|
||
Tisane provides automated content moderation through their [API solution](https://tisane.ai/). The platform uses Natural Language Understanding (NLU) to analyze user-generated content and categorize it against various criteria for content moderation. Each analysis includes information about the type and location of potentially problematic content, enabling you to determine appropriate moderation actions for your chat room. | ||
|
||
Tisane also offers an [on-premises solution](https://tisane.ai/) for organizations that need to deploy the moderation engine within their own infrastructure. Both options can be integrated with Ably to provide automated content moderation for your chat rooms. | ||
|
||
### Bodyguard <a id="bodyguard"/> | ||
|
||
Bodyguard provides content moderation through their [API solution](https://bamboo.bodyguard.ai/api). The service performs contextual analysis of messages to detect and categorize inappropriate content based on configurable criteria. Each analysis provides detailed information about the content, enabling you to implement appropriate moderation actions in your chat room. | ||
|
||
The platform is accessed through a REST API and is integrated with Ably to provide automated content moderation for your chat rooms. | ||
|
||
## What's next? <a id="whats-next"/> | ||
|
||
* [Add moderation with Hive (model only)](/docs/chat/moderation/direct/hive-model-only) | ||
* [Add moderation with Hive Moderation Dashboard](/docs/chat/moderation/direct/hive-dashboard) | ||
* [Add moderation with Tisane](/docs/chat/moderation/direct/tisane) | ||
* [Add moderation with Bodyguard](/docs/chat/moderation/direct/bodyguard) | ||
* [Build custom moderation with the Moderation API](/docs/chat/moderation/custom) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've now got 4 rules that will have a lot of field crossover (the before publish rule config). Worth extracting the common fields to an overview page (like we have for "custom") so that we don't have to repeat ourselves on every page (and just have a callout to that page on the individual provider pages)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point! have extracted them all out now, see f0dbdf0