Our Feedback Aide API includes content moderation powered by AI, which is helpful for identifying inappropriate content in essays written by learners.
Note: AI moderation is not a substitute for human review. Given the wide diversity of writing patterns, colloquialisms, regional nuances, and personalities among learners, it may flag instances incorrectly, both producing false alarms and missing critical content.
Types of content which will be flagged
AI-driven moderation helps ensure that sensitive, inappropriate, or crisis content is flagged and managed efficiently. This reduces manual workload, improves accuracy, and enhances the overall user experience.
The content flagged includes:
- Uncontextualized* expressions or promotion of hate towards any target.
- Threats of violence or harm towards any target.
- Depictions of violent or sexual acts.
- Depictions, promotion, or encouragement of acts of self harm.
- Disclosures that the learner is engaging or intends to engage in acts of self-harm, such as suicide, cutting, or eating disorders.
* For example, a contextualised hateful quote from a historical figure may not be flagged.
Enabling content moderation
The AI content moderation model is not enabled by default. To activate it, developers must explicitly request it by passing a Model Option when creating the feedback session with the feedbackApp.feedbackSession()
such as below:
const feedbackSession = await feedbackApp.feedbackSession(
{ // security
...
},
{ // feedback session options
state: 'grade',
session_uuid: '36eebda5-b6fd-4e74-ad06-8e69dfb89e3e',
stimulus: 'Write an essay about obesity and its impact on society',
response: 'Obesity is ...',
rubric: {...},
options: {
moderation: {
enabled: true
}
}
});
This ensures that moderation is only applied when needed, giving developers control over when to leverage AI moderation for their specific use case.
Note: the legacy models standard-essay-moderation
and advanced-essay-moderation
models for moderation are deprecated since 2025-01-02. Instead use the options flag to enable moderation on standard-essay
or advanced-essay
or any other model that supports it as described above.
Moderation workflow example
When the essay is first graded by Feedback Aide, the user interface will display a warning message to the grader indicating it has detected content that may be of concern. The grader needs to acknowledge the message in order to continue.
Screenshot 1: Warning Message Shown to the Grader
When the grader has finished marking the essay and the feedback is ready for learner review, the grader will click ‘Submit to student.’ They will then be shown a second dialog window, asking for their acknowledgement before proceeding.
Screenshot 2: The Acknowledgement Window Shown to the Grader