Content moderation with Feedback Aide

Our Feedback Aide API includes content moderation powered by AI, which is helpful for identifying inappropriate content in essays written by learners.

Note: AI moderation is not a substitute for human review. Given the wide diversity of writing patterns, colloquialisms, regional nuances, and personalities among learners, it may flag instances incorrectly, both producing false alarms and missing critical content.

Types of content which will be flagged

AI-driven moderation helps ensure that sensitive, inappropriate, or crisis content is flagged and managed efficiently. This reduces manual workload, improves accuracy, and enhances the overall user experience.

The content flagged includes:

  • Uncontextualized* expressions or promotion of hate towards any target.
  • Threats of violence or harm towards any target.
  • Depictions of violent or sexual acts.
  • Depictions, promotion, or encouragement of acts of self harm.
  • Disclosures that the learner is engaging or intends to engage in acts of self-harm, such as suicide, cutting, or eating disorders.

* For example, a contextualised hateful quote from a historical figure may not be flagged.

Enabling content moderation

The AI content moderation model is not enabled by default. To activate it, developers must explicitly request it by passing a  Model Option when creating the feedback session with the feedbackApp.feedbackSession() such as below:   

const feedbackSession = await feedbackApp.feedbackSession(
{ // security
...
},
{ // feedback session options
state: 'grade',
session_uuid: '36eebda5-b6fd-4e74-ad06-8e69dfb89e3e',
stimulus: 'Write an essay about obesity and its impact on society',
response: 'Obesity is ...',
rubric: {...},
options: {
moderation: {
enabled: true
}
}
});

This ensures that moderation is only applied when needed, giving developers control over when to leverage AI moderation for their specific use case.

Note: the legacy models standard-essay-moderation and advanced-essay-moderation models for moderation are deprecated since 2025-01-02.    Instead use the options flag to enable moderation on standard-essay or advanced-essay or any other model that supports it as described above.

Moderation workflow example

When the essay is first graded by Feedback Aide, the user interface will display a warning message to the grader indicating it has detected content that may be of concern. The grader needs to acknowledge the message in order to continue.

 

Feedback Aide Moderation Screenshot 01.png

Screenshot 1: Warning Message Shown to the Grader

 

When the grader has finished marking the essay and the feedback is ready for learner review, the grader will click ‘Submit to student.’ They will then be shown a second dialog window, asking for their acknowledgement before proceeding.

 

Feedback Aide Moderation Screenshot 02.png

Screenshot 2: The Acknowledgement Window Shown to the Grader

 

 

Was this article helpful?

Did you arrive here by accident? If so, learn more about Learnosity.