What Are Moderation Agents?

Moderation Agents are custom AI-powered tools that let you define specific moderation rules beyond TrollWall's standard detection. While TrollWall automatically handles hate speech, spam, and other violations, Moderation Agents allow you to moderate content specific to your brand, community, or industry.

Key capabilities:

Define custom moderation topics with precise criteria
Train agents using real examples from your community
Understand context, typos, and misspellings automatically
Test agent performance before activation
Modify rules as your needs evolve

Current availability: Moderation Agents work exclusively with Facebook accounts.

Need access? If you don't see "Moderation Agents" in your menu, contact your sales account manager or use the in-app chat (bottom-right corner).

Creating New Moderation Agent

Step 1: Start Creation Process

Navigate to Moderation Agents in the side panel
Click "Create" to begin setup
The agent will guide you through configuration via conversation

Step 2: Define Your Moderation Topic

The agent asks questions to understand what you want to moderate. Be as specific and detailed as possible.

The agent will also ask about edge cases and clarifications to better understand the scope of your moderation needs. Answer these follow-up questions to help the agent be as precise as possible.

Once you've provided sufficient context, click "Next" to proceed to comment sorting.

Step 3: Sort Comment Examples

The agent presents real comments from your social accounts—some matching your moderation criteria, others borderline cases. Your task is to classify each comment as Allowed or Hidden.

Requirements:

Minimum 10 examples marked as Allowed
Minimum 10 examples marked as Hidden

Important: These examples teach the agent the boundaries of your moderation policy. Take time to classify them accurately.

Step 4: Assign to Social Accounts

Select which Facebook account(s) should use this agent
Click "Save" to complete setup

Your agent is now created and ready to use.

Testing Moderation Agent

Before activating your agent, test its performance to ensure it behaves as expected.

Running Tests

Go to Moderation Agents and click "Test" next to your agent
Add test samples—example comments you want to evaluate
For each sample, specify whether it should be Hidden or Allowed
Click "Run Test" to see results

Testing Tip: Use new examples that weren't part of your agent's training. Testing with training examples will always show correct classification, which doesn't validate actual performance.

Understanding Results

After testing, you'll see:

Whether each sample was classified correctly (matches your expected result)
The agent's reasoning explaining its decision

Use the reasoning to:

Understand how the agent interprets different content
Identify gaps in your moderation rules
Refine your agent's configuration through additional training

Managing Moderation Agents

Agent Overview

The Moderation Agents screen displays all your agents with these options:

Create – Set up a new agent Settings – Edit agent configuration Test – Validate agent performance Start/Pause – Control when the agent moderates Delete – Remove an agent permanently

Editing Agent Configuration

Click "Settings" next to any agent to modify its setup:

Moderation Rules Tab

View current moderation rules generated from your configuration
Continue the conversation with your agent to refine rules
Add clarifications about recent developments or edge cases

Comments Tab

Review all example comments classified as Allowed or Hidden
Includes examples from initial setup plus any corrections made via the Train function in the Comments section
Change classification or remove examples as needed

Social Accounts Tab

Select which Facebook accounts use this agent
Modify account assignments without recreating the agent

Critical: After making any changes, click "Save" to retrain the agent with updated information. Changes take effect only after retraining completes.

Starting and Pausing Agents

Start – Activate the agent to begin moderating comments on assigned accounts Pause – Temporarily disable moderation without deleting the agent

Remember: Paused agents don't moderate new comments, but their configuration remains saved for future use.

Best Practices

Creating Effective Agents

Provide detailed descriptions of what should be moderated during the initial setup conversation
Use real examples from your community when training the agent
Include both obvious violations and borderline cases to help the agent understand boundaries
Test thoroughly using the Test section before activating your agent
One topic per agent – Create separate agents for different moderation topics. Agents perform best when specialized on a single, specific topic rather than handling multiple unrelated issues

Important: If you need to moderate multiple topics, create a separate agent for each topic rather than combining them into one.

Maintaining Agent Performance

Correct classifications in the Comments section using "Hide + Train" when you find misclassified comments
Add new examples when you encounter edge cases not covered by your initial training
Test after major changes to verify your configuration updates work as expected

Troubleshooting Common Issues

Agent hides too many or too few comments:

Make corrections directly – For any wrongly classified comment, hide or unhide it and select which agent should learn from this correction
Refine your moderation description – Make your topic definition more specific in the agent settings
Analyze reasoning – Add misclassified comments as test samples in the Test section to see the agent's reasoning. Use this information to adjust your configuration
Continue the conversation – Go to Settings and talk with your agent to clarify boundaries and explain specific scenarios

Tip: When the agent makes mistakes, the reasoning provided in test results often reveals what needs clarification in your moderation rules.

Getting Help

If you need assistance with Moderation Agents:

Use the in-app chat (bottom-right corner) for immediate support
Contact your TrollWall account manager for strategic guidance
Provide specific examples and test results when reporting issues

For technical issues, include:

Screenshots of agent settings
Examples of misclassified comments
Test results showing unexpected behavior

Moderation Agents: Custom AI-Powered Comment Moderation