Sexism

Calculation method
Optimizing your AI system
Addressing Sexism in Your System

Sexism Detection flags whether a response contains sexist content. Output is a binary classification of whether a response is sexist or not.

Calculation method

Sexism detection is computed through a specialized process:

Model Architecture

The detection system is built on a Small Language Model (SLM) that combines training from both open-source datasets and carefully curated internal datasets to identify various forms of sexist content.

Performance Validation

The model demonstrates robust detection capabilities with an 83% accuracy rate when tested against the Explainable Detection of Online Sexism dataset, a widely recognized benchmark for sexism detection.

Optimizing your AI system

Addressing Sexism in Your System

When sexist content is detected in your system, consider these approaches:

Implement guardrails: Flag responses before being served to prevent future occurrences.

Fine-tune models: Adjust model behavior to reduce sexist outputs.

Identify responses that contain sexist comments and take preventive measures to ensure fair and unbiased AI interactions.

PII

Toxicity

⌘I

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

Calculation method

Optimizing your AI system

Addressing Sexism in Your System

Overview

Get Started

Logging and Monitoring

Experiments

Runtime Protection

Metrics

Annotations

Integrations

Security

References

​Calculation method

​Optimizing your AI system

​Addressing Sexism in Your System

Calculation method

Optimizing your AI system

Addressing Sexism in Your System