QUADRA: Intent-Specific Counterspeech Generation

Overview

Combating Hate Speech with Intent-Aware AI

We explore intent-specific counterspeech generation to tackle hate speech online. Using the IntentCONAN v2 dataset—with 9,532 training examples balanced across four rhetorical intents—we propose a modular framework with a shared HateBERT encoder and intent-specific BART decoders.

Our research investigates three fusion mechanisms (Linear, Shared, and Cross Attention) to combine hate speech embeddings with intent representations. For evaluation, we introduce DialoRank, a zero-shot DialoGPT method that ranks responses by intent relevance.

Results show our intent-aware models outperform DialoGPT and GPS baselines across lexical and semantic metrics, with SharedFusion achieving the best performance.

📚

Informative

Providing facts and accurate information to counter false claims

⚖️

Denouncing

Calling out hate speech and condemning harmful behavior

❓

Questioning

Challenging assumptions through thought-provoking questions

💚

Positive

Promoting empathy, understanding, and constructive dialogue

Technical Approach

QUADRA Architecture

💬

Hate Speech Input

Raw hateful text

→

🔍

HateBERT Encoder

Pre-trained on hate speech for contextual embeddings

→

🔗

Fusion Layer

Combines hate embeddings with intent signals

→

✍️

BART Decoder

Intent-specific generation

→

💡

Counterspeech

Generated response

Fusion Mechanisms

⊕

Linear Fusion

Simple linear combination of hate speech embeddings and intent vectors through concatenation and projection.

ROUGE-1: 0.250 BERTScore: 0.870

⚡

Shared Fusion

Best Performance

Shared representation learning that jointly models hate speech and intent in a unified embedding space.

ROUGE-1: 0.251 BERTScore: 0.871

✕

Cross Attention

Cross-attention mechanism that allows intent signals to selectively attend to relevant hate speech features.

ROUGE-1: 0.242 BERTScore: 0.870

Evaluation

Experimental Results

Text Generation Metrics

Model	ROUGE-1	ROUGE-2	ROUGE-L	METEOR
SharedFusion	0.251	0.065	0.176	0.158
LinearFusion	0.250	0.064	0.175	0.154
CrossFusion	0.242	0.061	0.171	0.152
GPS	0.176	0.030	0.132	0.116
DialoGPT	0.130	0.003	0.105	0.040

Semantic Similarity Metrics

Model	BS (P)	BS (R)	BS (F1)	CA
SharedFusion	0.871	0.870	0.871	0.751
LinearFusion	0.869	0.871	0.870	0.752
CrossFusion	0.870	0.869	0.870	0.752
GPS	0.240	0.121	0.180	0.754
DialoGPT	0.791	0.808	0.799	0.681

BS: BERTScore | CA: Classification Accuracy | R1/R2/RL: ROUGE scores | M: METEOR

Demonstrations

Counter Speech in Action

See how our model generates different types of counterspeech responses to hate speech. Each intent type addresses the harmful content in a unique, rhetorically strategic way.

Hate Speech

"Immigrants are taking all our jobs and ruining our country."

Informative Response

Actually, immigrants contribute significantly to the economy. According to a study by the National Bureau of Economic Research, immigrants have a positive impact on job creation and innovation. Additionally, immigrants often fill labor gaps, boosting economic growth.

Hate Speech

"Muslims don't belong in Western society."

Denouncing Response

It is unacceptable to generalize and stereotype an entire religious group based on the actions of a few individuals. We should promote understanding and respect for all individuals, regardless of their religious background.

Hate Speech

"Jews control all the banks and media."

Questioning Response

What evidence do you have to support the claim that Jews are controlling banks and media? Can you provide examples to support this claim? Isn't it unfair to generalize an entire group based on stereotypes?

Hate Speech

"Women are too emotional to be leaders."

Positive Response

I understand that everyone has their own opinions, but it's important to remember that equality and respect are fundamental values in our society. Women have proven time and again that they are more than capable of leading and excelling in any field they choose.

Access

Resources & Downloads

📦

Model Weights

Pre-trained weights for DialoGPT FineTuned, Linear Fusion, Shared Fusion, and Cross Attention models

Download from Google Drive →

📄

QUARC Paper

Original research paper on Intent Distribution Learning and Persistent Fusion

View on GitHub →

�

Project Report

Full research report with detailed methodology, experiments, and analysis

Download PDF →

📊

IntentCONAN v2 Dataset

9,532 training examples balanced across four rhetorical intents

Train: 9,532 Validation: 1,191 Test: 2,972