Methodology

How StreamOracle analyzes viewership data to produce suspicion scores.

Philosophy

StreamOracle follows the principle of "Score, don't accuse." Suspicion scores are statistical indicators based on data patterns, not definitive proof of manipulation. Many legitimate factors can influence scores, including raids, embeds, events, and platform promotions. Always consider context when interpreting results.

Detection Signals

Seven independent signals each produce a score from 0 to 100.

Chatter-to-Viewer Ratio (CVR)

1.5

Compares the number of unique chatters to the reported viewer count. Legitimate streams typically have a consistent ratio of active chatters, while artificially inflated streams show very low engagement relative to viewers.

Detects: Inflated viewer counts with low chat engagement

Step Function Detection

1.2

Identifies sudden, sharp jumps or drops in viewer count that occur without natural ramp-up. Organic viewership changes tend to be gradual, following discovery and sharing patterns.

Detects: Sudden, unnatural viewer count changes

Chat Entropy

Measures the diversity and randomness of chat messages. Genuine chat exhibits varied vocabulary, sentence structure, and timing. Low entropy suggests repetitive or scripted messages.

Detects: Scripted, repetitive, or bot-generated chat messages

Follower Ratio

0.8

Analyzes the relationship between follower count and concurrent viewers. Channels with unusually high viewer-to-follower ratios may be receiving artificial viewers.

Detects: Viewer count disproportionate to follower base

Growth Analysis

0.8

Evaluates channel growth trajectory over time. Natural growth follows discoverable patterns tied to content, raids, and platform promotion. Anomalous growth can indicate artificial inflation.

Detects: Unnatural growth spikes and trajectories

Benford's Law

0.7

Tests the distribution of leading digits in viewer counts against Benford's Law, a mathematical principle that naturally occurring numbers follow a specific frequency distribution. Artificial numbers tend to deviate significantly.

Detects: Statistically improbable viewer count patterns

Temporal Pattern

Examines how viewer counts change over time, looking for suspiciously regular patterns, flat lines, or mathematically perfect curves that differ from organic viewership behavior.

Detects: Artificially stable or patterned viewer counts

Aggregation Formula

final_score = Sum(score_i * weight_i * confidence_i) / Sum(weight_i * confidence_i)

Each signal's score is multiplied by its weight and confidence level, then normalized by the total weighted confidence. This ensures that signals with higher confidence have more influence on the final score.

Score Labels

Score Range	Label
0 - 20	Normal
21 - 40	Low
41 - 60	Moderate
61 - 80	Elevated
81 - 100	High

Data Collection

Viewership snapshots are collected at regular intervals (typically every 5 minutes) for tracked channels. Each snapshot records the viewer count, chatter count, and current category. Analysis requires a minimum number of data points to produce reliable scores.

Limitations & Disclaimers

Important Limitations

Scores are statistical indicators, not definitive proof of viewership manipulation.
Raids, hosted streams, embeds, and platform promotions can naturally cause anomalous patterns.
New or small channels may have insufficient data for accurate analysis.
The system cannot distinguish between purchased viewers and legitimate viral events with certainty.
Platform API changes may affect data accuracy.