Spam Score Interpretation

Q: Why does a new phone number have a non-zero spam score?

New numbers may inherit some risk from carrier-level factors (carriers with historically high spam rates), line type (VoIP numbers carry a baseline risk premium), or number recycling (previous owner had a poor reputation). A non-zero baseline score does not necessarily indicate the current owner is a bad actor.

Q: How quickly do spam scores update after complaints?

Score update speed varies by data source. Real-time carrier feeds update within minutes. Consumer complaint aggregators may take hours to days to process and incorporate new reports.

Q: Should I block or challenge numbers with medium spam scores?

For medium scores (typically 40-60), challenge rather than block. Send an SMS OTP, request voice verification, or use CAPTCHA. This balances fraud prevention with user experience for the gray-area cases.

Q: Do spam scores differ for mobile vs. VoIP numbers?

Spam scores measure complaint and behavior history regardless of line type. A clean VoIP number can score lower than a mobile number with spam history. However, VoIP numbers start with a higher baseline risk due to their abuse patterns in aggregate.

What is a Phone Spam Score?

A phone spam score is a numerical value, typically 0-100, indicating the likelihood that a phone number is associated with spam, scam, or fraudulent activity. Higher scores indicate higher risk.

Score Sources

Consumer complaints - Reports submitted to FTC, carriers, and spam-blocking apps
Call pattern analysis - High volume, short duration, geographic spread
Honeypot data - Numbers calling phone numbers that shouldn't receive calls
Carrier intelligence - Internal carrier spam detection systems
Industry databases - Shared blocklists and fraud consortiums
STIR/SHAKEN data - Attestation failures and patterns

Score Components

While exact algorithms are proprietary, spam scores typically weight these factors:

Component	Description	Typical Weight
Report Volume	Number of consumer complaints	High
Report Recency	How recently complaints were filed	High
Report Diversity	Complaints from multiple sources	Medium-High
Call Patterns	Volume, timing, duration patterns	Medium
Robocaller Flags	Known automated calling systems	High
Carrier Risk	Historical spam rates from carrier	Low-Medium

Understanding Score Ranges

Spam scores are typically presented on a 0-100 scale. Here's how to interpret different ranges:

Score Interpretation Guide

Score Range	Risk Level	Typical Characteristics
0-10	Minimal	Clean number, no complaints, established history
11-25	Low	Minor/old complaints, or similar to spam numbers
26-40	Moderate	Some recent complaints, patterns warrant monitoring
41-60	Elevated	Multiple complaints, risky call patterns
61-80	High	Many complaints, known spam associations
81-100	Critical	Confirmed spam/scam, robocaller databases

What Each Range Means in Practice

0-10: Minimal Risk

Numbers in this range are likely legitimate with clean histories. However, don't assume zero risk - new fraud numbers haven't accumulated complaints yet. Use additional signals like line type and velocity for new numbers.

11-25: Low Risk

These numbers may have minor historical complaints or characteristics similar to spam numbers (e.g., VoIP from a carrier with spam history). Generally safe to proceed with normal verification.

26-40: Moderate Risk

Warrants attention. Recent complaints may be accumulating, or patterns are emerging. Consider requiring additional verification for sensitive actions.

41-60: Elevated Risk

Significant spam signals present. Challenge with additional verification (SMS OTP, voice call) before allowing high-value actions. Monitor closely after allowing.

61-80: High Risk

Strong spam association. Block or require substantial verification (document upload, video verification). Only proceed if verification passes and user demonstrates legitimate intent.

81-100: Critical Risk

Confirmed spam or scam association. Block automatically for most use cases. Only allow with exceptional verification and manual review for legitimate appeals.

Get real-time phone spam scores. VeriRoute Intel provides 0-100 spam scores plus spam type classification.

Get Free API Key

Selecting Appropriate Thresholds

Choosing the right threshold depends on your risk tolerance, user base, and the action being protected.

Threshold Considerations

False positive cost - What's the impact of blocking a legitimate user?
False negative cost - What's the impact of letting fraud through?
User base characteristics - Are your users likely to have VoIP, prepaid, etc.?
Action sensitivity - Account creation vs. $10,000 transaction
Verification options - Can you challenge instead of block?

Threshold Recommendations by Use Case

Use Case	Block Threshold	Challenge Threshold	Rationale
Account Registration	80+	50+	Allow most signups, challenge suspicious
SMS 2FA Setup	70+	40+	Don't send OTPs to likely spam numbers
Financial Transaction	60+	35+	Higher fraud cost justifies more friction
High-Value Transaction	50+	25+	Maximum protection for big transactions
Inbound Call Screening	85+	60+	Don't want to block legitimate callers
Lead Verification	70+	45+	Don't waste sales resources on bad leads

Threshold Tuning Process

def tune_thresholds(historical_data, target_false_positive_rate=0.02):
    """
    Tune spam score thresholds based on historical data.

    Args:
        historical_data: List of {spam_score, was_fraud} records
        target_false_positive_rate: Maximum acceptable FP rate
    """

    # Sort by spam score
    sorted_data = sorted(historical_data, key=lambda x: x['spam_score'])

    results = []

    for threshold in range(0, 101, 5):
        # Calculate metrics at this threshold
        blocked = [d for d in sorted_data if d['spam_score'] >= threshold]
        allowed = [d for d in sorted_data if d['spam_score'] < threshold]

        true_positives = sum(1 for d in blocked if d['was_fraud'])
        false_positives = sum(1 for d in blocked if not d['was_fraud'])
        true_negatives = sum(1 for d in allowed if not d['was_fraud'])
        false_negatives = sum(1 for d in allowed if d['was_fraud'])

        total_fraud = true_positives + false_negatives
        total_legit = true_negatives + false_positives

        precision = true_positives / (true_positives + false_positives) if blocked else 0
        recall = true_positives / total_fraud if total_fraud else 0
        fpr = false_positives / total_legit if total_legit else 0

        results.append({
            'threshold': threshold,
            'precision': precision,
            'recall': recall,
            'false_positive_rate': fpr,
            'blocked_count': len(blocked)
        })

    # Find optimal threshold meeting FPR constraint
    valid_thresholds = [r for r in results if r['false_positive_rate'] <= target_false_positive_rate]

    if valid_thresholds:
        # Among valid thresholds, pick one with best recall
        optimal = max(valid_thresholds, key=lambda x: x['recall'])
        return optimal
    else:
        # Can't meet constraint - return threshold with lowest FPR
        return min(results, key=lambda x: x['false_positive_rate'])

Combining Spam Score with Other Signals

Spam scores are most effective when combined with other phone intelligence and behavioral signals.

Signal Combination Strategy

def calculate_combined_phone_risk(phone_data, context):
    """
    Combine spam score with other signals for comprehensive risk assessment.

    Args:
        phone_data: API response with spam, lrn, cnam data
        context: Additional context (transaction amount, user history, etc.)

    Returns:
        Combined risk score and recommendation
    """

    # Base spam score (0-100)
    spam_score = phone_data.get('spam', {}).get('score', 0)

    # Apply modifiers based on other signals
    modifiers = []

    # Line type modifier
    line_type = phone_data.get('lrn', {}).get('line_type')
    if line_type == 'voip':
        modifiers.append({'factor': 'voip', 'adjustment': 15})
    elif line_type == 'landline':
        modifiers.append({'factor': 'landline', 'adjustment': -5})

    # Number age modifier
    activation_date = phone_data.get('lrn', {}).get('activation_date')
    if activation_date:
        days_old = (datetime.now() - parse_date(activation_date)).days
        if days_old < 7:
            modifiers.append({'factor': 'very_new', 'adjustment': 20})
        elif days_old < 30:
            modifiers.append({'factor': 'new', 'adjustment': 10})
        elif days_old > 365:
            modifiers.append({'factor': 'established', 'adjustment': -5})

    # Robocaller flag (binary, high impact)
    if phone_data.get('spam', {}).get('is_robocaller'):
        modifiers.append({'factor': 'robocaller', 'adjustment': 30})

    # CNAM presence
    if not phone_data.get('cnam', {}).get('name'):
        modifiers.append({'factor': 'no_cnam', 'adjustment': 5})

    # Recent porting (possible SIM swap)
    if phone_data.get('lrn', {}).get('ported'):
        port_date = phone_data.get('lrn', {}).get('port_date')
        if port_date:
            days_since_port = (datetime.now() - parse_date(port_date)).days
            if days_since_port < 7:
                modifiers.append({'factor': 'recent_port', 'adjustment': 15})

    # Context-based modifiers
    if context.get('transaction_amount', 0) > 1000:
        modifiers.append({'factor': 'high_value', 'adjustment': 10})

    # Calculate final score
    total_adjustment = sum(m['adjustment'] for m in modifiers)
    combined_score = min(100, max(0, spam_score + total_adjustment))

    # Determine action
    if combined_score >= 70:
        action = 'block'
    elif combined_score >= 45:
        action = 'challenge'
    elif combined_score >= 25:
        action = 'monitor'
    else:
        action = 'allow'

    return {
        'base_spam_score': spam_score,
        'combined_score': combined_score,
        'modifiers': modifiers,
        'action': action
    }

Signal Weight Matrix

Signal	Low Risk	Medium Risk	High Risk
Spam Score	< 25	25-60	> 60
Line Type	Landline	Mobile	VoIP
Number Age	> 1 year	1-12 months	< 1 month
CNAM	Present, matches	Present, generic	Missing
Porting	Not ported	Ported > 30 days	Ported < 30 days

Understanding Spam Types

Beyond numeric scores, spam type classification provides actionable context:

Common Spam Type Classifications

Spam Type	Description	Typical Risk
robocaller	Automated calling systems	High
telemarketer	Sales calls (may be legal)	Medium
scam_likely	Potential scam operation	Very High
debt_collector	Collection agencies	Medium
political	Campaign/political calls	Low-Medium
survey	Research/survey calls	Low
nuisance	Unwanted but not fraudulent	Low

def handle_spam_type(spam_data, context):
    """Apply spam type-specific handling."""

    spam_type = spam_data.get('spam_type')
    spam_score = spam_data.get('score', 0)

    # Type-specific overrides
    if spam_type == 'scam_likely':
        # Always block confirmed scams regardless of score
        return {'action': 'block', 'reason': 'scam_classification'}

    if spam_type == 'robocaller':
        # Block robocallers for phone verification
        if context.get('action') == 'send_otp':
            return {'action': 'block', 'reason': 'robocaller_otp_abuse'}

    if spam_type == 'telemarketer':
        # Telemarketers might be legitimate businesses
        # Only block if score is also high
        if spam_score >= 50:
            return {'action': 'challenge', 'reason': 'telemarketer_high_score'}

    if spam_type == 'debt_collector':
        # Could be legitimate - proceed with caution
        return {'action': 'allow', 'flag': 'debt_collector'}

    # Default: use score-based logic
    return None

Handling Score Changes Over Time

Spam scores are dynamic - they change as new complaints are filed and old ones age out.

Score Volatility Patterns

Sudden increases - New complaint wave, often indicates active spam campaign
Gradual increases - Accumulating complaints over time
Score decreases - Old complaints aging out, number may have changed hands
Score stability - Consistent behavior (good or bad)

Caching Strategy

class SpamScoreCache:
    """Intelligent spam score caching."""

    def __init__(self, redis, phone_api):
        self.redis = redis
        self.api = phone_api

    async def get_spam_score(self, phone, force_refresh=False):
        """Get spam score with intelligent caching."""

        cache_key = f"spam:{phone}"

        if not force_refresh:
            cached = await self.redis.get(cache_key)
            if cached:
                data = json.loads(cached)
                age_minutes = (time.time() - data['cached_at']) / 60

                # Score-based TTL: Higher scores refresh more frequently
                max_age = self._get_ttl_minutes(data['score'])

                if age_minutes < max_age:
                    return data

        # Fetch fresh data
        result = await self.api.lookup(phone, {'spam': True})
        spam_data = result.get('spam', {})

        cache_data = {
            'score': spam_data.get('score', 0),
            'is_robocaller': spam_data.get('is_robocaller', False),
            'spam_type': spam_data.get('spam_type'),
            'cached_at': time.time()
        }

        # Cache with score-appropriate TTL
        ttl_seconds = self._get_ttl_minutes(cache_data['score']) * 60
        await self.redis.set(cache_key, json.dumps(cache_data), ex=ttl_seconds)

        return cache_data

    def _get_ttl_minutes(self, score):
        """Higher risk numbers get shorter cache TTL."""
        if score >= 70:
            return 30   # 30 minutes for high risk
        elif score >= 40:
            return 120  # 2 hours for medium risk
        else:
            return 360  # 6 hours for low risk

Handling False Positives

No spam detection system is perfect. Handle false positives gracefully:

False Positive Mitigation

Don't block silently - Tell users why they're challenged/blocked
Offer alternatives - Different phone, email verification, document upload
Implement appeals - Let users request review
Track and learn - Log false positives to improve thresholds

class FalsePositiveHandler:
    """Handle potential false positives gracefully."""

    def block_with_appeal(self, phone, spam_score, action):
        """Block action but offer appeal path."""

        return {
            'blocked': True,
            'reason': 'phone_risk_score',
            'appeal_available': True,
            'appeal_options': [
                {
                    'type': 'alternate_phone',
                    'description': 'Use a different phone number'
                },
                {
                    'type': 'email_verification',
                    'description': 'Verify via email instead'
                },
                {
                    'type': 'manual_review',
                    'description': 'Request manual review (24-48 hours)'
                }
            ],
            'message': (
                f"We couldn't verify your phone number. "
                f"Please try an alternative verification method."
            )
        }

    async def process_appeal(self, user_id, phone, appeal_type, evidence):
        """Process a false positive appeal."""

        appeal = {
            'user_id': user_id,
            'phone': phone,
            'appeal_type': appeal_type,
            'evidence': evidence,
            'status': 'pending',
            'created_at': datetime.now()
        }

        # Auto-approve certain cases
        if appeal_type == 'alternate_phone':
            new_phone = evidence.get('new_phone')
            new_score = await self.get_spam_score(new_phone)

            if new_score < 30:
                appeal['status'] = 'auto_approved'
                return {'approved': True, 'new_phone': new_phone}

        # Queue for manual review
        await self.queue_for_review(appeal)
        return {'approved': False, 'status': 'pending_review'}

Best Practices

Don't use spam score alone - Combine with line type, age, and context
Set thresholds per use case - Registration vs. transaction vs. call screening
Challenge before blocking - Give users a chance to verify
Cache intelligently - Shorter TTL for high-risk scores
Track outcomes - Measure false positive/negative rates
Provide appeals - Gracefully handle legitimate users with bad numbers
Monitor score distributions - Watch for anomalies in your traffic

Frequently Asked Questions

Why does a new phone number have a non-zero spam score?

New numbers may inherit some risk from carrier-level factors (carriers with historically high spam), line type (VoIP numbers have baseline risk), or number range characteristics. Additionally, "new" to the current user doesn't mean new to the system - the number may have been previously assigned to someone who received complaints. Scores reflect all available intelligence, not just current-owner activity.

How quickly do spam scores update after complaints?

Score update speed varies by data source. Real-time carrier feeds update within minutes. Consumer complaint aggregators may take hours to days. Industry databases update daily to weekly. For most spam score providers, significant complaint volumes are reflected within hours. VeriRoute Intel processes new intelligence continuously, with typical score updates occurring within 1-4 hours of new report submissions.

Should I block or challenge numbers with medium spam scores?

For medium scores (typically 40-60), challenge rather than block. This balances fraud prevention with user experience. Send an SMS OTP, request voice verification, or ask for an alternate phone. Only escalate to blocking if the challenge fails or if you combine the medium spam score with other high-risk signals (VoIP + new number + high velocity, for example). The goal is graduated friction, not binary decisions.

Do spam scores differ for mobile vs. VoIP numbers?

Spam scores measure complaint/behavior history regardless of line type, so a clean VoIP can score lower than a mobile with spam history. However, VoIP numbers statistically have higher spam rates because they're easier to obtain and dispose of. Best practice is to use spam score AND line type together - a VoIP with score 30 is riskier than a mobile with score 30, even though the scores are identical.

Key Takeaways