Understanding the Robocall Landscape
Robocalls are automated telephone calls that deliver pre-recorded messages. While some robocalls are legitimate - appointment reminders, prescription notifications, emergency alerts - the vast majority are unwanted spam or outright fraud attempts.
The Scale of the Problem
- 50+ billion robocalls per year in the United States alone
- $30+ billion in annual fraud losses attributed to phone scams
- Average consumer receives 4-5 robocalls per day
- 60-70% of robocalls involve spoofed caller ID
Types of Robocalls
| Type | Description | Legality |
|---|---|---|
| Informational | Appointment reminders, flight updates, prescription ready | Legal with consent |
| Political | Campaign messages, voter outreach | Legal (exempt from TCPA) |
| Telemarketing | Sales calls, lead generation | Legal with prior consent |
| Spam | Unwanted commercial calls without consent | Illegal |
| Fraud/Scam | IRS scams, tech support, prize notifications | Illegal |
Pre-Call Detection Patterns
Before a call even connects, you can assess robocall probability using phone number intelligence. This pre-call screening catches the majority of robocalls with minimal latency.
Phone Number Intelligence Signals
Line Type Analysis
The type of phone line provides immediate risk indication:
- VoIP (Voice over IP) - Highest robocall risk. VoIP numbers are cheap, easily obtained in bulk, and frequently used for high-volume calling campaigns.
- Toll-free numbers - Elevated risk. Often used by legitimate businesses but also by scammers impersonating companies.
- Mobile - Moderate risk. Harder to obtain in bulk but increasingly used via SIM farms.
- Landline - Lowest risk. Traditional PSTN numbers are more traceable and costly to abuse.
# Pre-call screening example
def screen_incoming_call(caller_number):
"""Assess robocall probability before answering."""
phone_data = lookup_phone_intelligence(caller_number)
risk_score = 0
risk_factors = []
# Line type scoring
line_type = phone_data.get('line_type')
if line_type == 'voip':
risk_score += 30
risk_factors.append('VoIP number')
elif line_type == 'toll_free':
risk_score += 15
risk_factors.append('Toll-free number')
# Spam reputation
spam_score = phone_data.get('spam_score', 0)
if spam_score > 70:
risk_score += 40
risk_factors.append(f'High spam score ({spam_score})')
elif spam_score > 40:
risk_score += 20
risk_factors.append(f'Elevated spam score ({spam_score})')
# Known robocaller flag
if phone_data.get('is_robocaller'):
risk_score += 50
risk_factors.append('Known robocaller')
return {
'risk_score': min(risk_score, 100),
'risk_factors': risk_factors,
'recommendation': 'block' if risk_score > 70 else 'caution' if risk_score > 40 else 'allow'
}
Number Age and Activation Patterns
Robocallers frequently burn through phone numbers, obtaining new ones as old numbers get blocked or reported. Checking the LRN activation date reveals suspicious patterns:
- Very new numbers (< 7 days) - Elevated risk; legitimate businesses rarely call from brand-new numbers
- Recently ported numbers - May indicate a number acquisition pattern
- Frequent porting history - Numbers that change carriers often are suspicious
Carrier and OCN Analysis
Certain carriers are disproportionately associated with robocall traffic:
- High-volume VoIP providers - Carriers offering bulk cheap numbers with minimal vetting
- Gateway carriers - Carriers known to transit international robocall traffic
- Problematic OCNs - Specific Operating Company Numbers with poor reputations
Spam Database Signals
Community-reported spam databases provide powerful robocall detection:
- Consumer complaint aggregation - FTC complaints, carrier reports, app submissions
- Robocall-specific flags - Distinction between general spam and automated robocalls
- Scam type classification - IRS scam, tech support, warranty, etc.
Screen calls before they connect. VeriRoute Intel provides line type, carrier data, and spam scores in real-time.
Get Free API KeyCall Detail Record Analysis
Analyzing Call Detail Records (CDRs) reveals behavioral patterns that distinguish robocallers from legitimate callers.
Volume and Velocity Patterns
| Pattern | Legitimate Caller | Robocaller |
|---|---|---|
| Calls per hour | 1-20 | 100-1000+ |
| Unique destinations | Few, repeated contacts | Many unique numbers |
| Geographic spread | Concentrated | Widely dispersed |
| Call timing | Business hours | All hours, systematic |
| Call duration distribution | Varied | Bimodal (very short or fixed length) |
Call Duration Patterns
Robocalls exhibit distinctive duration distributions:
- High abandonment rate - Many calls under 3 seconds (voicemail detection, no answer)
- Fixed message duration - Answered calls cluster around message playback length
- Low engagement rate - Few calls extended beyond initial message
def analyze_duration_pattern(call_records):
"""Detect robocall patterns from call duration distribution."""
durations = [call['duration_seconds'] for call in call_records]
# Calculate metrics
very_short = sum(1 for d in durations if d < 3) / len(durations)
avg_duration = sum(durations) / len(durations)
std_deviation = calculate_std(durations)
# Robocall indicators
indicators = []
# High abandonment rate
if very_short > 0.30:
indicators.append({
'pattern': 'high_abandonment',
'value': very_short,
'threshold': 0.30
})
# Low duration variance (same message repeated)
if std_deviation < 5 and avg_duration > 10:
indicators.append({
'pattern': 'fixed_duration',
'value': std_deviation,
'threshold': 5
})
return {
'is_robocall_pattern': len(indicators) > 0,
'indicators': indicators,
'metrics': {
'abandonment_rate': very_short,
'avg_duration': avg_duration,
'duration_std': std_deviation
}
}
Temporal Patterns
Robocallers often exhibit machine-like timing precision:
- Consistent inter-call intervals - Calls spaced exactly N seconds apart
- Sequential dialing patterns - Incrementing through number blocks
- Burst patterns - High volume followed by quiet periods
- Time zone ignorance - Calling patterns that ignore local time zones
Detecting Neighbor Spoofing
Neighbor spoofing is a technique where robocallers spoof caller ID to match the recipient's area code and exchange, making the call appear local. This significantly increases answer rates.
Detection Techniques
- CNAM inconsistency - Spoofed numbers often have mismatched or missing CNAM data
- Carrier mismatch - The actual originating carrier doesn't match the number's assigned carrier
- Impossible geography - Call originates from a location inconsistent with the number
- STIR/SHAKEN attestation - Low or failed attestation indicates potential spoofing
def detect_neighbor_spoofing(call_data, recipient_number):
"""Detect potential neighbor spoofing attacks."""
caller = call_data['caller_number']
recipient = recipient_number
spoofing_signals = []
# Check if caller matches recipient's NPA-NXX
if caller[:6] == recipient[:6]:
# Same area code and exchange - potential neighbor spoof
# Verify with phone intelligence
caller_data = lookup_phone_intelligence(caller)
# Check CNAM availability
if not caller_data.get('cnam', {}).get('name'):
spoofing_signals.append('no_cnam')
# Check carrier consistency
if caller_data.get('line_type') == 'voip':
# VoIP calling from local-looking number
spoofing_signals.append('voip_local_appearance')
# Check STIR/SHAKEN attestation
attestation = call_data.get('stir_shaken_attestation')
if attestation in ['C', None]:
spoofing_signals.append('low_attestation')
return {
'is_neighbor_spoof_risk': len(spoofing_signals) >= 2,
'signals': spoofing_signals
}
Audio-Based Detection
When pre-call screening isn't sufficient, audio analysis during the call provides definitive robocall detection.
Audio Fingerprinting
Robocalls use the same recorded message across thousands of calls. Audio fingerprinting can match these:
- Acoustic fingerprints - Hash-like signatures of audio content
- Voice pattern matching - Identifying the same synthetic or recorded voice
- Background audio signatures - Consistent background noise patterns
Speech Pattern Analysis
- Text-to-speech detection - Synthetic voices have detectable artifacts
- Recording quality - Pre-recorded messages often have different compression artifacts than live audio
- Response latency - Robocalls with IVR have specific response timing patterns
Silence and Pause Detection
The initial moments of a call are telling:
- Predictive dialer silence - 1-3 second pause while connecting to agent/recording
- AMD artifacts - Answering machine detection creates specific pause patterns
- Message start timing - Robocalls often start message at consistent intervals
STIR/SHAKEN Integration
The STIR/SHAKEN framework provides cryptographic caller ID authentication, helping identify spoofed robocalls.
Attestation Levels
| Level | Meaning | Robocall Risk |
|---|---|---|
| A (Full) | Carrier verified both number and caller identity | Low |
| B (Partial) | Carrier verified call origin but not caller's right to use number | Medium |
| C (Gateway) | Carrier is gateway only; cannot verify | Higher |
| None | No attestation provided | Highest |
def incorporate_stir_shaken(call_data, existing_risk_score):
"""Adjust risk score based on STIR/SHAKEN attestation."""
attestation = call_data.get('stir_shaken', {}).get('attestation')
verification = call_data.get('stir_shaken', {}).get('verified', False)
adjustments = {
'A': -20 if verification else 0, # Reduce risk if fully attested and verified
'B': -5 if verification else 10, # Slight reduction or increase
'C': 15, # Gateway attestation increases risk
None: 25 # No attestation significantly increases risk
}
adjustment = adjustments.get(attestation, 20)
return max(0, min(100, existing_risk_score + adjustment))
Implementation Architecture
A robust robocall detection system layers multiple detection methods:
Multi-Layer Detection Pipeline
class RobocallDetector:
"""Multi-layer robocall detection system."""
def __init__(self, phone_api_key):
self.phone_api = PhoneIntelligenceAPI(phone_api_key)
self.audio_analyzer = AudioAnalyzer()
self.cdr_analyzer = CDRAnalyzer()
def analyze_incoming_call(self, call_data):
"""Complete robocall analysis pipeline."""
results = {
'caller': call_data['caller_number'],
'timestamp': call_data['timestamp'],
'layers': {}
}
# Layer 1: Pre-call phone intelligence
phone_data = self.phone_api.lookup(call_data['caller_number'])
layer1_score = self._score_phone_intelligence(phone_data)
results['layers']['phone_intelligence'] = {
'score': layer1_score,
'data': phone_data
}
# Layer 2: STIR/SHAKEN attestation
layer2_score = self._score_attestation(call_data)
results['layers']['stir_shaken'] = {
'score': layer2_score,
'attestation': call_data.get('stir_shaken', {}).get('attestation')
}
# Layer 3: Historical CDR patterns (if available)
if self.cdr_analyzer.has_history(call_data['caller_number']):
cdr_analysis = self.cdr_analyzer.analyze(call_data['caller_number'])
layer3_score = cdr_analysis['robocall_probability'] * 100
results['layers']['cdr_patterns'] = {
'score': layer3_score,
'patterns': cdr_analysis['patterns']
}
# Calculate composite score
weights = {'phone_intelligence': 0.4, 'stir_shaken': 0.3, 'cdr_patterns': 0.3}
composite = self._weighted_average(results['layers'], weights)
results['composite_score'] = composite
results['is_robocall'] = composite > 60
results['action'] = self._determine_action(composite)
return results
def _determine_action(self, score):
if score > 85:
return 'block'
elif score > 60:
return 'captcha_challenge'
elif score > 40:
return 'flag_and_monitor'
else:
return 'allow'
Response Strategies
Once a robocall is detected, several response strategies are available:
Pre-Answer Interventions
- Silent blocking - Reject call without notifying caller
- Send to voicemail - Don't ring, route directly to voicemail
- Warning announcement - Play warning before connecting
Challenge-Response
- CAPTCHA audio - Require caller to press a key or speak a word
- Callback verification - Hang up and call back to verify
- Reputation query - Ask caller to identify themselves
Post-Detection Actions
- Report to databases - Contribute to community blocklists
- Notify carrier - Report to originating carrier
- Update internal blocklist - Block future calls from this source
- Regulatory filing - Report to FTC/FCC for egregious cases
Best Practices
- Layer detection methods - No single technique catches all robocalls
- Tune for your use case - B2C companies have different tolerance than call centers
- Monitor false positives - Legitimate automated calls exist; don't block everything
- Update continuously - Robocall techniques evolve rapidly
- Leverage STIR/SHAKEN - Cryptographic attestation provides strong signals
- Share intelligence - Participate in industry threat sharing
Frequently Asked Questions
What percentage of robocalls use spoofed caller ID?
Studies indicate that 60-70% of robocalls use some form of caller ID spoofing. Neighbor spoofing (making calls appear local) is particularly common because it increases answer rates by 3-4x. STIR/SHAKEN implementation is helping reduce this, but spoofing remains prevalent.
How effective is STIR/SHAKEN at stopping robocalls?
STIR/SHAKEN is highly effective at identifying spoofed caller ID, but it's not a complete solution. Robocallers can still obtain legitimate numbers and achieve "A" attestation. STIR/SHAKEN works best as one layer in a multi-factor detection system that also considers spam reputation, call patterns, and phone intelligence.
Why do robocallers use VoIP numbers?
VoIP numbers are preferred by robocallers because they're cheap to obtain in bulk, easy to configure for high-volume calling, can be acquired without strict identity verification, and can be quickly discarded when blocked. Some VoIP providers sell thousands of numbers for cents each.
How can I detect if a call is from a predictive dialer?
Predictive dialers create a distinctive 1-3 second silence after you answer as they connect you to an available agent. This pause, combined with high call volumes from the source number and consistent timing between calls, strongly indicates predictive dialer usage. Audio analysis detecting this initial silence is one reliable detection method.