ReDem Quality Days 2025 – Day 1 Recap
Data quality in online surveys is no longer a niche concern—it has become a central topic for the research industry. On June 11, the ReDem Quality Days 2025 opened with two expert-led presentations that addressed this evolving landscape from complementary perspectives.
Dr. Gargi Sawhney (Associate Professor at Auburn University) offered an academic lens on the effectiveness of bot detection tools and how some are no longer sufficient to address the sophistication of today’s fraudulent activity. Following her, industry leader Efrain Ribeiro (former COO and CRO at TNS, Kantar, and Ipsos) shared practical insights into how high-frequency survey takers can influence data integrity.
Talk 1: Are Current Survey Bot Detection Techniques Sufficient in the Age of AI Automation?
Speaker: Dr. Gargi Sawhney (Auburn University)
Imagine a world where a bot can fill out surveys, type thoughtful open-ended responses, simulate scrolling, and even “correct” a previous answer like a human would. That world is now.
Dr. Sawhney’s talk showcased the findings of her latest lab study, which evaluated four different types of bots – simple script-based bot, Skyvern, Claude Computer Use and Mariner – and measured how well they could pass 12 common detection techniques.
The Findings:
- AI-bots passed 11 out of 12 standard fraud checks. Only visual CAPTCHAs (those with images, not the invisible kind) offeres some layer of defense. In contrast, invisible CAPTCHAs offered no protection.
- Conventional quality checks such as attention checks, age consistency questions, and instructed response items failed to flag the bots. Even open-ended responses appeared credible, showing how convincingly AI can mimic human answers.
- All four bot types passed honeypot questions and Google’s ReCAPTCHA v3, rendering these techniques completely ineffective, even against simple bots.
- Behavioral analytics still offered a meaningful line of defense. Indicators like typing rhythm, mouse movement, and scrolling behavior remain distinct between bots and human respondents.
- Additionally, survey completion times differed noticeably: bots took significantly longer (e.g., due to API calls of the AI-powered bots), whereas human participants completed the survey more quickly.
“Behavior can be scripted. It’s only a matter of time before bots mimic human behavior perfectly.” – Dr. Gargi Sawhney
Another critical weakness: detection tools often rely on JavaScript embedded in the browser. But users can disable or alter it with browser extensions—meaning behavioral signals and device fingerprints can be faked or erased entirely.
What Should Researchers Do?
Dr. Gargi Sawhney recommends focusing on:
- Behavioral analytics (as long as it still holds).
- Cross-checking responses in real-time and post-field.
- Investigating whether JavaScript was manipulated.
- Studying bot behavior continuously to stay ahead of the curve.
Talk 2: Data Quality and Frequent Survey Participants
Speaker: Efrain Ribeiro (CASE4Quality)
While Dr. Gargi Sawhney looked at machines pretending to be humans, Ribeiro examined the damage caused by humans acting like machines.
In his talk, he revisited findings from the 2022 CASE Fraud Study—one of the first comprehensive efforts to measure fraud and inattentiveness in online samples—and showed that frequent survey takers quietly compromise data just as much as bots.
Key Stats:
- In total, ~40% of survey responses had to be removed for quality reasons.
- 20–25% showed strong signs of inattention, such as straight-lining or contradicting themselves. This was detected through traditional quality checks.
- 5–10% of survey data was pure fraud (e.g. fake identities or bots), detected by four different firms.
But the real eye-opener came from frequency analysis:
- 24% of participants had entered more than 25 surveys in the past 24 hours.
- A tiny 3% of devices were responsible for 19% of all surveys, and 10% of devides for 42% of surveys.
These „super takers“ may not technically be fraudsters—but their overuse leads to skewed and less thoughtful answers. One study presented by Efrain Ribeiro found that “brand awareness scores were 20 points lower among high-frequency respondents than among casual ones.”
Ribeiro also traced how this came to be: From online panels in 1997 to routers in 2002, and the rise of programmatic sampling, the industry has unintentionally eliminated most of the structural barriers that used to prevent overuse and fraud.
Category controls? Gone. Participation caps? Rare. Transparency into where sample comes from? Often lacking.
“Efficiency has been prioritized over validation. But we can’t just accept that as the cost of doing business.” – Efrain Ribeiro
What Ties These Talks Together?
Both sessions shared one subtle but powerful insight: poor-qualtiy data isn’t just about bots. It’s about unchecked automation—whether that’s programmatic sample delivery, repeated participation by the same human, or a bot running in a browser. We are in a system optimized for volume, not validity.
Recommendations for better data quality from the speakers:
1. Implement layered defenses.
No single fraud detection method is enough anymore. Combine CAPTCHAs, open-end analysis, cohernece and logic checks, behavioral analytics, and meta-data analysis.
2. Demand transparency.
Ask your suppliers: how many surveys does each respondent complete per day? Which quality assurance measures are in place? How is participant performance monitored over time? (Checklists and guides can be found here)
3. Track behavior in real-time.
Consider tools to capture keystroke patterns, tab-switching, or scrolling patterns to detect bots in your surveys.
Catch up on the full webinar session at your convenience. We will send the recording link to your inbox.