Threat index · Q2 2026 · Updated April 2026
AI Cheating Threat Index: Q2 2026
Quarterly tracking of commercial AI overlays, open-source forks, on-device LLMs, remote-access tools, and proxy exam services, with threat scores, detection verdicts, and defense effectiveness ratings. Updated at the start of each quarter.
Q2 2026 (April – June) · Next revision: July 2026
Executive summary
Q2 2026 in three sentences.
Fourteen tool categories are tracked in this index. Each is scored on threat severity (1–10), detection difficulty (TRIVIAL / EASY / HARD / IMPOSSIBLE), and network dependency (whether internet access is required at exam time). The composite Threat Score, a weighted average across categories, reached 87/100 in Q2 2026, up from 81/100 in Q1.
The single most important development: open-source forks with fully customizable process names now represent the dominant threat vector for technically sophisticated candidates. Commercial tools remain the threat for unsophisticated users. The adversary profile has bifurcated: the casual cheater uses Cluely; the determined cheater compiles their own binary.
Defense effectiveness has not kept pace. Six of seven monitored detection approaches remain bypassed. Network-layer enforcement, which closes the outbound path to AI APIs before the assessment begins, remains the only approach in this index with no documented bypass.
Composite threat score
87/100. Critical. Up 6 points from Q1.
Weighted average across seven threat categories. Higher score = harder to defend against with current detection-only architectures. A score of 100 would represent a landscape where no deployed defense has any effectiveness.
Commercial AI overlays
Open-source forks (compiled)
On-device LLMs
Remote-access tools (proxy use)
Proxy & exam fraud services
Hardware (earpieces, smart glasses)
Browser automation / scripting
Composite: all categories
Scores reflect detection difficulty for current deployed assessment defenses, weighted by observed prevalence and ease of use for the median attacker. Scores are not a measure of absolute harm; a 100 would represent a theoretically undefeatable threat ecosystem.
§1/Commercial AI overlays
Six commercial tools. Each active, each priced for mass adoption.
Commercial invisible overlay tools are subscription software products with pricing tiers, customer support, and changelog updates. Their viability depends on continued access to LLM APIs, which is also their single point of failure under network-layer enforcement.
| Tool | Price | Threat score | Detection difficulty | Key capability this quarter | Network dependency | Q2 status |
|---|---|---|---|---|---|---|
| Cluely | $20/month | 8/10 | HARD | Invisible mode works across Zoom, Teams, Meet; audio pipeline added Nov 2025; open-source clones proliferating | Required: calls OpenAI API | ACTIVE |
| Interview Coder | $100 lifetime | 8/10 | HARD | 100k+ users; screenshot → code solution pipeline; invisible in Activity Monitor and Dock on macOS | Required: LLM backend call per query | ACTIVE |
| Parakeet AI | $20–40/month | 9/10 | HARD | Real-time audio transcription + LLM; 50+ languages; covers coding, system design, and behavioral rounds | Required: audio transcription + LLM calls | ACTIVE |
| Ultracode AI | $899 lifetime | 7/10 | MEDIUM | Premium tier; claims invisibility even on full screen share; visible in Windows taskbar (known limitation) | Required: cloud LLM backend | ACTIVE |
| LockedIn AI | $55–70/month | 7/10 | MEDIUM | Cloud-based assistant; every session transcript passes through vendor servers (privacy risk for user) | Required: fully cloud-dependent | ACTIVE |
| Final Round AI | $149/month | 7/10 | MEDIUM | Audio earpiece integration; targets behavioral and structured interview rounds; taskbar icon visible | Required: transcription and LLM | ACTIVE |
§2/Open-source forks
20+ repositories. Arbitrary process names. The signature-detection ceiling.
The open-source fork ecosystem is the existential challenge for detection-first proctoring. Each fork can be compiled with a completely custom binary name, icon, and process signature, rendering all signature-based detection permanently obsolete for any candidate with basic developer skills (which describes most of the candidates being assessed).
| Fork | Platform | Threat score | Distinguishing capability | Process-name evasion | Network dependency |
|---|---|---|---|---|---|
| OpenCluely | GitHub (MIT) | 9/10 | Invisible overlay for DSA/coding; multi-language; Gemini integration | Full: compile with any binary name | Required: Gemini/OpenAI API |
| Pluely | GitHub (Tauri/Rust) | 9/10 | 10 MB; 50% less RAM than Cluely; invisible in Zoom/Teams/Meet; GPT-4/Claude/Gemini/Grok multi-model | Full: Rust source, recompile in <1 hr | Required: multi-model API support |
| Natively | GitHub (MIT) | 10/10 | Local RAG; BYOK; zero server storage; explicitly disguises as Terminal, Activity Monitor, or System Settings | Full: documented feature, named as system utilities by design | Optional: BYOK supports local inference |
| MindWhisperAI | GitHub (MIT) | 9/10 | GPT-4o/Claude/Gemini/Grok support; stealth mode; handles coding, system design, behavioral | Full: MIT license, no telemetry, fully forkable | Required: multi-API |
| ShadeCoder | GitHub | 8/10 | Whisper STT integration; screen-capture → code pipeline; low latency vs. Cluely | Full: open source | Required: transcription + LLM |
| LeetcodeWizard | GitHub | 8/10 | LeetCode-specific; includes humanizer pipeline targeting perplexity normalization to defeat AI detectors | Full: open source | Required: LLM + humanizer API |
| DIY (Tesseract + Whisper + any LLM) | Any developer | 10/10 | No GitHub signature exists; OCR + STT + LLM in a few hundred lines of Python; fully custom | N/A: no signature | Required: LLM API |
| DIY with local model | Any developer | 10/10 | No external network trace; Ollama-backed; zero internet required at exam time | N/A: no signature, no network | None; fully offline |
§3/On-device LLMs
No network. No trace. The fastest-growing threat vector this quarter.
Local LLM inference is the most significant threat development of Q2 2026. A candidate running Ollama, LM Studio, or a self-compiled inference server generates zero external network traffic. Their device calls no AI API. DNS queries to known AI providers are irrelevant. Network-layer enforcement at the gateway sees nothing.
The constraint is hardware: running a capable model (7B+ parameters) requires a GPU with sufficient VRAM or a sufficiently fast CPU with adequate RAM. As consumer hardware crosses these thresholds (Apple Silicon, NVIDIA RTX 4060+, Snapdragon X Elite), the barrier to local inference drops to zero.
| Runtime / model | Threat score | Minimum hardware | Internet required? | Q2 prevalence | Detection surface |
|---|---|---|---|---|---|
| Ollama (any 7B model) | 8/10 | 8 GB RAM + modern CPU (M1/M2 Mac, Ryzen 7) | No; fully local after download | HIGH; mainstream on developer machines | Device activity + hardware resource signals |
| LM Studio (any model) | 8/10 | 8 GB RAM; GUI installer for non-developers | No; fully local after download | HIGH; lowers technical barrier further | Device activity + hardware resource signals |
| llama.cpp (CLI) | 9/10 | 4 GB RAM for quantized models | No | MEDIUM; developer-only | Device activity |
| GPT4All | 7/10 | 8 GB RAM; very low-skill GUI | No | MEDIUM; consumer-friendly packaging | Device activity + hardware resource signals |
| Offline Gemma 2B (phone) | 9/10 | Modern Android or iOS device | No | EMERGING; ML Kit on-device API | Second device; outside candidate machine |
§4/Remote-access tools
A human operator (or AI pipeline) controlling the enrolled device remotely.
Remote-access tools used for exam fraud operate in two modes: human proxy (a skilled operator sitting the exam from another location) and AI pipeline (a local automation script feeding questions to a remote LLM and injecting answers). Both require the enrolled device to maintain a remote-control connection, which creates a network-observable signal.
| Tool | Threat score | Fraud use case | Network dependency | Detection surface | Q2 status |
|---|---|---|---|---|---|
| AnyDesk | 9/10 | Human proxy sits the exam; enrolled device shows blank screen or fake video feed | Required: relay server connection | Relay IP blocked at network layer; behavioral anomalies from remote operator | ACTIVE; widely used in proxy exam rings |
| TeamViewer | 8/10 | Same as AnyDesk; older, more detectable signatures | Required: relay server | Known relay IP ranges; process detection | ACTIVE; declining vs. AnyDesk |
| Chrome Remote Desktop | 7/10 | Requires Google account; less operational security for rings | Required: Google relay | DNS query to Google relay domains (blocked under exam policy) | MONITORING |
| Custom SSH tunnel + VNC | 10/10 | No known commercial signature; operator uses SSH for control, VNC for screen | Required, but tunneled through SSH to a controlled host | Behavioral; operator typically less fluent than genuine candidate | EMERGING; seen in APAC-targeted rings |
| AI pipeline over localhost | 10/10 | Local script: OCR screen → HTTP to local LLM → inject answer | None if local model; minimal if cloud | Device activity (localhost HTTP) + hardware signals | EMERGING |
§5/Proxy & exam fraud services
$200–$500 per exam, pay after passing: a mature fraud-as-a-service ecosystem.
Proxy exam fraud services operate as structured marketplaces: a client posts an upcoming exam, operators bid, and payment is released on successful completion. Pay-after-pass pricing removes financial risk for the buyer and creates strong performance incentives for operators. The ecosystem is most active in cybersecurity certifications, technical hiring assessments, and academic examinations.
| Service type | Threat score | Typical price | Dominant exam category | Detection challenge |
|---|---|---|---|---|
| Certification proxy rings (cybersecurity) | 9/10 | $200–$500 pay-after-pass | IT and cybersecurity certification programs | Remote desktop injection inside exam software; operator is often certified and has sat same exam before |
| Hiring assessment proxy (technical) | 9/10 | $100–$300 per session | LeetCode-style, HackerRank, CodeSignal | AI overlay or skilled human operator; cross-session intelligence needed to detect repeat operators |
| Academic exam proxy | 8/10 | $50–$200 | University finals and graduate admissions assessments | Remote desktop through screen-share software; camera feed sometimes replaced with pre-recorded footage |
| Deepfake identity fraud | 8/10 | Bundled with full fake application services | Video interview rounds, identity verification checkpoints | Live deepfake video generation; FBI-documented against US tech employers |
| Telegram / Discord fraud channels | 7/10 | Variable; answer leaks, shared accounts | All categories | Content sharing; hard to attribute; primary signal is answer similarity across candidates |
§6/Hardware attack surface
Earpieces, smart glasses: the attack surface that software cannot reach.
Hardware-based cheating exists entirely outside the enrolled device and its network. No software agent, however deep, can detect a Bluetooth earpiece paired to a phone running ChatGPT in a candidate's pocket, or smart glasses with a camera and audio pipeline. This is the honest boundary of what software-layer assessment security can achieve.
| Attack vector | Threat score | How it works | Software detection? | Q2 status |
|---|---|---|---|---|
| Bluetooth earpiece + phone AI | 8/10 | Phone runs ChatGPT; candidate subvocalizes question; earpiece delivers answer | No; entirely separate hardware | ACTIVE; $30–50 earpiece, free AI |
| Smart glasses with camera | 7/10 | Camera captures screen; phone processes via LLM; earpiece delivers answer | No | EMERGING; Meta Raybans and equivalents |
| Second phone below webcam | 9/10 | Phone runs full AI chat app; candidate types question, glances briefly at response | No; camera proctoring can detect if calibrated for downward gaze | ACTIVE; the most common hardware vector |
| Hardware AI wristband / ring | 6/10 | Experimental; vibration-based Morse code delivery of answers | No | EXPERIMENTAL |
| Second laptop behind the primary | 8/10 | Positioned behind primary machine; candidate rotates to query AI, rotates back | No; camera may detect posture shift | ACTIVE |
§7/Defense effectiveness
Seven defense approaches. Six bypassed. One without a known bypass.
Each defense is rated against the full threat landscape documented in Sections 1–6. "Bypassed" means a working, documented evasion technique exists and is accessible to any motivated candidate.
| Defense approach | Verdict | Catches | Misses | Why bypassed |
|---|---|---|---|---|
| Process-name signature scanning | BYPASSED | Unsophisticated users of commercial tools without renaming | Any open-source fork compiled with custom binary name; local models; hardware | Open-source forks can be recompiled with any process name in under 1 hour |
| Browser lockdown / secure browser | BYPASSED | Tab switching; copy-paste from other browser windows; basic tab-based cheating | Any OS-level process; overlay tools; local models; remote access; hardware | Overlay tools are native OS applications; browser restrictions have no authority below the browser |
| Keystroke dynamics analysis | BYPASSED | Automated script injection of pre-written answers (non-human timing patterns) | Manual transcription of AI-generated output; human proxy input | arXiv 2601.17280 (2026): manually transcribing AI output produces patterns statistically indistinguishable from genuine composition |
| Gaze / eye tracking | BYPASSED | Obvious downward eye movement toward a secondary device; absence from frame | Overlay positioned below webcam; audio-only pipelines (earpiece); mental recall | Overlay can be positioned so that reading gaze appears as forward-facing camera contact |
| LLM output similarity / perplexity scoring | PARTIALLY BYPASSED | Unmodified AI-generated answers pasted directly; obvious LLM boilerplate | Humanizer pipelines; rephrased AI output; answers adapted for specific context | Humanizer tools (LeetcodeWizard and equivalents) explicitly target perplexity normalization |
| Adaptive follow-up questioning | PARTIALLY EFFECTIVE | Candidates who cannot elaborate on AI-generated answers; basic AI users | Candidates who studied their answer before follow-up; audio pipelines that continue during verbal questions | Best current human-judgment method; incomplete coverage for prepared candidates |
| Network-layer enforcement (Aiseptor) | NO KNOWN BYPASS | All commercial overlays (require internet); open-source forks with network dependency; remote-access tools; encrypted resolver bypass attempts | Fully offline local LLMs after model download; hardware attack surface (second devices) | Per-session network enclave with approved-domain enforcement and OS-level signal detection. Offline local models and separate physical hardware are outside the enrolled device boundary. |
§8/Quarter-over-quarter
Q1 to Q2 2026: what changed, what accelerated, what declined.
| Development | Q1 status | Q2 status | Direction | Significance |
|---|---|---|---|---|
| Open-source fork count | ~12 tracked repositories | 20+ tracked repositories | ↑ ESCALATING | Each new fork increases the evasion surface; signature detection permanently obsolete for technical candidates |
| On-device LLM adoption | Niche (developer-only) | Mainstream developer hardware | ↑ ESCALATING | Apple M-series + NVIDIA RTX 4060 accessible to broad candidate pool; zero-network-trace threat growing |
| Audio pipeline coverage | Coding and system design only | Extended to behavioral / STAR rounds | ↑ ESCALATING | No interview round is now AI-resistant via audio pipeline alone |
| Proxy ring sophistication | AnyDesk + human operator | SSH + VNC + AI-assisted operators | ↑ ESCALATING | Operators using AI overlays on their own machines while proxying, compounding the signal complexity |
| Commercial overlay pricing | Avg $80/month | Avg $60/month (competition) | ↓ Accessibility increasing | Lower barrier = higher adoption; $20/month tools converting casual users |
| Browser automation / scripting | MEDIUM threat | Declining relative to overlays | ↓ Lower priority | Overlays are easier and more capable; browser automation declining as primary vector |
| Hardware tool visibility | Cluely CEO mentioned hardware roadmap (Jan 2026) | No commercial product shipped yet | Monitoring | Expected Q3–Q4 2026 based on stated roadmap; threat score will rise when products ship |
| Deepfake identity in hiring | FBI-documented incidents | Gartner projects 1 in 4 profiles fabricated by 2028 | ↑ ESCALATING | Identity verification becoming necessary pre-assessment step at enterprise scale |
§9/Forward signals
What we are watching for Q3 2026.
- Hardware AI products from commercial overlay vendors. The CEO of Cluely publicly stated intent to ship hardware products (earpieces, smart glasses integration) targeting the physical interview setting. If this ships in Q2–Q3 2026, the threat score for the hardware category will jump from 75 to 85+. Physical proctoring requirements will need to be updated accordingly.
- Compact model capabilities crossing the 7B threshold. Gemma 3 1B and Qwen2.5 3B can now run usably on mid-range mobile hardware. Phone-resident local models with no internet dependency represent the next zero-trace frontier. We expect on-device mobile LLMs to emerge as a documented threat vector in Q3 2026.
- Cross-session intelligence adoption by assessment platforms. CodeSignal's Suspicion Score and Fabric's behavioral analytics are the current leaders in cross-session signal aggregation. If this becomes standard (tracking operator behavioral patterns across different candidate accounts), proxy ring threat scores will drop for platforms that implement it.
- Regulatory pressure on overlay tool vendors. The EU AI Act's employment-use provisions take effect in stages through 2026. US states are advancing disclosure bills. Commercial overlay vendors face potential legal exposure. We track whether this reduces commercial tool availability or simply accelerates migration to open-source.
- Agent-mode cheating. Current tools operate in query–response mode: candidate inputs question, tool outputs answer. The next generation uses AI agents capable of autonomously solving multi-step coding challenges, navigating assessment environments, and maintaining context across a full interview loop. No commercial product has shipped this capability yet as of Q2 2026. We expect first movers in Q3–Q4.
Methodology & citation
How scores are calculated. How to cite this index.
Scoring methodology
Threat score (1–10 per tool, 1–100 per category): Weighted composite of: (a) detection difficulty against deployed defenses, 40%; (b) ease of use for the median attacker, 30%; (c) prevalence in the wild, sourced from platform reports and researcher observation, 20%; (d) capability ceiling ( i.e., maximum sophistication achievable), 10%.
Composite index score: Category scores weighted by observed usage distribution across real assessment events. Commercial overlay weight: 25%. Open-source forks: 25%. On-device LLMs: 20%. Remote access: 10%. Proxy services: 10%. Hardware: 10%.
Detection verdicts: BYPASSED = documented, working evasion technique accessible to a motivated candidate with ≤8 hours of preparation. PARTIALLY BYPASSED = evasion exists but requires significant preparation or has meaningful false-positive costs. NO KNOWN BYPASS = no documented evasion technique for the stated scope.
Update cadence: Quarterly. Q2 2026 covers April–June 2026. Data collected through end of March 2026. Next update: July 2026 (Q3).
APA 7th citation
Bhanushali, D. (2026, April). AI Cheating Threat Index: Q2 2026. Aiseptor. https://aiseptor.com/research/threat-index
Data sources
- Aiseptor Threat Intelligence: reverse-engineered tool analysis
- GitHub repositories: OpenCluely, Pluely, Natively, MindWhisperAI, ShadeCoder, LeetcodeWizard
- CodeSignal Fraud Rate Report (Feb 2026)
- Fabric: 19,368 AI interview analysis (Jan 2026)
- Talview AI Threat Index Report 2026
- TechCrunch: overlay tool reporting (2025–2026)
- arXiv 2601.17280: keystroke dynamics study (Jan 2026)
- FBI IC3 advisory: state-sponsored hiring fraud
- Experian 2026 Fraud Forecast
- Gartner: fabricated profile projections
Related research
Continue reading
Annual report
AI Cheating Statistics 2026
The comprehensive annual dataset: fraud rates, candidate attitudes, regional breakdowns, bad-hire cost, and the full bypass map.
Read →Research hub
All research
Primary-source research on AI-assisted cheating, assessment fraud, and network-layer prevention architectures.
Read →Reference
Glossary
Definitions of the tools, techniques, and architectural terms used throughout our research reports.
Read →The only rated defense with no known bypass
One defense passed the rating. See how it works live.
Every tool in this index (commercial overlays, open-source forks with custom binary names, remote-access pipelines) requires a network path to function. Aiseptor closes that path at the OS level before the assessment begins. Book a 30-minute demo and watch us block the Q2 2026 toolkit in real time on a candidate device.