
Perplexity AI’s rapid evolution from a niche “answer engine” into a multibillion-dollar search and productivity platform has captivated technologists and investors alike. Yet a parallel story is unfolding: the same innovations that make Perplexity’s tools more accurate, ubiquitous, and lucrative are simultaneously enlarging the attack surface for data theft, phishing, disinformation, and intellectual-property abuse. The past six months have produced a steady drumbeat of headline-grabbing incidents and research findings that illustrate how a single AI startup can unintentionally supercharge cyber-risk for publishers, businesses, and everyday users.
The article below dissects seven distinct vectors of misuse—each grounded in recent reporting, technical analyses, or legal filings—and explains how they intertwine to make both the AI ecosystem more intelligent and the public more vulnerable.
Anatomy of Recent Misuse—A Timeline at a Glance
1. Copyright Scraping and “Skip-the-Links” Plagiarism Feed an Underground Data Market

Perplexity’s core retrieval-augmented generation (RAG) pipeline hinges on real-time scraping, but multiple investigations show the company gathering entire articles—even those explicitly protected by robots.txt—without permission. The BBC warned in June that it may seek an injunction unless existing copies of its content are deleted and future scraping ends. Dow Jones and NYP Holdings went further, filing a Southern District of New York lawsuit accusing Perplexity of “massive illegal copying” and trademark dilution.
Illicitly acquired publisher data has knock-on effects beyond lost ad revenue. Scraped archives become fodder for phishing kits and malspam campaigns, because criminals can instantly remix credible journalistic text, inject malicious links, and distribute plausible-looking emails that bypass basic spam heuristics. The legal fight therefore doubles as a cybersecurity battleground: preventing AI companies from indiscriminately harvesting copyrighted prose also reduces the corpus available to threat actors.
How Scraping Supercharges Cybercrime
- Automated pseudo-journalism: Forbes documented Perplexity “knock-off stories” that lifted fragments from investigative articles without attribution. These free text fragments let scammers craft convincing lures at scale.
- API abuse vectors: Once scraped, paywalled content often resides in unsecured cloud buckets or developer sandboxes, a ripe target for credential-stuffing attackers looking to exfiltrate entire datasets.
- Trust dilution: As AI-generated summaries replace visits to original sites, users lose traditional trust signals (padlock icons, known domains)—a perfect environment for spoofed look-alike URLs.
2. Hallucinated Links Funnel Users Toward Phishing Domains
Large-language models can mistake similar string patterns for verified domains. Netcraft’s controlled experiment asked GPT‑4.1 family models (deployed by Perplexity) for the login page of 50 brands; 34% of 131 suggested URLs were either dormant or entirely unrelated to the target company. In one live test, Perplexity provided a Google Sites phishing clone instead of wellsfargo.com.
Because AI chat interfaces summarize answers without the domain-by-domain ranking that users expect from traditional search, malicious links appear authoritative—deleting the friction that historically gave people pause.
Why the Problem Persists
- Sparse brand representation: Mid-size banks and utility portals appear infrequently in training data, increasing hallucination odds.
- Lack of reputation scoring: Perplexity’s answer engine does not weigh established domain authority the way Google’s PageRank does, so typosquats slip through.
- User psychology: Plain-language answers from a confident chatbot short-circuit habitual URL inspection, accelerating click-throughs to fake sites.
3. Critical Vulnerabilities in the Official Android App
Appknox researchers disclosed six exploitable weaknesses in Perplexity’s Android client, including missing SSL pinning and exposure to the StrandHogg privilege-escalation flaw. Attackers can hijack unsecured network calls, overlay phishing screens, and even repackage the APK to inject malware via the Janus vulnerability.
Significantly, one flaw lets users call Perplexity’s premium APIs free of charge by tampering with local traffic—a revenue threat to the company, but also an invitation for malicious bot herders seeking cost-free LLM access to craft spear-phishing emails.
Even when Perplexity patches such issues, the broader ecosystem remains fragile: countless mirrors of older APKs circulate on shady app stores, and supply-chain counterfeiters can weaponize them long after official fixes.
4. A Flood of Counterfeit Apps and Impersonation Schemes

Security vloggers found dozens of Play Store listings mimicking Perplexity’s iconography but funneling users into ad-laden or subscription-trap shells. The apps request intrusive permissions—contact lists, camera, microphone—turning curious first-time users into data donors for spam farms.
HardReset’s teardown showed how to verify the genuine developer signature, yet fake apps still rank high in sponsored search slots. Because Perplexity’s brand is itself new and its visual identity shifts quickly, recognition fatigue aids impostors. The result: malware authors harvest query logs, location data, and even voice snippets under the guise of an AI assistant.
5. Comet Browser: From Answer Engine to Surveillance Hub
Perplexity’s July launch of the Comet browser extends LLM integration from search box to full device telemetry. Malwarebytes warns that Comet’s raison d’être is behavioral tracking: CEO Aravind Srinivas explicitly said the firm wants insight into “what are you spending time browsing … to build a better us”.
The standalone Comet Privacy Notice confirms broad data collection, including “pages viewed, scroll depth, and interaction timings,” which may be stored on overseas clouds. Privacy advocates note that by pre-installing Comet on smartphones—talks confirmed with OEMs in July—Perplexity could observe 24×7 browsing habits, not merely occasional AI queries.
Browser supply-chain dominance creates cascading cyber-risk: if a malicious actor compromises Comet’s extension framework or update channel, the attacker gains privileged, persistent access to passwords and cookies across millions of devices.
6. Misinformation Feedback Loops and “Second-Hand Hallucinations”
Perplexity markets itself as an antidote to ChatGPT-style hallucinations by attaching citations. Yet GPTZero’s June study shows the citations themselves are increasingly AI-generated; users encounter an AI article citing another AI article, forming a closed loop of synthetic “authority”. Reddit anecdotes back this up: a user’s China land-reform report listed sources dated “July 2025” months before that date existed, and Perplexity could not locate the links when challenged.
As these phantom references propagate through social media, disinformation acquires the sheen of scholarly rigor. Cybercriminals seize on this dynamic to seed deepfake “research” that legitimizes pump-and-dump crypto tokens or health scams. Once LLMs ingest the newly minted spam, the cycle repeats, embedding falsehoods deeper into the knowledge graph.
7. Intensifying Legal and Regulatory Dragnet
Perplexity’s Acceptable Use Policy explicitly forbids facilitation of cybercrime, but enforcement lags behind scale. The company now faces pressure from three flanks:
- Publishers: BBC’s threat letter and News Corp’s lawsuit demand deletion of infringing datasets and monetary damages.
- Cloud providers: AWS is investigating whether Perplexity violates scraping rules; a negative finding could suspend critical compute resources.
- Data-protection regulators: Cross-border data transfers in Comet and enterprise APIs raise red flags under GDPR and India’s DPDPA, as privacy lawyers note ambiguity around subcontractor access.
If any one investigation forces Perplexity to purge historical logs or retrain models from scratch, downstream services—including WhatsApp fact-checking bots that aim to combat scams—could become less reliable overnight, ironically undercutting tools designed to fight misinformation.
Policy Implications and Risk-Mitigation Roadmap
For AI Developers
- Mandatory domain-reputation scoring: Integrate phishing blacklists and Certificate Transparency logs before echoing links in answers.
- Citation provenance auditing: Require cryptographic signatures or web-archive snapshots to verify that every cited page existed at crawl time.
For Regulators
- Licensing transparency: Enforce public disclosure of content-licensing agreements and scraping exemptions to curb IP abuse.
- AI-specific privacy labels: Treat browsers like Comet as high-risk data processors subject to opt-in consent for behavioral tracking.
For End-Users
- Verify before click: Scrutinize every domain surfaced by AI tools, especially when asked to enter credentials.
- Install from first-party stores: Confirm developer names and package hashes to avoid counterfeit Perplexity apps.
- Use compartmentalized profiles: Separate AI-powered browsing sessions from online banking to minimize cross-site risk.
Leave a Reply