Autopilot is live - your AI agent runs LinkedIn outreach 24/7

LinkedIn scraping in 2026: what's legal, what got Proxycurl shut down

Proxycurl was doing $10M/year scraping LinkedIn. Then LinkedIn and Microsoft sued. Here's what the court cases actually say about scraping legality, and what alternatives work without the legal risk.

Alexandre Sarfati avatar

Alexandre Sarfati

Published February 21, 2026
Updated April 2, 2026
LinkedIn scraping in 2026: what's legal, what got Proxycurl shut down

Proxycurl was doing $10M a year. Then LinkedIn sued.

In January 2025, LinkedIn and Microsoft filed a federal lawsuit against Proxycurl and its founder Steven Goh. The allegation: hundreds of thousands of fake accounts used to scrape millions of LinkedIn profiles. By July 2025, Proxycurl - one of the most commercially successful LinkedIn scraping operations ever, with $10 million in annual recurring revenue - was shut down permanently.

The founder was candid about why. As Goh explained, they'd built the company organically without VC funding and simply couldn't afford to fight a multi-billion dollar corporation in court. The settlement required deleting all LinkedIn data obtained through unauthorized means.

This wasn't the first time LinkedIn went after scrapers. But Proxycurl's scale and public profile made it the most visible example yet of what happens when LinkedIn decides to enforce.

If you're collecting LinkedIn data for prospecting, this matters. The legal landscape has shifted dramatically since the hiQ ruling that everyone cites as proof that scraping is legal.

The hiQ case doesn't say what people think it says

The hiQ Labs v. LinkedIn case gets referenced constantly as evidence that LinkedIn scraping is legal. Here's what actually happened.

The ruling everyone cites

In April 2022, the Ninth Circuit ruled that scraping publicly accessible data doesn't violate the Computer Fraud and Abuse Act (CFAA). The court's logic: public websites impose no access authorization requirements - "there were no gates to lift or lower in the first place."

This was a genuine legal milestone. It established that accessing public data isn't "hacking" under federal law.

What happened next (the part people skip)

Later that same year, the district court found that hiQ had breached LinkedIn's user agreement - the Terms of Service that prohibit scraping. The court also found hiQ had failed to preserve evidence and awarded sanctions.

The parties settled. hiQ, which had lost funding, clients, and employees during years of litigation, closed down.

The practical lesson: you probably won't go to prison for scraping public LinkedIn data, but LinkedIn can still sue you for breach of contract, ban your account, and make running your business impossible.

Where the law stands now

QuestionLegal answer
Is scraping public LinkedIn profiles a federal crime?No (Ninth Circuit, hiQ ruling, 2022)
Can LinkedIn sue you for breach of Terms of Service?Yes - and they do
Can LinkedIn ban your account for scraping?Yes, immediately and permanently
Does GDPR allow you to scrape EU residents' data?Only with a valid legal basis and proper compliance
Can you scrape data behind a login wall?This crosses a much clearer legal line

The critical distinction: CFAA legality is not the same as "safe to do." LinkedIn has the resources and willingness to pursue scrapers through civil litigation, and the hiQ case itself proves that winning the CFAA argument doesn't save your business.

Why LinkedIn's enforcement is escalating

LinkedIn has gotten much more aggressive about scraping enforcement since 2024. The Proxycurl lawsuit wasn't an isolated incident.

The AI factor. Bloomberg Law reported that LinkedIn's war against bot scrapers has ramped up specifically because AI companies want LinkedIn data for training models. The volume of scraping attempts has increased dramatically, and LinkedIn is investing in detection technology to match.

Machine learning detection. LinkedIn's detection systems now use ML to identify scraping patterns - not just volume, but behavioral signatures like timing patterns, session characteristics, and device fingerprinting. The same Q4 2024 algorithm update that increased automation detection rates also improved scraping detection.

The fake account problem. Proxycurl's downfall wasn't just scraping - it was operating hundreds of thousands of fake accounts. LinkedIn's lawsuit highlighted this specifically. Creating fake accounts to access data is a much clearer violation than scraping public data, and it's what LinkedIn has been most aggressive about prosecuting.

Most LinkedIn data needs can be met without scraping. Here are the approaches practitioners actually use, ranked by risk.

Zero risk: LinkedIn's own export

Most people don't know this exists. Go to Settings > Data Privacy > Get a copy of your data. LinkedIn emails you an archive of your connections, including names, companies, titles, and emails (when shared).

It's your data. It's 100% compliant. And for many use cases - importing your network into a CRM, building a list of warm contacts to reach out to - it's all you need.

Limitation: You only get your own connections, and the export is manual.

Low risk: third-party data providers

Companies like Apollo.io, ZoomInfo, and Cognism collect professional contact data through legitimate means - partnerships, public filings, user opt-ins, and licensed databases. You're buying data from a provider that handles the compliance.

Bright Data is worth noting specifically: they won court cases against Meta and X in 2024, becoming the first web scraping company to be thoroughly examined in U.S. courts and win twice. That legal validation matters if compliance is your priority.

ProviderStarting priceContactsGDPR compliant
Apollo.ioFree tier (50 credits/mo)270M+Yes
ZoomInfo~$10K/year70M+Yes
CognismEnterprise pricingPhone-verified mobilesYes
Lusha$29/mo per userB2B contactsYes

The trade-off: You're paying for data someone else collected. The data quality and freshness varies. And you're trusting the provider's compliance claims.

Medium risk: browser-based automation

Tools that run in your browser, using your actual account and IP, occupy a gray area. They technically violate LinkedIn's Terms of Service, but they're much harder to detect than server-side scraping and they don't involve fake accounts.

The key distinction from Proxycurl's approach: you're using your real account, from your real browser, at human-like speeds. LinkedIn's detection focuses on scale and pattern anomalies, not on whether you clicked "connect" yourself or had a browser extension do it.

BeReach takes this approach - a Chrome extension handles authentication, and the API handles the rest. Starting at EUR49/month, it gives you access to LinkedIn data through legitimate authentication rather than fake accounts or server-side scraping.

Risk mitigation:

  • Stay within daily limits (20-30 actions)
  • Use your real account and IP
  • Don't bulk-export data - interact with it
  • Accept that ToS violation risk exists

No-risk alternative: earn the data

The approaches above all involve taking data from LinkedIn. There's a fundamentally different strategy: create content that makes prospects come to you.

Lead magnets, webinars, and gated tools collect LinkedIn-equivalent data with explicit consent. A "Free LinkedIn Profile Audit" tool that asks for a profile URL gives you the same data scraping would - but the prospect handed it to you voluntarily.

This doesn't scale the same way, and it requires content investment. But the data you collect is higher quality (they opted in), fully compliant (consent-based), and comes with built-in buying intent.

GDPR: the constraint most articles underestimate

If any of your prospects are EU residents, GDPR applies regardless of where your company is based. And GDPR treats scraped personal data seriously.

Legitimate interest is the legal basis most B2B companies rely on for outreach. It can work, but it requires:

  • A documented assessment that your business interest outweighs the individual's privacy rights
  • A clear opt-out mechanism in every communication
  • Data minimization - only collect what you need
  • Deletion on request - and you need a process for this

What GDPR doesn't allow: bulk scraping personal data, storing it indefinitely, or using it without transparency about how you got it.

The Clearview AI settlement in 2025 - roughly $51 million and 23% company equity to plaintiffs - shows what happens when facial recognition and personal data scraping intersect with privacy law. LinkedIn data is less sensitive than facial scans, but the regulatory direction is clear.

If you're scraping LinkedIn profiles of EU residents and storing that data without a valid GDPR legal basis, you're exposed. The fines can reach 4% of annual revenue or EUR20 million, whichever is higher. This isn't theoretical - enforcement is increasing.

The shift from extraction to interaction

The Proxycurl shutdown signals a broader trend: the era of large-scale LinkedIn data extraction is ending. LinkedIn is too well-resourced, too motivated (especially with AI training data concerns), and too legally aggressive for scraping to be a sustainable business model.

What's replacing it is interaction-based data collection. Instead of extracting thousands of profiles into a database, you interact with prospects on LinkedIn - visit their profiles, engage with their content, send connection requests - and collect data through those interactions.

This is fundamentally what tools like BeReach do. You're not scraping LinkedIn. You're using LinkedIn as LinkedIn intended, just with automation handling the repetitive parts. The data you get is fresher (it comes from real-time interactions), more relevant (you're already engaging with the prospect), and less legally exposed (you're operating within the platform, not extracting from it).

The trade-off is speed. You can't build a database of 50,000 prospects overnight this way. But you can build a pipeline of 50 genuinely warm prospects per week, which is what most B2B teams actually need.

Ready to Get Started?

Every viral post is 100+ warm conversations waiting. Install BeReach and start reaching out today.

Free tier available · No credit card required · Full API on all plans

Frequently asked questions

What happened to Proxycurl?

LinkedIn and Microsoft filed a federal lawsuit in January 2025 alleging Proxycurl operated hundreds of thousands of fake accounts to scrape millions of profiles. Despite $10M in annual revenue, Proxycurl couldn't afford to fight and shut down by July 2025. The settlement required permanent deletion of all scraped data. It's the highest-profile LinkedIn scraping enforcement action to date.

Does the hiQ ruling mean LinkedIn scraping is safe?

No. The hiQ ruling only addressed the CFAA (federal hacking law). The same case found hiQ breached LinkedIn's Terms of Service (contract law). hiQ ultimately closed down after years of litigation. The ruling means scraping public data isn't a federal crime, but LinkedIn can still sue you for breach of contract, ban your account, and make running your business extremely difficult.

Can I use LinkedIn data under GDPR?

B2B prospecting can use "legitimate interest" as a legal basis under GDPR, but you must document why your business interest outweighs the individual's privacy rights, provide opt-out mechanisms, minimize data collection, and honor deletion requests. Bulk scraping without these safeguards exposes you to fines up to 4% of annual revenue or EUR20 million. Using compliant data providers is the safer path.

Continue reading

Explore more insights and strategies