v1.10.90-0e025b8
Skip to main content
LegalBusiness

The Web Scraping Legal Landscape in 2026

12 min read

By Hex Proxies Engineering Team

The Web Scraping Legal Landscape in 2026

This article is for informational purposes only and does not constitute legal advice. Consult qualified counsel for guidance specific to your situation.

The legal status of web scraping in the United States has stabilized over the past four years, but it has not become simple. Three decisions form the spine of current doctrine: hiQ Labs, Inc. v. LinkedIn Corp., 938 F.3d 985 (9th Cir. 2019), revised on remand 31 F.4th 1180 (9th Cir. 2022); Van Buren v. United States, 593 U.S. 374 (2021); and the 2024 settlement in Meta Platforms, Inc. v. Bright Data Ltd., No. 3:23-cv-00077 (N.D. Cal.). Read together, they have narrowed the Computer Fraud and Abuse Act as a tool against scrapers of public data and pushed platforms toward contract and trespass-to-chattels theories instead.

This article walks through the current state of each doctrine and describes the practical compliance posture a data team should hold in 2026.

The CFAA After hiQ and Van Buren

The Computer Fraud and Abuse Act, 18 U.S.C. § 1030, criminalizes accessing a computer "without authorization" or in a manner that "exceeds authorized access." For two decades, platforms argued that a cease-and-desist letter or a terms-of-service clause was enough to revoke authorization, making subsequent scraping a federal crime.

The Supreme Court narrowed "exceeds authorized access" in Van Buren. The Court held that the phrase applies only to information located in areas of a computer system that the user is not permitted to access at all, rejecting a broader reading that would have criminalized any use of information that violated a policy. Justice Barrett, writing for a 6-3 majority, adopted a "gates-up-or-down" approach: if the gate is open, walking through it does not exceed authorized access, even if you later use the data in a way the owner dislikes.

The Ninth Circuit applied that logic to scraping in its 2022 hiQ remand opinion. LinkedIn had sent hiQ a cease-and-desist demanding it stop scraping public profile pages. The court held that because the profiles were publicly accessible without authentication, LinkedIn's letter could not transform the access into "without authorization" for CFAA purposes. 31 F.4th at 1201. The court wrote that the CFAA's prohibition "does not apply to the scraping of information that is publicly available."

Three things follow from this. First, scraping data that is visible to any logged-out browser is not a CFAA violation in the Ninth Circuit, and the Second, Fourth, and other circuits have cited hiQ approvingly. Second, scraping behind a login wall is still risky under the CFAA, because authentication acts as a gate. Third, the CFAA protects access, not use; once the data is acquired, downstream copyright, contract, and privacy claims remain available to plaintiffs.

The Shift to Contract Theories: Meta v. Bright Data

Because CFAA is no longer a strong weapon for public data, platforms have pivoted to breach-of-contract and trespass-to-chattels theories. Meta's 2023 suit against Bright Data tested both. Meta alleged that Bright Data, as a logged-in user of Facebook and Instagram, had agreed to terms prohibiting automated collection, and that its scraping therefore breached that contract. Meta also argued trespass based on server load.

Judge Edward Chen granted summary judgment to Bright Data in part on January 23, 2024, finding that Meta had not shown Bright Data scraped while logged in, and that scraping public-facing pages without logging in did not subject Bright Data to the terms of service. The court reasoned that a terms-of-service agreement requires acceptance, and browsing a publicly accessible page without creating an account or clicking through a clickwrap does not create a contract. The parties subsequently settled on undisclosed terms.

The practical takeaway is that the logged-out versus logged-in distinction now carries significant legal weight. Scraping while authenticated creates a contractual nexus that scraping from a clean browser does not.

Copyright and the hot-news doctrine

Scraping factual data generally does not infringe copyright because facts are not copyrightable under Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991). Scraping creative expression (articles, photos, product descriptions written by the seller) is a copyright question and turns on fair use analysis under 17 U.S.C. § 107 and the four-factor test applied most recently in Andy Warhol Foundation v. Goldsmith, 598 U.S. 508 (2023).

The hot-news misappropriation doctrine, descended from International News Service v. Associated Press, 248 U.S. 215 (1918), is largely preempted by the Copyright Act under the Second Circuit's analysis in NBA v. Motorola, 105 F.3d 841 (2d Cir. 1997), and survives only in narrow time-sensitive contexts.

Trespass to chattels

eBay v. Bidder's Edge, 100 F. Supp. 2d 1058 (N.D. Cal. 2000), is still cited, but its reach was narrowed by Intel Corp. v. Hamidi, 71 P.3d 296 (Cal. 2003), which held that the tort requires actual harm to the system, not mere unwanted contact. Modern trespass claims against scrapers survive only when the plaintiff can show measurable degradation of server performance or resource exhaustion. Well-behaved scrapers that observe rate limits and distribute load are rarely viable targets.

State Computer Crime Statutes

While the federal CFAA has contracted, several state computer crime statutes remain broader. California Penal Code § 502, in particular, uses "without permission" language that has not been authoritatively narrowed the way Van Buren narrowed the CFAA. The Ninth Circuit in United States v. Christensen, 828 F.3d 763 (9th Cir. 2015), treated § 502 as reaching conduct the CFAA might not. Operators should not assume the hiQ reasoning automatically transfers to state law.

International Exposure

The EU Digital Services Act, in force since February 2024, does not directly regulate scraping but increases platform obligations that may affect how platforms structure anti-scraping defenses. The Court of Justice of the European Union's 2021 decision in Ryanair DAC v. DataBird GmbH and the UK Supreme Court's reasoning in Lloyd v. Google LLC [2021] UKSC 50 have shaped downstream privacy claims. Scraping personal data of EU residents also triggers GDPR, which is a separate analysis covered in our GDPR compliance article.

Practical Compliance Posture

A defensible scraping program in 2026 has six ingredients:

  1. Scrape logged-out whenever possible. Authentication creates contract exposure that anonymous access does not.
  2. Honor robots.txt where feasible. It is not a legal instrument in the United States, but ignoring it is used by plaintiffs as evidence of bad faith.
  3. Rate limit. Keeps trespass-to-chattels claims out of range by ensuring no measurable harm to the target system.
  4. Exclude authenticated platforms absent clear authorization. Social networks, walled-garden marketplaces, and sites behind paywalls are higher-risk targets.
  5. Do not resell copyrightable expression without a fair use analysis. Facts are fine; prose and photos require review.
  6. Document your purpose. Research, price comparison, and market intelligence have stronger fair use and legitimate interest defenses than competitive replication.

A Note on Proxy Providers

A proxy provider is a common carrier of IP traffic. In Bright Data, the court did not hold the underlying proxy network liable for customer conduct any more than AT&T is liable for what its subscribers dial. But providers that market themselves as tools for circumventing authentication or bypassing legal restrictions have a weaker posture than those that position themselves as general-purpose network infrastructure. Hex Proxies maintains an acceptable use policy and responds to abuse reports within published timelines; see our compliance page for details.

Key Citations

  • hiQ Labs, Inc. v. LinkedIn Corp., 31 F.4th 1180 (9th Cir. 2022).
  • Van Buren v. United States, 593 U.S. 374 (2021).
  • Meta Platforms, Inc. v. Bright Data Ltd., No. 3:23-cv-00077 (N.D. Cal. Jan. 23, 2024) (order on summary judgment).
  • Feist Publications, Inc. v. Rural Telephone Service Co., 499 U.S. 340 (1991).
  • Intel Corp. v. Hamidi, 71 P.3d 296 (Cal. 2003).
  • Andy Warhol Foundation v. Goldsmith, 598 U.S. 508 (2023).
  • 18 U.S.C. § 1030 (Computer Fraud and Abuse Act).
  • Cal. Penal Code § 502.