Legality of Data Scraping Using AI Revisiting In Canada
Overview
Canadian Courts are once again due to rule on the use of artificial intelligence technology (“AI”) in scraping data from third-party websites. Data scraping is the process of automatically extracting data from websites or other online sources using software. The Courts in Canada have determined that scraping data from a website is not permissible. However, with the advent of AI and its increased accessibility, companies in Canada continue to do this contrary to law and best practices.
Websites can be protected by copyright law. For example, written content, graphics and design elements of a website is generally protected by copyright. Previously, in 2019, the Federal Court of Canada upheld the copyright in content on websites and condemned the actions of web/data scrapers in The Toronto Real Estate Board v. Monghouse.com et al.
Additionally, in 2022, privacy regulators across Canada found that Clearview AI, a U.S.-based company, violated Canadian federal and provincial privacy laws by scraping upwards of three billion images of individuals worldwide and reselling that data to law enforcement officials in the US. It is clear in Canada that the unauthorized copying, data scraping, downloading and distributing third-party content without express permission is illegal. Any organization engaging in data scraping in Canada, even where the information collected does not include personal information, runs the risk of violating copyright law and claims for breach of contract.
On November 4, 2024, the Canadian Legal Information Institute (“CanLII”) filed a Notice of Claim with the Supreme Court of British Columbia against 134 BC Ltd et al, including Caseway AI Legal Ltd. (“Caseway”, collectively the “Defendants”). In the Notice of Claim, CanLII alleges, among other things, that the Defendants violated the CanLII website’s terms of use, which expressly prohibits the bulk downloading and web scraping of the CanLII website without permission or a license. CanLII seeks, among other things, an injunction prohibiting the use of any material scraped from the CanLII website.
CanLII is a not-for profit organization that owns and operates a legal database that includes more than three-and-a-half million works, including court decisions, legislation and other secondary sources. CanLII provides the public with free access to its database, subject to the terms of use of its website. The terms of use of the CanLII website are intended to “balance the public’s interest in free and open access to Canadian legal materials with interests of the participants in judicial proceedings that could be at risk by the bulk access and use of those materials by commercial and other parties.”
The terms of use of the CanLII website specifically outline prohibitions on use of the content from the CanLII website and restrict users from engaging in “the bulk or systematic downloading of CanLII Works, including by way of programmatic means or by way of hiring human resources to manually download the CanLII Works and the incorporation of the CanLII Works into another website, whereby the user masks the original source of the CanLII Works by way of framing, re- use of search processes or any other means, so as to create confusion or misrepresentation of the fact that CanLII Works originated from CanLII and the CanLII Website.”
The Defendants are accused of misappropriating data from the CanLII website. Caseway operates an AI platform trained on Canadian court decisions, that aims to improve access to justice by providing enhanced legal research tools. The platform is available to users who pay a $49.99 CAD monthly subscription fee. The Caseway platform does not disclose the source of its catalogue of legal documents, the Defendants have suggested in public articles and promotional materials that the database is based data scraped from the CanLII website. The basis for CanLII’s claims includes breach of contract, copyright infringement, conversion of property, unjust enrichment and punitive damages for the Defendants’ alleged flagrant misconduct. Despite CanLII’s demand that the Defendants cease and desist from using and distributing data illegally obtained from the CanLII website (“CanLII Works”), the Defendants continue to operate their infringing platform and publish and distribute CanLII Works without authorization.
Data scraping is not only the subject of ongoing legal proceedings, but is also on the radar of privacy law officials in Canada. The Office of the Privacy Commissioner of Canada finalized its joint statement on data scraping and the protection of privacy in October 2024, which was endorsed by several members of the Global Privacy Assembly’s International Enforcement Cooperation Working Group. The statement noted that unlawful data scraping has gained increased attention, in part due to the rapid onset and deployment of generative AI systems. The joint statement highlights that there is an expectation that all companies protect publicly accessible personal information that they host against unlawful scraping and any failure to do so could result in regulatory intervention, including enforcement action.
In light of the above, we await the Supreme Court of British Columbia to confirm whether or not web scraping is the basis of good business practices in 2024.
For more information about this or other privacy-related questions, please contact a member of the firm’s Privacy & Data Management Group.