In the past few years following the Indian Supreme Court’s 2017 decision upholding the right to privacy, the Indian Government has pursued the creation of a wholistic data regulation framework in India extending beyond the mere issue of privacy. However, before the economic potential of data can be harnessed to drive innovation in India, regulators must contend the fundamental questions regarding its ownership under the country’s intellectual property regime. The following discussion explores these questions in the context of the proposed non-personal data regulation framework mooted by the Indian Government.
Background to Indian Data Regulation and Ownership
The regulatory framework for data governance in India is undergoing rapid shifts to keep pace with the ever-expanding commercial applications for data. There is a pressing need for developing a mature legal framework to regulate and nurture the technology sector. As a part of the drive towards digitization, the Government of India, on its part, endeavoured to provide robust data protection through the now withdrawn Personal Data Protection Bill, initially introduced in 2019. Extended debates surrounding this legislation also gave rise to ideas of broad data governance in India, and the possibility for unlocking the economic potential of aggregated data, which was not personal, yet was derived from Indian communities. To this end, the Government constituted a Committee of Experts (CoE), in September 2019, to submit findings on regulating what it termed as ‘non-personal data’ (NPD). The CoE published its first draft report on the Non-Personal Data Governance Framework in July 2020, and a revised draft report based on public feedback in December, 2020 (Revised Report). More recently, in May 2022, the Government also unveiled a draft National Data Governance Framework Policy (NDGFP) to enhance accessibility and innovation based primarily on government held data. However, eventual NPD regulation through new laws, including for private data, is not precluded by the NDGFP.
Expected regulatory trends for NPD can be discerned from the Revised Report. While an entire complex data sharing framework for NPD under a new authority is envisioned in the Revised Report, its core proposals revolve around a framework for the creation of ‘high-value datasets’ and the providing of access to these datasets for ‘public good’ purposes as defined in the report. This includes sharing ‘high-value NPD datasets’ for the purposes of policy making, public service delivery etc. This was proposed so that the benefits of NPD reach the communities from which the datasets were derived. The Revised Report also entailed proposals for the sharing of underlying data, and meta-data by data business under appropriate regulations, where data businesses refers broadly to entities collecting and managing both personal data and NPD.
A fundamental assumption underpinning the proposals of the CoE was the lack of significant conflict between the new NPD framework and existing laws, specifically relating to intellectual property. This notion was heavily criticized in industry submissions to the CoE on its draft reports. These submissions from members of the information technology industry argued that NPD datasets collected and managed by corporations and other entities were protected by the intellectual property regime in India, and particularly under the Copyright Act, 1957. However, the CoE in the Revised Report, contested the submissions, relying on its own evaluation that Indian copyright law did not protect the underlying raw NPD from being accessed under the data sharing framework for NPD. A considered analysis into these contested claims reveals a range of complications regarding the precise nature and extent of intellectual property rights in datasets.
Position of Datasets under the Indian Intellectual Property Regime
At the outset, it should be noted that unlike jurisdictions such as the European Union, India does not have a sui generis legal framework for the protection of non-original databases. The term ‘non-original’ is of utmost importance in this context, as original computer databases are indeed protected under Indian copyright law. Original computer databases are included within the definition of literary works under the Copyright Act. To understand the implications of this law for conflict with the CoE’s proposed framework, it is necessary to analyse the meaning of ‘originality’ under copyright law.
The subsisting Indian legal precedent for the interpretation of this term in the context of a database arguably stems from the Supreme Court judgment in the case of Eastern Book Company v. D.B. Modak, which arose on account of a copyright infringement dispute involving a publisher of court judgements. This publisher published edited versions (or modified copies) of court judgments which included paragraph segregation and numbering, identification of concurring or dissenting opinions, and the creation of ‘headnotes’ for cases. The Supreme Court held in this case that a database would have to exhibit just a minimal degree of creativity to qualify for copyright protection. The pure investment of labour, skill and capital into a work would not result in the creation of original work entitled to copyright protection under Indian law.
Implications for NPD Protection
In the context of a factual database derived from the collection and aggregation of NPD, the creativity necessary to establish originality would consequently be required in the selection, compiling and organisation of the database. However, the Supreme Court’s interpretation of minimal creativity creates more doubt than clarifications in practical terms. In Easter Book Company v. D.B. Modak, the numbering of paragraphs and their segregation alongside indicating the ‘majority opinion’ or ‘dissent’ was found to involve great skill and judgment akin to creativity. The court found these aspects of the editing process to meet the minimum thresholds for creativity for providing copyright protection to the publisher’s edited judgments. This conception of likening ‘skill and judgment’ to creative output carries awkward ramifications for database protection through copyright law. In private entities managing NPD databases, database administrators generally use considerable technical skill in the organisation of information to suit the requirement of the business and draw out key insights. If the low threshold of creativity determined by the Supreme Court is applied analogously to these activities, we arrive at the possibility that a majority of databases would be subject to copyright protection, increasing potential points of conflict of copyright with the proposed NPD sharing framework of the CoE.
Additionally, the CoE argued in the Revised Report that its framework for data access involves raw NPD, which would not be covered by copyright protection in any case. However, this position may also be potentially contentious and the meaning of ‘raw data’ is itself a notion, worthy of greater scrutiny. While individual data points comprising a dataset would undoubtedly be categorised as raw data, the Revised Report envisions accessing high value datasets where even the relevant fields for collected data are to be shared by data businesses. These fields are expected to be pre-determined subsets of the original database. It is difficult to logically conclude that this would also qualify as raw data since it would be reliant on the data organisation carried out by the creator of the database. Further, if the specific data accessed from a high-value database through an NPD data sharing framework is adequate to re-compose at least a component of the original database, categorisation of this shared NPD as ‘raw data’ also cannot be automatically assumed.
Concluding Remarks and Recommendations
The analysis above demonstrates that the relationship between the Indian copyright law and the Revised Report’s proposals for NPD sharing is not necessarily harmonious, but controversial. The complexities of protecting original computer databases composed of aggregated NPD through the Copyright Act demand a practical resolution. This would not only prepare the ground for a tenable NPD sharing legal framework, but also help data holding entities identify the true extent of intellectual property protection for their ever-expanding databases. One manner in which the Government of India may proceed towards such resolution would be a direct amendment to the Copyright Act, 1957 to create a specific and precise definition for ‘original computer databases’ entitled to copyright protection. It would be of considerable utility to enact such an amendment within the current year. While reports indicate that NPD may be excluded from the ambit of the upcoming Indian data protection law, expected to be introduced by the Government in the upcoming sessions of parliament in the year 2022, current trends do not preclude regulation of privately held NPD in the near future. A strong indication of this can be discerned from the report of the parliamentary Joint Committee on the Personal Data Protection Bill, 2019, which recommended that a joint regulator for both personal data and NPD be established under the law.
It should also be noted that Indian law protecting confidential information through trade secrets provides a separate conflict point for the sharing of privately held NPD. The merits of such trade secret protection for databases have also been considered by the CoE in the Revised Report, and these would form the basis for an independent legal discussion on trade secret protection for data in India.
 See Section 2(o), Copyright Act, 1957
*The author is a legal, technology and public policy professional consultant based out of Delhi, India. The views expressed in this piece reflect the personal opinions of the author.