By the Bertelsmann Foundation

The study “The Role of Indian Data for European AI” investigates the potential benefits of a closer data exchange collaboration between India and Europe. It explores the Indian AI landscape, Germany’s need for access to larger data pools, and requirements to make a cooperation possible. While no quick wins are to be expected from a closer cooperation, the long-term prospects are promising, provided that several important obstacles are cleared first. India’s new regulation on data privacy and security, due to be passed soon, appears to be a make or break issue.

After a period of sluggish progress in the Euro-Indian strategic partnership, the EU-India summit held in July 2020 reaffirmed that both sides are keen to deepen their cooperation in security, the environment, innovation and public health. The two large democratic blocs, together accounting for roughly a quarter of the world’s population and GDP, see huge potential in closer cooperation. One area that seems particularly promising is innovation and artificial intelligence (AI). The leaders of India and Germany also put AI cooperation prominently on their agenda included in the joint statement after their last intergovernmental consultations in November 2019.

This push also builds on the assumption that data from India might be valuable to promote AI-development in Europe. As India is a vast and diverse country, it produces a lot of data that might complement data available in Europe, and vice versa. But is this assumption valid?

Acknowledging the strategic importance of close cooperation between the EU and India, this study, prepared for the Bertelsmann Stiftung by CPC Analytics, an Indo-German AI consultancy, investigates whether a cooperation on data between India and Germany/the EU can boost the German and European AI ecosystem. It also asks whether a realistic case can be made for AI cooperation, and especially for cross-border data exchange between India and Germany/the EU.

The study focuses on the situation in India to understand the potential for a data collaboration based on the actor and regulatory landscape in the country. The AI landscape in India is sketched in a first step and this knowledge is then complemented by interviews with relevant actors in the field in order to shed light on the practitioner’s perspective of using cross-border data to build AI. As a regulatory environment allowing for data exchange is a necessary condition for such an exchange to happen on a large scale, the current regulatory situation and plans are described.

Indeed, the analyses of India’s digitalization efforts and of the private sector landscape point towards a rapidly evolving environment for AI. India’s government has launched a series of initiatives to support the country’s digital development. Useful data is increasingly being collected and the country certainly has the talent pool and industry to make use of this data. These factors make cooperation on data for AI between Germany and India promising. Yet major obstacles are present, including regulatory and data availability issues. These obstacles are discussed in the study, which also takes a detailed look at two sectors, health and e-commerce, finding that collaboration possibilities vary greatly between them.

The potential for cooperation is considerable but remains theoretical for now with regards to large scale data exchanges (i.e. across a wide variety of sectors and companies). A quick realization, let alone quick economic benefits, does not seem realistic at the moment. This situation is unlikely to change before there is clarity on the effects of India’s Personal Data Protection (PDP) bill, which is not expected until 2022.

The US and China can be regarded as the world’s most advanced countries in the field of AI, both technologically and in terms of data access, while the rest of the world is following at a distance. Europe needs to find its place in this environment and make sure it is not left behind. Thus, on February 19, 2020, the European Commission published its European Strategy for Data, which lists as an important goal increasing access to larger data sets relevant to AI development. The long-term potential for cooperation with India in this regard is considerable for various reasons. Besides obvious factors like the country’s large population, market size and a thriving IT services sector, three aspects of its digital economy stand out: the establishment of a digital baseline infrastructure (“India Stack”), the political push towards digitization (national AI strategy, work on PDP) and the digital start-up landscape, which has created more than 20 unicorns in recent years. All of these aspects result in a growing pool of data that could be used for AI.

While the study estimates the current size of the Indo-German AI-related economy to be roughly between €500 million and €1.5 billion (with total bilateral trade at about €20 billion), many obstacles need to be overcome before meaningful cooperation can be realized. Moreover, there seems to be “no low-hanging fruit” in Indian data that could be picked easily. At the moment, most data in India seems to be either privately held and inaccessible, context specific and not transferable, or of lower quality and thus not usable. India is making progress in this regard, however, and cooperation is likely to be promising for both countries in the longer run as more data of higher quality becomes available. A lot will depend on the realization and implementation of data protection laws in India and on whether a data-adequacy decision by the European Commission can be attained. Adding to these difficulties, the Hindu-nationalist agenda of the current Indian government has called the country’s traditional role as a “natural partner” for Germany/the EU into question. Various ethical considerations will thus need to be resolved, too, before real progress can be made. Having argued so far from a European perspective, it should also not be forgotten that, while there is interest in India to cooperate more with Germany and the EU, the latter are not the first priority for India’s very entrepreneurial and dynamic AI companies.

Partly because of the huge potential and partly because both countries will need strong partners in the future since they cannot become relevant counterweights to China or the US on their own, closer cooperation seems appropriate. The current geopolitical situation represents a window of opportunity in this regard. There is obvious potential for cooperation between India and Germany/the EU in the area of AI, but rather surprisingly the study could not identify major examples of cross-border data exchange for AI development. Even within companies, such examples seemed to be very limited.

Against this background, making a major political effort to foster cooperation in data for AI might not be the most promising candidate for inclusion on the bilateral agenda for now, as it would bind a lot of policy makers’ time and resources for a long period and for a mission that is theoretically appealing but not yet fully proven in practical terms. Championing an Indian PDP bill that is likely to gain adequacy status in the EU thus really seems like the single most important high-level topic that policy makers should focus on for now. There is an open window of opportunity as the Indian data protection regulation has not yet been finalized and European initiatives might fall on fertile ground. If this succeeds, it will be worthwhile to deepen efforts for more cooperation further.