How do national strategies address data accessibility for AI?

Under the proposed Cloud and AI Development Act (CADA), Member States are legally required to adopt national cloud and AI strategies that include specific

Summary Under the proposed Cloud and AI Development Act (CADA), Member States are legally required to adopt national cloud and AI strategies that include specific measures to ensure the accessibility of high-quality data for AI development. Crucially, Article 7(2)(h) mandates that these strategies include "measures to ensure the accessibility of high-quality data for AI development, notably by preventing data bottlenecks encountered by organisations." This requirement is not standalone; it is designed to complement the Act's broader operational objectives, specifically Article 4(7)(c), which aims to "promote the sharing and reusing of training data and AI models across the Union's public services." Together, these provisions create a dual framework: one ensuring national governments actively remove barriers to data access, and another driving the public sector to open its data repositories for AI training, thereby strengthening the Union's technological sovereignty and competitiveness.

Detail

The proposed Cloud and AI Development Act (CADA), COM(2026) 502 final, establishes a comprehensive framework to strengthen Europe's cloud and AI ecosystem. A central pillar of this framework is the obligation for Member States to develop and implement national cloud and AI strategies. These strategies are not merely aspirational documents; they are binding instruments intended to align national policies with Union-level goals for technological sovereignty, resilience, and innovation.

The Mandatory Obligation: Article 7(2)(h)

Article 7 of the CADA proposal sets out the requirements for these national strategies. Member States must establish these strategies within one year of the Regulation's entry into force. While the strategies must cover a wide range of objectives—from accelerating AI adoption in strategic sectors to deploying data centre capacity—Article 7(2)(h) specifically targets the data dimension of AI development.

The text of Article 7(2)(h) explicitly requires national strategies to include:

"measures to ensure the accessibility of high-quality data for AI development, notably by preventing data bottlenecks encountered by organisations."

This provision acknowledges a critical reality: data is the foundational fuel for artificial intelligence. Even the most sophisticated algorithms cannot function effectively without access to large volumes of high-quality, relevant data. The proposal identifies "data bottlenecks" as a significant barrier to innovation. These bottlenecks can disproportionately affect small and medium-sized enterprises (SMEs) and start-ups, which often lack the resources to negotiate complex data-sharing agreements, build proprietary datasets from scratch, or navigate fragmented regulatory landscapes.

By mandating that national strategies address these bottlenecks, the EU aims to create a more level playing field. The goal is to treat data as a strategic asset that can be leveraged for broader economic benefit, ensuring that the Union's AI ecosystem is not held back by a lack of accessible training data.

Complementing Public Sector Data Sharing

The requirement in Article 7(2)(h) does not exist in a vacuum. It is intrinsically linked to the broader operational objectives of the Cloud and AI Leadership Initiatives outlined in Article 4. Specifically, Article 4(7) focuses on increasing the development and adoption of AI models and systems across the Union's public sectors.

Within this operational objective, Article 4(7)(c) states that the Initiatives shall:

"promote the sharing and reusing of training data and AI models across the Union's public services;"

Furthermore, Article 4(7)(d) adds a specific focus on facilitating "secure, privacy-enhancing health data reuse for AI models and tools in healthcare."

The relationship between Article 7(2)(h) and Article 4(7)(c) creates a synergistic, dual-pronged approach to data accessibility:

Supply-Side Activation (Article 4): The Act drives the public sector to actively share its vast repositories of data. By promoting the reuse of training data and models across public services, the EU seeks to unlock the value of data that is currently siloed within national administrations.
Demand-Side Enablement (Article 7): The Act ensures that national strategies actively remove the barriers that prevent organisations from accessing this data. By requiring measures to prevent "data bottlenecks," Member States must address the legal, technical, and economic obstacles that might otherwise hinder the uptake of this newly available public data.

This combination ensures that data is not only made available by the public sector but is also practically accessible to the organisations that need it for AI development.

Understanding "Data Bottlenecks" in the CADA Context

The term "data bottlenecks" in the context of CADA refers to a variety of obstacles that restrict the flow of data necessary for AI development. While the proposal does not provide an exhaustive definition, the context suggests these include:

Legal and Regulatory Barriers: Unclear rules on data sharing, intellectual property rights, or privacy compliance that deter organisations from sharing or using data.
Technical Barriers: A lack of interoperability between different data systems, the absence of standardized formats, or the difficulty of integrating data from disparate sources.
Economic Barriers: High costs associated with data acquisition, cleaning, preparation, or the fees charged for access to high-value datasets.
Fragmentation: The existence of data silos within and between Member States, preventing the creation of a unified European data space for AI.

By requiring national strategies to address these bottlenecks, CADA empowers Member States to tailor solutions to their specific national contexts. For example, a Member State might establish a national data space for healthcare, implementing privacy-enhancing technologies (PETs) like federated learning to allow AI training without moving sensitive patient data. Another Member State might focus on simplifying the legal framework for sharing anonymized public sector data or creating "data trusts" to facilitate secure sharing.

Alignment with the "AI First" Principle

These data accessibility measures are part of a broader "AI first" principle embedded in the national strategies. Article 7(2)(a) requires strategies to include key objectives and priorities for cloud and AI adoption "in line with the 'AI first' principle."

The "AI first" principle, as defined in the explanatory memorandum and referenced in Article 7(2)(a), urges organisations to reflect on their business processes, considering the needs and opportunities offered by AI. Access to high-quality data is a prerequisite for applying this principle effectively. Without accessible data, organisations cannot fully assess the potential of AI to transform their operations or develop the innovative solutions required to meet the Union's strategic goals.

Coordination via the European Artificial Intelligence Board

To ensure consistency and coordination across the Union, Article 7(6) establishes a role for the European Artificial Intelligence Board (AI Board), which was established under the AI Act (Regulation (EU) 2024/1689).

The Board shall:

"advise and assist the Member States as regards the coordination of national strategies. The AI Board shall facilitate exchange of best practices among Member States."

This mechanism is crucial for preventing a fragmented approach where one Member State's strategy to prevent data bottlenecks is incompatible with another's. By facilitating the exchange of best practices, the AI Board will help Member States learn from each other's approaches to overcoming data bottlenecks and promoting public sector data sharing, thereby fostering a more cohesive European AI ecosystem.

What this means for you

For CTOs, data architects, AI developers, and SMEs, the CADA proposal signals a significant shift in the EU's approach to data governance and AI development. Here is how these provisions impact your operations and strategy:

New Opportunities for SMEs: If your SME relies on data for AI development, the national strategies driven by Article 7(2)(h) should lead to improved access to high-quality datasets, particularly from the public sector. This can significantly reduce your data acquisition costs and accelerate your AI projects. Keep a close watch on your national government's strategy development to identify new data-sharing initiatives, national data spaces, or specific programs designed to remove bottlenecks.
Architectural Considerations: As public sector data becomes more accessible, you may need to adapt your AI architectures to integrate with new data sources. This could involve implementing privacy-enhancing technologies (PETs) to ensure compliance with data protection regulations while leveraging shared data. Be prepared to work with data formats and interoperability standards that align with national and EU-wide data spaces.
Advocacy and Engagement: Engage with your national authorities and industry associations during the drafting of national strategies. Ensure that the measures taken to prevent data bottlenecks are practical and effective. Provide feedback on the specific barriers you face in accessing data. Your input can help shape national strategies that are more responsive to industry needs and more likely to succeed in unlocking data value.
Preparation for Compliance: While CADA focuses on accessibility, it does not replace existing data protection laws like the GDPR. Ensure that your data handling practices remain compliant with all applicable regulations. The increased availability of data should not come at the cost of privacy or security. The "secure, privacy-enhancing" nature of the data sharing mentioned in Article 4(7)(d) underscores the importance of maintaining robust compliance frameworks.

Common misconceptions

Misconception 1: CADA mandates that all data must be open.

Reality: CADA does not require all data to be open or public. It focuses on ensuring accessibility and preventing bottlenecks. This can be achieved through various means, including controlled access, anonymization, privacy-enhancing technologies, and secure data spaces. Data protection and security remain paramount, as highlighted by the specific reference to "secure, privacy-enhancing" reuse in Article 4(7)(d).

Misconception 2: National strategies are optional or voluntary.

Reality: Article 7 makes it a binding obligation for Member States to adopt national cloud and AI strategies. While the specific measures to prevent data bottlenecks can vary based on national contexts, the requirement to address data accessibility is mandatory. Member States must notify the Commission of their strategies and update them regularly.

Misconception 3: Data accessibility is only about public sector data.

Reality: While the proposal highlights public sector data sharing (Article 4(7)(c)), the obligation to prevent data bottlenecks (Article 7(2)(h)) applies more broadly. It encourages measures that improve data accessibility for AI development across all sectors, including private-public partnerships and industry-led initiatives. The goal is to ensure that all organisations, not just public bodies, can access the data they need.

Misconception 4: The AI Board will dictate national strategies.

Reality: The AI Board's role, as defined in Article 7(6), is to "advise and assist" and "facilitate exchange of best practices." It does not have the power to dictate the content of national strategies. The primary responsibility for designing and implementing these strategies remains with the Member States.

Official sources

This is general information about a draft EU regulation, not legal advice.