CADA Data Training Ban: Can Cloud Providers Train AI on Customer Data?

As proposed, the Cloud and AI Development Act (CADA) imposes a strict prohibition on cloud computing service providers using customer data to train or fine

Summary As proposed, the Cloud and AI Development Act (CADA) imposes a strict prohibition on cloud computing service providers using customer data to train or fine-tune AI systems operated by third countries. For services seeking Union Assurance Levels 2, 3, or 4, the proposal mandates that data generated by using the service must never be transferred outside the Union and cannot be used for any AI training by non-EU entities. This rule applies to all data generated by the service, including metadata and telemetry, and is a cumulative requirement for recognition as a sovereign cloud service under Annex II.

Detail

The proposed Cloud and AI Development Act (CADA) establishes a comprehensive sovereignty framework designed to mitigate strategic dependencies and protect the Union's public order. A critical component of this framework is the prevention of "data leakage" into foreign AI ecosystems. The proposal explicitly addresses the risk that cloud providers might leverage customer data to improve AI models controlled by entities outside the EU's legal jurisdiction.

The Core Prohibition: No Training Third-Country AI

Under the proposed text, cloud computing service providers aiming for higher levels of Union assurance must adhere to strict data usage restrictions. These restrictions are codified in Annex II, which sets out the cumulative criteria for recognition.

For Union Assurance Level 2, Annex II 2.1(f) states:

"the data generated by using the audited service are not used to train or fine-tune any AI system operated by a third country or a legal entity established in a third-country, and are not transferred outside the Union in any case;"

This exact prohibition is repeated verbatim for Union Assurance Level 3 in Annex II 3.1(f) and for Union Assurance Level 4 in Annex II 4.1(f).

This means that if a cloud provider seeks to serve public sector bodies requiring these higher assurance levels, they are legally barred from using the data their customers generate—including logs, telemetry, and usage patterns—to improve AI models owned or controlled by non-EU entities. Furthermore, the data itself is strictly confined to the Union; it "are not transferred outside the Union in any case."

Scope of "Data Generated"

The prohibition is broad and covers "data generated by using the audited service." Annex III (Audit Evidence), which guides the assessment of these criteria, clarifies the scope of this data. It includes:

Customer-derived data: Any data produced through the customer's use of the service.
Logs and telemetry: Records of who used the service, when, which functions were accessed, and configuration data.
Interaction data: Any data resulting from the interaction with the audited service by the cloud service customer.

This definition ensures that even passive data collection, such as system performance metrics, access logs, or metadata, cannot be siphoned off to train foreign AI models. The intent is to prevent the "free" extraction of value from EU public sector data by third-country AI developers.

Union Assurance Level 1: The Baseline

Union Assurance Level 1 serves as the baseline for all public sector procurement under Article 30(2). While Annex II 1.1(c) requires that customer data remain exclusively within the Union unless the public sector body explicitly requires otherwise, Level 1 does not contain the identical verbatim training prohibition found in the higher tiers.

However, Level 1 requires the provider to demonstrate compliance with state-of-the-art cybersecurity standards (Annex II 1.1(e)) and to ensure that subcontracting does not compromise operational autonomy (Annex II 1.1(d)). While Level 1 provides a baseline of data residency, the specific, explicit ban on using data to train third-country AI models is a distinguishing feature of the higher assurance levels (2, 3, and 4). Consequently, for public sector bodies seeking the strongest guarantee against AI training by foreign entities, the risk assessment under Article 29 should mandate Level 2 or higher.

Audit and Verification

Compliance with these criteria is not a matter of self-declaration for Levels 2, 3, and 4. Providers must undergo independent third-party audits as outlined in Article 20. Annex III specifies the evidence auditing organizations must request to verify compliance with the training ban:

Contractual clauses: Explicit statements that data will not be used to train or fine-tune any AI model operated by a third country.
Data flow diagrams: Documentation of end-to-end data flows, showing where AI pipelines or machine learning operations (MLOps) connect with customer data.
Model cards: Statements confirming that data generated by using the service does not leave the Union.
MLOps records: Evidence that build, test, and release locations for AI models are within the EU.

What this means for you

For in-house counsel, compliance officers, and public procurement teams, these provisions represent a significant shift in cloud contract management and vendor selection.

1. Procurement and Contracting Strategy

Public sector bodies must align their procurement strategies with the risk assessments mandated by Article 29. If your organization's activities contribute to the preservation of public order (e.g., healthcare, energy, justice, law enforcement), Article 30(3) requires you to procure only services recognized at Union Assurance Level 2, 3, or 4.

Your contracts must explicitly reflect the prohibitions in Annex II 2.1(f), 3.1(f), and 4.1(f). Ensure your service level agreements (SLAs) and data processing agreements (DPAs) include clear, enforceable clauses prohibiting the use of your data for any AI training purposes by the provider or its third-country affiliates. Relying on standard terms that allow "service improvement" data usage is no longer sufficient for sovereign cloud services.

2. Vendor Due Diligence

You must verify that your cloud provider has obtained the necessary Union Assurance recognition. Marketing claims of "EU data residency" are insufficient without the specific audit opinion regarding AI training. You should:

Request access to the central repository of recognized services maintained by the Commission (Article 22) to confirm the provider's status and assurance level.
During due diligence, ask for the audit report or summary demonstrating compliance with the specific data residency and AI training restrictions.
Verify that the provider has implemented the necessary technical and organizational measures to prevent data transfer outside the Union "in any case."

3. Data Mapping and Telemetry

Conduct an internal audit of the data you send to cloud providers. Identify all telemetry, logs, and metadata generated by your use of cloud services. Under the proposed rules, this is not just a best practice; it is a regulatory requirement. Ensure that your provider's DPAs explicitly forbid the use of this data for model training. If your current provider uses your data to train their global AI models, they cannot be recognized at Level 2, 3, or 4, and you may be in breach of Article 30 if you procure their services for public-order-relevant activities.

4. Penalties and Liability

Article 24 outlines the penalty framework. Member States must lay down rules on penalties that are "effective, proportionate and dissuasive." While specific fine amounts are left to national implementation, the criteria for imposition include the nature, gravity, scale, and duration of the infringement, as well as the financial benefits gained. For public sector bodies, using a non-compliant provider could lead to procurement irregularities, potential liability for data protection breaches under the GDPR, and reputational damage regarding the protection of public order.

Common misconceptions

Misconception 1: This only applies to personal data. The prohibition on training third-country AI systems applies to "data generated by using the audited service." This includes non-personal data such as system logs, performance metrics, and telemetry. Even if the data is anonymized or aggregated, it cannot be used to train foreign AI models if the service is recognized at Level 2, 3, or 4.

Misconception 2: I can allow data to leave the EU if I give explicit consent. For Levels 2, 3, and 4, the requirement is absolute: data "are not transferred outside the Union in any case." While Annex II 1.1(c) (Level 1) allows data to leave the Union if the public sector body explicitly requires otherwise, the higher assurance levels do not provide this exception for the data generated by the service itself. The integrity of the sovereignty framework relies on strict data residency for these critical tiers.

Misconception 3: This only affects large non-EU hyperscalers. The rules apply to any cloud computing service provider seeking recognition under the Union Assurance Framework, including European providers and joint ventures. If a European provider uses a third-country AI model for its own services and trains it on customer data, it will fail to meet the criteria for Levels 2, 3, and 4. The focus is on the operation of the AI system and the location of the data, not just the provider's nationality.

Misconception 4: The AI Act already covers this. The AI Act (Regulation (EU) 2024/1689) focuses on the safety, fundamental rights, and transparency of AI systems themselves. It does not contain specific provisions regarding the sovereignty of cloud infrastructure or the prohibition of using customer data to train foreign AI models in the context of cloud service procurement. CADA fills this specific gap by establishing a sovereignty framework for the underlying cloud and AI infrastructure.

Official sources

This is general information about a draft EU regulation, not legal advice.

CADA Data Training Ban