When developing AI systems, practitioners often focus on model building, while sometimes underestimating the importance of analysing the diverse data collection mechanisms. However, the diversity of mechanisms used for data collection deserves closer attention since each of them has different implications for AI developers, data subjects, and other rights holders whose data has been collected. This policy paper maps the principal mechanisms currently used to source data for training AI systems and proposes a taxonomy to support policy discussions around privacy, data governance, and responsible AI development.
Mapping relevant data collection mechanisms for AI training
Policy paper
Share
Facebook
Twitter
LinkedIn
Abstract
In the same series
-
Working paper
Evidence from selected countries and the European Union
7 May 202658 Pages -
Working paper
Global linkages and the cross‑country distribution of the gains from AI
18 March 202679 Pages -
Working paper
International insights and policy considerations for Italy
11 December 2025100 Pages -
8 December 202543 Pages
Related publications
-
Working paper
2025 Results and Key Findings
16 February 202641 Pages -
Working paper
Insights into structures and solutions for public access and use
8 December 202557 Pages -
10 September 202526 Pages
-
Policy paper13 June 202545 Pages
-
9 June 20259 Pages
-
6 February 20258 Pages
-
29 January 2025142 Pages