G. Barrett
Cirrus AI, South Africa
Artificial Intelligence in Science

Artificial intelligence for science in Africa
Copy link to Artificial intelligence for science in AfricaIntroduction
Copy link to IntroductionAcademic and other research institutions in Africa carry out much fundamental and applied scientific research, but few as yet use artificial intelligence (AI). African science needs to take up AI methods. In the absence of such methods, an increasing number of scientific disciplines at African institutions will border on irrelevance. A greater use of AI in scientific research in Africa will bring numerous benefits, deepening African science, broadening global research agendas and incentivising the location of corporate research and development (R&D) labs. Ultimately, the use of AI in science will have spillover effects, helping to upgrade the capabilities of civil society more broadly.
Cirrus and the AI Africa Consortium are a major response to the AI deficit in African science. They aim to broaden researcher access to computing, data, engineering resources and trained students. Ultimately, this will make AI for science feasible in numerous academic institutions across Africa – not just elite academic institutions and large technology firms. In so doing, they will help commercialise research findings. With human capital central to AI, online learning can play an important role in knowledge transfer to Africa.
Prioritising AI for science in Africa
Copy link to Prioritising AI for science in AfricaAI-enabled scientific research is not yet happening in Africa. Most of the leading corporate research operations are active in Asia, Europe and North America but not in Africa. This is an important barrier to collaborative research and commercialisation efforts at African institutions.
Data from the QS World University Rankings, since 2012, show that Fortune 500 companies collaborated six times more with the top 50 universities than with those ranked between places 301 and 500, where most African universities are situated1 (Ahmed and Wahed, 2020). This imbalance in collaboration exacerbates disparities between academic institutions in Africa and top tier academic institutions in the rest of the world.
Furthermore, Fortune 500 technology companies and the top 50 universities publish five times as many papers annually per AI conference than universities ranked between 200 and 500. The research budgets of premier academic research institutions like Carnegie Mellon University’s Robotics Institute – at USD 90 million in 2019 (Spice, 2019) – are a fraction of that of the major industrial companies. However, they are still orders of magnitude greater than that of any academic institution in Africa.
While world-class research does take place at African institutions, African researchers lack the data, computing infrastructure and engineering resources to develop and apply the more powerful and critical AI methods. Even for the world’s elite academic institutions and researchers, it is increasingly difficult to work at the frontier of AI research (Sample, 2017). For example, OpenAI analysed the relationship between the availability of computational resources and 15 relatively well-known breakthroughs in AI between 2012 and 2018 (Amodei and Hernandez, 16 May 2019). Of the 15 developments examined, 11 were achieved by private companies, while only 4 came from academic institutions.
In terms of training and human capital development, universities are fortunate that the AI field involves many feasible options to rapidly upskill researchers. This is resulting in a paradigm shift for many in academia who are accustomed to building courseware.2 Forward-thinking universities have been steadily moving towards a “flipped classroom”. In this format, learners watch videos and complete in-depth assignments and online quizzes at home, then come to class for discussions. The classes generally culminate in an open-ended final project, supported by the teaching team. The university often uses previously developed high-quality Massive Open Online Courses as the core course material. It then focuses on supplementary domain-specific materials, projects and assignments. With this approach, students in developing countries can access courseware used at elite universities. The cost to both students and the university are well below the previous alternatives.
A range of new capabilities and leadership is required to deploy AI for science
Copy link to A range of new capabilities and leadership is required to deploy AI for scienceNew capabilities and leadership are needed if African research institutions are to harness new AI methods. Such capabilities require engineering personnel to prepare data, and configure hardware, software and machine-learning algorithms, which are absent in most of Africa. In addition, the ad hoc mix of campus computers and commercial clouds that Africa’s educators and researchers rely on today are inadequate.3
Simply providing underserved academic and research organisations with the data, hardware, software and engineering resources is insufficient. To truly reduce barriers to AI-enhanced research, underserved institutions need access to experts who can implement best practices. Key areas include approaches to problems, learning methods, selection of tools for tasks and optimisation of workflows.
An example is the development of AI-ready datasets. In some fields of science, data are abundant. However, in many scientific areas, sufficiently large datasets either do not exist and/or are not accessible in forms that permit the use of AI methods. Substantial effort is required to create new datasets. This could include locating and cleaning the data, aligning the schemas of disparate data, ensuring machine readability and providing relevant metadata pertaining to issues such as data provenance, quality and completeness. This expensive and error-prone process must be repeated for each analysis. This becomes a barrier to using data, and also leads to problems of research reproducibility. Furthermore, privacy and security issues need to be addressed from the beginning rather than after the fact. The process must provide integrated assurances and audit capabilities to advance research in the public interest.
Data engineering is often needed to develop specific software tools to construct the dataset for AI. Most of this tool development happens without considering possible public or inter-experiment collaborations. When collaborations are eventually sought, researchers may find their work has already been duplicated.
Providing a data management platform to enable efficient AI development and sharing is a priority for Cirrus. Such a platform will enable users to store, manage, share and find data with which to develop AI systems. This includes tracking data, versioning support for various data formats and complete metadata to allow for retraining and understanding models built from the data. Such a platform will drive advances in AI by enabling researchers to experiment with existing and new methods in new contexts. It will benefit the disciplines in which the datasets are created.
For African academic and research institutions, moving forward on AI also requires a significant increase in the scientific throughput that feeds AI systems. Governments, and academic and research institutions in the region need to generate more and better quality data, and to make data accessible. The use of findable, accessible, interoperable and reusable data principles, and participation in a centralised set of standards for benchmark datasets in scientific domains, are both needed. These will help govern data storage formats, access and metadata to reduce engineering overhead and lower the barriers to training and comparing model performance.4 A high priority must be to identify and use existing and potential scientific data-generating programmes to produce AI-ready data repositories. The liberating of data in a privacy-preserving manner must extend across science, from Earth observation to health care. Doing so will support science and aid in using AI to address diverse pressing social problems.
Cirrus and the AI Africa Consortium
Copy link to Cirrus and the AI Africa ConsortiumCirrus and the AI Africa Consortium are ambitious by African standards. Cirrus emerged in 2017 from a need to use AI in a scientific collaboration at Wits University. The university leadership then decided that Cirrus should benefit all academic and research institutions in Africa.
Over five years, the legal groundwork has been laid to operationalise Cirrus and the AI Africa Consortium. Some activities have already begun, including the rollout of machine learning for embedded devices. Full implementation will commence following the confirmed participation of the Strategic Founding Partners (SFPs).
Cirrus
Cirrus is designed to provide data, dedicated compute infrastructure and engineering resources at no cost to academic and research institutions through the AI Africa Consortium.
Providing dedicated compute infrastructure will be enormously important. Based solely on hardware costs, it is more cost effective to own infrastructure when computing demand is close to continuous. Estimates show that commercial cloud services are more expensive per compute cycle than a dedicated high-performance computing cluster (Villa and Troiano, 30 July 2020). The initial costs of subsidising cloud use might be less than building public infrastructure. However, studies show that relying on commercial cloud services will likely be much more expensive in the long term (Wang and Casado, 2021).
Through a variety of financial and other mechanisms, Cirrus is designed to help attract corporate research in AI (and associated venture capital activity), targeting multinationals not yet active in this field in Africa. Ultimately, Cirrus would be owned, through equity, by around 15‑25 multinational corporations. Each of these SFPs would commit USD 7‑20 million.
The diversity in ownership should bring with it a diversity of research interests. This will help avoid AI research focused on a narrow set of ideas and methods biased to the interests of any particular private sector participant. The research mission of Cirrus is also isolated from political influence, from changes in political administrations and from politically appointed administrators. Allocation of Cirrus resources to the AI Africa Consortium will occur through a mix of peer review, lottery and equitable distribution criteria. As a private sector entity, Cirrus is also not encumbered by the intellectual property constraints that ensnare research commercialisation efforts at publicly funded universities.5 This provides Cirrus with the flexibility to support a range of commercialisation options.
Cirrus has three components. First, it will house co‑operation programmes, the state-of-the-art computing and data infrastructure, engineering personnel and the open learning programmes. Second, the Cirrus FOUNDRY – a form of business incubator – is equipped with everything needed to turn insights from scientific research into start-ups and eventually larger commercial applications. Third, the Cirrus FOUNDRY Fund is an in-house fund to support start-ups in the Cirrus FOUNDRY. The Cirrus FOUNDRY Fund has a target capitalisation of USD 35 million and will undertake pre-seed and seed stage investments.
The physical infrastructure and operations for Cirrus are to be housed at Wits University in Johannesburg, South Africa. Wits University was selected as the host institution for three reasons:
1. South Africa is the most scientifically advanced country on the African continent (Mouton et al., 2019), and Wits is one of Africa’s leading academic research institutions.6
2. Wits is situated geographically in the highest concentration of economic, academic and research activity in Africa.
3. Wits has the land available to house the necessary infrastructure, including for energy generation and storage.
The AI Africa Consortium
The Africa AI Consortium fosters collaboration agreements with parties across the African R&D ecosystem. The agreements focus on helping identify research priorities, spreading AI research resources and engaging African research talent.7 The Consortium aims to create significant AI research capabilities by developing skills and recruiting researchers and other skilled personnel from across Africa. It will then pair these capabilities with those provided through Cirrus.
Figure 1. The organisational layout of Cirrus and the AI Africa Consortium
Copy link to Figure 1. The organisational layout of Cirrus and the AI Africa ConsortiumThe Consortium will:
help and encourage researchers to interact and collaborate beyond disciplinary or institutional silos
reduce redundancy of effort and cost as new research projects will not have to build capabilities or collect new data from scratch each time
accelerate discovery and improve reproducibility through sharing of datasets, metadata, models, software, hardware and other resources
reduce the cost for individual research programmes involved in integrating capabilities and/or comparing their work with that of others
foster a co-design culture where teams of scientific users, engineers and instrument providers can help develop new and broadly applicable capabilities and tools
support a research ecosystem that understands the full context for AI solutions.
Figure 1 sets out the organisational structure of the Consortium. At the time of writing, the next step is the appointment of the lead investment bank for solicitation of the SFPs. Following placement of the SFPs, the Partner, Affiliate and Co-development programmes will be rolled out.
Efforts underway within the AI Africa Consortium include:
TinyML4D: The rollout of machine learning on embedded devices, targeted at developing countries. It includes the provision of free hardware kits, workshops, courseware and a network of research and collaboration opportunities.8 TinyML4D began in 2021 and is being scaled up.
MLCommons: Fostering African participation in the development of science benchmarks, particularly those relevant to African researchers.9
Remote Excellence Fellowships: A remote internship system to help talented graduate students connect with leading researchers in Europe. The first cohort is planned for September 2022.
Conclusion
Copy link to ConclusionFundamental and applied R&D at academic and research institutions in Africa are at risk of marginalisation. Resources essential to AI – compute, hardware, software, accessible data and machine-learning engineering – are out of reach. The growing imbalance in AI resources and innovation between Africa and the rest of the world requires an unprecedented response. The establishment of Cirrus and the AI Africa Consortium is one of Africa’s responses. It aims to help spread opportunity more widely; support students and researchers at universities and research institutions across Africa; activate the talent of researchers once they have access to AI infrastructure and other resources; and create fertile ground for commercialisation through entrepreneurship.
For science in Africa, Cirrus and the AI Africa Consortium afford a major opportunity to develop and exploit AI techniques and methods. This will improve both the efficacy and efficiency of science, and also the operation and optimisation of scientific infrastructure (because system scale and complexity demand AI-assisted design, operation and optimisation).
Strengthening science in Africa by AI methods will broaden global research agendas and elevate African research. To accomplish this, Africa must also act collectively and collaborate to grow the scientific output needed to exploit opportunities presented by AI.
The goals described in this essay are challenging and the proposed solutions will require significant investment. However, the potential return on that investment is enormous: new types of data analysis; improved and even autonomous operations and performance of scientific instruments; innovative commercial products emerging from science, with even the potential for new industries; and an opportunity for Africa to become a producer of AI for science and not merely a consumer of the resulting breakthroughs.
References
Ahmed, N. and M. Wahed (2020), “The de-democratization of AI: Deep learning and the compute divide in artificial intelligence research”, arXiv, https://arxiv.org/pdf/2010.15581.pdf.
Amodei, D and D. Hernandez (16 May 2019), “AI and compute”, OpenAI Blog, https://openai.com/blog/ai-and-compute.
Cutcher-Gershenfeld, J. et al. (2017), “Five ways consortia can catalyse open science”, Nature, Vol. 543, pp. 615-617, https://doi.org/10.1038/543615a.
Mouton, J., et al. (2019), The State of the South African Research Enterprise, DST-NRF Centre of Excellence in Scientometrics and Science, Technology and Innovation Policy, Stellenbosch University, Matieland, South Africa, www0.sun.ac.za/crest/wp-content/uploads/2019/08/state-of-the-South-African-research-enterprise.pdf.
OECD (2021), Recommendation of the Council concerning Access to Research Data from Public Funding, OECD, Paris, https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0347
QS World University Rankings (2021), “QS World University Rankings” webpage, www.topuniversities.com/university-rankings/world-university-rankings/2021 (accessed 6 January 2023).
Reddi, J.V. et al. (2021), “Widening access to applied machine learning with TinyML”, arXiv, arXiv:2106.04008v2, 9 July, https://arxiv.org/pdf/2106.04008.pdf.
Rubiera, C. (19 July 2021), “AlphaFold 2 is here: What’s behind the structure prediction miracle”, Oxford Protein Informatics Group blog, www.blopig.com/blog/2021/07/alphafold-2-is-here-whats-behind-the-structure-prediction-miracle.
Sample, I. (2017), “‘We can’t compete’: Why universities are losing their best AI scientists”, 1 November, The Guardian, www.theguardian.com/science/2017/nov/01/cant-compete-universities-losing-best-ai-scientists.
South African Government (2010), Intellectual Property Rights from Publicly Financed Research and Development Act: Regulations, www.gov.za/documents/intellectual-property-rights-publicly-financed-research-and-development-act-regulations-1.
Spice, B. (2019), “Hebert named dean of Carnegie Mellon's top-ranked School of Computer Science”, 8 August, Carnegie Mellon Computer Science Department, https://csd.cmu.edu/news/hebert-named-dean-carnegie-mellons-top-ranked-school-computer-science.
The Times Higher Education (2022), Emerging Economies University Rankings 2022 (database), www.timeshighereducation.com/world-university-rankings/2022/emerging-economies-university-rankings (accessed 6 January 2023).
Villa, J. and D. Troiano (30 July 2020), “Choosing your deep learning infrastructure: The cloud vs. on-prem debate”, Determined AI blog, https://determined.ai/blog/cloud-v-onprem.
Wang, S and M. Casado (2021), “The cost of cloud, a trillion dollar paradox”, Andreessen Horowitz, 27 May, https://a16z.com/2021/05/27/cost-of-cloud-paradox-market-cap-cloud-lifecycle-scale-growth-repatriation-optimization.
Notes
Copy link to Notes← 1. Africa’s highest ranked university in 2021 was the University of Cape Town, in 220th place. For the full list of rankings, see QS World University Rankings (2021).
← 2. Reddi (2021) provides an example of what it takes to build and maintain high quality courseware for machine learning.
← 3. For commentary on the engineering skills that went into developing AlphaFold 2, see Rubiera (19 July 2021).
← 4. For recommendations concerning access to research data from public funding, see OECD (2021).
← 5. For the regulations governing intellectual property rights from publicly financed research in South Africa, see South African Government (2010).
← 6. See The Times Higher Education Emerging Economies University Rankings (2022).
← 7. For an overview of why consortia can catalyse open science, see Cutcher-Gershenfeld (2017).
← 8. For information on TinyML4D, see http://tinyml.seas.harvard.edu/4D/.
← 9. For information on the MLCommons Science Working Group, see https://mlcommons.org/en/groups/research-science/.