Mahbub ul Haq Research Centre at LUMS

1

Pakistan's AI Health Gold Rush and What it is Costing the Frontline

By:

Maryam Mustafa

June 24, 2026

On 16 May 2026, the National AI Hub convened Pakistan's first National AI x Health gathering at LUMS, Lahore. For the first time, the event brought multi-sectoral representation in one room; government partners, start-ups, researchers, academics, donor agencies — all there to talk about and connect over building Pakistan's next generation of AI tools for health. Fourteen presentations from a diverse range of presenters at varying stages of their platforms and products. The day closed with a private roundtable with provincial health partners from Sindh, Balochistan, ICT and Punjab on what health system needs AI can actually support.

The last ten years of studying, building and deploying solutions in Pakistan’s digital health landscape reveal that the introduction of AI alone has accelerated changes to that landscape tenfold. Yet the discourse emerging on the issue lacks in grave and disconcerting ways. Below, this piece flags what conversations today they do include, what they omit and what is urgent to reflect on before any substantial next steps can be taken.

Building Tools, not Evidence

Across the fourteen presentations, almost none used their own data to investigate the most fundamental question the sector needs: Does AI work for Pakistan’s localized context and needs? What is the actual impact on the patient, the provider, and the system? Can this be embedded into existing health infrastructures? The conversation defaulted to features, partnerships, and pipeline. Evidence of what works, what does not, and for whom was largely absent.

This is not unique to Pakistan. Reviews of digital health implementation across sub-Saharan Africa have repeatedly noted that despite the rapid spread of digital tools across primary care, the evidence base for whether these tools actually strengthen health systems or improve access remains thin (Owoyemi et al., 2022). The most candid voices in global health have a name for the pattern: pilotitis — a churn of small, parallel pilots that prevents digital health from ever consolidating into something a health system can rely on (Bhatia, Matthan, Khanna & Balsari, 2020). Uganda took this seriously enough in 2012 to impose a moratorium on digital health pilots until the fragmentation could be contained. This was a result of over 50 m-health (mobile health) and ehealth (electronic health) projects being piloted concurrently in the country in 2010, with some initiatives overlapping in terms of product features within the same region and/or health facilities. Pakistan, unfortunately, is very far from this type of conversation for health despite being well past the point where such narratives should already have been established.

The Work is Removed from Ground Realities

The presentations shared at the AI meet were additionally disconnected from Pakistan’s health ground realities. They lacked attention to key components including workflows in primary, secondary and tertiary care settings; actors or agents and their respective roles; bottlenecks/roadblocks identification; information entry and flow; referral gaps; and decision-making responsibility distribution in case of AI intervention, to name just a few. Without such grounding, AI healthcare will remain a theoretical abstraction separate to everyday lived conditions/requirements of care.

Take an EMR tool, or a tool for diagnosing gestational diabetes. The presentations centered on the model — accuracy, the interface, validation. All under ideal, synthetic conditions. There was no conversation about where the data going into these systems is actually coming from. Who is doing the data entry? How long will it take them? What does it do to their existing workflow? What does the accuracy look like in the wild? An EMR is only as good as the data sitting inside it, and that data is being entered by a midwife, an LHV, or a junior doctor who is already managing multiple registers, a queue, and a clinic that runs on understaffing. None of that was visible in what was presented.

The clearest example of this disconnect is who these tools are designed for. Most of the solutions are being built in English, for smartphones, for users who can read, write, and navigate apps. That excludes the vast majority of the women, frontline workers and patients these tools claim to serve including women who do not read or write in English, who have limited or no digital literacy, who share a phone with the household, or who have none (GSMA, 2025; Jamil, 2021). Underlying all of the work was the assumption that medical officers will have approx. 30 – 40 minutes per patient to enter data. Not a single product focused on reimagining the data pipeline, which is the hardest part of building the AI infrastructure.

The Silence on Ethics and Guardrails

The most concerning thing in the room was what was not said. Across fourteen presentations, almost no one talked about the ethics of deploying these tools directly with populations: consent, bias, harms, redress, what happens when a tool fails for a woman at a primary care site, what happens to her data after she has used it, whether she even knew she was using AI.

This is harder, slower work, and it is not optional. Without functioning oversight, AI in health does not stay neutral; it absorbs and amplifies the inequities of the system it sits on top of. The structural risk is not bad actors. It is the quiet inheritance of training data that reflects decades of unequal care (Obermeyer, Powers, Vogeli & Mullainathan, 2019; Buolamwini & Gebru, 2018). In LMIC contexts, where regulatory scaffolding is weaker and IRB pathways for AI specifically are still being figured out, that inheritance is not a footnote — it is the default.

There is also an extraction problem to name directly. Communities — mostly women, mostly poor, mostly rural — become the data layer for products that are then owned and monetized elsewhere. Scholars working on AI in African health systems have called this data colonialism: data extracted from LMIC populations to develop technologies that are ultimately owned, commercialized and valued by institutions in high-income countries (Couldry & Mejias, 2019; Mohamed, Png & Isaac, 2020; Birhane, 2020). It must be stated plainly: pilots which mine communities for data without returning much to them are not benign, even when the funder is benevolent, and the slide deck is inclusive.

No One is Asking Whether AI is Even the Right Answer

This is the question that recurred most often during the day. Of the fourteen things I saw, much of it could have been a non-AI tool. Some of it could have been a checklist, a better referral form, a phone tree, or a training. Worse, the funding being routed into AI in some of these cases could have done more for outcomes if it had gone to community midwives, training cadres, supplies, or transport to facilities — the unglamorous spine of MNCH delivery.

The sector is missing a systematic way of asking: what is the most significant need? What can AI support? What can it NOT support? And what is better solved with people, money, and a functioning system?

Practitioners working on AI in African health systems have started asking exactly this question out loud. As one recent commentary on AI health investment on the continent by Dino Rech puts it, “Budget pressure should not drive adoption of flashy tools just to prove innovation. The first questions must always revolve around what health system problems are being solved” (Rech, 2025). Without that discipline, AI becomes a distraction at a moment when a health system can least afford one.

What Pilots Cost the Frontline

The donor-funded AI tool model is unsustainable, and it has direct costs at the site of care.

For instance, in my recent fieldwork in Karachi, a visit to a primary care site revealed that a community midwife was running her normal workflow alongside two separate AI interventions, on two separate devices, with four physical registers. One of the interventions was a hand-held ultrasound pilot. Word had travelled across the catchment, so women were now walking miles, taking buses, and bypassing closer facilities to reach hers. They did not know it was a short pilot. For each patient she registered, the midwife was pulling information from four different places to make one entry in one register. Her monthly registered patient count was climbing. Her clinic targets will be reset upward on the back of those numbers — and she will not be able to maintain them once the ultrasound is withdrawn. The women will arrive, expect the service, and find it gone.

This is what pilotitis looks like at the level of one midwife and one waiting room. It is not abstract. Studies of community health worker digitization in Kenya describe the same dynamic: different mobile solutions deployed county by county, sometimes within a single county, with no interoperability and no ability to aggregate data across the system (Njoroge, Zurovac, Ogara, Chuma & Kirigia, 2017).

These pilots erode trust. They raise expectations they cannot keep. They take time from a midwife who is already stretched. And when the pilot ends — and almost all of them end — patients and providers are left holding the gap.

Who Pays After the Donor Leaves?

Another key question surfaces around the economics of AI. Almost no one in the room had a serious answer to: who pays for these AI interventions when the grant ends? AI tokens, server costs, cloud storage, model fine-tuning, support — none of this is free, and the vast majority of patients in this country cannot pay for any of it. What benefit is any of this to them if it disappears the moment the funder moves on?

How long will Pakistan's health system continue to depend on third-party donors to fund tools and platforms that should sit inside the country’s own health departments and workflows? Tools that often carry the donor's incentives and reasoning — which do not always align with what is needed at the community level?

This concern is well-established in the LMIC digital health literature. Even where programs have reached scale, sustainability remains uncertain (LeFevre, Chamberlain, Singh, Scott, Menon, Barron, Ved & George, 2021). Across sub-Saharan Africa, digital health initiatives remain fragmented, poorly coordinated and underfunded; the absence of a financing model is consistently named as one of the central reasons digital tools fail to embed inside national health systems (Karamagi et al., 2022). Sustainability is not a final-chapter problem. It is a design problem. A tool that cannot answer who pays when the donor leaves is a tool whose design is not yet finished.

Where Does This Leave Us?

This blog is not contesting the potential of AI in Pakistan’s health system(s) - which is especially potent for the social groups our current setup excludes - women in rural Sindh and Balochistan, low-literacy patients, midwives carrying impossible workloads. The current state of Pakistan’s health provision crises, however, demands much more discipline, not more demos. This was not the case at the AI meeting.

This is what a list of urgencies for AI-powered healthcare in Pakistan should ideally look like:

A problem-first framework should inform the decision to innovate. Before we build, we ask what the gap is, who is affected, what the workflow looks like, and whether AI is the best instrument here, or whether the better answer is more midwives, better training, or a functioning referral pathway.

Evidence should be non-negotiable. Every tool deployed in a public facility should have a built-in evaluation plan. Funding and adoption should depend on the rigor of the evaluation. Preferably by third-party actors. Real data, real metrics, real publication, and real willingness to find out the tool does not work.

The ethics infrastructure should be made explicit. Operational regulatory pathways for AI in health, consent frameworks that account for low-literacy populations, and data governance that does not assume women in Sindh and Balochistan understand or accept what is happening with their data should be clearly laid out. Additionally, there should be compensation for frontline workers with training if we are adding to her load or a plan for what happens after the pilot.

To ultimately reach a sustainable scale, the target should be for the government to be the owner, not audience. If a tool cannot eventually live inside a provincial health department's workflow and budget, we should not be piloting it at scale.

Pakistan has one of the highest maternal and newborn mortality burdens in the world. The cost of getting this wrong is not measured in lost pilot funding. It is measured in trust eroded at a primary care site, in a midwife whose targets were reset and then abandoned, and in a woman who walked miles to a facility for a service that was no longer there.

Authors: