Bringing NTD data to the masses

Community health worker collecting survey data

Reliable, up-to-date maps and data identifying the location and levels of parasitic worm infections are essential tools in the targeting and treatments of populations at risk.

One of the key challenges in mapping the global distribution of NTDs is that the empirical data are often difficult to access and make sense of. Survey data are either left unpublished or are spread over 1,000’s of separate journal articles.

4 April 2017

The Global Atlas for Helminth Infections was created to help to improve access to these vast reservoirs of valuable data by collating all available information on the prevalence of soil- transmitted helminths (STH), schistosomiasis and lymphatic filariasis into a single open-access online database. GAHI developed by researchers from the London Applied & Spatial Epidemiolgoy Research Group (LASER) based at the London School of Hygiene & Tropical Medicine

At the time of writing GAHI’s NTD database has collated datasets from over 10,000 surveys, including 23,000 individual records from over 1,600 reports and publications. This online open platform provides users free access to parasitological data and maps covering 124 countries.

LASER’s Dr Anwar Musah explains the rigorous approach that GAHI follows to ensure that no stone is left unturned in the search for accurate NTD data.

Data weeding

The GAHI database draws its data from an exhaustive global online search for NTD surveys contained within historical and contemporary formal and grey literature sources. Keyword searches of research papers are conducted through publication listings such as PubMed, Medline, EMBASE, Web of Science and Google Scholar. Unpublished grey literature is collected from a range of national and regional NTD control programmes. Reference management software archives the identified publications where they are then screened for extractable NTDs data.

The screening weeds out data which isn’t relevant to the GAHI leaving only the studies which have determined the prevalence of infection at a school, household, village or community-level and provides the spatial location of the dataset. To meet this criteria we look for surveys which:

  1. have a study design which is either cross-sectional, repeated cross-sectional or longitudinal;
  2. are non-clinical and not from a hospital settings;
  3. no worm control or treatment programme has been conducted in the past 12 months; and
  4. feature whole distinct populations rather than smaller non-representative sub-groups.

If a study meet these criteria, the papers are mined for:

  1. Publication details: Alongside the usual citation we also collect author contact details in case we need to confirm anonymised location or prevalence information.
  2. Survey information: Including geographical location (latitude and longitude coordinates), site details (name of country, province, district etc.) and type of survey employed, including thecharacteristics of the population under study.
  3. Epidemiological and parasitological data: Disease prevalence is calculated using the number of survey participants and the number recorded as positive for a given parasitic infection. Additionally, we collect information on the type of diagnostic tool used in the survey,

This compiled and filtered data is made available for users to freely download from The screening and categorising process employed by GAHI means that users can tailor the required datasets by geographical location, species and diagnosis method, the year that the original data was collected and the characteristics of the survey.

To ensure this mass of valuable data is available for those without access to, or knowledge of, mapping software, GAHI uses the data to create over 1,000 downloadable static maps organised by country, disease and intervention type.