TY - UNPB
T1 - Genome-based source attribution using a one health Escherichia coli isolate collection from 2013-23 in Scotland
AU - Chalka, Antonia
AU - Crozier, Louise
AU - Vallejo-Trujillo, Adriana
AU - Qarkaxhija, Vesa
AU - Low, Alison
AU - McAteer, Sean
AU - Templeton, Kate E.
AU - Tongue, Sue C
AU - Evans, Judith
AU - Foster, Geoffrey
AU - Evans, Thomas
AU - Marwick, Charis A
AU - Raza, Ahmed
AU - Parcell, Benjamin J
AU - Holden, Matthew TG
AU - Mcneilly, Tom
AU - Fitzgerald, Stephen
AU - Mitchell, Mairi
AU - Silva, Nuno
AU - Robertshaw-McFarlane, Emily
AU - Hamilton, Scott
AU - Wells, Elizabeth
AU - Hamilton, Clare
AU - Watson, Eleanor
AU - Findlay, David
AU - Bolland, Julie
AU - Redshaw, John
AU - Walker, David
AU - Heywood, Jane
AU - King, Charlotte
AU - Baker-Austin, Craig
AU - Papadopoulou, Athina
AU - Powell, Andy
AU - Paterson, Gavin K
AU - Morgan, Genever
AU - Mcelhiney, Jacqui
AU - Gally, David L.
N1 - The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-ND 4.0 International license.
PY - 2025/11/25
Y1 - 2025/11/25
N2 - Random Forest based source attribution models were developed from a ‘one health’ resource comprising 4,230 high-quality whole genome assemblies from E. coli. These were isolated from a wide range of sources, predominantly originating in Scotland, including wastewater, livestock, food, and clinical infections of humans and dogs. Using these models, we derived a probabilistic assignment of E. coli isolates from food, shellfish and water samples to potential livestock and human sources of contamination. The incorporation of E. coli sequences from wastewater alongside those from human clinical infections, enabled us to capture a wide diversity of human strains in our analyses. The sequence types (STs) of isolates from human bacteraemia and urinary tract infections (UTI) were compared with livestock and food isolates. While only 2.3% of the E. coli isolated from food samples in the study were from STs primarily associated with human bacteraemia and UTI, the models found a livestock signal associated with 15% of the human clinical isolates. In the food and private water samples, livestock-human co-attribution of E. coli isolates was common and consistent with routine human exposure to specific subsets of livestock E. coli, potentially a result of selection during food and water processing. Overall, this research demonstrates the potential value of including source attribution models in national surveillance programmes to understand the transmission of E. coli through the agri-food chain and support risk management to protect public health.
AB - Random Forest based source attribution models were developed from a ‘one health’ resource comprising 4,230 high-quality whole genome assemblies from E. coli. These were isolated from a wide range of sources, predominantly originating in Scotland, including wastewater, livestock, food, and clinical infections of humans and dogs. Using these models, we derived a probabilistic assignment of E. coli isolates from food, shellfish and water samples to potential livestock and human sources of contamination. The incorporation of E. coli sequences from wastewater alongside those from human clinical infections, enabled us to capture a wide diversity of human strains in our analyses. The sequence types (STs) of isolates from human bacteraemia and urinary tract infections (UTI) were compared with livestock and food isolates. While only 2.3% of the E. coli isolated from food samples in the study were from STs primarily associated with human bacteraemia and UTI, the models found a livestock signal associated with 15% of the human clinical isolates. In the food and private water samples, livestock-human co-attribution of E. coli isolates was common and consistent with routine human exposure to specific subsets of livestock E. coli, potentially a result of selection during food and water processing. Overall, this research demonstrates the potential value of including source attribution models in national surveillance programmes to understand the transmission of E. coli through the agri-food chain and support risk management to protect public health.
U2 - 10.1101/2025.11.25.25340865
DO - 10.1101/2025.11.25.25340865
M3 - Preprint
BT - Genome-based source attribution using a one health Escherichia coli isolate collection from 2013-23 in Scotland
PB - medRxiv
ER -