Table of Contents

1. Introduction to Infectious Disease Modelling

Simulation models have become increasingly powerful and popular tools for studying the epidemiology of infectious diseases because they allow us to explore the potential outcomes of different disease control scenarios without the time and expense of conducting experiments in the real world.   However, no model is ever perfect and it is important for veterinarians, researchers, and policy-makers to have a good understanding of how these models are built so that they can make the appropriate inferences and ultimately the appropriate decisions based on the model results.  In this first module, we will cover the basic principles behind infectious disease modelling and develop an action plan for how we can start building models for our diseases of interest.

Learning Outcomes

By the end of this module, students should be able to:

  • Understand the basic principles behind infectious disease pathogenesis in individuals and infectious disease transmission dynamics in populations
  • Identify the critical steps involved in developing an infectious disease simulation model from initial planning through to final publication.
  • Understand how the basic programming principles of variables, vectors, and loops can be used to generate an infectious disease model

Introduction

Infectious diseases place a significant burden on animal populations through their many effects on animal health, welfare, and performance.  Consequently, there is strong interest in understanding how we can better prevent animals from becoming infected in the first place and, if they do become infected, what interventions we can implement to reduce the clinical impacts as well as the chances of onwards transmission.  Our task would be relatively straightforward if we had unlimited resources to spend on infectious disease control and if there was only one possible option for controlling disease.  However, in reality, we typically have a very limited budget and there are often many different measures we can take that each have their own relative costs, benefits, and effectiveness.  The situation is made even more complicated by the fact that any decisions we make about managing infectious disease in a single animal or on a single farm will have externalities for every other individual they are at risk of infecting.

For many reasons, it is often difficult for us to conduct formal research studies in the real world to evaluate the effectiveness of different disease management options:

  • Exotic pathogens like foot-and mouth disease virus spread very quickly through susceptible animal populations and in an outbreak situation, policy-makers need to make rapid decisions about disease control to minimize economic losses without the time to conduct studies.
  • Endemic pathogens like bovine tuberculosis operate over very long time scales and it could take decades before any actionable results were available from the field studies.
  • Every animal and every herd has different demographic, management, and health profiles so it may be difficult to extrapolate the results from one herd to another given all the potential confounders.
  • For questions around national disease control, there is only one population and so it would be impossible (and also incredibly expensive) to conduct controlled experiments.

Instead, what we can do is build a virtual representation of the population and the transmission dynamics of disease within the population that we can then manipulate to make inferences about the potential outcomes of different interventions.  This leads to all kinds of practical questions around how much level of detail is needed to accurately capture the disease dynamics and how do we know if our models are actually doing a good job.  What you will find through the process of model building is that we often develop additional insights about the disease pathogenesis and identify gaps in basic research knowledge that must be filled to reduce model uncertainty.  Don’t be surprised when the final model you produce looks very different from the one you started out building!

Infectious Disease Basics

The term infectious disease dynamics is commonly used to describe the interactions between pathogens and their host populations over time.  This includes understanding what happens to individuals when they become infected with the pathogen as well as the mechanisms that pathogens use to spread between individuals in the host population. 

When a susceptible individual is exposed to a pathogen, there are a number of different exposure, infection, and disease states they can pass through as shown in the figure below.  It is important to distinguish between these three categories for each disease because it determines whether the individual is actively infectious to other individuals and when we can expect to be able to diagnose individuals based on the presence of clinical signs, isolation of the pathogen, and/or the presence of antibodies against the pathogen.  Some of the most difficult pathogens to control are the ones with long incubation periods or subclinical periods where the animal is carrying the pathogen but is not detectable by any diagnostic test.  Also note that depending on the disease, animals can pass backwards or forwards through the different states.

The key immunological and infection states we are most often interested in are:

  • Susceptible (S) – naïve animals that can become infected with the pathogen.
  • Exposed (E) – animals that have been exposed to the pathogen but are not yet infectious. This is also often called the incubation period.
  • Infectious (I) – animals that are actively excreting the pathogen and are infectious to others
  • Recovered (R) – animals that have gained natural immunity to the pathogen following exposure.
  • Maternal Antibody (MA) – animals that are fully or partially protected from disease through colostral antibodies
  • Carrier (C) – animals that are chronically shedding the pathogen
  • Latent (L) – animals where the pathogen has become dormant in the body
  • Vaccinated (V) – animals that are fully or partially protected from disease through vaccination

Fundamentally, we know that in order for a pathogen to persist in a population long term, it must get copies of itself out of its currently infected host and into at least one other susceptible host on average before (1) the current host dies naturally or as the direct result of infection or (2) the current host mounts an immune response or receives some kind of treatment that kills the pathogen.  This propensity can be described as the basic reproductive number or R0, which measures the number of secondary infections on average caused by a single infected individual interacting with an entirely susceptible population. The following table lists estimates of R0 for several common infectious diseases of animal populations.

Disease system R₀ estimate Reference
Bovine herpesvirus 1 in dairy cattle 7 Hage et al. (1997)
Paratuberculosis in dairy cattle 11 Roermund et al. (1999)
Bovine viral diarrhoea virus in dairy cattle 4 Moerman et al. (1993)
Staphylococcus aureus mastitis in dairy cattle 0.5–8.0 Lam et al. (1998)
Rabies virus in dogs 2.0–2.5 Coleman and Dye (1996)
Escherichia coli O157 in beef calves 2.9–5.6 Laegrid and Keen (2004)

In general, the more susceptible individuals that a single infected individual manages to infect (high R0 values), the faster disease will spread through the population and the harder it will be to control the disease.  The primary exception to that rule is with fast-moving diseases in small populations where the pathogen can deplete the population of susceptible animals.  Diseases with R0 values close to 1 will tend to remain at a steady state with the prevalence of infected individuals staying relatively constant over time.  The closer R0 gets to zero, the faster disease will disappear from the population.  There are a few other important caveats to these rules, which we will discuss in more depth in later modules. 

If we unpack the concept of R0 a little further, there are three primary things that influence its value:

1. Number of contacts an infected individual makes with other individuals during the infectious period

  • This will depend on the length of the infectious period and the behaviours of the infected individual during that time. Contacts can include direct contacts such as face-to-face interactions as well as indirect contacts such as through contamination of the environment.  It is important to also remember that being infected with a pathogen can cause individuals to change their typical contact patterns.  A classic example is rabid animals or mice infected with Toxoplasmosis that lose their inhibition to avoid contact with other animals.

2. Probability that at least one of those contacts is with a susceptible individual who can become infected

  • This will depend on the overall level of immunity in the population. Common factors that can influence the level of immunity are recovery from previous infections, immunization, and maternal antibodies.

3. Probability that transmission will successfully occur as the result of the contact

  • This will depend on the total pathogen load the susceptible individual is exposed to as well as the level of biosecurity and hygiene measures that are in place to prevent disease from occurring through the contact

As infectious disease modellers, this means we need a framework for describing the contact patterns of individuals, for tracking the immunologic and infection status of individuals, and for estimating the transmission probabilities.  Once this is in place, we can then begin to look at the different interventions we can use to get that value of R0 below one.

Getting Started

If this is the first time you have tried building an infectious disease simulation model, it can be pretty intimidating even just knowing where to start.  Since the usual goal is to produce a model that will pass peer-review for publication, I always recommend setting up a Word document in the same format as you would for a manuscript and then outlining the different sections that you know you will need to fill in. 

Infectious Disease Modelling Manuscript Template

Click here to download an MS Word template with all the key sections for an Infectious Disease Modelling Manuscript

Reviewing the Literature

Much of the information we need to build infectious disease models has already been published in the literature and the model building process is primarily about pulling it together in a unified framework that will provide decision-makers with the information they need to effectively manage disease.  Regardless of the disease system you are working with, you will need to gather information from the following seven areas:

  1. Animal Demographics

Describing the normal physiological stages and management events for animals in the host population as they progress through the lifecycle from birth to slaughter 

  1. Prevalence and Incidence

Describing the estimated prevalence and incidence of disease at the animal, herd, and industry levels to better quantify disease burdens in the population.

  1. Disease Risk Factors

Describing the different pathways that allow disease to spread between hosts as well as the different host, environment, and pathogen factors that can increase the risk of disease transmission.  This includes knowing whether there are other reservoirs or vectors for the disease.

  1. Pathogenesis and Impacts

Describing the clinical progression, production impacts, and outcomes of disease in animals once they have become infected.  This helps us understand how long individuals are likely to stay infectious to others and when we may be able to detect them with the different diagnostic tests.

  1. Diagnostic Tests

Describing the different methodologies that can be used to identify infected animals and herds including clinical signs, laboratory tests, and post-mortem examinations. It is important to discuss the sensitivity and specificity of the tests at both the animal and herd levels.

  1. Management Options

Describing the different interventions that can be used to control disease including treatments, vaccinations, culling, and other management changes.  It is important to discuss the efficacy of the interventions in reducing the incidence, duration, and severity of the disease.  It is also important to know if the interventions have negative consequences such as withholding periods, toxicities, or reductions in carcass values.

  1. Externalities

Describing other factors that should be considered when making management decisions about the disease including antimicrobial resistance, trade restrictions, animal welfare, environmental impacts, social motivations, competing disease priorities, and farmer business objectives.

While doing our literature review, we may find that there are significant knowledge gaps in certain areas and this can help us prioritize further research studies to get better estimates for the model parameters.  We’ll discuss this in more detail in Module 6.

Defining Model Scope

Before starting to build infectious disease models, we need to have a clear understanding of the main research question and how we intend to use the model results to support decision-making since this will drive the model structure as well as determine what information we need to collect at each step in the process.  The following is an initial checklist of six basic questions to ask yourself – note that it is okay if you don’t have all the complete answers at this stage and, again, you may find that these answers changes as you start the process of building the model.

Before investing time and effort in building a model, you should be able to justify why it is important to do so.  This usually includes factors like how commonly the disease occurs in the population, what effects the disease has on animal health and production, and other important externalities (trade restrictions, welfare, public health).  Or sometimes it’s simply because we don’t know how much the disease is actually impacting populations and need more information to guide future decisions around research and control priorities.

  • Making backward inferences

This is for situations where we don’t actually understand a lot about the basic epidemiology and pathogenesis of disease.  We usually develop several theoretical models of how we think the system works and then see which one best matches the empirical data observed in the field.

  • Making forward predictions

This is for situations where we have the disease system well described and want to use the models to make predictions about what might happen in the future either in the presence or absence of disease control measures.

  • Animal-level

This is for situations where we want to make decisions about how to manage a clinical case in a single infected individual. This requires modelling how an individual animal progresses through the course of an infection from exposure to resolution.

  • Herd-level

This is for situations where we want to make decisions about managing infectious disease in a discrete population of animals, which could be a herd for livestock, a city or town for companion animals, or a community or colony in the case of wildlife.  This requires modelling pathogen transmission between infectious and susceptible animals within the population.

  • Industry-level

This is for situations where we are looking to develop plans for controlling or eliminating infectious disease at a large-scale regional or national level. This requires modelling the pathogen transmission between infected and susceptible herds within the industry.

  • Epidemic

This is for situations where the entire population starts out naïve and you are looking to track the outcomes of an outbreak occurring.  A classic example is the foot-and-mouth (FMD) disease models developed for disease-free countries to help policy-makers prepare and plan for a response.  We are most often trying to convince decision-makers of the need to invest in preventative measures to reduce the risk and severity of outbreaks.

 

  • Endemic

This is for situations where the disease is already established in the population and you are looking to compare different strategies for controlling or eliminating disease.  A classic example is disease models for bovine tuberculosis or bovine viral diarrhoea virus.  We are most often trying to convince decision-makers of the value in investing money to fix their current problems with disease.

  • No

This gives you the most scope for developing your own model system but can also be the most challenging because you have no basic guidelines to work from.

  • Yes
  • In this situation, you will need to justify why the existing models are not adequate for meeting your intended purpose.  This is usually because (1) the demographic structure of the population doesn’t match yours, (2) you suspect the disease dynamics work differently in your population, and/or (3) the models were missing components like different disease control interventions
  • Farmers and Veterinarians

These individuals are interested in knowing things like how much the disease is currently impacting their business and what returns they might expect to see by investing in control programmes.

  • National Policy-Makers

These individuals are interested in knowing the what national strategy will have the biggest benefit to industry at the lowest cost to individual farms or how they should allocate scarce resources amongst different competing interests.

  • Researchers

These individuals are interested in knowing how the different assumptions about the disease and the population structure impact the inferences that can be made from the model and how this modelling framework could be applied to other situations.

Overview of Modelling Methods

Although the structure of simulation models will vary widely depending on the disease system and preferences of the modeller, the methods section for a modelling manuscript should typically be divided into six sections:

The model needs to be described in enough detail that another researcher could replicate the code to generate the exact same outputs. 

1. Model Overview

This first section is a short paragraph highlighting the key structural features of your model. There are several decisions about model structure that need to be made up front.

Mechanistic models represent a hypothesized relationship between variables based on underlying biological principles and the parameters for the model often have distinct biological definitions. 

Phenomenological or statistical models seek only to find the relationship that best describes the data. Most infectious disease simulations are built to be mechanistic because we are generally trying to describe the biology of host and pathogen populations.

In compartmental models, the population of animals (or herds) is divided into groups based on their disease and/or physiologic status and we only keep track of the total number of animals (or herds) in each group over time. 

In individual-based models, each animal (or herd) is created its own separate unit and we can track more specific details about that individual such as age, breed, sex, pregnancy status, production levels, movement history, and disease status.  Individual-based models are usually far more computationally expensive than compartmental models, but it can be important to track this level of detail for chronic, slow-spreading diseases like bovine tuberculosis or Johne’s disease where the history and characteristics of the animal is important for determining their risk of getting disease as well as the epidemiological and clinical outcomes once they are infected.

In single population models, there is only one set of disease dynamics being described for the whole population. 

In metapopulation models, the population is divided into discrete units (subpopulations) that each have their own unique transmission dynamics.  A good example of this is modelling disease transmission in the cattle industry where the national animal population is divided into discrete herds and each herd has its own unique epidemiological processes occurring.  It’s also possible within-herds to have separately managed mobs that each have their own disease dynamics.  For metapopulation models, we need to account for both the within-herd and between-herd spread of disease.

In homogenous mixing models, all animals in the population are assumed to make the same number of contacts over time and have an equal chance of coming into contact with all other animals in the population.  

In heterogenous mixing models, each individual has their own unique contact patterns reflecting differences in individual behaviour as well as other factors such as geographic location, seasonality, and management practices that influence contact probabilities.  Homogenous mixing models are usually fine for describing the transmission dynamics of animals that are managed as well-mixed mobs.  Heterogenous mixing models are generally preferred for industry-level models where each herd has a unique demographic profile and geographically distinct location, which will determine the frequency of contact and the likelihood of making contact with other herds in the population. 

In deterministic models, the parameter values and processes remain fixed across the entire simulation so that the model will always produce the exact same results every single time it runs. 

In stochastic models, the parameter values are usually sampled from distributions of what may be expected in the real world, which will potentially produce a range of different outcomes at the end of each simulation.  It is more difficult to build stochasticity into compartmental models than individual-based models. 

In continuous time models, it’s assumed that events occur as a continuous process and these models are most often represented as a series of ordinary differential equations (ODEs) where the movements between different disease or physiological states occurs at a constant rate. 

In discrete time models, the simulation is divided into discrete time steps of a given length (days, weeks, months) and events occur with a certain probability during that time step.  Discrete time models can be particularly useful for modelling seasonal livestock production systems where demographic events often happen over short time periods or on single dates rather than continuously over the year.

Depending on the epidemiology of the disease, the simulation model may need to include separate dynamics for each species of reservoir host, intermediate hosts, and vectors 

In general, the field of infectious disease modelling for livestock populations is increasingly moving towards the use of stochastic individual-based metapopulation models that operate in discrete time and use network-based approaches describe the patterns of contacts between hosts.  This is largely in recognition of the fact that individuals in a population have very heterogenous behaviour and failing to account for those differences can significantly alter the predictions made by the model.  This section should also usually include a description of your target population (i.e. we developed a deterministic, discrete-time, compartmental model to describe the transmission dynamics and control of swine influenza virus in a typical 1,000 sow commercial breeding herd in New Zealand).

2. Model Structure

Almost every infectious disease simulation model will require three basic components: (i) a demographic component describing the transition of animals through normal physiological states from birth to death as well as their normal contact patterns in the absence of disease, (ii) a disease component describing how susceptible individuals become infected and then progress through the different disease states including the accompanying impacts on production, and (iii) a control component describing the different diagnostic, treatment, and/or prevention strategies that can be used to control disease.  This is followed (iv) a description of the simulation algorithm. If you are using a discrete-time model, you should also include here a brief description of the time step duration.

This component often acts as your negative control of what you would expect the baseline health or production status to be in a disease-free state.  For fast-moving and indiscriminate diseases like FMD, you may not need much detail here.  However, for diseases operating over a long-time scale or where the specific health impacts vary based on the animal’s’ physiological status, you may need to track much more information about the population. Depending on the scale of your model, the demographic component may include:

  • Animal-level demographics: describing the different physiological states an animal moves through from birth to death and the relevant health or production parameters that need to be tracked to assess disease impacts.
  • Within-herd demographics: describing the processes through which animals enter and exit a herd as well as how they are grouped into different management units and the relevant management events.
  • Between-herd demographics: describing the rules for how animals are selected and moved between different herds in the population and/or any assumptions around the spatial distribution of herds that may influence the potential for local spread.

Once you know what “normal” looks like for your system, you can then go on to describe the impacts of disease and animal health and production. This includes:

  • Transmission model: describing how a susceptible animal becomes infected including the different mechanisms of within-population and between-population spread.
  • Pathogenesis: describing the progression of disease in an individual once they have become infected. We often use compartmental style models to help illustrate this.
  • Health and production impacts: describing the effects of disease on animal health and production, which is important for estimating the economic impacts.

Some models will stop after the disease component if the primary objective is just to predict basic things like outbreak size and duration while other models will be used to evaluate the efficacy of different management interventions.   The section will need a clear description of:  

  • Diagnostic tests: describing the different methods that can be used to diagnose an infected animal including the recommend sampling protocol, test sensitivity, test specificity, and potential adverse reactions to the test.
  • Treatments: describing the different strategies that can be used to treat infected animals including the criteria for treating, the recommended treatment protocol, treatment efficacy, and potential adverse reactions to treatment.
  • Prevention: describing the different strategies that can be used to prevent disease transmission from occurring including vaccinations, biosecurity measures, and other management interventions. These will also need an estimate of efficacy.

Once you have described the different pieces of the model, you then need a section explaining how they link together to form a unified simulation framework.  This section describes either (1) the software programme you used to solve the continuous time models or (2) the events and decisions that occur on each time-step of discrete time simulation to generate the outputs over time.  For the latter, you will also need to specify how long the model was allowed to run or any conditions that may result in early termination of the model (i.e. the epidemic burns out or disease is detected).

It is helpful to include simple figures or flow charts outlining the transition of animals through the different states as well as a table with your definitions and estimations for the parameter values.

A lot of people get caught up trying to get specific estimates for each parameter value or describe every process in fine detail at this planning stage, which can put a road block on progress.  For now, its more than enough to know that you need something to describe that feature and then you can worry about the specifics later.  This is also not a linear process and you will find yourself refining the model over time.

3. Model Data and Validation

If you are using any datasets to describe the population or validate the model you should use this section to provide an overview of (1) what the data contains (population coverage and variables), (2) what records you extracted from the databases for processing (selection criteria), and (3) the processing steps you used to generate the final dataset.

Another key challenge for you as an infectious disease modeller is convincing your target audience that your model adequately describes the disease and population in question.  Sometimes the structure and parameters can be justified based on what makes biological sense or what has previously been reported in the literature.  Sometimes you will have longitudinal field data from infected herds that you can use to show that your model is able to accurately replicate the observed patterns of disease.  Sometimes you will simply end up tweaking your parameter values until the model produces a result that looks biologically plausible.

4. Simulation Conditions

This section describes the assumptions that are made at the start of each simulation through to generating the final model outputs.

  • Starting Conditions

This section describes how you initialized all the values at the start of the simulation.  For epidemics, you need to decide who will be the index case(s).  For endemic diseases, you usually need to have a burn-in period to allow the disease to reach endemic stability in your system before you can start evaluating the impacts of control measures unless there is good data on the current distribution of disease in the population.

  • Model Outputs

This section describes the outcome measures you want from each simulation model run to assess the epidemiologic and economic picture.  This can include things like the counts of animals in each disease state over time, the incidence of new infections, the total number of animals or farms that were ever infected, costs of disease interventions that were applied, and the total economic outputs or key performance indicators for the farm. It helps to think about the kinds of statistics, figures, and tables you might want to present to decision-makers to help them evaluate the results from the model.

  • Description of Replicates

Because stochastic models will produce a different result every time, you will need to decide how many simulations to run so that you get a good representation of the range of outputs from the model.  You may also want to test a range of different conditions to see what may happen to the results.

5. Sensitivity Analysis

Oftentimes we won’t be able to find good estimates for our model parameters, in which case we perform a sensitivity analysis to see how much changing our assumptions about each parameter will change the outcomes predicted by the model.  In an ideal situation, we hope to find that changing the parameter values makes little difference in the results or the ultimate decision we would take based on the results.  If not, this often means we need to conduct further studies to get better estimates for the model parameters before we are confident in using the model outputs to make decisions.  It is often helpful to include a table here showing the ‘best guess’ value for each parameter and then the range of different values you tested in the model.  For stochastic models, you will again need to specify the number of replicates performed for each parameter value tested.

6. Statistical Analysis

This final section of the methods details any descriptive statistics or statistical analyses you will be using to present the results from the model.  It should align closely with the text, tables, and figures presented in the results section.  You will most often start with a basic description of the population demographics followed by different sections for each of the predictions or comparisons you made with the model.

Next

2. Model Structure (Animal Demographics)