Tuesday, May 5, 2009

Experimenting with An Agent-Based Model for Influenza


In the previous article I introduced an agent-based model of influenza A (H1N1) aka "Swine Flu", and shared some simple exploratory runs. In this next piece, I'll be sharing some thoughts about how we perform basic experiments with ABMs. Then.. you can check out a running version of the model online and play with parameters yourself!

A Quick Quiz


Sorry, but did you expect to get out of here without doing some work? Take a look at the following chart -- it displays the overall number of individuals who are symptomatic infectious for seven different model runs. Referring to the runs we explored in the previous article, what can you say about each of the runs and the parameters they used? Looking at the red and gray runs, which parameters might be similar and which different? The answers are below, don't peek!



Nothing is different. All of these model runs were produced with the same set of parameters. I used the same setup as in the final model of the last article, except for one change -- agents are now assigned an individual contact probability which can vary from 0.08 to 0.16. More on why this is an interesting and useful thing to do in a later article. But the important point is that the only difference between these models is that different random series of numbers were used.

I hate trick questions too, but hey, it's in service of an important point. What if we made recommendations for hospital equipment needs based on the green curves, without running enough models to see the red curves? Not good. And in effect isn't this also what we are do when we use real-world data from a few events to make predictions about future needs? Even when none of the key factors change, we can see very significant changes from one model to another. This is another signal feature of ABMs called "path dependency" -- but we don't really have time to get into that now.. The point is that it isn't generally a very good idea to come up with conclusions based on results from a few runs.

Batch Runs

A lot can be learned just noodling around with an Agent-Based Model. In fact, that's usually where we make the most interesting discoveries. That's the "interactive" side of ABM and it can be really compelling, even addictive. But the "batch" side of the model exploration process is really the most important. Without systematically studying the behavior of the model, you really can't have any confidence that your model results are anything more than flukes or artifacts. A modeling artifact is some phenomenon that seems to be a feature of the domain system we're studying but turns out to be a consequence of the way we've chosen to implement the model. I can't think of a really good example right now, but suffice it to say that they usually involve the coolest, most interesting, mind-blowing behavior. We'll be looking at a model and have a eureka moment ( "I'm going to get an article in Nature!") only to discover that the behavior only exists under a particular set of conditions ("Whoops, never mind..")

More often though, we'll observe differences between different model parameter settings based on a few model runs, but when we do more runs those differences just turn out not to be not that significant. To examine our models in depth we run formal experiments just as one might do in a physics or biology lab. This is something that the social sciences have never been able to do before, and a real benefit of ABM. You certainly couldn't give a bunch of people a deadly virus, restrict their movement and prevent contact with the outside world for weeks on end! Our tests involve multiple runs of our models under different parameters, spatial structures, behavior variations and so on. Typically, we'll do parameter sweeps, which involve taking a few model parameters, changing them systematically and running the model many times for each unique set of parameters. We'll go through a few of these to give you a sense for how it works. Have your snooze alarm handy?

Maximum Health Care Burden

So given the rather wild differences between the model runs above, how can we know that the results from our interactive runs weren't simply a fluke? Again, we need to do a boat load of runs and test our hypothesis against different conditions. Ah, basic science -- and so very different form the way that social science is typically done! The first thing we'll do is look at the effect of our interventions on the maximum number of people infected symptomatically at any given time. Even if we can't keep people from getting sick, perhaps we can do something to lower the maximum burden on the system. So first, we'll vary the probability of movement from 0.0 to 1.0 in increments 0f 0.05, with 21 runs for each probability setting. In this chart, we'll see the average maximum value (got that?) in dark red, and the minimum and maximum ranges for the maximum infection rate in light red.

Well, its pretty clear from that that lowering movement generally reduces the maximum level of health care resources we'll need. But the key word here is "generally". From run to run it would be somewhat questionable to make a policy choice between say a movement restriction of .4 and one of .8; the ranges have very significant overlap. Do we really want to plan for 100 patients say, given that a completely random sets of circumstances could cause those numbers to rise substantially higher? Next, let's look at the effect of contact transmission probability. We've seen that this also seems to make a difference.

Here at least the variation in ranges seems a bit more predictable. (Part of that might be due to the fact that we are using a constant rate of transmission probability for everyone but that's too much to go into now. At some point we really need to be comparing movement rate and the heterogeneity of contact transmission probability. Starting to sound like a paper title here -- a very boring paper title).

Overall Health Effects

As a final example let's look at a somewhat different policy goal. What we can do to control the overall number of infections? Suppose we have two policy levers, one to control movement and the other to control the probability of transmission. Which one to pull?

Just as before, the answer is both. Imagine that there is some kind of increasing cost for either intervention: It's super easy to get people to not go to football games, but really really hard to keep them from visiting their close friends and family. Similarly, though it would be nice to be able to lower transmission as much as possible, you can get people to wash their hands after using the bathroom, but not to do a surgical scrub every time they shake someone else's hand. So there has to be a sweet spot in there somewhere; a place in policy space where a given set of interventions combine to do the most good. This is exactly the kind of balancing act that makes risk management so challenging and the ABM approach so potentially useful.


I hope all of the charts aren't a major bore after the snazzy quicktime movies in the last article. Yes, there is not as much eye candy in this mode of modeling, and it can involve its share of drudgery, but doing rigorous parameter exploration always seems to turn up new areas for exploration and provides a healthy does of negative results. This then gives us the confidence to turn potential insights into real recommendations. You can now run a live version of the model and do your own interactive experiments. And the actual model is at http://metaabm.org/downloads/models/Epidemic.metaabm -- so if you're feeling ambitious you can run your own batch and experiments as well.

No comments:

Post a Comment

Popular Posts

Recent Tweets

    follow me on Twitter