Skip directly to content

Forecasting the Number of Cases of Ebola Virus from the 2014 Outbreak in West Africa During the Next 12 Weeks

on Thu, 09/04/2014 - 05:36

In the last five months I have been following the evolution of the Ebola virus disease outbreak in West Africa, which up to August 26th has a cumulative number of 3,069 cases including 1,552 deaths for a case fatality rate of 51%, making this the largest Ebola outbreak ever recorded. A previous post was devoted to illustrate the chronology of Ebola virus outbreak from 1976 to 2014 and an interactive tool for tracking this outbreak was also published.  

Last week, on August 28th 2014 The World Health Organization (WHO) issued the Ebola response roadmap to guide and coordinate the international response to the outbreak of Ebola virus disease (EVD) in West Africa. The roadmap assumes that in many areas of intense transmission the actual number of cases may be 2-4 fold higher than that currently reported and  It acknowledges that the aggregate case load of EVD could exceed 20,000 over the next nine months.

The estimates provided by WHO -mentioned above- and the results of some researchers and research teams modeling the EVD outbreak, which are projecting a rapidly rising number of cases, served as inspiration to apply mathematical modeling and forecasting to predict the cumulative number of EVD during the next weeks.

Using the the cumulative total numbers of clinical cases (confirmed, probable and suspected) reported by WHO, the analytic capacity for forecasting from Tableau Software was applied to predict the total number of EVD cases during the next 12 weeks (from September 7th to November 23rd, 2014).

The interactive data visualization bellow shows the actual trend lines of cumulative cases of EVD from March 16 to August 26, 2014 and the estimated number of cases with 95% confidence interval [CI] from September 7 to November 23, 2014 (12 weeks).  Hovering the mouse over the actual and projected (estimate) trend lines will provide details of number of cases per weeks.

Main findings

Based on the forecasting model and asuming the current conditions (diagnosis, treatment, interventions and reporting) of the Ebola virus outbreak in West Africa remains as in the past, it is expected a rapid and exponential increase in the number of cases and deaths.

It is estimated that the cumulative number of cases will reach more than 8,000 95% CI (6,466-9,954) for the three countries Guinea, Liberia and Sierra Leone in the next 12 weeks.

Liberia is strongly contributing to the high number of cases of the whole outbreak in the last weeks. It is estimated that Liberia reachs more than 4,000 cases in the next 12 week, practically half of the total number of cases estimates for the whole outbreak (three countries together).  Liberai and Sierra Leone have the highest average rate of cases per week.  

Urgent actions and interventions should be taken in order to control the outbreak, avoid more cases and deaths.

I hope this helps to raise awareness of the current and near future situation of the 2014 Ebola virus outbreak in West Africa.

Your comments about this post are welcome.



P.S. I've been working refining the predictive model for a better fit and updating the data with the most current one. I recommend all readers to take a look at the new data visualization which include the new estimates. Thanks to all that has come to read my post and specially to those who has giving their thougths and feedback in the comment section. 


Nancy Abramson's picture

Thanks again for showing how rapidly this disease is increasing. Hopefully this will help illustrate why the increases are of international concern.

Spencer's picture

Hi there,

I'm curious to hear what methods you used to make these forecasts. "Tableau Software" is not something I'm really familiar with. At the very least are these statistical (e.g arima models) or mechanistic forecasting models (e.g. state space models) being used?

Thank you,

martiner's picture

Hi Spwencer,

Thanks for your comment and question.

Tableau Software has implemented Exponential smoothing models including two basic components of time series analysis: Trends and Seasonality.

Exponential smoothing models iteratively forecast future values of the time series of values from weighted averages of past values of the series. The simplest model, Simple Exponential Smoothing, computes the next level or smoothed value from a weighted average of the last actual value and the last level value. The method is exponential because the value of each level is influenced by every preceding actual value to an exponentially decreasing degree. More recent values are given greater weight. 


NICURN's picture

Hi, Starting about Mid May, known cases have been doubling on average about 29 days (and seems to be speeding up slightly). So, my question is, if there are 3707 cases by Aug 31, why wouldn't you forecast 7400 cases by late Sept 29, 14,800 by Oct 26 and close to 29,600 by Nov 24? It seems like cases will continue to grow at the current rate or faster unless modeling assumes a positive response to some future public health measures. Maybe the graph skews low because it uses outdated numbers to start with, as there are currently more cases on Aug 31 than modeling projects for Sept 7. Thanks in advance for your answer!

martiner's picture


Thanks for your multiple comments. I just followed a different approach trying to apply a more robust time series analysis method. With this method we got also the uncertainty (95% confidence intervals)

The result is not perfect, but at least take into account the pattern of the time series, giving more weight to the most recent values of the series.

I also fitted an exponential regression model, which fitted very well and predicted a higher number of cases. Probably we need to explore more and publish those results. 

NICURN's picture

The previous comment that I posted relies on numbers from the Virology down under blog at He in turn took them from . However when I checked the WHO numbers they were slightly lower than the ones at VDU,the Sept 4 DON gives 3685 cases instead of 3707 through August 31. It's probably moot however, as these KNOWN cases are "vastly underestimated" to begin with!

NICURN's picture

I just realized that the number 3707 includes Nigeria and Senegal, whereas the smaller 3685 I used is just for Liberia, Guinea and Sierra Leone. So the original post stands. However, the fact that your model leaves out Nigeria and Senegal could also result in a lower prediction.

martiner's picture

You are right. I'm know adding data from the WHO report published today. I'll share the results and I will try to give more details about the model.

Thanks again for your ineterest and constructing comments. 




BioBob's picture

Two points:
It would seem to make more sense to use the 25 to 50% undercount estimate of reported cases rather than some "standard" CI based on such unreliable numbers. So you would could employ the WHO numbers as the min, the 2x as the average and the 4x as the max (or something).

Any model should only use the known increase week over week of the LAST-known-week especially for Liberia since it is obvious that this outbreak is completely out of control there and in an urban population in Monrovia and Freetown. Any consideration of many prior week's data is clearly not warranted as the rate of growth has been increasing exponentially as well as the number of infections.

For example, last weeks cases roughly doubled in Freetown for the first time, indicating some organic loss in control capability (eg nurses strike, not enough contact tracing, health workers not showing up for work, etc.).

martiner's picture

Hi BioBob,

Thanks for your comment and recommedation for an alternative method of calculation to consider the underregistration of cases. Your approach sounds interesting, so it is worth to give it a try.



BioBob's picture

BTW, THANKS for updating !!!

I very much like these plots and find them interesting & useful.

1) Why not use both your normal eg "...." & 25%-50% eg "----"confidence intervals forecast cones on the above graph ESTIMATEs like they do for a hurricane track ?

I suspect it would be VERY interesting indeed.

2) Also, using the same scale for all plots might better indicate the differences among countries which is currently confounded by the differing scales. Perhaps use current graphs with CI added and then individual country plots on SAME 4th plot without the CI for direct comparison ?

3) Being able to click to enlarge or have a link to a larger version of the plots would be very useful.

martiner's picture

Thanks BioBob for your great suggestions. 

I will try your suggestions and I'll share any result from further improvement iterations. 


Blair's picture

WHO or Wikipedia are not reliably reporting the actual count "as of" the specified date.

The Aug 26th count you have is from Aug 24th. The Aug 31st count is accurate, but the Sept 5th count released on Sept 5th is inaccurate, and is from around Sept 3rd. For your convenience, the Sept 6th count is 4354.

These numbers can be found from the primary source distributed by the Health Ministries, and are often uploaded to and


martiner's picture

Dear Blair,

Thanks for your comments and observations. I'm using as source the official data reported by World Health Organization (WHO), which also data reported by the Health Ministries. 

The data set used in this analysis includes reported data from March 22nd to August 31st, which was prorated by calendar weeks to the effect of time series analysis. 

Data from weeks September 7 to November 23rd are estimates or predicted values based a time series model. This resulting model was fitted based on the actual (reported) data mentioned in previous paragraph.

I'm aware of the most recent Ebola virus disease outbreak data, published by WHO last Friday, September 5th  in the Ebola Response Roadmap Situation Report 2.

More results from the application of predictive analysis, using more recent data, will be published further.

Again, thank you so much for your visiting Health Intelligence, taking the time to read this post and provide your input and thoughts.



Steph's picture

shows a caseload 20 % above your estimate. Anyway WHO figures are notoriously unreliable both in terms of volume and reporting time so the trend is what matters here and it's always good to have a real time trend reporting. Thanks for that

martiner's picture

Thanks Steph for your comment.

I'm refining the predictive model and updating my dataser based on new count from WHO. I'll post the results soon

Aaron saxton's picture

I used 2.5 weeks ago a forecast model that has panned out exceptionally well, which sows active cases by 23 September should hit 4000, and an increase of cases by 500% every 3 weeks thereafter.

At the time of building the model it forecast 100 new cases per day by 11 September and 150 per day by 19 September.

The model shows once 4000 active cases are reached if there are not 20,000 active on-ground medical personnel in addition to logistical then the outbreak reaches a point where containment is fairly futile unless drastic actions are taken.

We will see how it unfolds. I currently predict based in the model 1100 cases currently that are off the radar in addition to the current reported cases.

Post new comment