Main Article Content
In this paper, we develop and compare two models for forecasting the 2020 U.S. presidential election using multiple linear regressions (MLR) and the Machine Learning method of Extreme Gradient Boosting (xgboost). We predict each state’s Republican vote share using seven continuous predictors from 1976-2016, as well as dummy columns for each state. After computing 95% confidence intervals for each prediction, we determine the candidates’ electoral college probabilities. The xgboost appears to be a very strong predictor, accounting for 98.6% of the variance with a 3.34% root mean square error (RMSE), whereas the MLR only accounts for 71.8% of the variance and leaves an RMSE of 6.35%. We observe that 1) both models predict a Democratic electoral college landslide in the 2020 elections, 2) Georgia, Iowa, Florida, North Carolina, and Ohio are crucial for the Republicans to win, and 3) Extreme Gradient Boosting is an attractive alternative to MLR in election forecasting.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors wishing to include figures, tables, or text passages that have already been published elsewhere are required to obtain permission from the copyright owner(s) for both the print and online format and to include evidence that such permission has been granted when submitting their papers. Any material received without such evidence will be assumed to originate from the authors.
Presidential Election.” PS: Political Science and Politics, vol. 45, no. 4, 2012, pp. 618–619. JSTOR, www.jstor.org/stable/41691397. Accessed 1 February 2020.
Alvarez, R. Michael, and Jonathan Nagler. “Economics, Issues and the Perot Candidacy: Voter
Choice in the 1992 Presidential Election.” American Journal of Political Science, vol. 39, no. 3, 1995, pp. 714–744. JSTOR, www.jstor.org/stable/2111651. Accessed 3 March 2020.
Bechtel, G. & Bechtel, T. (2020). American GDP Alone Predicts Human Development. Advances in Social Sciences Research Journal, 7(9) 273-282.
Erikson, Robert S. “Economic Conditions and The Presidential Vote.” The American Political
Science Review, vol. 83, no. 2, 1989, pp. 567–573. JSTOR, www.jstor.org/stable/1962406. Accessed 26 February 2020.
Fair, Ray C. “The Effect of Economic Events on Votes for President: 1980 Results.” The Review
of Economics and Statistics, vol. 64, no. 2, 1982, pp. 322–325. JSTOR, www.jstor.org/stable/1924312. Accessed 18 January 2020.
Fair, Ray C. “The Effect of Economic Events on Votes for President: 1992 Update.” Political
Behavior, vol. 18, no. 2, 1996, pp. 119–139. JSTOR, www.jstor.org/stable/586603. Accessed 18 January 2020.
Fair, Ray C. “Econometrics and Presidential Elections.” The Journal of Economic Perspectives,
vol. 10, no. 3, 1996, pp. 89–102. JSTOR, www.jstor.org/stable/2138521. Accessed 19 January 2020.
Fair, Ray C. “Presidential and Congressional Vote-Share Equations.” American Journal of
Political Science, vol. 53, no. 1, 2009, pp. 55–72. JSTOR, www.jstor.org/stable/25193867. Accessed 21 January 2020.
Gelman, Andrew, and Gary King. “Why Are American Presidential Election Campaign Polls so
Variable When Votes Are so Predictable?” British Journal of Political Science, vol. 23, no. 4, 1993, pp. 409–451. JSTOR, www.jstor.org/stable/194212. Accessed 11 February 2020.
Gomez, Brad T., et al. “The Republicans Should Pray for Rain: Weather, Turnout, and Voting in
U.S. Presidential Elections.” The Journal of Politics, vol. 69, no. 3, 2007, pp. 649–
663. JSTOR, www.jstor.org/stable/10.1111/j.1468-2508.2007.00565.x. Accessed 19 June
Hibbs Jr, Douglas. “President Raegan’s Mandate from the 1980 Elections: A Shift to the Right?”
American Politics Quarterly, vol. 10, 1982, pp. 387–420.
Hummel, Patrick, and David Rothschild. “Fundamental Models for Forecasting Elections at the
State Level.” Electoral Studies, vol. 35, 2014, pp. 123–139.
Kennedy, Ryan, Stefan Wojcik, and David Lazer. "Improving election prediction
internationally." Science 355, no. 6324 (2017): 515-520.
Lacy, Dean, and Barry C. Burden. “The Vote-Stealing and Turnout Effects of Ross Perot in the
1992 U.S. Presidential Election.” American Journal of Political Science, vol. 43, no. 1,
1999, pp. 233–255. JSTOR, www.jstor.org/stable/2991792. Accessed 3 March 2020.
Lauderdale, Benjamin E., and Drew Linzer. "Under-performing, over-performing, or just
performing? The limitations of fundamentals-based presidential election forecasting." International Journal of Forecasting 31, no. 3 (2015): 965-979.
Lewis-Beck, Michael S., and Tom W. Rice. “Forecasting Presidential Elections: A Comparison
of Naive Models.” Political Behavior, vol. 6, no. 1, 1984, pp. 9–21. JSTOR, www.jstor.org/stable/586044. Accessed 9 January 2020.
Linzer, Drew A. “Dynamic Bayesian Forecasting of Presidential Elections in the States.” Journal
of the American Statistical Association, vol. 108, no. 501, 2013, pp. 124–134. JSTOR,
www.jstor.org/stable/23427516. Accessed 31 January 2020.
R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
Sano, M. (2020). Verification Of A Classification Prediction Method For The Development Of Musical Expression In Early Childhood Using A Machine Learning Method Based On 3D Motion Capture Data. Advances in Social Sciences Research Journal, 7(9) 338-358.
Silver, Nate. “The Invisible Undecided Voter.” FiveThirtyEight, FiveThirtyEight, 23 Jan. 2017,
Soumbatiants, Souren, et al. “Using State Polls to Forecast U.S. Presidential Election
Outcomes.” Public Choice, vol. 127, no. 1/2, 2006, pp. 207–223. JSTOR, www.jstor.org/stable/30026779. Accessed 10 March 2020.
Weisberg, Herbert F., and Dino P. Christenson. “Changing Horses in Wartime? The 2004
Presidential Election.” Political Behavior, vol. 29, no. 2, 2007, pp. 279–304. JSTOR, www.jstor.org/stable/4500245. Accessed 6 June 2020.