Dream to Learn is shutting down...

We are very sorry to say that Dream to Learn will be shutting down as of December 28th, 2019. If you have content that you wish to keep, you should make a copy of it before that date.


New York Times Data Team Using R & "Leo" Model for Senate Election Forecast

POSTED IN: Data Analytics & Visualization Blog

Overview (from R-Bloggers)

Nate Silver's departure to relaunch FiveThirtyEight.com left a bit of a hole at the New York Times, which The Upshot — the new data journalism practice at the Times — seeks to fill. And they've gotten off to a great start with the newSenate forecasting model, called Leo. Leo was created by Amanda Cox(longtime graphics editor at the NYT) and Josh Katz (creator of the Dialect Quiz), and uses a similar poll-aggregation methodology to that used by Silver. The model itself is implemented in the R language, and the R code is available for inspection at GitHub.

What is shown here

This is basically my newbie journey in accessing the New York Times' GIT-based code.  Added a couple of lines to visualize the data.   Also pulled various links into one place below. 

Links for Context

Get Source Code & Data:


Brief instructions

This model, created by Amanda Cox and Josh Katz, combines polls with other information to predict how many Senate races Democrats and Republicans will win this year --

  • Please make sure the following R packages have been installed: gamgtoolslubridatemapsRJSONIOgdata,plotrixzoo

  • Change directory to the top-level working directory of this Git repository.

  • Run: Rscript master-public.R

  • Prediction output can then be found in the data-publisher/public/_big_assets/ subdirectory.

Basic Code (after you pull down GIT Zip):

setwd("C:/Users/Home/Documents/DTL Data Viz Community/leo-senate-model2/")

workingDir  <-  getwd()
dataDir     <-  paste(workingDir, "data-publisher/", sep = "/")
modelDir    <-  paste(workingDir, "model", sep = "/")
fundyDir    <-  paste(workingDir, "fundamentals", sep = "/")

### run the model (make sure install all the packages above)
n.days <- 30      # number of days to sim. set to "all" to run all days. 
just.today <- T   # if T, overrides n.days
n.sims <- 50000

if (just.today) source("combine-data.R")

## OK, let's fire this into a quick viz
setwd("C:/Users/Home/Documents/DTL Data Viz Community/leo-senate-model2/data-publisher/public/_big_assets/")
hist <- read.delim("histogram.tsv", header = TRUE, sep = "\t")
barplot(hist$chance, names.arg=hist$dem, border=NA, main="April 27 Prediction - US Senate result", las=2)


Interested in more content by this author?

About the Author

Ryan Anderson

Ryan Anderson

Hi! I like to play with data, analytics and hack around with robots and gadgets in my garage. Lately I've been learning about machine learning.

About this blog

Description is...<br/>Data Analytics & Visualization Blog - Generating insights from Data since 2013

Created: July 25, 2014


Up Next