Source can be viewed on GitHub
Begin by loading the data used for the visualizations in part 1.
#### Run Rscript0 to load useful packages and functions ----
source("Rscript0 - libraries.R")
source("f.Wrangle.R")
#### Model data cleanup ----
source("Rscript1a - modeling data appends.R")
Recall:
The existing model (“JMod”) has two components. First, the probability a solicitation \(S_{i}\) comes in before the end of the fiscal year is a function of the current stage progress \(c\) and time until the end of the year \(t\), \(P(S_{i}=1)=f(c_{i},t)\). The expected FRP in the current fiscal year is this probability times the expected amount, \(E(FRP)=P(S_{i}=1)E(S_{i})\).
Focus on the probability model. I assume that there should also be some effect due to the average close rate, season (e.g. people give more in December), planned ask amount, actual ask amount, and potentially interactions. Something like this:
\[P(S_{ij}=1)=f(c_{i},t,\mu,t_{i}^{*},a_{j}^{*},a_{j})\]
Start by trying to directly predict the booked in FY probabilities. The simplest approach is to treat everything that isn’t continuous as a factor and chuck it into a logistic regression model.
\[logit(p)=log\Big(\frac{p}{1-p}\Big)=\eta\] \[E(\eta_{i})=c_{i}+t_{i}^{*}\]
In other words, the logit of the expected close rate is dependent on some intercept \(c_{i}\) due to stage, and some month effect \(t_{i}^{*}\).
# Load some functions I wrote to calculate model diagnostics
source("f.Diagnostics.R")
## Model each stage by when it was asked
glm.pln.1 <- glm(FY.Plan.Book ~ as.factor(month(mdat$Planning.Dt)), data=mdat, family=binomial())
glm.clr.1 <- glm(FY.Clear.Book ~ as.factor(month(mdat$Planning.Dt)), data=mdat, family=binomial())
glm.ask.1 <- glm(FY.Ask.Book ~ as.factor(month(mdat$Planning.Dt)), data=mdat, family=binomial())
glm.ora.1 <- glm(FY.Oral.Book ~ as.factor(month(mdat$Planning.Dt)), data=mdat, family=binomial())
modnames <- c("Plan","Clear","Ask","Oral")
# Compare each of the models
kable(CompareDeviances(models=list(glm.pln.1, glm.clr.1, glm.ask.1, glm.ora.1), modnames=modnames, debug=F), digits=3)
null.dev | dev | diff | rel.diff | df.p | df.q | p | |
---|---|---|---|---|---|---|---|
Plan | 1792.257 | 1609.836 | 182.421 | 0.102 | 1308 | 1297 | 0.000 |
Clear | 1647.203 | 1591.628 | 55.576 | 0.034 | 1192 | 1181 | 0.000 |
Ask | 1551.004 | 1511.474 | 39.529 | 0.025 | 1136 | 1125 | 0.000 |
Oral | 297.946 | 276.139 | 21.807 | 0.073 | 361 | 350 | 0.026 |
Month-by-month coefficients result in an unambiguously better model compared to just an intercept per stage.
Why four separate models rather than a factor for stage? Note that the goal is to determine whether hitting a certain stage results in a solicitation closing that fiscal year; the target \(Y\) changes from stage to stage, so the model needs to change as well.
JMod has the following structure. Cell entries are probability to close given that a solicitation is in the indicated stage and month, with an expected close date that fiscal year:
Stage | July-Feb | Mar-Apr | May | June |
---|---|---|---|---|
Plan | 1/6 | 1/8 | 0 | 0 |
Clear | 1/3 | 1/6 | 1/8 | 0 |
Ask | 2/3 | 1/3 | 1/6 | 1/8 |
Oral | 2/3 | 1/3 | 1/6 | 1/8 |
Paperwork | 1 | 1 | 1 | 1 |
Here are the confusion matrices, with a classification threshold of \(p=.5\):
# Count tables
ConfusionMatrix(list(glm.pln.1, glm.clr.1, glm.ask.1, glm.ora.1), modnames=modnames)
## $Plan
## pred
## truth FALSE TRUE Sum
## FALSE 501 239 740
## TRUE 205 364 569
## Sum 706 603 1309
##
## $Clear
## pred
## truth FALSE TRUE Sum
## FALSE 256 296 552
## TRUE 191 450 641
## Sum 447 746 1193
##
## $Ask
## pred
## truth FALSE TRUE Sum
## FALSE 180 304 484
## TRUE 165 488 653
## Sum 345 792 1137
##
## $Oral
## pred
## truth TRUE Sum
## FALSE 52 52
## TRUE 310 310
## Sum 362 362
# Proportion tables
ConfusionMatrix(list(glm.pln.1, glm.clr.1, glm.ask.1, glm.ora.1), modnames=modnames, counts=F)
## $Plan
## pred
## truth FALSE TRUE Sum
## FALSE 0.38 0.18 0.57
## TRUE 0.16 0.28 0.43
## Sum 0.54 0.46 1.00
##
## $Clear
## pred
## truth FALSE TRUE Sum
## FALSE 0.21 0.25 0.46
## TRUE 0.16 0.38 0.54
## Sum 0.37 0.63 1.00
##
## $Ask
## pred
## truth FALSE TRUE Sum
## FALSE 0.16 0.27 0.43
## TRUE 0.15 0.43 0.57
## Sum 0.30 0.70 1.00
##
## $Oral
## pred
## truth TRUE Sum
## FALSE 0.14 0.14
## TRUE 0.86 0.86
## Sum 1.00 1.00
Compare errors for JMod versus the month intercepts logistic model by stage. My error function is the \(L_{1}\) norm:
\[L_{1}(\epsilon)=\sum_{i=1}^{n}\left|\epsilon_{i}\right| = \sum_{i=1}^{n}\left|\hat{p}_{i}-p_{i}\right|\]
I also included the squared \(L_{2}\) norm (equivalent to \(RSS\) in OLS regression) for comparison, but it seems like an odd metric for range-restricted data.
## Load functions for JMod
source("f.JMod.R")
## JMod glm
## L1 525.5833 557.8019
## L2 427.2917 278.9010
JMod wins on the sum of absolute residuals, but it is very negatively biased, to the tune of:
## [1] -0.3449198
## JMod glm
## L1 545.8333 566.2442
## L2 367.6667 283.1221
A tiny bit closer on the sum of absolute residuals, but JMod is still negatively biased, around:
## [1] -0.3174071
## JMod glm
## L1 454.5833 537.3278
## L2 267.2188 268.6639
Clear win for JMod; bias is only:
## [1] -0.1302404
## JMod glm
## L1 157.12500 83.46150
## L2 92.75174 41.73075
JMod bias is:
## [1] -0.3942219
Basically, what this comes down to is how concerned are we about bias? The negative bias is due to overaggressive discounting as the end of the fiscal year (June 30) approaches.
One more idea for model comparison: why not look at the classification tables directly? Since JMod has hard cutoffs it doesn’t really make sense to use a threshold. Instead, add the predicted probabilities to the corresponding cells of a confusion matrix for each observation.
## Confusion matrix summing probabilities, rather than counting classifications
# GLM for plan stage
ConfusionMatrixWeighted(preds=fitted(glm.pln.1), truth=model.frame(glm.pln.1)[,1], dimnames=c("truth", "pred"))
## pred
## truth FALSE TRUE
## FALSE 461.1 278.9
## TRUE 278.9 290.1
# JMod for plan stage
(jmod.1$errtable <- ConfusionMatrixWeighted(preds=jmod.1$prob, truth=mdat$FY.Plan.Book, dimnames=c("truth", "pred")))
## pred
## truth FALSE TRUE
## FALSE 702.96 37.04
## TRUE 488.54 80.46
Note that the antidiagonal terms sum to the corresponding \(L_{1}\) norm, from the “Plan JMod versus glm residual error and bias” table above.
Planned solicitations – while the glm is well balanced, JMod overwhelmingly classifies planned solicitations as unlikely to come in. This could be because JMod is tuned toward directly predicting actual amounts, and uses its weights to discount MG rather than worrying about the close probability of every AF gift.
# GLM for clear stage
ConfusionMatrixWeighted(preds=fitted(glm.clr.1), truth=model.frame(glm.clr.1)[,1], dimnames=c("truth", "pred"))
## pred
## truth FALSE TRUE
## FALSE 268.88 283.12
## TRUE 283.12 357.88
# JMod for clear stage
(jmod.2$errtable <- ConfusionMatrixWeighted(preds=na.omit(jmod.2$prob), truth=na.omit(mdat$FY.Clear.Book), dimnames=c("truth", "pred")))
## pred
## truth FALSE TRUE
## FALSE 468.42 83.58
## TRUE 462.25 178.75
Same story – for cleared solicitations JMod overwhelmingly categorizes them as unlikely to close.
# GLM for ask stage
ConfusionMatrixWeighted(preds=fitted(glm.ask.1), truth=model.frame(glm.ask.1)[,1], dimnames=c("truth", "pred"))
## pred
## truth FALSE TRUE
## FALSE 215.34 268.66
## TRUE 268.66 384.34
# JMod for ask stage
(jmod.3$errtable <- ConfusionMatrixWeighted(preds=na.omit(jmod.3$prob), truth=na.omit(mdat$FY.Ask.Book), dimnames=c("truth", "pred")))
## pred
## truth FALSE TRUE
## FALSE 330.75 153.25
## TRUE 301.33 351.67
Same story for asked, but look how much better JMod does in the True-True quadrant, bottom right.
# GLM for oral stage
ConfusionMatrixWeighted(preds=fitted(glm.ora.1), truth=model.frame(glm.ora.1)[,1], dimnames=c("truth", "pred"))
## pred
## truth FALSE TRUE
## FALSE 10.27 41.73
## TRUE 41.73 268.27
# JMod for oral stage
(jmod.4$errtable <- ConfusionMatrixWeighted(preds=na.omit(jmod.4$prob), truth=na.omit(mdat$FY.Oral.Book), dimnames=c("truth", "pred")))
## pred
## truth FALSE TRUE
## FALSE 44.79 7.21
## TRUE 149.92 160.08
Oral is different; JMod is much too pessimistic. Oral pledges actually behave closer to paperwork in house than to ask, and should mainly be assigned \(p=0\) when scheduled to be closed in a future fiscal year.
Definitely include an indicator for expected date in the future.
From part 1, some interesting variables include expected amount, and whether ask amount/planned ask is greater or less than 1.
Potentially interesting derived variables include the number of days left in the year, the number of days until expected close, and the amount of time that’s passed since the previous stage.
Some other thoughts.
First, let’s create those derived variables.
#### Derived variables ----
source("Rscript1b - derived variables.R")
Think about #3 first. I’m a fan of classification trees, and that general method is the closest to the existing approach. It would also be interesting to see which of the proposed variables end up being chosen.
Try the random forest approach to select variables. First split up the variables by which stages it’s suitable to use them for prediction.
## Identify which dependent variables contain information specific to the various stages
## Obviously shouldn't include after-the-fact information (FY. objects, actual amount, etc.)
head.all <- names(mderived)[names(mderived) %in% c("Solicitation.Type.Desc", "Planned.Fiscal.Mo", "Planned.Amt", "Expected.Fiscal.Mo", "Expected.Amt", "Planned.Ask.Band")]
# Look for "plan" and "Plan" in header ([pP] = both cases), but ignore the FY. outcome objects
head.plan <- names(mderived)[setdiff(grep("[pP]lan", names(mderived)), grep("FY.", names(mderived)))]
head.plan <- setdiff(head.plan, c(head.all, "plan2clear")) #take out plan2clear which wouldn't have been known
# Look for "Clear" in header
head.clear <- names(mderived)[setdiff(grep("[cC]lear", names(mderived)), grep("FY.", names(mderived)))]
head.clear <- setdiff(head.clear, c(head.all, "clear2ask")) #take out clear2ask which wouldn't have been known
# Look for "Ask" in header
head.ask <- names(mderived)[setdiff(grep("[aA]sk", names(mderived)), grep("FY.", names(mderived)))]
head.ask <- setdiff(head.ask, c(head.all, "ask2oral")) #take out ask2oral which wouldn't have been known
head.ask <- c(head.ask, "plan2clear") #add fields that would have been known
# Look for "Oral" in header
head.oral <- names(mderived)[setdiff(grep("[oO]ral", names(mderived)), grep("FY.", names(mderived)))]
head.oral <- setdiff(head.oral, head.all)
head.oral <- c(head.oral, "plan2clear", "clear2ask", "Ask.Amt", "Ask.Band", "Ask.Amt.Over.Pln.Amt", "Ask.Amt.Over.Pln.Fac") #add fields that would've been known
Now grow a random forest for each stage.
library(randomForest)
set.seed(15584)
# Planned stage forest
treedat <- na.omit(mderived[,c("FY.Plan.Book", head.all, head.plan)])
rf.plan <- randomForest(factor(FY.Plan.Book) ~ ., data=treedat, importance=T)
# Clear stage forest
treedat <- na.omit(mderived[,c("FY.Clear.Book", head.all, head.clear)])
rf.clear <- randomForest(factor(FY.Clear.Book) ~ ., data=treedat, importance=T)
# Ask stage forest
treedat <- na.omit(mderived[,c("FY.Ask.Book", head.all, head.ask)])
rf.ask <- randomForest(factor(FY.Ask.Book) ~ ., data=treedat, importance=T)
# Oral stage forest
treedat <- na.omit(mderived[,c("FY.Oral.Book", head.all, head.oral)])
rf.oral <- randomForest(factor(FY.Oral.Book) ~ ., data=treedat, importance=T)
The green lines are the out-of-bag error (similar to cross-validation). (Note that PlotForest()
is included in f.Diagnostics.R.)
PlotForest(rf.plan, title="Plan")
PlotForest(rf.clear, title="Clear")
PlotForest(rf.ask, title="Ask")
PlotForest(rf.oral, title="Oral")
Finally, the GLM errors from the confusion matrix, threshold \(p=0.5\), versus the JMod errors from the weighted confusion matrix, \(\frac{1}{n}\sum_{i=1}^{n}\mathbf{1}\{x_{i}\text{ misclassified}\}\), versus the out-of-bag decision forest errors:
Model | Plan | Clear | Ask | Oral |
---|---|---|---|---|
GLM | 0.34 | 0.41 | 0.42 | 0.14 |
JMod | 0.40 | 0.46 | 0.40 | 0.43 |
Random forest | 0.18 | 0.23 | 0.23 | 0.09 |
Great out-of-bag error rates across the board. The random forests are definitely doing something right. Which variables do they find the most helpful?
Stage.Future
, stage2EOFY
, stage.fiscal.mo
, and stage2expect
variables show up as a distinctive double peak pattern per stage. Note that they are all correlated – because of how the forests are constructed (random branching) that means they all show up as fairly important.stage2EOFY
and consider dropping one of the stage.fiscal.mo
or stage2expect
variables as well.Expected.Amt
and Planned.Amt
are good predictors before the Oral stage; Planned.Ask.Band
is worse across the board.Expected.Fiscal.Mo
is important across the board, while Planned.Fiscal.Mo
is useful before the Oral stage.Solicitation.Type.Desc
is a decent predictor beforeplan2clear
are not good predictors.Ask.Amt
is a good predictor in the Ask stage, certainly much better than Ask.Band
; the two Ask.Amt.Over.
fractions are useless.Re-fit the random forests, cutting out the clearly unimportant variables.
## Update the explanatory variables
exclude <- c("plan2EOFY", "clear2EOFY", "ask2EOFY", "oral2EOFY", "plan2clear", "clear2ask", "ask2oral", "Planned.Ask.Band", "Ask.Band", "Ask.Amt.Over.Pln.Amt", "Ask.Amt.Over.Pln.Fac")
head.all <- setdiff(head.all, exclude)
head.plan <- setdiff(head.plan, exclude)
head.clear <- setdiff(head.clear, exclude)
head.ask <- setdiff(head.ask, exclude)
head.oral <- setdiff(head.oral, exclude)
## Regrow the random forests
set.seed(66461)
# Planned stage forest
treedat <- na.omit(mderived[,c("FY.Plan.Book", head.all, head.plan)])
rf.plan <- randomForest(factor(FY.Plan.Book) ~ ., data=treedat, importance=T)
# Clear stage forest
treedat <- na.omit(mderived[,c("FY.Clear.Book", head.all, head.clear)])
rf.clear <- randomForest(factor(FY.Clear.Book) ~ ., data=treedat, importance=T)
# Ask stage forest
treedat <- na.omit(mderived[,c("FY.Ask.Book", head.all, head.ask)])
rf.ask <- randomForest(factor(FY.Ask.Book) ~ ., data=treedat, importance=T)
# Oral stage forest
treedat <- na.omit(mderived[,c("FY.Oral.Book", head.all, head.oral)])
rf.oral <- randomForest(factor(FY.Oral.Book) ~ ., data=treedat, importance=T)
Plotting the new importances:
And the new errors table:
Model | Plan | Clear | Ask | Oral |
---|---|---|---|---|
GLM | 0.34 | 0.41 | 0.42 | 0.14 |
JMod | 0.40 | 0.46 | 0.40 | 0.43 |
Random forest | 0.18 | 0.23 | 0.23 | 0.09 |
Random forest 2 | 0.17 | 0.22 | 0.23 | 0.09 |
Looks like I might even be able to get away with taking out some more predictors. What if we exclude the redundant stage2expect
, and Expected.Fiscal.Mo
and Planned.Fiscal.Mo
which are overwhelmingly December or June?
## Update the explanatory variables
exclude <- c(exclude, "plan2expect", "clear2expect", "ask2expect", "oral2expect", "Expected.Fiscal.Mo", "Planned.Fiscal.Mo")
head.all <- setdiff(head.all, exclude)
head.plan <- setdiff(head.plan, exclude)
head.clear <- setdiff(head.clear, exclude)
head.ask <- setdiff(head.ask, exclude)
head.oral <- setdiff(head.oral, exclude)
## Regrow the random forests
set.seed(48676)
# Planned stage forest
treedat <- na.omit(mderived[,c("FY.Plan.Book", head.all, head.plan)])
rf.plan <- randomForest(factor(FY.Plan.Book) ~ ., data=treedat, importance=T)
# Clear stage forest
treedat <- na.omit(mderived[,c("FY.Clear.Book", head.all, head.clear)])
rf.clear <- randomForest(factor(FY.Clear.Book) ~ ., data=treedat, importance=T)
# Ask stage forest
treedat <- na.omit(mderived[,c("FY.Ask.Book", head.all, head.ask)])
rf.ask <- randomForest(factor(FY.Ask.Book) ~ ., data=treedat, importance=T)
# Oral stage forest
treedat <- na.omit(mderived[,c("FY.Oral.Book", head.all, head.oral)])
rf.oral <- randomForest(factor(FY.Oral.Book) ~ ., data=treedat, importance=T)
## randomForest errors
errors <- rbind(errors, data.frame(Model="Random forest 3", Plan=tail(rf.plan$err.rate[,"OOB"],1), Clear=tail(rf.clear$err.rate[,"OOB"],1), Ask=tail(rf.ask$err.rate[,"OOB"],1), Oral=tail(rf.oral$err.rate[,"OOB"],1), stringsAsFactors=F))
## error table
kable(errors, digits=2)
Model | Plan | Clear | Ask | Oral |
---|---|---|---|---|
GLM | 0.34 | 0.41 | 0.42 | 0.14 |
JMod | 0.40 | 0.46 | 0.40 | 0.43 |
Random forest | 0.18 | 0.23 | 0.23 | 0.09 |
Random forest 2 | 0.17 | 0.22 | 0.23 | 0.09 |
Random forest 3 | 0.17 | 0.21 | 0.23 | 0.10 |
Pretty much the same as the other forest methods, so I prefer this last Random forest 3 as the more parsimonious model. Finally, how do random forests compare to JMod when given the same amount of information?
## Regrow the random forests
set.seed(30635)
# Planned stage forest
treedat <- na.omit(mderived[,c("FY.Plan.Book", "planning.fiscal.mo", "plan2EOFY")])
rf.plan.jm <- randomForest(factor(FY.Plan.Book) ~ ., data=treedat, importance=T)
# Clear stage forest
treedat <- na.omit(mderived[,c("FY.Clear.Book", "clear.fiscal.mo", "clear2EOFY")])
rf.clear.jm <- randomForest(factor(FY.Clear.Book) ~ ., data=treedat, importance=T)
# Ask stage forest
treedat <- na.omit(mderived[,c("FY.Ask.Book", "ask.fiscal.mo", "ask2EOFY")])
rf.ask.jm <- randomForest(factor(FY.Ask.Book) ~ ., data=treedat, importance=T)
# Oral stage forest
treedat <- na.omit(mderived[,c("FY.Oral.Book", "oral.fiscal.mo", "oral2EOFY")])
rf.oral.jm <- randomForest(factor(FY.Oral.Book) ~ ., data=treedat, importance=T)
## randomForest errors
errors <- rbind(errors, data.frame(Model="Random forest, JMod fields only", Plan=tail(rf.plan.jm$err.rate[,"OOB"],1), Clear=tail(rf.clear.jm$err.rate[,"OOB"],1), Ask=tail(rf.ask.jm$err.rate[,"OOB"],1), Oral=tail(rf.oral.jm$err.rate[,"OOB"],1), stringsAsFactors=F))
## error table
kable(errors, digits=2)
Model | Plan | Clear | Ask | Oral |
---|---|---|---|---|
GLM | 0.34 | 0.41 | 0.42 | 0.14 |
JMod | 0.40 | 0.46 | 0.40 | 0.43 |
Random forest | 0.18 | 0.23 | 0.23 | 0.09 |
Random forest 2 | 0.17 | 0.22 | 0.23 | 0.09 |
Random forest 3 | 0.17 | 0.21 | 0.23 | 0.10 |
Random forest, JMod fields only | 0.32 | 0.43 | 0.40 | 0.15 |
## save to disk for later
write.table(errors, "classification.errors.txt", sep="\t", row.names=F)
Interesting – limiting random forest to JMod fields (allows splits only on Stage
, stage.fiscal.mo
, and stage2EOFY
) results in a model that doesn’t do any better than the basic GLM. Random forest 3 is parsimonious so it looks like we may have a final set of variables.
## [1] "Solicitation.Type.Desc" "Planned.Amt"
## [3] "Expected.Amt" "Plan.Future"
## [5] "planning.fiscal.mo"
## [1] "Solicitation.Type.Desc" "Planned.Amt"
## [3] "Expected.Amt" "Clear.Future"
## [5] "clear.fiscal.mo"
## [1] "Solicitation.Type.Desc" "Planned.Amt"
## [3] "Expected.Amt" "Ask.Amt"
## [5] "Ask.Future" "ask.fiscal.mo"
## [1] "Solicitation.Type.Desc" "Planned.Amt"
## [3] "Expected.Amt" "Oral.Future"
## [5] "oral.fiscal.mo" "Ask.Amt"
We see the following are consistently important:
Solicitation.Type.Desc
, Planned.Amt
, and Expected.Amt
are useful across the boardStage.Future
and stage.fiscal.mo
for the corresponding stages are usefulAsk.Amt
is useful in the ask and oral stagesStage.Future
and stage.fiscal.mo
are both used by JMod so this introduces three new predictors useful for assessing a solicitation’s probability to be booked in a given year (plus Ask.Amt
for the last two stages).
session_info()
## Session info --------------------------------------------------------------
## setting value
## version R version 3.3.0 (2016-05-03)
## system x86_64, mingw32
## ui RTerm
## language (EN)
## collate English_United States.1252
## tz America/Chicago
## date 2016-05-25
## Packages ------------------------------------------------------------------
## package * version date source
## assertthat 0.1 2013-12-06 CRAN (R 3.3.0)
## colorspace 1.2-6 2015-03-11 CRAN (R 3.3.0)
## DBI 0.4-1 2016-05-08 CRAN (R 3.3.0)
## devtools * 1.11.1 2016-04-21 CRAN (R 3.3.0)
## digest 0.6.9 2016-01-08 CRAN (R 3.3.0)
## dplyr * 0.4.3 2015-09-01 CRAN (R 3.3.0)
## evaluate 0.9 2016-04-29 CRAN (R 3.3.0)
## formatR 1.4 2016-05-09 CRAN (R 3.3.0)
## ggplot2 * 2.1.0 2016-03-01 CRAN (R 3.3.0)
## gridExtra * 2.2.1 2016-02-29 CRAN (R 3.3.0)
## gtable 0.2.0 2016-02-26 CRAN (R 3.3.0)
## highr 0.6 2016-05-09 CRAN (R 3.3.0)
## htmltools 0.3.5 2016-03-21 CRAN (R 3.3.0)
## knitr * 1.13 2016-05-09 CRAN (R 3.3.0)
## labeling 0.3 2014-08-23 CRAN (R 3.3.0)
## lazyeval 0.1.10 2015-01-02 CRAN (R 3.3.0)
## lubridate * 1.5.6 2016-04-06 CRAN (R 3.3.0)
## magrittr 1.5 2014-11-22 CRAN (R 3.3.0)
## memoise 1.0.0 2016-01-29 CRAN (R 3.3.0)
## munsell 0.4.3 2016-02-13 CRAN (R 3.3.0)
## plyr 1.8.3 2015-06-12 CRAN (R 3.3.0)
## R6 2.1.2 2016-01-26 CRAN (R 3.3.0)
## randomForest * 4.6-12 2015-10-07 CRAN (R 3.3.0)
## Rcpp 0.12.4 2016-03-26 CRAN (R 3.3.0)
## rmarkdown 0.9.6 2016-05-01 CRAN (R 3.3.0)
## scales * 0.4.0 2016-02-26 CRAN (R 3.3.0)
## stringi 1.0-1 2015-10-22 CRAN (R 3.3.0)
## stringr 1.0.0 2015-04-30 CRAN (R 3.3.0)
## tidyr * 0.4.1 2016-02-05 CRAN (R 3.3.0)
## withr 1.0.1 2016-02-04 CRAN (R 3.3.0)
## yaml 2.1.13 2014-06-12 CRAN (R 3.3.0)