Trump vs Clinton - Polling Opinions: How the polls were wrong and how to fix...
Final Paper
1. 1
Final Paper by Arthur Gailes
Introduction
Withthe 2016 electiononthe horizon,we have noshortage of up-to-the momentanalysisof each
primaryresult,favorabilitypoll,andquote of the moment.Giventhat,Ibecame curiousaboutwhatan
out-of-contextanalysiscouldtellusaboutthe probabilitiesinvolvedinthe upcomingpresidential
election.
I have endeavoredtouse the outgoingpresidenti
,Senateii
,andHouse of Representativesto developa
model thatwouldpredictthe outcome of the winnerof the election,partyof the winner,electoral
marginof victory,andpopularvote marginof victory.
Before lookingatthe data,I wouldproject:
The Outgoingpresidentwouldpredictanew presidentfromadifferentparty.
Because Senate andHouse electionsoccureverytwoyears,theirmarginwouldpredicta
memberof the same party winningthe election.
The House wouldbe a strongerindicatorthanthe Senate,because all membershave elections
everytwoyears.
Please note thatI use endnotes(e.g.i
) torefertodata.
Summary of results
First,I lookedathowthe dependentvariablesiii
(winner’spartyandthe marginof victoryfor the
electoral andpopularvotes) were affectedbythe independentvariables(controlof the house and
senate,partyof outgoingpresident,lengthof previouspresident’sterm, andyear).Fromthisdata,I
made the firstpreliminaryconclusions:
Year of electionhadnopredictive power.Includingthe variableactual reducedthe likelihoodof
significance of the model (viaF-Testiv
).
Transformingthe qualitative factorsdoesnotimprove the explanatorypowerof the regression.v
Includingabinaryvalue forcontrol of House andSenate significantlyimprovesthe explanatory
powerof everyregression –an average increase inroughly 0.1R2
value.
Usingthe Cook-Weisbergtestandagraph of residuals,the model appearstodisplay
heteroscedasticity athighmarginsof electoral victory.vi
ThiswasresolvedusingaFeasible
GeneralizedLeastSquaresModel.vii
The popularvote onlyexplains about72% of the electoral vote count,sothe electoral vote isof
greaterinterest.
A model simplypredictingthe partyof the winnerhadthe most explanatorypower,testingas
significantatthe 99.9% confidence level (R2
=.52).viii
However,Ichose note touse thismodel
because the dependentbinaryvariable didnotallow foraprojectionof gains/lossesdue tothe
independentvariables.
Followingatwo-termpresident,whileinteresting,didnothave asignificantenoughsample size
(10) to testits impacton a winningparty.
2. 2
Giventhe findingsabove,Isettledonmyfinal model forexplainingthe effectof the outgoingSenate,
House,andpresident:
Usingthis model topredictthe 2016 Presidentialelection,Iuse:
Y = 18.652 + (-8 * .788) + (59*.189) + (0) – 32.885 = -9.386 electoral votes.
Confidence Interval (-9.386+/- 1.708(27.019)) = -55.53 – 36.76 electoral votes.
Conclusion
Basedon thismodel,the republicannominee forpresidentcanexpecttohave a slightadvantage (worth
aboutnine electoral votesinthe upcomingelection.Mostsignificanttothe RepublicanParty’schances
isthe fact thattheyare runningagainstan outgoingdemocraticpresident,whichisexpectedtogain
themabout32 electoral votes.
For the DemocraticParty,the besttakeawayisthattheydo not control the bothHouse and Senate,
whichalsohas significantnegativeconsequencesinanelectionyear.Infact,Republicancontrol of both
House and Senate maycompletelymitigatethe advantage theywouldotherwisegainfromrunning
againstthe party of an outgoingpresident.
Anotherwayof puttingthiswouldbe to saythat the Americanelectorate hasdisplayedastatistically
significanttendencytovote againstthe partyperceivedtobe in power.Furthermore,thattendency
usuallyoutweighsthe signaling of the priorHouse andSenate elections.
On balance,thisdataindicatesthatthe DemocraticPartymust fieldasignificantlystrongercandidate
and/oralignmore closelywiththe Americanpublictoovercome the structural disadvantagesof this
electioncycle.
_cons 18.65166 11.87956 1.57 0.129 -5.814761 43.11808
OutgoingD -32.88488 13.98332 -2.35 0.027 -61.68406 -4.085693
D_SH -40.15296 13.73011 -2.92 0.007 -68.43065 -11.87527
House_Margin .1892397 .1108122 1.71 0.100 -.0389823 .4174616
Senate_Margin .7882464 .4958446 1.59 0.124 -.2329647 1.809458
Electoral Coef. Std. Err. t P>|t| [95% Conf. Interval]
Total 32365.6127 29 1116.05561 Root MSE = 27.019
Adj R-squared = 0.3459
Residual 18250.0726 25 730.002905 R-squared = 0.4361
Model 14115.54 4 3528.88501 Prob > F = 0.0050
F(4, 25) = 4.83
Source SS df MS Number of obs = 30
(sum of wgt is 1.2340e-01)
. regress Electoral Senate_Margin House_Margin D_SH OutgoingD [aweight=1/(wght)^2]
3. 3
Critique
Obviously,thisisasmall sample size,andthe nature of US votinghas changeddramatically
across the sample size.Inparticular,the abilityof people of colorandwomento vote,the
growthof the population,andpolitical realignmentof the partiespresentchallengestothe
model’sconsistency.
Since the introductionof the DemocraticParty,there have beenthree opposingparties
(National Republicans,Whigs,andRepublicans).
Calculatingthe Congressional margin of control bynumberratherthanpercentage maybe a
fault,because the size of bothhousesincreasedoverthe sample.
o Future researchcouldalsostudythe change in bothhouse marginsinthe previousoff-
yearelections.Thisisespeciallytrue forthe Senate,whichhassix-yearterms,meaning
that the compositionof the Senate onlypartiallyreflectsthe currentsentimentof
voters.
The large standarderrorin the constant,and Senate Marginand House Margin variablesshows
that there are likelyomittedvariableshere,whichseemsobviousfromthe nature of the model.
Strengths
Thismodel testsstronglyassignificant,anddespite the flawslistedabove,isconsistentacross
years,indicatingthatthe effectsof congressandsenate compositionhave similareffects
throughouttime.
Viewingthe electionthroughthislenseallowsustoanalyze some of the causesof electoral
trendsthat persistthroughoutelections.Of course,the actual candidatesandissueslikely have
a much greaterconsequence than
If combinedwithsimilarobservationsfromdifferentcountries,thiscouldtellussomething
abouthow humansona verybasiclevel reacttoincumbentpartiesandpoliticians.
The fact that only43% of variationcan be explainedbytrendsinpartycompositionimpliesa
confirmationof the importance muchof the publicandmediaplace onthe presidency.
i
Data begins at 1828,the firstpresidential election after the advent of the Democratic Party. This was a natural
startingpointdue to changes in the countingof popular vote and electoral vote counting before that period.
Vice Presidents who ascend to the presidency (e.g. Lyndon Johnson) not included in “Winner is Same Party,” even
if re-elected for the next term. When re-elected, they areessentially incumbents already.However, if they were
members of the same party as the president they replaced,the followingpresidentwill be marked 1 for follows 2-
term president (e.g. Richard Nixon,but not Jimmy Carter).
ii Independent members of congress not counted unless caucused with a major party.
iii The variables used:
Electoral – A normalized margin of victory for electoral delegates by percentage. Negative numbers
represent a loss by the incumbent party’s candidate,positivenumbers represent a win. Formula:
4. 4
c=total possibleelectoral votes; w=winner
electoral votes; r=runner-up electoral votes
Popular – Percentage won/lost (positive/negative) by incumbent party’s candidate
Democrat_Winner – 1 if winner is a member of the Democratic Party.
Year – Year of election.
D_Senate – 1 if senate is controlled by the Democratic Party at time of election. For example, the election
year 2008 uses the 110th congress (2007-2009).
D_House – 1 if senate is controlled by the Democratic Party at time of election.
D_SH – D_House*D_Senate
Senate_Margin – Margin of control by outgoing senate. Negative numbers represent the opposi ngparty’s
control.
House_Margin – Margin of control by outgoing senate. Negative numbers represent the opposingparty’s
control.
Follows_2Term – Follows a two-term president
1’s (or positivenumbers) represent the Democratic Party becauseit was the first major modern party to
come into existence. 0 (or negative numbers) may indicateRepublican Party,Whigs or National
Republicans.
iv All tests and confidence intervals usea 95% confidence level unless otherwise noted.
v One example (also tested linear-logand log-linear):
_cons 10.04358 10.04665 1.00 0.327 -7.117521 27.20468
OutgoingD -33.50862 17.18491 -1.95 0.062 -62.86287 -4.15437
Follows_2Term -15.01311 14.96189 -1.00 0.325 -40.57012 10.54391
House_Margin .1233509 .1221979 1.01 0.322 -.0853802 .332082
Senate_Margin .3737769 .5633529 0.66 0.513 -.5885092 1.336063
Electoral Coef. Std. Err. t P>|t| [90% Conf. Interval]
Total 51489.7803 29 1775.50966 Root MSE = 37.847
Adj R-squared = 0.1932
Residual 35810.2856 25 1432.41143 R-squared = 0.3045
Model 15679.4946 4 3919.87366 Prob > F = 0.0513
F(4, 25) = 2.74
Source SS df MS Number of obs = 30
. regress Electoral Senate_Margin House_Margin Follows_2Term OutgoingD, level(90)