A Bayesian analysis of the time through the order penalty in baseball Ryan S. Brill Sameer K. Deshpande and Abraham J. Wyner

2025-04-24 0 0 4MB 36 页 10玖币
侵权投诉
A Bayesian analysis of the time through the order
penalty in baseball
Ryan S. Brill
, Sameer K. Deshpande
, and Abraham J. Wyner
June 2, 2023
Abstract
As a baseball game progresses, batters appear to perform better the more times
they face a particular pitcher. The apparent drop-off in pitcher performance from one
time through the order to the next, known as the Time Through the Order Penalty
(TTOP), is often attributed to within-game batter learning. Although the TTOP has
largely been accepted within baseball and influences many managers’ in-game decision
making, we argue that existing approaches of estimating the size of the TTOP cannot
disentangle continuous evolution in pitcher performance over the course of the game
from discontinuities between successive times through the order. Using a Bayesian
multinomial regression model, we find that, after adjusting for confounders like batter
and pitcher quality, handedness, and home field advantage, there is little evidence of
strong discontinuity in pitcher performance between times through the order. Our
analysis suggests that the start of the third time through the order should not be
viewed as a special cutoff point in deciding whether to pull a starting pitcher.
Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania. Cor-
respondence to: ryguy123@sas.upenn.edu
Dept. of Statistics, University of Wisconsin–Madison
Dept. of Statistics and Data Science, The Wharton School, University of Pennsylvania
1
arXiv:2210.06724v2 [stat.AP] 31 May 2023
1 Introduction
In Game 6 of the 2020 World Series, the Tampa Bay Rays’ manager, Kevin Cash, pulled
his starting pitcher, Blake Snell, midway through the sixth inning. When he was pulled,
Snell had been pitching extremely well; he had allowed just two hits and struck out nine
batters on 73 pitches. Moreover, the Rays had a one run lead. Snell’s replacement, Nick
Anderson, promptly gave up two runs, which ultimately proved decisive: the Rays went on
to lose the game and the World Series. After the game, Cash justified his decision to pull
Snell, remarking that he “didn’t want Mookie [Betts] or [Corey] Seager seeing Blake a third
time” (Rivera, 2020).
In his justification, Cash cites the third Time Through the Order Penalty (TTOP), which
was first formally identified in Tango et al. (2007, pp. 187–190) and recently popularized
by Lichtman (2013). It has long been observed that, on average, batters tend to perform
better the more times they face a pitcher; for instance, they tend to get on base more
often on their third time facing a pitcher than their second. Tango et al. (2007) quantified
the corresponding drop-off in pitcher performance as increases in weighted on-base average
(wOBA; see Section 2.4 for details). They observed that the average wOBA of a plate
appearance in the first time through the order (1TTO) is about 9 wOBA points less than
that in the second TTO (2TTO). Further, the average wOBA of a plate appearance in the
second TTO is about 8 wOBA points less than that in the third TTO (3TTO) (Tango et al.,
2007, Table 81).
The TTOP is considered canon by much of the baseball community. Announcers routinely
mention the 3TTOP during broadcasts and several managers regularly use the 3TTOP to
justify their decisions to pull starting pitchers at the start of the third TTO. For instance,
A.J. Hinch, who managed the Houston Astros from 2015 to 2019, noted “the third time
through is very difficult for a certain caliber of pitchers to get through.” Brad Ausmus, who
managed the Detroit Tigers from 2014 to 2017, explained “the more times a hitter sees a
pitcher, the more success that hitter is going to have” (Laurila, 2015).
Tango et al. (2007) attribute the increased average wOBA from one TTO to the next to
within-game batter learning. According to them, batters learn the tendencies of a pitcher
as the game progresses. In fact, they observe “pitchers hitting a wall after 10 or 11 batters”
rather than a “steady degradation in [pitcher] performance” (Tango et al., 2007, pg. 189).
Lichtman (2013) agrees and goes further, stating “the TTOP is not about fatigue. It is
2
about [batter] familiarity.”
We argue that Tango et al. (2007)’s analysis is insufficient to justify such sweeping conclu-
sions. Tango et al. (2007) estimated the 2TTOP and 3TTOP by first binning plate appear-
ances by lineup position and TTO. They then computed the average wOBA within each bin.
Their analysis, by design, cannot disentangle continuous evolution in pitcher performance
over the course of a game (e.g., from pitcher fatigue) from discontinuities between successive
TTOs (e.g., from batter learning). Further, they provide no uncertainty quantification about
their estimated TTOPs.
We conduct a more rigorous statistical analysis of the trajectory of pitcher performance over
the course of a baseball game. Specifically, we fit a Bayesian multinomial logistic regression
model to predict the outcome of each plate appearance as a function of the batter sequence
number, batter quality, pitcher quality, handedness match, and home field advantage. The
batter sequence number simply counts how many batters the pitcher has faced up to and
including the current plate appearance. We find that the expected wOBA forecast by our
model increases steadily over the course of a game and does not display sharp discontinuities
between times through the order. Based on these results, we recommend managers cease
pulling starting pitchers at the beginning of the 3TTO.
The remainder of this paper is organized as follows. We introduce our Bayesian multinomial
logistic regression model of a plate appearance outcome in Section 2. We present our main
findings in Section 3 and conclude by discussing implications of our results in Section 4.
2 Data and model specification
We begin with a brief overview of our MLB plate appearance dataset and identify several
variables that may be predictive of the outcome of a plate appearance. We then introduce
our Bayesian multinomial logistic regression model.
2.1 Retrosheet data
We scraped every plate appearance from 1990 to 2020 from the Retrosheet database. For
each plate appearance, we record the outcome (e.g. out, single, etc.), the event wOBA,
the handedness match between the batter and pitcher, and whether the batter is at home.
We further compute measures of batter and pitcher quality for each plate appearance (see
3
Section 2.5 for details). We include our final dataset, along with all pre-processing and data
analysis scripts in the Supplementary Materials. We used R(R Core Team, 2020) for all
analyses.
We restrict our analysis to every plate appearance from 2012 to 2019 featuring a starting
pitcher in one of the first three times through the order, using the 2017 season as our primary
example. We remove plate appearances featuring switch hitters from our dataset. Our 2017
dataset consists of 108,519 plate appearances, 691 unique batters, and 315 unique starting
pitchers.
There are K= 7 possible outcomes of a plate appearance: out, unintentional walk (uBB),
hit by pitch (HBP), single (1B), double (2B), triple (3B), and home run (HR). For each i=
1, . . . , n, let yibe the categorical variable indicating the outcome of the ith plate appearance.
Notationally, we write
yi∈ {1,2, ..., 7}={Out, uBB, HBP, 1B, 2B, 3B, HR}.(1)
In predicting the probability of each plate appearance outcome, we need to control for several
factors. We introduce the batter sequence number t∈ {1, ..., 27}, which records how many
batters the pitcher has faced up to and including that plate appearance. We additionally
construct indicators of being in the 2TTO and 3TTO, I(10 t18) and I(19 t27) .
Intuitively, we expect that most pitchers are more likely to give up base hits and home runs
to elite batters than they are to strike out elite batters. Similarly, we expect elite pitchers
would have more plate appearances ending in outs than base hits against most batters.
Accordingly, when modeling the outcome of a plate appearance, we adjust for the quality or
skill of the batter and pitcher. To this end, let x(p)and x(b)denote the estimates of pitcher
and batter quality, respectively. We discuss the computation of both quality measures in
Section 2.5.
Additionally, we expect that a pitcher whose handedness matches that of the batter (e.g.,
the pitcher and batter are both right handed) is less likely to give up base hits and home runs
than a pitcher whose handedness doesn’t match the batter’s. To this end, we define hand,
an indicator that is equal to one when the batter and pitcher have matching handedness and
zero otherwise. Finally, we expect that a pitcher on the road is more likely to give up base
hits and home runs than a pitcher at home. Thus we define home, an indicator that is equal
to one when the batter is at home and zero otherwise.
4
Table 1 summarizes the variables that we record from plate appearance i.
Table 1: Summary of variables measured for each at-bat that are used in our analysis.
Covariate symbol Covariate description
yioutcome of the ith plate appearance ∈ {1, ..., K = 7}
tithe batter sequence number ∈ {1, ..., 27}
I(ti2TTO) binary variable indicating whether the pitcher is in his second TTO
I(ti3TTO) binary variable indicating whether the pitcher is in his third TTO
x(b)
irunning-average estimator of batter quality
x(p)
irunning-average estimator of pitcher quality
handibinary variable indicating handedness match between batter and pitcher
homeibinary variable indicating whether the batter is at home
xixi= (x(b)
i, x(p)
i,handi,homei)
2.2 A multinomial logistic regression model
We fit a Bayesian multinomial logistic regression model to predict the outcome of each plate
appearance. For each non-out result (k̸= 1), we model
log P(yi=k)
P(yi= 1) =α0k+α1kti+β2kI(ti2TTO) + β3kI(ti3TTO) + x
iηk,(2)
where the vector xiconcatenates our batter and pitcher quality and indicators for handedness
and home team: x
i= (x(b)
i, x(p)
i,handi,homei).
The parameters α0kand α1kcontrol the continuous evolution of the probability of each plate
appearance outcome throughout the game. In contrast, the parameters β2kand β3kallow for
discontinuities in these probabilities between different times through the order. Pitchers face
each of the opposing team’s batters, and so we interpret the term α0k+α1ktas the continuous
effect of a change in pitcher performance on the probability of each outcome. Batters, on
the other hand, take turns facing the opposing team’s pitcher, and so we interpret β2kand
β3kβ2kas the respective discontinuous effects of a change in batter performance between the
first and second times through the order and between the second and third times through the
order. Observe that for k̸= 1,a large positive value of β2ksuggest that the non-out outcome
kis systematically more likely to occur in the the second time through the order than the
first. Similarly, a large positive positive value of β3kβ2ksuggests that the outcome is more
likely to occur in the third time through the order than the second. Consequently, based
5
摘要:

ABayesiananalysisofthetimethroughtheorderpenaltyinbaseballRyanS.Brill∗,SameerK.Deshpande†,andAbrahamJ.Wyner‡June2,2023AbstractAsabaseballgameprogresses,battersappeartoperformbetterthemoretimestheyfaceaparticularpitcher.Theapparentdrop-offinpitcherperformancefromonetimethroughtheordertothenext,knowna...

展开>> 收起<<
A Bayesian analysis of the time through the order penalty in baseball Ryan S. Brill Sameer K. Deshpande and Abraham J. Wyner.pdf

共36页,预览5页

还剩页未读, 继续阅读

声明:本站为文档C2C交易模式,即用户上传的文档直接被用户下载,本站只是中间服务平台,本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私,请立即通知玖贝云文库,我们立即给予删除!
分类:图书资源 价格:10玖币 属性:36 页 大小:4MB 格式:PDF 时间:2025-04-24

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 36
客服
关注