A Bayesian analysis of the time through the order penalty in baseball Ryan S. Brill Sameer K. Deshpande and Abraham J. Wyner

2025-04-24 0 0 4MB 36 页 10玖币

侵权投诉

A Bayesian analysis of the time through the order

penalty in baseball

Ryan S. Brill∗

, Sameer K. Deshpande†

, and Abraham J. Wyner‡

June 2, 2023

Abstract

As a baseball game progresses, batters appear to perform better the more times

they face a particular pitcher. The apparent drop-oﬀ in pitcher performance from one

time through the order to the next, known as the Time Through the Order Penalty

(TTOP), is often attributed to within-game batter learning. Although the TTOP has

largely been accepted within baseball and inﬂuences many managers’ in-game decision

making, we argue that existing approaches of estimating the size of the TTOP cannot

disentangle continuous evolution in pitcher performance over the course of the game

from discontinuities between successive times through the order. Using a Bayesian

multinomial regression model, we ﬁnd that, after adjusting for confounders like batter

and pitcher quality, handedness, and home ﬁeld advantage, there is little evidence of

strong discontinuity in pitcher performance between times through the order. Our

analysis suggests that the start of the third time through the order should not be

viewed as a special cutoﬀ point in deciding whether to pull a starting pitcher.

∗Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania. Cor-

respondence to: ryguy123@sas.upenn.edu

†Dept. of Statistics, University of Wisconsin–Madison

‡Dept. of Statistics and Data Science, The Wharton School, University of Pennsylvania

arXiv:2210.06724v2 [stat.AP] 31 May 2023

1 Introduction

In Game 6 of the 2020 World Series, the Tampa Bay Rays’ manager, Kevin Cash, pulled

his starting pitcher, Blake Snell, midway through the sixth inning. When he was pulled,

Snell had been pitching extremely well; he had allowed just two hits and struck out nine

batters on 73 pitches. Moreover, the Rays had a one run lead. Snell’s replacement, Nick

Anderson, promptly gave up two runs, which ultimately proved decisive: the Rays went on

to lose the game and the World Series. After the game, Cash justiﬁed his decision to pull

Snell, remarking that he “didn’t want Mookie [Betts] or [Corey] Seager seeing Blake a third

time” (Rivera, 2020).

In his justiﬁcation, Cash cites the third Time Through the Order Penalty (TTOP), which

was ﬁrst formally identiﬁed in Tango et al. (2007, pp. 187–190) and recently popularized

by Lichtman (2013). It has long been observed that, on average, batters tend to perform

better the more times they face a pitcher; for instance, they tend to get on base more

often on their third time facing a pitcher than their second. Tango et al. (2007) quantiﬁed

the corresponding drop-oﬀ in pitcher performance as increases in weighted on-base average

(wOBA; see Section 2.4 for details). They observed that the average wOBA of a plate

appearance in the ﬁrst time through the order (1TTO) is about 9 wOBA points less than

that in the second TTO (2TTO). Further, the average wOBA of a plate appearance in the

second TTO is about 8 wOBA points less than that in the third TTO (3TTO) (Tango et al.,

2007, Table 81).

The TTOP is considered canon by much of the baseball community. Announcers routinely

mention the 3TTOP during broadcasts and several managers regularly use the 3TTOP to

justify their decisions to pull starting pitchers at the start of the third TTO. For instance,

A.J. Hinch, who managed the Houston Astros from 2015 to 2019, noted “the third time

through is very diﬃcult for a certain caliber of pitchers to get through.” Brad Ausmus, who

managed the Detroit Tigers from 2014 to 2017, explained “the more times a hitter sees a

pitcher, the more success that hitter is going to have” (Laurila, 2015).

Tango et al. (2007) attribute the increased average wOBA from one TTO to the next to

within-game batter learning. According to them, batters learn the tendencies of a pitcher

as the game progresses. In fact, they observe “pitchers hitting a wall after 10 or 11 batters”

rather than a “steady degradation in [pitcher] performance” (Tango et al., 2007, pg. 189).

Lichtman (2013) agrees and goes further, stating “the TTOP is not about fatigue. It is

about [batter] familiarity.”

We argue that Tango et al. (2007)’s analysis is insuﬃcient to justify such sweeping conclu-

sions. Tango et al. (2007) estimated the 2TTOP and 3TTOP by ﬁrst binning plate appear-

ances by lineup position and TTO. They then computed the average wOBA within each bin.

Their analysis, by design, cannot disentangle continuous evolution in pitcher performance

over the course of a game (e.g., from pitcher fatigue) from discontinuities between successive

TTOs (e.g., from batter learning). Further, they provide no uncertainty quantiﬁcation about

their estimated TTOPs.

We conduct a more rigorous statistical analysis of the trajectory of pitcher performance over

the course of a baseball game. Speciﬁcally, we ﬁt a Bayesian multinomial logistic regression

model to predict the outcome of each plate appearance as a function of the batter sequence

number, batter quality, pitcher quality, handedness match, and home ﬁeld advantage. The

batter sequence number simply counts how many batters the pitcher has faced up to and

including the current plate appearance. We ﬁnd that the expected wOBA forecast by our

model increases steadily over the course of a game and does not display sharp discontinuities

between times through the order. Based on these results, we recommend managers cease

pulling starting pitchers at the beginning of the 3TTO.

The remainder of this paper is organized as follows. We introduce our Bayesian multinomial

logistic regression model of a plate appearance outcome in Section 2. We present our main

ﬁndings in Section 3 and conclude by discussing implications of our results in Section 4.

2 Data and model speciﬁcation

We begin with a brief overview of our MLB plate appearance dataset and identify several

variables that may be predictive of the outcome of a plate appearance. We then introduce

our Bayesian multinomial logistic regression model.

2.1 Retrosheet data

We scraped every plate appearance from 1990 to 2020 from the Retrosheet database. For

each plate appearance, we record the outcome (e.g. out, single, etc.), the event wOBA,

the handedness match between the batter and pitcher, and whether the batter is at home.

We further compute measures of batter and pitcher quality for each plate appearance (see

Section 2.5 for details). We include our ﬁnal dataset, along with all pre-processing and data

analysis scripts in the Supplementary Materials. We used R(R Core Team, 2020) for all

analyses.

We restrict our analysis to every plate appearance from 2012 to 2019 featuring a starting

pitcher in one of the ﬁrst three times through the order, using the 2017 season as our primary

example. We remove plate appearances featuring switch hitters from our dataset. Our 2017

dataset consists of 108,519 plate appearances, 691 unique batters, and 315 unique starting

pitchers.

There are K= 7 possible outcomes of a plate appearance: out, unintentional walk (uBB),

hit by pitch (HBP), single (1B), double (2B), triple (3B), and home run (HR). For each i=

1, . . . , n, let yibe the categorical variable indicating the outcome of the ith plate appearance.

Notationally, we write

yi∈ {1,2, ..., 7}={Out, uBB, HBP, 1B, 2B, 3B, HR}.(1)

In predicting the probability of each plate appearance outcome, we need to control for several

factors. We introduce the batter sequence number t∈ {1, ..., 27}, which records how many

batters the pitcher has faced up to and including that plate appearance. We additionally

construct indicators of being in the 2TTO and 3TTO, I(10 ≤t≤18) and I(19 ≤t≤27) .

Intuitively, we expect that most pitchers are more likely to give up base hits and home runs

to elite batters than they are to strike out elite batters. Similarly, we expect elite pitchers

would have more plate appearances ending in outs than base hits against most batters.

Accordingly, when modeling the outcome of a plate appearance, we adjust for the quality or

skill of the batter and pitcher. To this end, let x(p)and x(b)denote the estimates of pitcher

and batter quality, respectively. We discuss the computation of both quality measures in

Section 2.5.

Additionally, we expect that a pitcher whose handedness matches that of the batter (e.g.,

the pitcher and batter are both right handed) is less likely to give up base hits and home runs

than a pitcher whose handedness doesn’t match the batter’s. To this end, we deﬁne hand,

an indicator that is equal to one when the batter and pitcher have matching handedness and

zero otherwise. Finally, we expect that a pitcher on the road is more likely to give up base

hits and home runs than a pitcher at home. Thus we deﬁne home, an indicator that is equal

to one when the batter is at home and zero otherwise.

Table 1 summarizes the variables that we record from plate appearance i.

Table 1: Summary of variables measured for each at-bat that are used in our analysis.

Covariate symbol Covariate description

yioutcome of the ith plate appearance ∈ {1, ..., K = 7}

tithe batter sequence number ∈ {1, ..., 27}

I(ti∈2TTO) binary variable indicating whether the pitcher is in his second TTO

I(ti∈3TTO) binary variable indicating whether the pitcher is in his third TTO

x(b)

irunning-average estimator of batter quality

x(p)

irunning-average estimator of pitcher quality

handibinary variable indicating handedness match between batter and pitcher

homeibinary variable indicating whether the batter is at home

xixi= (x(b)

i, x(p)

i,handi,homei)

2.2 A multinomial logistic regression model

We ﬁt a Bayesian multinomial logistic regression model to predict the outcome of each plate

appearance. For each non-out result (k̸= 1), we model

log P(yi=k)

P(yi= 1) =α0k+α1kti+β2kI(ti∈2TTO) + β3kI(ti∈3TTO) + x⊤

iηk,(2)

where the vector xiconcatenates our batter and pitcher quality and indicators for handedness

and home team: x⊤

i= (x(b)

i, x(p)

i,handi,homei).

The parameters α0kand α1kcontrol the continuous evolution of the probability of each plate

appearance outcome throughout the game. In contrast, the parameters β2kand β3kallow for

discontinuities in these probabilities between diﬀerent times through the order. Pitchers face

each of the opposing team’s batters, and so we interpret the term α0k+α1ktas the continuous

eﬀect of a change in pitcher performance on the probability of each outcome. Batters, on

the other hand, take turns facing the opposing team’s pitcher, and so we interpret β2kand

β3k−β2kas the respective discontinuous eﬀects of a change in batter performance between the

ﬁrst and second times through the order and between the second and third times through the

order. Observe that for k̸= 1,a large positive value of β2ksuggest that the non-out outcome

kis systematically more likely to occur in the the second time through the order than the

ﬁrst. Similarly, a large positive positive value of β3k−β2ksuggests that the outcome is more

likely to occur in the third time through the order than the second. Consequently, based

文档加载中……请稍候！
如果长时间未打开，您也可以点击刷新试试。

下载文档到电脑，查找使用更方便

10 玖币 0人已下载

立即下载

摘要：

ABayesiananalysisofthetimethroughtheorderpenaltyinbaseballRyanS.Brill∗,SameerK.Deshpande†,andAbrahamJ.Wyner‡June2,2023AbstractAsabaseballgameprogresses,battersappeartoperformbetterthemoretimestheyfaceaparticularpitcher.Theapparentdrop-offinpitcherperformancefromonetimethroughtheordertothenext,knowna...

展开>> 收起<<

A Bayesian analysis of the time through the order penalty in baseball Ryan S. Brill Sameer K. Deshpande and Abraham J. Wyner.pdf

共36页,预览5页

还剩页未读，继续阅读

声明：本站为文档C2C交易模式，即用户上传的文档直接被用户下载，本站只是中间服务平台，本站所有文档下载所得的收益归上传人(含作者)所有。玖贝云文库仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对上载内容本身不做任何修改或编辑。若文档所含内容侵犯了您的版权或隐私，请立即通知玖贝云文库，我们立即给予删除！

A Bayesian analysis of the time through the order penalty in baseball Ryan S. Brill Sameer K. Deshpande and Abraham J. Wyner

相关推荐

开通VIP享超值会员特权

作者详情

相关内容

热门标签

举报选择: