Hey Now, You’re an All-Star. You Probably Earned It in the First Half.
Over the past few decades, MLB's All-Star selections have trended toward favoring strong first-half performances over prior reputations.
I love the MLB All-Star Game. I’ve argued before that it’s pretty much the only major All-Star event that holds up as an actual representation of the sport itself, rather than the uncanny-valley versions we see in other sports. At its core, baseball’s ASG remains, well, baseball.
But one thing I’ve always puzzled over when it comes to the All-Star Game is how we choose who plays in it. Right now, it’s a convoluted process involving fan ballots, player voting and even the commissioner’s office. (Previously, it also involved the managers for each league making picks.) But the goal has always been somewhat unclear: Are we picking the biggest “stars” — regardless of their performance this season — or the best players of the first half? Or some combination of both?
My interest in this question was re-piqued when my friend and former FiveThirtyEight colleague, Ben Lindbergh of The Ringer, reached out and asked if I could determine exactly how much of the All-Star selection process is based on this season versus previous ones… and whether that has changed over time.
So I decided to actually look at it in an empirical manner. Specifically, I pulled Wins Above Replacement data from FanGraphs1 for batters since 1990 over three different stretches of time:
The first half of the season in question (i.e., pre-All-Star break), pro-rated to a full-season pace
The previous season
Two seasons prior
Using those datapoints — and tossing out any seasons where they overlapped with partial schedules like the strike or the pandemic — I created a “weighted WAR” where each category was given a value that added up to 100 percent overall. That weighted WAR was then used in a logistic regression to predict the odds of making that year’s All-Star teams for all batters who had at least 1 plate appearance during the season.
I ran four versions of this regression model: One for the entire 1990-2023 period, one for just the 1990s, one for just the 2000s, and one for the 2010s/20s. The different weights on each WAR timeframe in each era should tell us the relative importance of each category in predicting All-Stars, and how much it varies by era.
Here are the results of those percentage weightings:
Indeed, there does seem to be a trend over time toward the process rewarding strong first halves of seasons, relative to a player’s previous track record. In the 1990s, first-half WAR was only 69 percent of the equation for becoming an All-Star, with 14 percent going to last season and even more weight (18 percent) going to two seasons prior. Compare that with the most recent decade-plus, in which 82 percent weight goes to first-half play, 15 percent to last season and just 3 percent to two seasons earlier.
It’s tough to say why exactly this change happened. Certainly, it would make sense that track record matters less in MLB’s post-steroid era, during which younger players (with fewer years of stardom behind them) began making up a higher share of leaguewide value. But there may also simply be more of a genuine focus on performance within the current season, since WAR is so commonly used to guide — and then judge — All-Star picks. While All-Star selections haven’t exactly trended toward greater alignment with WAR over time, many of the most undeserved ASG choices came in the pre-WAR era, or very early in it.
That may even help explain why we see All-Stars this year like Jurickson Profar, Alec Bohm, Jordan Westburg, David Fry and Heliot Ramos, who collectively produced 3.8 WAR over the previous two seasons before breaking out for a combined 21.5-WAR pace this season. According to my regressions, a player like Ramos is 1.6 times more likely to make an All-Star roster in the current era than he would have been in the 1990s, when a half-season of success wasn’t enough to overcome a lack of previous production and/or star power.
One final question might be whether this is a positive change or not. Baseball still sorely lacks for household names relative to other sports, and while we may argue it has outlived this purpose, the All-Star Game was originally designed to showcase the biggest stars in the sport. An increased focus on players who are playing well over the previous few months of action may be a “fairer” way to reward All-Stars, but it does little to foster a sense of familiarity between fans and big-name players.
To use a historical example, Cal Ripken Jr. may have racked up a lot of questionable ASG appearances in his career by the stats, but he also put the “star” in “All-Star”. By contrast — and this is nothing against Ramos and company — just about anybody can have a hot first half in a sport as chaotic as baseball. But does that equate to stardom? For better or worse, MLB’s All-Star selection process increasingly says it does.
Filed under: Baseball
Which makes up half of my beloved JEFFBAGWELL (the Joint Estimate Featuring FanGraphs and B-R Aggregated to Generate WAR, Equally Leveling Lists).
How did you handle younger All-Stars with no "past-two year", or even prior season data. Like Paul Skenes and Gunnar Henderson this years, or Julio Rodriguez two years ago?