Making Sense of 1969's Lowered Mound and Expansion, Via League Statistics
Introduction
When changes in baseball’s conditions through the eras are delineated, a favorite tack is to play the contrarian, to insist that the public has traditionally been told it all wrong. An example is William Curran in Big Sticks: The Batting Revolution of the Twenties1 apparently contending that a lively ball wasn’t really introduced in 1920.
I certainly understand the value of debunking myths, and why an audience feels gratified by the attempt, but at the same time I am a bit uncomfortable with the hunger with which we analysts seek misconceptions. I wonder if we don’t typically poke rather than find holes, and I wonder if at bottom we aren’t motivated by that ugly desire to show ourselves the lone smart one.
Not wanting to make hay of the record, then, but just to present it, I humbly mention that batters in the National League actually struck out more in 1969 (15.89% of non-IBB PA), the year after the mound was lowered, than they did in 1968 (15.83%). Things went more predictably in the American League, where the strikeout rate dropped to 14.73% in 1969 from 16.18%.
This paper will revolve around that perhaps curious National League mark, and I will attack it from two main perspectives. I will expend my most energy in evaluating alternative explanations, on the thought that, if I am unable to establish them, then we can feel more confident that the surface trend really does speak to the effect of the lowered mound, an effect contrary to our intuition. Second, I will ask if a lower mound’s having a negligible effect on the strikeout rate does perhaps make sense after all, despite the way I first reacted to the possibility. As part of this consideration, I will explore the range of effects that the lowered mound seems to have had, and where strikeouts might fit into this picture.
As a reader, I would probably want to know the bottom line, and might even (God forbid!) skim the piece for it, particularly led towards a narrow focus by what seems to be a clear question that is posed upfront — specifically, what effect the lowered mound had on the strikeout rate. I will warn you, though, that while I found this research very valuable, I never got seriously close to definitively answering the question.
Unlike my recent piece on 1960s saves leaders, this is a work of data analysis and research, not history, but in its contours, it did not really allow for the absolutes one might assume with so much data explored. I was often working without datasets, or was forced to conceive of them on my terms. My questions were rarely of the yes/no variety. The task was more like furnishing legal argument, maybe like one does in constitutional law, than in working out a whodunnit. At the end of the research, I was more left noting the relative balance of arguments and the strength of those arguments than with a measure or an answer. But not having an answer properly speaking didn’t leave me feeling that I hadn’t learned anything, or feeling unfulfilled.
Part of that satisfaction came about because even when I failed to land the plane, I accrued the normal benefits one does from ample research. I was the recipient of insights I didn’t directly pursue. I will once again use the analogy of Columbus discovering America even as he didn’t find a more efficient route to India.
I also will at least contend that because I approached these subdomains within the context of a larger goal, I was left with more than just the random knowledge that days of research inevitably produce. I hope you, too, will benefit from some of these fortuitous encounters, and that you will better be able to wait out the slow time because I give you a complete explanation of where I thought I was going. I hope everything did come together on some level. My drift is that organization (while hardly one of my great strengths) is valuable in that it imparts meaning beyond just tangible yield.
All of this said, at the risk of being called a panderer or a hypocrite, I have produced a (still plenty long) accompanying summary at the end of the piece that I encourage you to consult at will. I enjoy the process of doing these — the challenge of settling on the crucial facts, the different rhythm involved in the writing, and the opportunity of really learning my own work.2
I guess one place to start is why we would think there would be an association between mound height and strikeout rate. While strikeouts would probably only have been of indirect interest to baseball, the decision to lower the mound was logically hastened by 1968’s featuring the lowest runs rate since 1908. Nothing says “pitcher” like the strikeout. It is to pitchers what home runs are to batters. I remember a Bill James piece demonstrating that a minimum strikeout rate was necessary for a pitcher to have consistent success over a period of years, so it can be argued they are indispensable to the breed.
Along with the general strong performance of pitchers in 19683, three pitchers put up numbers that seemed to show how lopsided things had become. I am thinking of Gibson’s 1.12 E.R.A., Drysdale’s 58 scoreless-innings streak, and Denny McLain’s 31 wins. Gibson also set a single-game strikeout record in that year’s World Series with 17. All three pitchers can certainly be linked to strikeouts. Gibson would retire 2nd on the all-time list behind only Walter Johnson; McClain’s 280 strikeouts in 1968 trailed only Sam McDowell in the American League; and Drysdale,4 while with his lowest strikeout total in a decade, was a three-time league strikeout champion, and six-time member of the 200 k club. So reining in these pitchers would necessarily include reining in their strikeouts.
Strikeout Trend Lines
In a first attempt to interpret that 15.89% 1969 National League rate as high, low, or indifferent, the 1968 rate of 15.83% isn’t necessarily our only gauge. There are surrounding years we can also use. But what makes this difficult is that projection is an art in itself. For a good reading, I would need to bring in an expert, and I have a feeling few are to be found. Whether we are talking horses or stocks or baseball, the age-old question is if we should seize on trends for a projection or we should expect regression. For individual players, the informed stance is not to bet on a career year twice, but league-wide base rates may follow a different law. This thought seems particularly pertinent with strikeouts which, in the major leagues, increased per game every year from 2005 to 2019.
Was something like that happening in the late ‘60s? 1968’s 15.83% was preceded by 15.68%, which was preceded by 15.35%. The increase from ‘67 to ‘68 seems slight (a case of dramatic increase would be the 18.74% to 19.89% in MLB from 2011 to 2012), but it is true that the increase on top of ‘68 was less still, suggesting strikeouts were perhaps below expectations, setting aside the mound-height change.
1970’s National League strikeout result also is in the eye of the beholder. That it was down to 15.37% argues against a rocketing rate, but might also suggest 15.89% was a high fluke. If you see it that way, you’re taking more of a regression approach.
We can make a laboratory of the same years in the American League, but doing this points to nothing decisive.
1966 thru 1970: 15.84, 16.52, 16.18, 14.73, 14.89.
There was no reason based on that to think, absent changed circumstances like a lowered mound, strikeout rate would be higher in 1969 than it had been in ‘68. These data, including the 2011 to 2012 MLB comparison, also indicate that the fall-off in the American League in ‘69 was really quite substantial.
The League Strikeout Difference — Could It Be Luck?
That the strikeout rate in the two leagues took on such different progressions can be argued to be every bit as confounding as the National League’s staying steady despite a lowered mound, particularly if the American League change really should be considered notable, as I am coming to think. Just based on what is normative, the difference certainly wouldn’t seem to be luck, but a statistician is more skeptical, and naturally inclined to believe the null hypothesis. Because a change in mound height affecting the leagues in materially different ways also seems unlikely if not implausible, that is all the more reason to take a formal look at the statistics.
In comparing the different league rates, one runs into the issue of non-independence of data points. What one doesn’t want to do is analyze with the assumption that each league had one a priori strikeout rate, and the final league strikeout rate was some approximation of that, run over the 73156 plate appearance encounters, in the case of the National League. If you do this, you are told that the NL 15.89 strikeout percentage had a 0.14 standard error (the American League would have a 0.13 standard error). With a 1.16 difference between the leagues, you would thus be well on your way to declaring a significant difference between the leagues (a z difference of 6.17 using the test of two population proportions). But this is wrong because far fewer than 73156 individual hitters and pitchers determine that NL strikeout rate, and these hitters and pitchers all have their own strikeout characteristic, different than 15.89%.
So, while this is inartful5 and probably goes too far the other way in terms of not assuming, what I did was to break the leagues down by hitters and pitchers who had 200 plate appearances or 200 batters faced. So I did two analyses, one on the basis of hitters, the other on the basis of pitchers. If you as a player or pitcher have 200 PA, it doesn’t matter how many PA you have in this analysis. Your strikeout rate counts as one data point. It’s not a perfect approach, but it gets at the issue of significant difference, and gets at the issue of non-independence.
I will start with the study comparing league hitters. Organized this way, the mark for each league dropped. The 135 NL batters with at least 200 PA had an average strikeout rate of 14.18%, and the 139 AL batters with at least 200 PA had an average strikeout rate of 13.02%. As luck would have it, the 1.16% gap between the leagues remained identical to the actual difference. The decrease in both leagues is explained by those bad hitters, largely pitchers forced to hit, no longer inflating the strikeout rates.
An unexpected aspect of the question was brought to light by the requirement of first looking if there was a variation difference in strikeouts between the leagues, before turning to the significance of the difference in means. I am not looking to make your head spin, but the NL standard deviation in strikeout percentage organized by hitter was almost a third greater than the the same for the AL. The test I ultimately ran to compare the means, a simple across-groups t-test, has a significance test for variance differences as a screening tool, and starred this F of 9.92 at the .01 level.
The difference made for a new interesting mystery in its own right, but of more pressing concern was the possibility that the higher strikeout rate might not be a universal phenomenon, but the product of a few hitters. However, a higher NL standard deviation could also mean more NL hitters with low strikeout rates than predicted by the mean.
To get to the bottom of the phenomenon, there were a couple of other strikeout rate parameters worth examining.
First, skewness. For a primer, a distribution with commensurate extreme values above and below the mean has skewness of 0. Positive skew is reflected by a positive number, and negative skew by a negative number.
Unlike for standard deviation, there was very little difference in skew by league. It was 0.616 in the National League, 0.608 in the American League.
The value of skewness in this case is really just to make the relation between the means and medians make sense. The more critical question was if the higher strikeout rate in the NL remained on a median basis. And it did. The typical NL player had a 13.47% strikeout rate; the typical American Leaguer, 12.62%.
Just to satisfy my curiosity on the standard deviation front, I looked at the players with the highest and lowest strikeout rates, regardless of league, and then saw how they were divided by league. Since we know that players in the National League struck out more on average and varied more on average, it would have been a surprise if they did not dominate high strikeout rates, and indeed they did. Twelve of the 15 players with the highest rates were in the senior circuit. The trend remained if the 25 highest strikeout rates were broken down: of those, 19 of the 25 players were National Leaguers.
However, where mean and standard deviation conflicted, with the lowest strikeout rates, the National League was more often found there, too, at least among the bottom 15 players, taking up 10 of the 15 bottom slots.
I find this a very strange thing. It’s not easy to explain how it can be easier to strike out a lot in a league and easier to largely avoid strikeouts. I suppose one could come up with a theory about the two divisions in the National League posing different probabilities, what with each playing a somewhat different schedule.
As for the main test (which was, to remind you, whether the 14.18% NL average by batter and the 13.02% AL average significantly differ), the difference scored as marginally significant (p = .075). So, if this level of conservatism with non-independence is justified, we have to be open to the possibility that the high National League mark is in fact not worthy of particular attention, as the difference failed to exceed thresholds often employed, like p = .05. In light of this, one might very well argue that the change in the pooled MLB strikeout rate (all players and plate appearances) of a 0.69% drop from 1968 to 1969 should be our one and only proper focus.
Before pivoting from these results, however, I promised you I would supplement the study grouping by pitcher within league instead of by hitter within league, so I had better live up to my word. For some reason, hitters tend to get mentioned before pitchers, just as couples are usually referred to with the man’s name coming first, but in this as most instances, they deserve equally large roles.
A look at the descriptives made plain that the pitcher analysis would be tricky and nunaced.
NL: n = 113, mean = .1525, SD = .0406, % of LG SO represented = 94.5
AL n = 117, mean = .1476, SD = .0343, % of LG SO represented = 92.1
First, even organized by pitcher, the National League again rates as more variable, although this time the Levene Equality of Variance test is above the .05 significance level (.103). However, a bigger issue is that grouping by pitcher and not weighting the pitchers based on PA has changed the nature of the comparison. The American League no longer trails by more than 1% in strikeout, but is within 0.50%, So, when we run a significant difference test, the results don’t reflect whether the original difference transcended the sample size that produced it, but whether this smaller difference passes muster.
And the significance test is quite emphatic that it does not. The t was 1.00, meaning that a mean difference of 0.49% splitting the sample almost evenly (n = 113 in one group, 117 in another) in random fashion fits the data almost perfectly. To be clear, the difference, organized by pitcher, is not significant.
How to explain the change in mean difference that resulted? You can see from my summary that a relatively low number of plate appearances were discarded by only using pitchers who faced 200 batters, but to know if the difference is the culprit, we can figure league strikeout difference just with these data. In other words, we use the same data, but instead of averaging the pitchers’ strikeout rates in each group, we hold off on figuring the rates until summing their strikeouts and batters faced.
Doing this, the NL rate rises notably to .1595, while the new rate of .1484 in the AL is much less transformed. We are almost where we started; the difference in the leagues is back to 1.11%. So, the change in the size of the strikeout difference doesn’t reflect a change in sample, but a different effect of not weighting by batters faced in the two leagues.
The suggestion from this was very much that strikeout rate in the National League correlated more with batters faced than it did in the American League. I generated the respective correlations, and they confirmed this framing as the right one: strikeout rate and batters faced in the National League correlated at .332, while in the American League, the correlation was just .049.
Weird again, right?
To get a further look at both the variation piece and the disparate correlations between strikeout rate and batters faced, I repeated what I did for batters, sorting by strikeout rate, and noting the profiles of the top and bottom players.
In terms of which league pitchers at the top of the list would belong to, the same principle applied as with the hitters: a higher average and standard deviation in the NL certainly predicted NL advantage among the highest strikeout rates. That panned out: of the 15 pitchers with the highest strikeout rates, 11 were in the National League.6
The second component I was interested in, workload difference by league, was harder to gauge, but the only one of those National Leaguers with at least 1000 batters faced was Fergie Jenkins. This was actually a lower percentage of pitchers with 1000 batters faced than in the dataset as a a whole, which included 33 pitchers who faced 1000+ batters (14.3%). The NL did, however, also have Bob Veale (980 BF), Don Wilson (979 BF), and Steve Carlton (968 BF), all of whom were in the MLB top 15 in strikeout rate. While the AL had fewer top 15 strikeout-percentage pitchers in general, two of them (Mickey Lolich, 1172 BF), and (Sam McDowell, 1166 BF) carried serious workloads.
Going to the bottom 15 of the strikeout list, just as with hitters, standard deviation won out over mean in tipping which league was more represented. Even with the higher mean, nine of the 15 pitchers with the worst strikeout rates were in the National League.
Only one of the bottom-15 strikeout men, the Yankee Mel Stottlemyre (20-14, bWAR 5.7), faced 1000 or more batters. The bottom strikeout rates therefore lent little perspective on the overall correlations with workload in each league.
Thinking of the simplest scenario, I wondered if the disparity between the strikeout proficiency/workload correlation in the leagues was a manifestation of a differential between the strikeout percentage of starters and relievers in the NL and AL. Starters throw more innings, after all. Starter/reliever status could even be looked upon as a sort of confound, obscuring what would logically be a correlation between use and strikeout rate.
In any event, here are the 1969 data.
Starter Strikeout Rates: NL 16.14%, AL 14.35%
Reliever Strikeout Rates: AL 15.64%, NL 15.23%
The difference in the correlation almost certainly does trace to a difference between starters and relievers in the two leagues, then. The data crisscross. Starters score a rather decisive victory in NL strikeouts, while relievers do the same in the American League. And we don’t just get a different verdict on whether strikeouts or relievers are the higher strikeout pitchers depending on which league we examine, we get a different result for the higher strikeout league. Finally, it makes sense that not weighting by batters faced leads to a higher relative strikeout rate in the American League than if composite statistics are used.
Again, I think we can question the wherefore, and whether the difference is meaningful. But the split is also certainly not driving the NL’s overall edge. A defensible narrative could be that the AL was the more innovative league, as symbolized by the stardom achieved by Dick Radatz and Hoyt Wilhelm in the ‘60s. But relievers pitch fewer innings, so turning up great relievers is to little avail when it comes to total strikeouts.
I suppose one could argue that in fact the AL was overdoing strikeout relievers, and this was costing them in the final count. That some of these men should have been starting. But this seems not to sufficiently appreciate the greater difficulty of pitching well in a starting role. Pitching well in relief often does not (and probably usually does not) translate into pitching well as a starter.7
1968 starter/reliever data seems needed to complete the picture, both in terms of whether the difference was of more than one year’s standing, and whether a change in it could have led to the NL jumping up so much compared to the AL in total strikeout rate. Here are the 1968 data.
Starter Strikeout Rates: AL 16.11%, NL 16.03%
Reliever Strikeout Rates: AL 16.38%, NL 15.21%
We learn a lot from this, I think. Some of the picture was the same in 1969, but some was different. The interaction between league and role is still striking, and in the same direction, with relievers the better power pitchers in the AL, and starters the better power pitchers in the NL. However, breaking down the data this way shows the relative spike in National League strikeout rate generally. To bring home that point, in 1969, NL starters trounced American League starters in strikeout percentage. By contrast, in 1968, they actually lost out, showing a lower rate. Even just focusing on relievers, NL pitchers closed the gap from a 1.17% strikeout loss in 1968, to just a 0.41% loss in 1969.
In any event, starter and reliever differences, even if they had emerged as a striking change between 1968 and 1969, could be construed as a biproduct of a lowered mound only with very imaginative thinking8. Taking a wider view, this is the difficulty posed in general by the leagues having different changes in strikeout rate from ‘68 to ‘69: not only did they both see the same drop in mound height, but they both expanded from 10 to 12 teams, to give a small preview. That makes the hypothesis that the difference was entirely luck attractive.
The statistical tests I did keep this possibility surprisingly viable. However, the analyses required the NL to show a higher rate than the AL, in the process disregarding the fact that the NL was actually a little lower overall in 1968. That initial difference in 1968 was small, but it should be accounted for. In general, I am also guessing that not taking stock of the 1968 data of individual players reduced my statistical power, maybe even dramatically. The results I present here convince me that a random split of some 250 players could produce mean differences such as existed, but I’m not sure those really should be the analytical terms. On the other side, one could say there could be other sources of not reflected non-independence, like umpires, making observed differences more likely than a straightforward assessment of the sample size would say (although the umpires, too, probably carried over by league almost entirely from 1968, reducing their influence).
Strikeouts as a Second-Order Effect, and Where They Fit in 1968/1969 with Other Statistics
As I said before, the end goal of lowering the mound was likely not to reduce strikeouts, but to increase offense. And that happened in 1969, in both leagues. The intervention worked like a charm. The National League went from 3.43 runs a game to 4.05; the American League, from 3.41 runs to 4.09. Play, however, was quite different in the two leagues, and the differences were actually fairly consistent across the two years, with the exception of the aforementioned strikeouts. Batters walked more in the American League and hit more home runs, while they had a better “batting average on balls in play” (BAbip) in the National League. Perhaps representing a case of “regression to the mean” in action, while the league that got the offensive nod in each case in 1968 retained it in 1969, the gap in all of these categories actually narrowed.
I will first show the gap between leagues and how it narrowed, then show the composite gain from 1968 to 1969.
Comparison of NL and AL
Unintentional Walk Percentage: AL 22.5% higher in 1968, 12.4% higher in 1969
Home Runs/AB: AL 26.7% higher in 1968, 12.5% higher in 1969
BAbip: NL 17 points higher in 1968, 12 points higher in 1969
(2B + 3B)/AB: NL 4.1% higher in 1968, 5.6% higher in 1969
Overall MLB Change by Category, 1968 to 1969
Unintentional Walk Percentage: Up 23.2%, from .0663 to .0817
Home Runs/AB: Up 29.4%, from 10.56/575 AB to 13.66/575 AB
BAbip: Up 7 points, from .269 to .276
(2B + 3B)/AB: Up 3.1%, from 24.17/575 AB to 24.91/575 AB
Strikeout Percentage: Down 4.3%, from .1600 to .1531
It should go without saying that none of these changes happen in isolation. Since the increases in walks and home runs were dramatic, I thought they may have spoken to what did or didn’t happen with strikeout rate.
One interpretation of the increase in walk rate, pending review, would be that pitchers weren’t used to the lower mound, and this caused them to miss their spots more often. If this were true, then pitchers should have issued fewer walks as the season went on. An adjustment period should also have meant that the walk rate stabilized some in 1970, too.
An alternate interpretation of the increase in walk rate would be that pitchers pitched more carefully in response to the greater threat in home runs. If this were true, then one would think that walks may have increased as the season went on and pitchers became aware of just what they were facing. Or the same thing may have come about in half the same way if it was hitters who gained a better understanding of the increased home run potential during the season and consequently worked counts more.
This last theory also seem to suggest that strikeouts might have increased as the season went on, as the offensive approach of taking strikes would logically lead to more strikeouts, too. Perhaps, then, from this perspective, the flat National League strikeout rate between 1968 and 1969 is a misguided focus, and our attention should instead be on the improvement in BB-to-SO ratio. Batters could have been striking out less in the National League in ‘69 than ‘68 if they’d wanted to, this theory goes, but they didn’t, because they were trying to hit home runs and draw walks. This could also go with the high standard deviation in strikeouts across batters in the National League in ‘69, with the different strikeout rates seen among players reflecting varying adoption of different strategies in taking pitches.
Something I haven’t addressed is why the home run percentage itself increased so much from ‘68 to ‘69, if we assume it was brought about by the lowered mound. But even as someone who shies away from mechanics, just not having a feel for that kind of science, I think I can see that this relation makes a good deal of sense. Pitchers unable to throw from the same downhill angle were unable to generate the same ground ball rates, and that led to more home runs. While now I feel stupid, with the benefit of having the data, I confess that a link between mound height and home runs probably actually makes more sense than a link between mound height and strikeouts.
With all of my talk about intra-season trends, perhaps you can see that I am building up to presenting statistics by month, specifically walk and home run statistics by month. Since the year-to-year trend was the same in each league, just differing in size (one league merely having a big increase, while the other had a really big increase), I combined both leagues below. Walks are again unintentional walks only.
April: 8.53 BB%, 13.83 HR/575 AB
May: 8.43 BB%, 14.26 HR/575 AB
June: 8.17 BB%, 14.00 HR/575 AB
July: 7.85 BB%, 14.16 HR/575 AB
August: 8.01 BB%, 14.72 HR/575 AB
September/October: 8.21 BB%, 11.26 HR/AB
It would be easy to make something of walks decreasing from month-to-month in the first three intervals, but having noted an elevated April rate over each of the last three years as well, I think this is a common thing unrelated to getting acclimated to the mound. It is also not like the walk rate ever descended to near its 1968 level of 6.63%. So, the best statement about the walk rate, in my opinion, is that it was much higher from the get-go, and did not change markedly during the season (the April/July difference is only 8% — 0.68/8.53).
The explanation, I can’t begin to tell you. A lower mound seems to have just leveled the playing field, so to speak, from a walks perspective. Or maybe what was going on had nothing to do with a lowered mound (we will explore the possible effect of expansion shortly). That walks were up because pitchers needed time to get used to the lower mound is also belied by the fact that walks in 1970 were very slightly up from 1969 (8.29% across MLB, vs. 8.17 in 1969). It’s possible that the lower mound did more to screw with umpires than pitchers, changing the appearance of pitches. But if it had this effect, again, the data say it was a permanent effect.
The story with home runs was very much similar to the story with walks. The home run rate for a median month in 1969 was 14.08 HR/575 AB. Considering the colder weather in April, the 13.83 April rate was very strong. So I don’t see any reason to think that batters gradually warmed to the home run possibilities like the weather. I expected walks and home runs might fluctuate together, but the truth is that, except for a big September fall-off in home runs,9 they were both consistently elevated versus 1968. The evidence is of a condition change, not of players playing differently as time went on and exploiting new possibilities. And, while we can’t absolutely know that batters weren’t all thinking “home run” more from the outset of the season with an understanding of the possibilities the lower mound presented, that the home run rate was elevated right off the bat certainly seems to argue against the theory that the strikeout rate largely spoke to batters increased efforts to “go for the downs” in 1969.
Given much of the minutiae I tackle, it is a hole in my knowledge I should have long ago rectified, but I admit I don’t have anything approaching an authoritative grasp on just how walks and strikeouts relate in different contexts. Often when a player’s strikeouts rise, it seems his walks come along with them. In my estimation, the reason this mostly comes about is because he takes strikes more than he used to or swings through more pitches, and thus gives the pitcher more chances to walk him. This model only seems applicable, however, when strikeouts actually go up. That was not the case in the 1969 National League, when strikeouts were flat (strikeouts were perhaps surprisingly high, but only walks went up). The solo increase in walks indicates a lot more balls thrown, not just more opportunities to throw them. More balls by themselves don’t do anything to increase strikeouts.
The Expansion Piece
Introduction
In this open-ended exercise of trying to find other explanations for statistical differences between 1968 and 1969 other than the lowered mound, there is one easy one to latch onto, which is expansion. Both the National and American League went from 10 teams to 12 teams.
Just what effect expansion may have had requires research and informed speculation, but however it showed up, we can take it as a matter of faith that the play was at a reduced level. Even if one believes a league by default improves from year to year through the delayed influence of more people over time playing the game and more people playing it seriously, it is implausible this gentle upward slope could overnight offset league membership becoming twenty percent less exclusive.
To measure roughly the effects of expansion, we can try reasoning backwards. It is awfully hard to make blanket statements about what better and worse leagues look like statistically, particularly around strikeouts and walks, and the balance almost certainly depends on just how high on the ladder of play we’re going. But it doesn’t do any harm to organize what we know, and may even be helpful as a sort of brainstorming exercise.
While some may harrumph at hitters today keeping their swings long with two strikes and giving away at-bats, I think the broad consensus and the clear-eyed view is that the pitching stuff is often just too nasty to hit now. The increase in strikeouts today is a manifestation of good pitching beating good hitting, then. Or, maybe, one could say that more strikeouts reflect that the pitching space had more room for improvement than the hitting space.
But in light of this inference (that the greater number of major league strikeouts today than in 1969, say, is a manifestation of improved play), note that what happened in the NL in 1969 with strikeouts was the opposite of what such an understanding of expansion would dictate. Play got worse, and strikeouts increased (or at least, controlling for the lowered mound, they did). Of course, if one doesn’t buy my theory about the reason for the increase in strikeouts today, there is no contradiction.
After strikeouts, I am primarily interested in how walks were affected by expansion. The reason is that, first, walks are hard to divorce from strikeouts. Second, the 1969/1968 statistical change in walks was so profound that it is natural to wonder if expansion played a role, and if so, how large a one.
To have a handle on general lightly held opinion is difficult, but I am almost positive I have heard multiple people talk about the watered-down pitching of expansion having greased Maris’s path towards 61 home runs, and think I have heard sophisticated presumption as well that expansion gives opportunity to pitchers who create a walk-filled game. So, for some reason (and it could even specifically trace to what happened in this ‘69 season), perhaps there is a perception that expansion increases walks, everything else being equal.
This past off-season, I “noted” on the data of a couple of leagues, those being last year’s college ACC, and the most recent Dominican Winter League. Creating a league of those leagues and the major leagues, walks-per-9 innings are the inverse of the level: in the majors last year, there were 3.1 walks per 9 innings, while in LIDOM there were 3.7, and in the ACC, there were 4.5.
So those are a few stray points, and evidence of a sort linking more walks with the worse play that we can assume is a biproduct of expansion. But a curiosity is that people don’t seem to consider that expansion could work the other way. They don’t seem to realize that sub-par hitters are also introduced into the pool by expansion, sub-par hitters who on their end might subtract from the league walk total and add to the strikeout total.10 So the question really is one of the net effect of the advent of borderline hitters and pitchers.
The question, then, was relatively clear. How to go about answering it was another matter. The difficulties were all too apparent to me. Inevitably, by the time I had gathered the data I needed for a particular study, I had resigned myself to the idea that the results I would get really wouldn’t mean anything, that they would be fine as far as they went, but they wouldn’t go very far. Yet I still felt they might add to an overall story, and also suspected that if I don’t do the study, someone would rightly ask why. Not wanting to leave any stone unturned, I plowed ahead.
I am quite sure I didn’t proceed in just the order that I will be presenting the mini studies. My organizational structure is to start with what I consider to have been my worst plan of attack and work my way forward to my best. My hope is that this will keep the larger questions in front of you and give you a window into my developing thought process. But if I hold off on pointing out limitations of the studies until I have presented their results, this does not mean I think the approach was sound. Indeed, I am reporting on the studies as much to explain how they guided my next steps as for their independent value.
Statistics of Expansion Teams Themselves (Expansion Study 1)
New teams have been rare since the founding of the American League. Counting NL and AL expansions in the same year as two expansions, there have only been eight expansions11 and 14 expansion teams. So there is actually not a lot to study, although for some unfathomable reason this was a revelation to me.
A straightforward and beginning approach would be to look at the expansion teams themselves in terms of their walks and strikeouts. Since offensive and defensive walks and strikeouts are at issue, we are really looking at four categories, not two. Again the question is what net impact these teams had on walks and strikeouts. Cumulatively, did they add or subtract walks to the league? What about for strikeouts?
Not only is it generally kosher to compare to league averages, but when a team is on offense, its goal is opposite to its goal on defense, so this is a necessity for framing expansion teams’ data correctly, given my interest here. I defined both walk and strikeout rate for both the teams and their leagues just as I did in analyzing 1966-1970 data, which is to say, excluding intentional walks, and excluding intentional walks as plate appearances.
Presented below are the mean and median for the four components for the 14 expansion teams as a group. It can be difficult to process these small numbers, so keep in mind that .0100, for instance, might refer to a team that struck out 16% of the time in a league that struck out 15% of the time. Positive numbers always indicate more of walks or strikeouts than league average, and negative numbers fewer walks or strikeouts. In the case of pitching walks and hitting strikeouts, however, negative numbers are good. Note that the differentials are imperfect because I didn’t take out the subject team from the league in figuring the difference from the league. That is more consequential in the case of 1961 and 1962, when the leagues expanded to 10 teams, than in 1998, when they expanded to 14 and 16 teams. respectively.
Offensive BB Diff: Mean - .0019, Median -.0035
Defensive BB Diff: Mean .0076, Median .0079
Offensive SO Diff: Mean .0124, Median .0146
Defensive SO Diff: Mean -.0071, Median -.0039
Not surprisingly, by average and median, expansion teams have been below average on all four indices.
Accurately reading strikeouts and walks for the purpose of comparing them requires some context. Strikeouts are a more common occurrence, and so tend to have more variation, which means more difference from league average. Across the expansions, the average league strikeout-to-walk ratio approached 2.00-1. But even making that mental adjustment, one can see that expansion teams have been much less deficient in their offensive walks (-.0019 mean) than their offensive strikeouts (.0124 mean).12
Besides walking some at the plate, the other component where expansion teams certainly haven’t disgraced themselves is with their pitchers striking out some batters. Note that, by median, the Defensive SO Diff, despite the larger scale, is almost as close to 0 as the Offensive Walk Diff.
So, in terms of walk/strikeout performance, expansion teams’ relative strengths have been walking on offense and striking out hitters on the pitcher mound, and their relative weaknesses have been walking hitters and striking out at the plate.
To make a total account, and to address the overriding question of a potential effect on league average, we can add the numbers for the two walk components and the two strikeout components. The average differential for an average expansion team in the two categories is therefore
Walk Effect Total: Mean 0.57%
Strikeout Effect Total: Mean 0.54%
To what degree these differences might come into play at the entire league level is another question, but we can at least note the direction. The data of expansion teams have been of a piece with what happened in the National League in 1969, which is to say, with a league that had an unexpectedly high walk and strikeout rate.
The reliability of the trend for both walks and strikeouts doesn’t quite hold up under scrutiny, however. At least, not if you think median matters.13 The total walk and strikeout effects can be figured by team, and for both variables, only seven teams had positive scores. The median total effect is just a matter of the mean of the 7th and 8th highest cases, and for strikeouts, it was actually negative.
Let’s take walks first and see how the data break down. There were five expansion teams above league average in walking, and four better than average at not walking hitters, but those were two separate groups, with no overlap. No team was on both lists. So, right off the bat, we know that the tendency of expansion teams to have cumulatively positive walks can’t have always applied. We know it didn’t hold for the 1961 Senators, the 1962 Colt .45s, the 1969 Royals, and the 1998 Diamondbacks. They walked a below-average number of hitters, while walking a below average amount on offense themselves, subtracting from the league walk total.
For the other five teams, those who stunk in both offensive and defensive walks, the direction of “walk effect total” depends on the absolute value of the two numbers. As it turns out, three of these cases were of teams whose offensive walks were farther below average than their defensive walks were above average. So that’s a total of seven teams, then, seven of 14, at variance with the rule.
Oddly, skewness (0.487 with a 0.597 standard error of the skew) does not seem to show “walk effect total” as a highly positively skewed variable. But the negative “walk effect totals” top out at just -1.23%, while six of the seven net positive sums were greater than 1.23%. So that explains a clearly positive mean. Even forgetting about the mean, if one takes note of the individual values (and it’s not a function of an individual positive outlier),14 a distribution around +0.57% fits better than a distribution centered around 0. The mean would be a better indicator for prediction going forward, I think. But the complete breakdown does poke holes in the idea that a positive “walk effect total” is anything like a given for an expansion team.
Now, to dissect strikeouts in the same way. Three pitching staffs (the ‘61 Angels, the ‘69 Seattle Pilots, and the ‘62 Colt .45s) struck out an above-average rate of hitters. But, as the Colt .45s were also an above-average offensive strikeout team (meaning that they struck out less than average), they have to be dealt with separately. The total rate in Pilots games was +2.11%, but they were no match for the Angels, who had a cumulative 5.75% differential.15 So, here we do have an outlier; the corresponding z for the Angels is 2.90 (theoretically a 1-in-500-positive value), and skewness for “strikeout effect total,” 2.049.
The sign of the Colt .45s’ cumulative score depended on which part of their good strikeout data, pitching or hitter, was better. Their sum was 0.79%, indicating it was their pitching.16
Other than the Colt .45s, the ‘77 Mariners were the only team that struck out less than league average, although their offensive differential was only -0.29%. They did not defy expansion odds on both fronts, failing to strike out a league average rate on the mound. However, they also tracked fairly close to center there, with a -0.47% differential. So they were a team that was much less impactful in lowering league strikeouts (a strikeout effect total of -0.76%) than the Angels were in raising them, the pertinent fact when analyzing a contribution to the mean.
At this point, we have only looked at teams that were better than average in at least one of the strikeout categories, and three have had positive cumulative strikeout differentials, and one has had a negative.
The 10 other teams, then, were below average in both hitting strikeouts and pitching strikeouts, and just as with a two-checks good team like the Colt .45s, the sign of the overall effect depends on the larger absolute value of the two strikeout components. As it happens, six of the 10 teams were more deficient in pitching strikeout percentage than hitting striking out percentage. And that explains how the overall count of positive and negative strikeout effect totals ended 7-7: we add this 4-6 over/under to the previous 3-1.
Of the negative nets, of these teams with conflicting signs, no total exceeded -1.13% (‘62 Mets). On the positive end, the ‘93 Marlins were +1.57%, but that’s a pretty tame total. Ultimately, the “mixed” verdict for each of these 10 teams stands as quite a fair description even after the distance from 0 for both components has been noted.
Outliers like the ‘61 Angels are sometimes illustrative and sometimes provide important clues, one of my graduate school teachers told me, and four of the largest absolute values were teams on the positive side of the strikeout ledger. So, like with walks, I would not say the data reflect even balance. And like with walks, the suggested net effect is a league increase, something that actually happened in the case of both variables in the National League in 1969. What trend there is that can be asserted among the expansion teams is even weaker for strikeouts than for walks, however.
To advocate for a smaller sample might be blasphemy, but it can be argued that my analysis here is overly broad. That, if the idea is that expansion teams’ statistics are deserving of special note, and that they affect a league’s total, then my only concern should be with the 1969 expansion teams, since I am trying to understand what happened in 1969.
To take this subset of four, as it happens, their data are very similar to the larger group. For both walks and strikeouts, two teams had positive effects and two negative, but the means for both walks and strikeouts were positive, with averages of +0.59% and +0.48. Those are essentially indistinguishable from the total expansion group means. The individual team numbers were as follows.
Walk Effect Total
Expos +2.60%
Pilots + 1.24%
Padres -0.59%
Royals -0.90%
SO Effect Total
Pilots +2.11%
Padres +0.21%
Expos -0.20%
Royals -0.20%17
To this point, little I have shown has really been inconsistent with the idea that the general nature of expansion teams contributed to the intricacies of the 1969 league data. However, remember that strikeouts in the National and American leagues went in opposite directions. The surprisingly high number was in the National League, yet it is the American League Pilots who drive the positive SO Effect Total. The Padres and Expos had hitter and pitcher strikeouts that essentially canceled themselves out.
Pre-, Post-Analysis of Statistics in Other Expansion Leagues (Expansion Study 2)
God knows, I have no qualms about being quantitative even at the risk of taxing my readers. And you might say that I could get a definitive idea of the impact of the expansion teams by calculating the league walk and strikeout rates without their games included, and seeing how that changed from the year before. But in truth, I feel strongly that a laser focus on expansion teams’ data is misplaced. For expansion does more than just create expansion teams, it affects all of the teams in a league. It is probably true that the majority of players on expansion teams are marginal players who sometimes wouldn’t have been in the league without expansion, and more often wouldn’t have seen more than occasional action. However, the players who take the place of the players used to fill out the expansion teams also are part of the dynamic. To focus on the expansion teams to the exclusion of other teams surely then doesn’t give the whole picture, although it can be argued that paying the expansion teams more heed makes sense.
This then brings us to a second way of analyzing the effect expansion has on walks and strikeouts, which is to look at the whole league changes in the other six leagues that have seen an expansion. Our interest is in pre- and post-; there is no necessity of throwing the seasons right after the first expansion season into the mix.
By my standards at least, hopefully I can make this short and sweet. Looking at total walks, four of the six leagues saw increases, and going by strikeouts, three of the six leagues did. Given the recent period when strikeouts regularly increased with a pace that exceeded even Dick Morris’s wildest dreams for his candidates’ poll numbers, it may seem strange, but the increases came in the early expansions, and the decreases in the later expansions. For instance, 1998 saw decreases in walks and strikeouts in both leagues (it is important to note, however, that these were certainly the mildest expansions, being the only ones that added just one team, and to leagues of the largest existing size).
Although only three of six leagues saw increases in strikeouts (all expansions before 1980), the average strikeout gain of 0.25% was greater than the average walk gain of 0.08%. The numbers are put into context looking at 1969, however. The largest strikeout gain was for AL 1961, and that was 0.76%. Strikeouts in the AL in 1969 went down 1.45%. In terms of walks, none of the leagues showed an increase anything like the 1969 increases of 1.73% in the National League, and 1.34% in the American League. The greatest increase in walk percentage was the 0.25% in AL 1961. Scientists would tell us we can’t conclude from this that the lowering of mound was the reason for more walks in 1969, but given the history of six other expansions, if expansion had anywhere near that effect on its own, I think we would know.
I don’t know that the design of comparing the statistics of the entire league before and after the expansion in the other six cases is convincing, but I do think it ended up being informative. We still don’t really know what expansion does, but we know what it doesn’t do, at least in terms of walks. Strikeouts remain the bigger puzzle, just because the National and American Leagues were affected so differently in 1969.
A Comparison of The Statistics of the Worst Players by WAR (Expansion Study 3A)
The third methodological approach I adopted seemed to do a better job of speaking to the actual question, and so I began the research with genuine enthusiasm. But it was uniquely weak among the three because the parameters of Study 3A were fuzzy and dependent on my understanding. We know what an expansion league is, and we know what an expansion team is, but who, exactly, is an expansion player?18
My idea was to get past stand-ins for the expansion effect, and to try to measure it directly. Comparing a whole expansion league to a whole non-expansion league and thinking one is getting to any answers requires believing on faith that the difference in the walk and strikeout totals is a product of the expansions. There are probably other markers that identify correlation as less powerful than causation, but at the heart of the problem with correlation is really its indirect nature, and that is a fair description of drawing inferences about expansion from an expansion league.
So what I wanted to do was direct measurement. I wanted to find expansion-level players, and capture and analyze their statistics. The method again is that this is done for hitters and pitchers, and then their walks and strikeouts are compared. The guiding question is, what is the cumulative effect in a league of having more bad hitters and bad pitchers than it had previously, in terms of walks and strikeouts?
I ran my procedure for three leagues: 1969 National League, 1993 National League, and 2024 MLB. The argument for using actual expansion years was to hopefully be able to identify the walk and strikeout profiles of the kinds of players perhaps let in to the league at that time. It did cross my mind that using non-expansion years instead might have the advantage of adding independent data. It could be argued, if rather elaborately, that making a study of a year when I already had the overall data biased my results in favor of the same trend. However, we don’t know that expansion, as opposed to, say, the lowering of the mound, was indeed the reason for the change in statistics, and this study is intended to figure that very thing out. It is also true that since expansion was attended with a different effect for walks and strikeouts in 1993, sampling two expansion years put us in a unique position to see if the statistics I gathered did indeed correlate with the outcomes of concern. As for the reason for including 2024, I suppose I thought doing so might be fun, and indicate how the game was different today.
So, to finally get into specifics, in this study I figured bWAR-per-plate appearance for every hitter and pitcher in the league. I am of course aware that hitters add and lose WAR by means other than plate appearances, but I thought plate appearances was the best stand-in for position-player playing time. For the definition of the worst players in the league19, I bandied about maybe players at the level of the bottom 7.5% or players at the bottom 15%, but I understood that if you wanted to get that, you had to make that your median, not your threshold, or your group would be half as skilled as you wanted. So I put my threshold at the bottom 25%, thinking I’d be looking at about the bottom 12.5% player.
Many of my decisions later had me thinking that I had not conceptualized issues well, but one that had perhaps the unique distinction of being somehow simultaneously weird and uninspired was how I got to this bottom 25%. Thinking in terms of plate appearances, not players, I sorted the players from worst to best in terms of WAR-per-PA, and counted the cumulative PA of these players. Once I got to a quarter of the league plate appearances, I had my group representing the bottom.
I don’t know that it cost me anything, although the nature of these comparisons did hinge on precision, but one consequence of my method was that players who played more were weighted more, and that was true no matter how they played: whether they were really bad, or had a WAR-per-PA that just barely placed them under the cut. So the quality and characteristics of the cumulative player represented by the method perhaps depended on whether there happened to be a high number of awful players who played a lot, or simply merely bad players who played a lot. And this does seem to take us in a direction that has nothing to do with what the bottom of a league might look like in one year versus another, and how that bottom might change in terms of an expansion.
Another decision that can be questioned is that I compiled the data for the year of expansion, and not for the year before. In a perfect world, one would probably want to identify “four A” players in a year before expansion to see the characteristics of the players who might have been promoted through the expansion. How this could be done for players who didn’t actually play MLB ball in 1968 is unclear, however. And because I was also looking at effectiveness (WAR/PA) and not actual playing time in the year of expansion, it’s not like I was only looking at players who only assumed backup roles.20 I was rather getting general characteristics of the league bottom.
While walks and strikeouts are hitting stats and separate from fielding and base running21, it was realistic and right for me to define quality of play in terms of WAR. If I had just used a pure hitting statistic to define the league’s worst players, I would have been incorrectly assuming that all teams care about is hitting, and exaggerated the negative effect on hitting statistics with a looser league entrance requirement. Some of those players on the outside looking in before expansion will actually be better hitters than ones rated above them, but worse fielders and base runners. If we want to simulate expansion, the holistic quality of WAR is the way to go, all the more so if the effects analyzed are offensive effects.
There is generally no separation of the league that produces the bottom 25% of plate appearances exactly, so following my sort, I took the cumulative plate appearance total that left me closest. This seemed to work well, and no unforeseen problems cropped up in its execution.
For the 1969 and 1993 NL, hitting pitchers were excluded as bottom offensive WAR players. The tweak also extended to adjusting the size of the starting pie that determined the bottom 25%, and using league statistics without hitting pitchers for the comparison points. In the same vein, the pool of 2024 pitchers did not include actual position players used on occasion for expediency, and position-player pitching statistics were removed from the comparison statistics.22 I will spare you my exact steps for cleaning the position player and pitcher groups of those not doing their main job, but having checked my work many times, I’m confident I was successful. Please contact me for details if you would like them.
My variables of interests were of course walks and strikeouts. For these, I figured the cumulative rates for the bottom position-player and pitcher groups.
I also documented just how bad the groups were according to bWAR. An estimate of that was interesting in its own right, but also provided some perspective on whether differences between hitters and pitchers for walks and strikeouts just reflected those particular categories, or more reflected the overall difference in the strength of the groups. The WAR statistic I give is the cumulative rate per 625 plate appearances.
I also figured cumulative OBP and SA for the bottom hitters and pitchers, and expressed these relative to the league average.
O.k., time to cover and discuss results.
1969 NL, Topline
WAR: Position Players -1.48/625 PA, Pitchers -1.03/625 PA
BB: Position Players -1.19%, Pitchers +0.79%
SO: Position Players +1.77%, Pitchers -2.26%
The first thing I would note here is that these results clearly do not mesh with the changes I have reported for NL 1969. If those changes were mostly brought about by expansion, then the walk total in this analysis should first and foremost be very positive, since walks increased by 29%. But the cumulative walk total here is negative (-0.40%).23 The meaning of the latter is that I found that bottom hitters were farther off average in walks than bottom pitchers.
Then the whole behemoth of a piece here revolves around the NL strikeout rate’s having been curiously high, given a rule change designed to increase offense. Yet, if you subscribe to the logic of this design for assessing expansion, bad pitchers were more likely to lower the strikeout rate than bad hitters were to raise it.
Trying to piece the two parts together, 1969 actual data, and this analysis of bottom players, is complicated, and unfortunately not apples to apples. Specifically, if a devil’s advocate wanted to explain away the discrepancy between the ratio among potential expansion players and pitchers in walks from the actual change that occurred in NL ‘69, and wanted to say that these results do not challenge the expansion explanation, she could say that the bottom pitchers I captured here (-1.03 rate bWAR) were better than the bottom players I captured here (-1.48 rate bWAR).
I have a couple of rebuttals. First, the finding of this overall difference could be said to be part of the overall weakness of the expansion argument — that, while we are interested specifically in walks, if those greatly increased because of expansion, then we probably would have found that expansion-level pitchers were worse than expansion-level players, too. Then, regardless of how the cumulative bWAR numbers for players and pitchers came out, data for players and pitchers just often look different. I’m not sure we can take the bWAR number as absolute evidence that the expansion stand-in pitchers were, in fact, worse than their position player counterparts. The groups are fair and representative from the standpoint that both represent the bottom 25%.
I can be that devil’s advocate, too, and the argument can go on all day long. If hitters’ and pitchers’ bWAR is apples and oranges, maybe hitters’ and pitchers’ wider data, most pertinently walks and strikeouts, are as well. It is at least certainly true that whatever conclusions we reach regarding theoretical effects of expansion may have to do with the relative extremity of hitters and pitchers, and not expansion per se.
Regardless, in this model of expansion, results contradicted the reality of 1969 NL, a failure that must be given some weight.
1993 NL, Topline
WAR: Position Players -1.84, Pitchers -1.21
BB: Position Players -1.47, Pitchers +1.09
SO: Position Players +2.44, Pitchers -1.67
As walks and strikeouts in the NL 1993 were little changed from the year before, it is unclear how much weight to give these results when evaluating the hypothesis that expansion increases both walks and strikeouts (relative to a lowered mound), as happened in the 1969 NL. An argument that these results should be considered to be at least almost as informative as any other year of bottom-level players can certainly be made, on the theory that making a one-to-one link between borderline players “year of” and actual league change depends on a fanciful notion of tidiness. In this conception, each expansion model provides one data point, and with enough gathered, we might get some idea of probable trends.
In any event, with strikeouts, no clear picture is emerging, as unlike 1969, this analysis found bubble hitters to be worse in strikeouts than bubble pitchers.
The walk data did resemble their 1969 counterpart, however. Position players were 0.38% worse than the league than pitchers, and in 1969, they trailed pitchers by 0.40% in discrepancy from league average. Again, the walk comparison was in line with WAR, which also showed position players worse.
2024 MLB, Topline
WAR: Position Players -1.30, Pitchers -1.65
BB: Position Players -0.67, Pitchers +1.13
SO: Position Players +2.52, Pitchers -2.64
The procedure applied to 2024 did yield the first evidence of an expansion having the potential to add walks to a league’s total, as bad pitchers were 0.46% units worse relative to league average than bad hitters. Again, however, the direction of walks followed WAR generally, which shifted for 2024: bottom pitchers were cumulatively worse this time than bottom position players.
Strikeouts came out a push. With strikeouts higher overall in 2024 than in 1993 and 1969 (not to mention virtually every other year), both the PP and pitcher differential were the highest we have seen yet, but they were about equally high.
All Three Years Average, Topline
WAR: Position Players -1.54, Pitchers -1.30
BB: Position Players -1.11, Pitchers +1.00
SO: Position Players +2.24, Pitchers -2.19
When we view the data this way, it is very easy to say the data reflect no real difference between bottom hitters and pitchers on walks and strikeouts, and that expansions in turn should have the same absence of effect on league totals. The best guess at the mound lowering effect of 1969, then, need not take in expansion, in this analysis.
While I am generally supportive of this position, at least if I have no more to go on than what I have presented so far, I think it is an overreading of the data and places excessive faith in this particular study. My results deviated from year to year, and that creates more doubt than if the walk and strikeout differentials were near 0 on average and about the same every year. Even if the model is relevant, and WAR-per-PA a sound approach, it is well-nigh impossible to know exactly how differences in the differentials translate to actual changes in an expansion year, and this uncertainty makes it harder to interpret results that differ on the surface. Was the variation from case to case just of expected size, which is to say, random? It is certainly plausible that it wasn’t random. It is possible that the relative strength of bad hitters and pitchers is not the same in one year or another, particularly when we are looking across decades. But the overarching concern is just the previously discussed doubts about the methodology, particularly the great differences in weight assigned players.
While theory doesn’t lead me to expect a result in one direction or another, I also am hesitant to embrace the null result because I take the proposition that bad hitters and bad pitchers are equally bad as dubious. Baseball is at least full of outcomes that are more controlled by the hitter or the pitcher, with one example being that home runs have more to do with the batter. No major league pitcher is going to make Myles Straw into a home run hitter. Whether these examples mean that there are overall differences in quality (or maybe, better put, in variation), of course, is another matter.
The one thing that is clear is that the study settled nothing, meaning that it absolutely did not establish a clear walk or strikeout effect, either in terms of projected increase or decrease. Both variables were split “2-1” between hitters and pitchers over the three years. Bad hitters were worse than bad pitchers on two of three occasions in walks, while bad pitchers were worse than bad hitters on two of the three occasions in strikeouts. Leading to even less clarity, for both variables, the one decision that did not go with the majority showed a bigger difference than the two that did.
Bottom WAR Position Players vs. Pitchers: the Numbers Over Time, and Factors Affecting Them
My reason for showing cumulative WAR along with walks and strikeouts was to give you an opportunity to evaluate the hitter/pitcher ratio in light of the WAR, although I don’t think it’s obvious that’s the correct line to take. But I am sure some people will be interested in what the WAR numbers show beyond just statistical context. Certainly, the broad design I used lent itself to potentially powerful insights.
But unfortunately we have to think about what the data actually mean, and how the variables were operationalized, not just what it seems like they might mean. If I were more “big picture,” I would doubtless have more fun and draw more dramatic conclusions. But the sad truth is that, in analysis, if you are just a little bit wrong, particularly in more than one place, you will usually end up completely wrong. And my goal is to be right, not dramatic.
To address the more black-and-white aspects, I was surprised to find that negative WAR is as common as it is. I will look upon WAR somewhat differently now. One perspective is that if 0 represents a replacement player, basically everyone should be above 0. But actual WAR numbers seem to say that there are a lot of guys every year who should get fired24 . This is mostly definitional, not something to endorse or reject, but it is interesting.
So that you have the numbers all in one place, here again was cumulative WAR per 625 plate appearances (or batters faced) for the bottom position players and pitchers in each year. The meaning of that “bottom” is again tricky: defined by WAR-per-PA, but with the bottom-quarter mark defined not by number of players, but at the WAR-per-PA point where 25% of league plate appearances had been reached. That mark seemed to vary moderately at most, but I’ll show it as well just for the sake of clarity.
Position Players: 1969 -1.48; 1993 -1.84; 2024 -1.30
Pitchers: 1969 -1.03; 1993 -1.21; 2024 -1.65
Bottom 25%, Everyone On Down From….
Pos Players 1969, +0.62/625; Pitchers 1969, +0.26/635
Pos Players 1993, +0.34/625; Pitchers 1993, +0.31/625
Pos Players 2024, +0.63/625; Pitchers 2024, +0.21/625.
Another productive framing is to look at the percentage of the league in each year, by cumulative plate appearances, that had negative WAR. So, instead of working from bottom WAR-per-PA, stopping at 25% plate appearances, totaling the WAR, and noting the 25th percentile rate, I work from bottom WAR-per-PA until I get to the 0 WAR players, and note what percentage of plate appearances has been reached.
Position Players: 1969 18.2%; 1993 20.4%; 2024 18.3%
Pitchers: 1969 19.1%; 1993 21.3%; 2024 21.1%
As you can see, the proportions are quite consistent in the three years and across the two groups. In fact, they run so close, that I wouldn’t rule out that a very similar procedure was undertaken to determine the definition of negative WAR.25
Cumulative WAR for pitchers on the bottom says that 2024 had a group less competitive than in 1969 and 1993.26 Additionally, 2024 was also the only time when the bad pitchers were worse than the bad hitters.
However, the percentage of negative pitchers gives a different view. 21.3% of the batter matchups in NL 1993 went to negative WAR pitchers, while 21.1% of the batter matchups went to this contingent in 2024 MLB.
How, then, to reconcile that 1993’s cumulative WAR projected to be just -87.8 to 2024’s -120.2 even if the batters faced of the bottom quarter for 1993 were increased to the 2024 level?
In the most literal terms, the meaning is that these 2024 negative guys were more negative than the 1993 negative guys. In terms of how this came about, one likely factor is that there was a serial changing of the pitchers on a roster in 2024. The modern dynamic is there are lots of guys who have small sample size, and when they don’t work out, the team is on to the next guy. Trust my statistical instinct that this is a way to tag team to a very negative WAR within the parameters of this particular study.
Relative workloads of one pitcher to the next have also certainly changed since 1993, and depending on how effective each specimen is, the cumulative WAR at the bottom of WAR-per-PA can look different. The simple number of players and pitchers comprising the bottom 25% of hitting and pitching PA in each year shows the shift away from position players toward pitchers on rosters since 1993 (i.e., rosters used to maybe be 40% pitchers, and are now maybe 50%). Given what seems to be an expansion of relief pitching even beyond the change in roster allocation, it’s actually surprising my shift in sample at the bottom wasn’t even greater.
1969: 111 players, 84 pitchers (1.32-1)
1993: 142 players, 110 pitchers (1.29-1)
2024: 282 players, 336 pitchers (0.84-1)
With so many pitchers major leaguers now, a certain logic, bolstered by these data, could say that their quality has decreased. In other words, that an expansion of sorts has occurred, even without an addition of teams. But when we look at the average velocity of relief pitchers, and at the fact that they perform as well as starters in a bad year, this belief seems to crumble.
OBP and SA Data, Expansion Study 3A
With the bottom groups set for each year, there was no reason I couldn’t obtain any number of other statistics. I took advantage to figure OBP and SA in each instance. My reasons for doing this were twofold. First, as a sort of reality check on the bWAR, and to make the groups seem more real to me; second, in the belief that calculating OBP and SA could tell me something about how baseball was played in each time, and which things were valued.
The numbers I give below are points (+ or -) relative to the league average for the bottom groups.
1969 Position Players: -42 OBP, -70 SA
1993 Position Players: -44 OBP, -60 SA
2024 Position Players: -35 OBP, -64 SA
1969 Pitchers: +32 OBP, +53 SA
1993 Pitchers: +32 OBP, +51 SA
2024 Pitchers: +39 OBP, +68 SA
These data are fun and are in easy, digestible form for a change, but I would submit that they are actually very hard to interpret. The groups were defined by WAR, so one thing that is being reflected is just the mathematical underpinnings of WAR — specifically that on-base percentage and slugging average are very important in it. We didn’t need this study to know that, and this certainly isn’t anything like a straightforward way of representing it.
One might make something of the fact that OBP among 1993 position players was worse than in the other two years, while SA was not as bad. But again, I couldn’t really argue that this says anything about which characteristics were valued in the leagues, as WAR-per-PA is still ultimately driving the groups. How one group got up to 25% league PA versus how another group did is an extremely complicated business, but I can’t see that the differences give any insight into which characteristics were most valued. WAR is WAR, and 25% is 25%, and those are the confines of this design.
It is certainly possible, however, that relative OBP and SA differences of the bottom in the different years reflect broader differences in the standard deviation of those categories. We would need to see how the top of the league, etc., compared to league average to know. We do know, from average WAR, that the 1993 bottom NL was particularly bad overall, and this certainly affects the likely standing of any individual statistic.
Along the same lines, if you compare the position players and pitchers in the given years, you will see that the return WAR gave in terms of which was the weaker group, players or pitchers, would be seconded in each instance if just OBP were used, or just SA. I actually expected pitchers to show bigger deficits in OBP and SA than position players, because their job is solely pitching, while defense contributes strongly to negative WAR values, and might get respectable hitters placed in the bottom 25%. OBP and SA also ignore positions, and WAR doesn’t, including it (as a + or -) in Defensive WAR. Conversely, all pitchers are pitchers, although there is a relief tax put into WAR.
But I think the reason why the pitchers had negative WARs that rated with position players every bit as much as their OBPs and SAs is that pitcher WARs adjust for the fielding they are believe to have received. The adjustment for team defense is at times large, and means that Bailey Ober could be within 0.1 of Shota Imanaga in 2024 WAR with a run average 0.65 higher, despite pitching only 5 more innings. Supporting defense functions much as player defense does in getting from OPS to WAR, acting as an unaccounted for variable. Just as summing defensive WAR for players with bottom 25% WAR results in a negative number,27 I believe defensive adjustments for bottom 25% WAR pitchers would be found to have moved their numbers significantly back, were it calculated.
A Comparison of Replacement-Level Players and Pitchers Using Playing Time (Expansion Study 3B)
It would also have left a lot to be desired, but to gauge the effect of expansion, instead of tying the categories I was interested in to WAR, I think I would have done better just looking at a distribution of the key variables (strikeouts, walks) alone for hitters and pitchers, and comparing the distance from league average in each case. Choosing not to identify actual potential expansion players is a concession to reality, no doubt. It is the equivalent of a company scaling down and trying to raise millions instead of hundreds of millions, with all of the emotional deflation that would entail.28 But for the loss of baggage, I think it would have been decidedly preferable. While the league’s worst walk and strikeout players are not necessarily the worst players per se, because of its concreteness, I think it would have offered something that tangling variables couldn’t.
However, dreams die hard. Not having full perspective on my work at this time, the follow-up I actually elected to do employed the same strategy of identifying expansion-level players and pitchers and comparing them, but on a bit of a different footing. Instead of using each player’s WAR-per-PA and his plate appearances, and then finding and analyzing the players linked to the 25% worst plate appearances, I decided to rank players just based on their plate appearances. I still used a cutoff of 25%,29 but I sorted by just plate appearances. The players’ totals — in other words, their walks, strikeouts, OBP, and SA relative to league average, and their cumulative WAR — thus represented the players who played the least in that league.
This, I think, is a much more conventional take on a replacement player, or an expansion player. The theory relies on finding an inferior group of players not by looking at how well or poorly they played, but by the natural inclination of teams to use their worse players less. The approach takes advantage of a team bench and sometimes unseen “Columbus shuttle” and presumes teams have control over who is in the lineup and who is on the bench, beyond just injury rate.
I think this representation better reflects reality. Creating a team perfectly compliant with bottom 25% WAR-per-PA guys probably creates a team worse than one could ever be. While the exact level that should represent the expansion effect is nebulous, the starkly negative WAR composition that came out of Study 3A wasn’t really what I had in mind.
Another advantage to this approach, to review my earlier point, was that theoretically freeing the categories under study from the criteria of who was in the study was sounder. Additionally, while not my original interest, it opened up a whole area of what player characteristics teams valued in different eras. It is meaningful to say that low slugging average players were shunted to the side in a given year, while it is predetermined that low WAR players have low slugging averages, and that is what I was reduced to saying before.
Unaddressed was that the ultimate rates reflected in the bottom group still in effect weighted players by plate appearances. If you give the matter no thought, this is certainly what you naturally fall into. While there wasn’t really any good argument for the de facto weighting with WAR-per-PA, it also was probably a less important aspect, because there wasn’t the variation in WAR-per-PA there is when playing time is the basis for inclusion. In this instance, it made a certain amount of sense for players with more plate appearances to count more, since more plate appearances made their performance a better reflection of their ability. On the other hand, one could argue that the guy at the 3rd percentile in plate appearances is much more of a replacement player than the guy at the 24th percentile, and so should be counted more. Above all, it is still troubling that the most important player to the ultimate rate in 2024 was the player with 313 plate appearances, Joey Meneses, while the player with 314 plate appearances, Kevin Pillar didn’t count at all. So it’s a complicated subject, and viable alternate approaches would certainly have changed the hitter and pitcher samples, and theoretically the results in a substantial way. I would regard these differences in sample as random, though, rather than systematic, and hopefully that would mean no changes in results beyond sampling error.
I also think that making playing time the criterion for the bottom brings to the fore the wrongheadedness of capturing replacement-level players from the year of expansion rather than from the year before (if one is to take this exercise literally as one of modeling the conditions of expansion). Just to take the case of the expansion teams, expansion starters weren’t part of the bottom group under analysis, because they obviously had too much playing time to qualify.30 And by all means, these seem like the kinds of players the study is trying to capture. These players certainly had a greater effect on league numbers than they would have had the year before.31
For pitchers, I was concerned at the prospect of just rating playing time by batters faced, because this would naturally create a bottom group skewed towards relievers. Consequently, I amended the method.
The first piece of information I used was the percentage of position players in that league who made up the bottom group, with everyone with at least one plate appearance counting towards the total N. Then, I applied that percentage towards pitchers with a batter faced, but instead of using batters faced as the barometer of activity, I used games_started*2.1 + relief_games. I used games instead of batters faced because, unlike batters faced, they are easily available by role, and I weighted starts extra in the manner I did because relievers pitch more games than starters.32 I again cut off the pitcher group based on the target percentage of players.
While the first study did not have any obvious systematic bias towards more starters or relievers in the way that defining the group by batters faced would have had, neither was there any effort made to balance starters and relievers. So a difference between studies in percentage of starters and relievers could exist, and could lead to different results.
Expansion Study 3B, Results
Here are some specifics for each year, to make the study more concrete for you. In line with footnote 28, be aware that what I identify as 25% is not 25% exactly.
-In the 1969 NL, 62% (155 of 250) of players could be made to cover the bottom 25% of league plate appearances. This included players up to and including 330 PA.
My bottom 62% of pitchers, weighting relief games more than games started, accounted for 25.96% of batters faced.
-In the 1993 NL, 62.46% of players were needed to cover the bottom 25% of league plate appearances. This included players up to and including 302 PA.
My bottom 62.46% of pitchers, weighting relief games more than games started, accounted for 33.80%33 of batters faced.
-In 2024 MLB, 58.09% of players were needed to cover the bottom 25% of league plate appearances. This included players up to and including 313 PA.
My bottom 58.09% of pitchers, weighting relief games more than games started, accounted for 26.42% of batters faced.
We will start in earnest now and look at WAR. The raw numbers show that the revised method of making playing time the criterion for the bottom, and not performance specifically, had the effect of substantially elevating performance across the board.
Position Players
Study 3A, WAR/625 PA, Three Years Averaged: -1.54
Study 3B, WAR/625 PA, Three Years Averaged: -0.04
Pitchers
Study 3A, WAR/625 PA, Three Years Averaged: -1.30
Study 3B, WAR/625 PA, Three Years Averaged: 0.29
In addition to the level moving up by about a win and a half, the other thing to take away is that pitchers retained an overall edge over position players, and by about the same margin. Around that general difference, too, the individual years fell in the same pattern they did the first time. Pitchers registered their biggest victory in 1993, while position players beat them in 2024, but this time by only 0.05 WAR/625 PA, not 0.28.
Since 2024 pitchers made a more respectable showing than before, this study might actually pour some cold water on the analysis that there is something notable in the degree to which today’s bad pitchers struggle. I would actually have thought using playing time as a simple threshold would have made the phenomenon I described before, a kind of selection where bad performance goes in the books without the pitchers having the opportunity to redeem themselves and show their actual level, more acute, but the results went in the opposite direction.
We have a bit of an opposite question now with the playing time method of defining the bottom, as the 2024 position player group was flattered to the tune of +0.25 WAR/625 PA. Before getting way out ahead of myself, however, and wondering if this says something about current short benches, or teams doing an all-time worst job of identifying their worst players, or about methodological issues with WAR, etc., I would need to see what the WAR rate is for the other quartiles, and for the other quartiles in other years. Again, we need to know to what extent the difference in the bottom quartile reflects a difference in just that quartile, and to what extent it reflects a change in the whole range.
Turning to walks and strikeouts, the variables primarily under investigation, to review, in study 3A, I found that bottom players were on average -0.11% in walks (summing players and pitchers), and on average +0.05% in strikeouts (summing players and pitchers). So, at least going by the average year, there was no reason to think opening up a league to more players and pitchers would result in the increase in walks seen in 1969, or to the hypothetical increase in strikeouts absent the lowered mound seen the same year.
But this time, I got different results.
1969, 1993, 2024 Average
BB Differential: Position Players -0.22%, Pitchers +0.84%, Cumulative +0.62%
SO Differential: Position Players +2.29%, Pitchers -1.79%, Cumulative +0.50%
Just 1969
BB Differential: Position Players +0.20%, Pitchers +1.15%, Cumulative +1.35%
SO Differential: Position Players +2.35%, Pitchers -1.62%, Cumulative +0.73%
With 1969 providing the strongest support, this model of expansion fit what really happened with walks and strikeouts in the NL in 1969.
Taking walks first, marginal position players, defined in terms of playing time, were not low in walks. In fact, in 1969, they walked at an above-average rate. However, pitchers used less did walk quite a few more batters than average. So, while fully acknowledging that there’s a huge difference between noting this and knowing that it played out this way in real life, and even then if such an imbalanced infusion was of any real importance, it makes sense that increasing the roles of both of these marginal groups to the same degree would result in a net increase in walks.
1993 seems to indicate, however, that this is not a staple of the bottom every year. The cumulative walk effect for that year was negative (-0.07%), resulting from the least-used pitchers combining for a walk rate 0.48% worse than average, while the least-used position players lagged average by 0.55%. So the bottom pitchers were not so wild in this year. It is also perhaps worth noting that pitchers rather thumped players in WAR that year, 0.41 to -0.34. Considering that, the players’ loss in walks was by comparison small, but regardless of the explanation, its effect would be the same.
As you can see, the net strikeout differential also says that expansion leagues should be high strikeout leagues, in addition to high walk leagues. Since strikeouts are more common than walks, that the difference between players and pitchers on strikeouts is narrower indicates a smaller real projected effect.
Maybe I am overly influenced by the 1961 Angels, but in my mind, there is a logic in the profile of this kind of position player who may be elevated by expansion: he’ll take pitches and hold his own drawing walks, but intrinsic to the player, regardless of his approach, is a bad strikeout-to-walk ratio, and consequently a really high number of strikeouts.
Fulfilling hopes, the findings for on-base percentage, slugging average, and Defensive WAR had enough nuance to potentially generate future studies, as well as qualifying as statistically provocative, at least in this researcher’s mind.
The special difficulties of the 1993 position players boiled down to their poor Defensive WAR. Here is Defensive WAR for each year per 625 PA, accompanied by OPS difference from league average.
1969: OPS -95, Defensive WAR/625 PA -0.12
1993: OPS -76, Defensive WAR/625 PA -0.45
2024: OPS -77, Defensive WAR/625 PA -0.15
Modern analysts, who are among the people running baseball teams, flatter themselves on a new understanding and appreciation of defense. It is therefore interesting that it appears that bad defense was much more likely to keep you out of the lineup in 1993. We don’t absolutely know what dWAR for the other quartiles look like, but as WAR needs to balance, the whole picture would presumably show a strong trend of some sort towards better defense and more playing time in 1993. On a theoretical level, I guess it could be argued that the bad defensive players were making it to part-time play and the bench in 1993, and they would not even be doing that now, but WAR can’t represent abstract defensive talent. It is always relative. The 2024 poor defenders, if only statistically speaking, show up somewhere, and this analysis says the distribution was different in 1993, and in a way most of us wouldn’t have predicted.34
In thinking through dWAR, it is not a measure strictly of defensive performance, but has the positional adjustment mixed in with it. We can expand our thinking even more remembering that 2024 included designated hitters, who didn’t have a spot carved out for them in 1969 or 1993, National League. But DHes mean, I think, that average defensive WAR overall was lower in 2024, and that’s not what the bottom quartile for 2024 is showing versus 1993.
Bottom pitchers were closer to the league average in both OBP and SA than bottom position players in all three years. Since we know that the players beat the pitchers in relative walk rate in two of the three years, the OBP piece shows that pitchers stacked up better in batting average. The average pitcher advantage on batting average for the three years was 12 points.35 Slugging Average, meanwhile, showed a composite 27-point average advantage for pitchers. This reinforces the point I made earlier, that no pitcher is going to turn Myles Straw into Darryl Strawberry.
Probably the slash component that sticks out the most is the .316 slugging average for the 1969 NL bottom position players. That was 68 points below the average for all non-pitchers in the league. With the (relatively) great walk rate and better defense than in 1993, however, bWAR came out very close to replacement. Except for the walk rate, it’s a profile that makes one think of middle infielders, but again, the method has built-in protection against that sort of thing: it is hard to see how middle infielders could get much more backup play in one year than another (someone has to start at those positions). The low slugging average does seem to suggest that teams discriminated on the basis of power, but I wouldn’t run too far with that interpretation. The number could be an anomaly. (If you are interested in more detail on the low slugging percentage of the ‘69 NL bottom, done in my “notes” style, see the following footnote)36.
Adding 1969, AL
After the first half of this study, the question was just whether the essentially null results were sound. Study 3B resulted in a complete shift. I was now interested in whether the results would replicate, and I seemed close to having a case on the expansion piece. I was in fact in that position where one slightly altered study giving a consistent conceptual result would go a long way.
Specifically, I was intrigued by the opportunity to get the corresponding numbers for the 1969 American League. 1969 National League showed only a solid net positive cumulative strikeout difference, but since the actual strikeout trend in the American League had been of a very different nature, with the National League becoming the strikeout league, it seemed it would be a real point in favor of the methodology if I found a much lower cumulative strikeout rate (and maybe even a negative one) for the American League in the projection.
The other thing that had stood out about the 1969 National League in the second expansion analysis was a +1.35% cumulative walk rate. That was accompanied by a dramatic real increase in walks in the league. While the American League fell a bit shy of that increase, walks were very much of the same trajectory compared to 1968 as they were in the NL. As there was less divergence between the leagues in walks than strikeouts, the cumulative walks of the bottom in the AL seemed less critical to obtain, but I still was curious.
Additionally, that backwards walk differential for the NL bottom position players really surprised me. I was interested in it in its own right, and wanted to see if it also prevailed in the AL. The implication of the era’s possible disregard for offensive walks seemed more interesting than any modeling it suggested for expansion, frankly.
American Leaguers with 288 plate appearances and under combined for 24.97% of the total non-pitcher plate appearances and were the group used. They made up 60.32% of players with a plate appearance, and so the same target was used to define bottom pitchers with the normal 2.1-points-a-start, 1-point-a-relief-appearance weighting determining them. The pitcher group summed for 26.66% of total league batters faced.
The results fit well with my overall Study 3B findings, but more tended to contradict any theories I had been developing about differences between the leagues in 1969 than to further those theories. Most clearly inconsistent was the American League bottom strikeout data.
AL: Position Players +2.36%, Pitchers -0.68%, Cumulative 1.68%
NL: Position Players +2.35%, Pitchers -1.62%, Cumulative +0.73%
While all of the differences for other leagues also showed a positive difference between position players and pitchers, this was the only difference that reached 1.00%. A believer that this approach could predict the impact of an expansion would have said that something like this might have been present in the 1969 NL, when the strikeout rate remained level despite the lowered mound. But in fact it was AL 1969 where this difference emerged.
I will make a few observations here. First, we had a hint of this result from my earlier data that strikeout rate correlated with batters faced in the NL, but not in the AL. The strikeout data in the expansion study doesn’t quite provide the same thing, and we could have gotten a different result, but there is certainly overlap. The previous study said that, among pitchers with at least 200 batters faced, strikeouts and batters faced in the NL correlated, while they did not in the American League. This study roughly took some of the guys of that group towards the low end of batters faced, added some guys with even fewer batters faced, and compared them to pitchers who worked more regularly. In any event, the picture was consistent: pitchers who didn’t work that much in the AL did ok strikeout wise.37
My other thought is that, again, while it would take a lot of faith to think that even a very well-designed study could enable one to make accurate predictions about a specific change in strikeouts or walks after an expansion, the logic really does break down when the “year of” is sampled for the bottom players and not the year before. Pitchers who got irregular work in 1969 do not seem like expansion pitchers.
Additionally, while there could come a point where the calculated net differences were so large I would have little doubt they made expansion years different from non-expansion years, the net differences I am seeing fall far short of that level for me. And even in such a scenario of dramatic differences between bottom hitters and pitchers, I would only feel comfortable asserting a directional effect, and would not know the walk and strikeout change that should result in the league as a result. So this work seems in the realm of models and theories, and not quantitative analysis.
The walk rates among the part-time AL players and pitchers summed to +0.56%. Taking the 1969 actual change as a barometer, it appears to fit with the NL’s 1.35%, since that league did show a greater increase in walks. Where the result was a bit of a letdown was that AL bottom position players didn’t come close to outperforming the league average in walks. In fact, the -0.63% difference was the worst of the four leagues. The theory that players who walked were not recognized the way they would be later was therefore not supported. The clear positive cumulative rate then came about because of the +1.19% differential of the bottom pitchers. Whether it reflected the pitching priorities of that time or just a shortage of pitching, the commonality in walks between the two leagues in 1969 turned out to be on the pitching mound: 1969 NL was only just behind the AL, at 1.15%.
Here again the failure to look at the year before the expansion seems regrettable. The high rate of walks among fringe pitchers in 1969 seems to make sense as an effect of expansion, more than perhaps explaining how a big increase in walks happened to come about.
If the data are looked upon in that way, though, the small +0.48% walk differential for 1993 NL bottom pitchers stands out as interesting. It was lower than last year’s +0.89%, suggesting that 1993 was very able to handle an expansion from the standpoint of pitching depth.
Not a great deal stood out with WAR, OBP, and SA for 1969 AL. Walks turned out to be the bottom pitchers only relative weakness, as they beat position players by 0.60 in WAR/625 PA, by 16 points in batting average relative to league average, and by 37 points in slugging average relative to league average. The bottom pitchers were a decent group, but just behind 1993 NL in batting average differential and slugging average differential, so the strength of their performance shouldn’t be overstated.
The hitters in the 1969 AL inched forward records for worst OBP differential (-31) and worst BA differential (-28). At -58, Slugging Average deficit did not keep pace with the National League’s -68, and gave the AL a lower cumulative OPS differential than the NL, but did nothing to discourage the idea that power very much separated those who played in this era from those who didn’t. Something that shows either the complexity of these data or their randomness is that dWAR for the AL came out as particularly bad, -0.45/625 PA, the same rate that the NL 1993 bottom players produced.
Adding the 1969 AL data to the other three leagues, the final averages come out as follows.
BB Differential: Position Players -0.32%, Pitchers +0.93%, Cumulative +0.61%
SO Differential: Position Players +2.31%, Pitchers -1.52%, Cumulative +0.79%
The good strikeout showing for the 1969 AL bottom pitchers, next to the typical strikeout showing for bottom batters, reversed the order in size of cumulative strikeouts and walks, although both are positive, and walks almost certainly still more positive adjusting for scale. All four leagues had net positives for strikeouts, while three of four did for walks. And that is probably a good synopsis, and as good a place as any to leave this section.
Conclusion
A mystery sometimes is resolved when a source who saw everything chooses to speak up. That this happens enables us to know how well we can piece things together with just circumstantial evidence, if we record what we understood and thought before the truth arrived. Comparing the real truth to the estimated truth, we always find at the very least that things become less hazy and details are filled in. But over 99% of the time, we also find that we had significant things wrong. That, in spite of our better judgment, we let our imaginations get the best of us and failed to apply Occam’s razor. That we missed basic explanations. The torment or maybe the fun is that this source with complete knowledge still often distorts or gilds, so a doubt lingers in the back of our mind if even now we have it right.
I am tempted to pretend to be that source here with the 1969 league data, pretending that I am able to reconcile everything, or have magically learned the truth. After a piece of this length, your patience really deserves no less. And maybe only some role playing can enable me to shed my inhibitions and offer some much needed clarity even if it also means oversimplification. Since I am afraid that I am incapable of fiction even in the statistical realm, however, I will simply give you my opinions on all of the important points, and resist my temptation to play the clairvoyant.
My advice is to make neither too little nor too much of opinions. An opinion is best understood as more than a baseless theory, but certainly less than the known truth.
In the process of working on this, I dreaded putting together a conclusion, but felt it was part of being accountable for my research and writing, and also, again, my debt to you for your patience. But once I had actually finished the research and writing, I became more confident that I at least had opinions, and my conclusion would not be entirely manufactured. I felt my mastery of the subject modest, but I thought myself less than completely confused, and in the process of working on an argument. Those are not small things.
So, without further ado, here we go.
Offense in 1969 was substantially up from 1968. Runs were up 20% in the AL and 18% in the NL. Taking the National League strikeout rate as the final say on the matter, we could say that strikeouts were exempt from this trend of increased offense, but the drop in the American League was one of the larger same-league single-season drops ever, and there was no real reason for the two leagues to differ. Additionally, the 1968 NL rate soon did come to look high. Strikeouts-per-game in the NL in ‘69 would not be topped again until 1986. Everything says that the 1969 rate was a fluke.
However, my belief that a lowered mound would show up more in strikeouts than anywhere else was probably wrong. Strikeouts were not what made 1968 Year of the Pitcher. Speaking now, if not just from theory but from little more than inference, I now believe that what a higher mound really does is to create an angle for pitchers that makes it hard for batters to hit the ball in the air. This decreases home runs.
A lower mound also seems to raise walk totals. Why exactly, I don’t know. In 1969 control was particularly worse early in the season, but most of the effect remained. Based on the evidence, I don’t think the elevated walks were a product of hitters bashing more home runs, or of a general turn in hitters’ favor. Physicists will have to weigh in if starting from a higher angle gives more margin for error in terms of throwing a ball accurately. An intriguing possibility is that umpires perceive more strikes when the ball is starting from a higher position.38
The 1969 season featured not just the lowered mound but expansion, and analyzing and modeling its effects spun off into something of its own paper. We will have another expansion, while I have not heard any talk of a different mound height, so I wish I better understood the general effect expansion has on the balance between offense and defense.
I return to a critical point I opened with: every effect of a league diluted by less skilled pitching has its theoretical counterpart with less talented hitters receiving a chance. Lacking evidence of a strong overall effect, then, the null hypothesis weighs very heavily for me.
That said, I studied the expansion dynamic in three different ways, and all three suggest that it might have contributed, if ever so slightly, to what did in fact occur in the 1969 National League: an increased walk rate, and a stubbornly level strikeout rate. Whether we look at expansion teams, expansion years, or the most marginal players in a league, the increase in walks that occurs with pitchers is greater than the decrease that occurs with batters, and the strikeout more affected by expansion batters than pitchers. As walks and strikeout have opposite relation (a walk being an offense win, and a strikeout a defense win), this split decision in who has greater control of which theoretically leads to a “walk and strikeout happy” league. However, the expansion hitters and pitchers are both deficient in walks and strikeouts, just to different degrees. The balance is probably particularly baked in the cake with strikeouts. The projection is truly one of only net gain, and this limits its size.
Summary
(1) After 1968, the Year of the Pitcher, the mound was lowered in both the National and American leagues. However, strikeouts-per-batter in the National League were essentially the same as they had been the year before, not shifting things towards batters the way one might have expected.
(2) In the American League in 1969, strikeouts dropped precipitously, from 16.18% of plate appearances in 1968 to 14.73%. The NL 1968 rate had been 15.83%, so the leagues were not starting in greatly different places.
(3) I explored the statistical probability that the difference between the leagues in 1969 was in fact just random. A basic requirement of doing this analysis is representing the data by pitcher or by batter, rather than acting as if all plate appearances are independent. Requiring a minimum of 200 plate appearances and grouping strikeout rates by hitter, I found the strikeout difference between leagues to be marginally significant, meaning that we have some reason to reject the luck hypothesis, but not a compelling case. Representing strikeout rates in terms of individual pitchers, in its turn, did not reveal a significant difference in strikeouts between the leagues, marginal or otherwise.
There seemed more to the observed strikeout difference than just mean rates, however. First, whether we grouped by hitter or pitcher, the standard deviation in strikeout percentage in the NL was greater. The difference was large enough that NL hitters and pitchers, despite the higher overall mean, were more likely to have very low strikeout rates than AL hitters and pitchers.
A second complicating difference I found between the leagues was that strikeout rate and batters faced correlated at .332 in the NL versus .049 in the AL. Comparing strikeouts just pitcher to pitcher, then, and eliminating effective weighting by batters faced, the strikeout difference between the leagues decreased. Consequently, the difference did not score as even marginally significant when represented on a per-pitcher basis.
One reason for the difference in correlation, keeping in mind that starters face more batters than relievers, is that NL starters had a strikeout rate quite a bit better than NL relievers, while the opposite was true in the American League.
There is little evidence, however, that this difference in the leagues explained the differential between the leagues in strikeout rate, or the difference in strikeout rate compared to 1968. 1968 showed the same pattern of starters in the NL striking out a higher percentage than relievers, with the opposite happening in the AL. While AL relievers held up much better in strikeouts in 1969 than AL starters did, they, too, lost about three-quarters of a point off their strikeout percentage.
(4) The MLB home run rate was up 29% in 1969, and the MLB unintentional walk rate was up 23%. While the American League rates were higher in both categories in both years than the National League rates, the gap in these statistics between the leagues was somewhat smaller in 1969 than 1968.
(5) This paper did not research why a lower mound might have caused such an increase in home runs, but it stands to reason the lower mound changed the angle of incoming pitches and made them easier to elevate.
(6) It was thought that monthly walk and home run data might shed some light on whether and how players adjusted their play during the season given the altered possibilities and probabilities. However, the 1969 effects were more or less constant for both walks and home runs throughout the season, and so did not give any indication of a learning or adjustment effect.
(7) The connection between a lower mound and more walks therefore remains elusive, but the change between 1968 and 1969 in walks certainly was dramatic. There is sometimes a connection between strikeouts and walks, with walks just representing longer at-bats, not hitters having gotten the upper hand on pitchers. But with strikeouts down compositely in 1969 between the leagues, this theory doesn’t seem to hold water.
(8) The 1969 National League added two teams, the Expos and the Padres, so we can ask whether this was behind the strikeout and walk changes. In other words, we can ask what role expansion played, and the consideration of this question took up the rest of the paper. The reader should be aware that the 1969 American League was also an expansion league, with the one-season-wonder Seattle Pilots and the Royals, the new teams in question.
(9) All credible analysis of expansion must take stock of the effect it had on both offense and defense. So, in this case, where the concern is with strikeouts and walks, there are the two following question:.
a. Did expansion hitters, presumably adding strikeouts to the league total, add more than expansion pitchers took away?
b. Did expansion hitters, presumably lowering the league walk total, subtract fewer walks than expansion pitchers added?
(10) How to get at these questions remained to be decided.
(11) I started by looking at the 14 AL and NL expansion teams themselves throughout history. On average, they were in the black both in terms of total strikeouts and total walks. This is to say that, once everything was converted to percentages, that their pitching walks minus league average, minus league average walks minus their hitter’s walks, came out to 0.57%, while the same arrangement of strikeouts showed a 0.54% positive differential on average. While those numbers look indistinguishable, the walk trend was quite a bit larger than the strikeout trend, adjusting for the greater rarity of walks.
(12) While I do believe the true net trends were positive, the median was not of a piece, underscoring the small size of the finding. The reader is to note that for both walks and strikeouts, only half of the teams had positive differentials.
(13) Looking at the 1969 expansion teams specifically, they displayed the same pattern as expansion teams overall: mean walk and strikeout differential were moderately positive, but two of the four teams had net negatives for each statistic. At odds with the 1969 data, the positive net strikeout differential among the four expansion teams was actually driven by the Pilots, an American League team, while it was the National League that had the higher strikeout rate.
(14) So, on the whole, expansion teams themselves do not suggest expansion was anything more than a small factor in the elevated strikeouts and walk totals that prevailed in the 1969 National League.
(15) A model that represents expansion as just consisting of expansion teams ignores the systemic effects of expansion, and that the existing teams provide the new teams with players. Therefore, it can be argued that looking at changes on a league level is more meaningful.
(16) Excluding 1969, six other expansions have been carried out. These were in 1961 AL, 1962 NL, 1977 AL, 1993 NL, 1998 AL, 1998 NL. Cases of strikeout and walk increases and decreases have been about evenly split, although the general trend was one of moderate increase before the 1993 NL’s expansion. To the extent that the expansions closer in time to 1969 are more relevant to the impact that the 1969 expansion had, league expansion data analysis can then again be seen as supporting small increases in walks and strikeouts.
In this case, the data were stronger for strikeout increase than for walk increase.
(17) A third attempt at measuring the effect of expansion on walks and strikeouts sought to improve upon looking just at the data of expansion teams and to look at expansion-level players instead.
Running the procedure for three different leagues (1969 NL, 1993 NL, 2024 MLB), I kept hitters and pitchers separate. Within group, I sorted by bWAR-by-plate appearance, and found the bWAR-per-PA rate that would encompass the bottom 25% of the league in terms of plate appearances (or the closest rate to that). I then compiled aggregate statistics for these players.
(18) The average for the three years found only a 0.11% difference in terms of the distance from average of bottom hitters versus bottom pitchers (pitchers scoring as less bad) in walk percentage, and only a 0.05% difference in terms of the distance from average of bottom hitters versus bottom pitchers in strikeout percentage (pitchers again scoring as less bad). Mean net strikeouts were therefore in the same direction as the 1969 National League trend, while mean net walks were not, but both means were judged to be tiny, and therefore not supportive of the idea that the categories are much influenced by expansion.
All six cells (three years, two variables) did not give the same picture as the overall variable mean. In the 2024 dataset, the projected net cumulative effect for walks was clearly for increase, while in the 1969 dataset, the projected net cumulative effect for strikeouts was clearly for decrease. Making sense of the size of these individual year differences is difficult, both on their own terms and for the effects they might have had on a league, but both appear to be moderate.
(19) The procedure used for identifying expansion players had the side benefit of shedding some light on the distribution of bWAR.
First, for every league that was examined, both from the standpoint of players and pitchers, a good percentage of players turned out to be rated as worse than replacement (meaning having negative WAR), ranging from 18.2% of the PA of 1969 position players, to 21.3% of the PA for 1993 NL pitchers.
Second, in 1969 NL and 1993 NL, the cumulative bWAR of bottom position players was notably worse than for bottom pitchers. The opposite was true in 2024 MLB, with the worst pitchers also scoring as worse than in 1969 NL and 1993 NL.
One possible explanation for this is more pitchers today given a shot and then discarded after very poor results.
The average position player/pitcher discrepancy over the three years, which in sum was in the direction of worse position players, while not ideal, was not accompanied by any real difference in strikeouts. The difference in average bWAR may have been artifactual and not real. It is important to keep in mind that, irrespective of the difference in average bWAR, each group was defined by the same bottom 25% of plate appearances in WAR-per-PA.
(20) I also figured OBP and SA relative to the league for each bottom group. The differences in OPS relative to the league for hitters and pitchers were more or less proportional to their WAR rates, with some tendency for hitters to have still greater deficits from the average. In one sense, this is surprising, because position player fielding is so much more important in WAR than pitcher fielding, and defensive performance isn’t reflected in OPS. However, the hidden element in pitcher WAR is the defense a pitcher has behind him. This reduces the total amount of pitcher WAR that is made up of OPS Allowed. It is very likely that if calculated, the average supporting defense for pitchers in the bottom WAR group would be found to be very good. Their WARs are adjusted negatively for this good helping defense, and WAR therefore represents more than just runs allowed, or its close facsimile, OPS Allowed.
(21) I redid Study 3A with one important change. Instead of finding the bottom 25% of the leagues by WAR-per-PA, I simply used PA. The theory was that, while players with low plate appearances may have had that total only because they were injured, overlooked, or called up from the minor leagues mid-season, in general there is a correlation between the amount of playing time a player receives and his ability.
(22) For pitchers, because I did not want the bias towards starting pitchers that just using batters faced would have created, I used (games started x 2.1 + relief games) to assess playing time. Every pitcher active in the league was sorted from fewest points to most using this system. With batters faced no longer applicable, the ultimate group was determined by borrowing the percentage of position players who made up the position player group resulting from the cumulative 25% plate appearance standard.
(23) There were a few ways in which this follow-up (Study 3B) seemed like it might be superior to its predecessor. First, just using playing time was more straightforward and less prone to unintended consequences. Second, it seemed more realistic and true to expansion than an effective ban on players with positive WAR (just as part-time players can have good statistics, expansion teams will usually show to have some legitimately valuable players). Third, it made the categories under study separate from a definition of who was in the study, which couldn’t happen when WAR was directly referenced, since WAR and walks and strikeouts are correlated. Fourth, it eliminated the same problem for categories beyond walks and strikeouts, and opened up in earnest the analysis of changes in the valuation of characteristics in different eras.
(24) In both Study 3B and Study 3A, the more plate appearances a player had, the more he effectively counted toward the statistics of the bottom group. This dimension was more coherent in Study 3B, when plate appearances themselves determined eligibility for the group, but still unsatisfying from the standpoint that players at the 25th percentile had maximum weight, and those at the 26th percentile, no weight. Additionally, what level of plate appearances represented the quintessential expansion player was not explored in this study, and the effective weighting by plate appearances, a residue of convenience rather than a conscious choice.
Another choice that was perhaps ill-conceived was to study low-playing-time players in two expansion years themselves, not the years prior to those expansions. Effects therefore seem almost to be treated as antecedents in the design, except that even the purported effects were represented in part-time statistics, while the full impact that replacement players could have in an expansion would come if they graduated into starters.
(25) The new procedure for choosing the group raised the average level of players and pitchers to about replacement level, and by about 1.5 wins per 625 plate appearances. Pitchers remained the stronger group on average, although again, the opposite was true for 2024. The gap favoring position players in 2024 was down to 0.05 wins per 625 plate appearances, however.
(26) Study 3B offered clear if perhaps not dramatic support that the general expansion effect pertaining to walks and strikeouts is in line with what occurred in the 1969 NL. Summing the player and pitcher differentials from the league for walks produced a net average of +0.62%, and doing the same for strikeouts produced a net average +0.50%. Most persuasively, net walks in NL 1969 itself were +1.35% .
1993 did not mimic the overall trend, with a net walk rate of -0.07%, and a net strikeout rate of just 0.03%.
Key to net walks being positive on average was that the bottom hitters typically had walk rates that looked as much “major league average” as “expansion typical.”
(27) If we take the cue of Defensive WAR, the least used players in 1993 were woeful on defense. While other readings are possible, this seems to suggest a surprising appreciation of defense in 1993, since the bad defenders would logically be offset with better fielding ones who were playing.
(28) Having gotten some encouragement that there is in fact a difference in strikeout and walk deviation between expansion-level players and pitchers, I decided to apply the method of Study 3B to 1969 AL. I was interested in the data for this league particularly because I thought if the strikeout differential came out as lower in the model in 1969 AL than in 1969 NL, that would support the model’s validity. (The AL 1969 had in actual fact lost more strikeouts to 1968 than the NL). Additionally, I wanted to see if the comparison of the 1969 AL cumulative walk rate to its counterpart in the NL was consistent with the changes in walks for the two leagues in 1969.
1969 NL had also shown the bottom position players with an outright higher walk rate than the league generally, a backwards result, and I was curious if this would replicate in the American League. If it did, the indication would perhaps be that position player walks correlated differently with playing time than in later years.
(29) With the models from both leagues completed, cumulative strikeout effect was in fact inconsistent with the comparative change in strikeouts in the leagues. American League 1969 showed a cumulative difference between bottom players and pitchers of +1.68%, almost a full percent greater than any of the other three leagues I examined. In reality, strikeouts in the American League were well down in 1969 from 1968. The key to the positive cumulative strikeout effect was that bottom AL pitchers struck out hitters at a competitive rate, a quirk of that league we encountered before.
(30) The cumulative walk rate in AL 1969 was +0.56%, very much in line with the average of 0.62% in the other three leagues.
(31) However, bottom position players in AL 1969 did not outwalk the rest of the league, as I found for NL 1969. In fact, the group’s walks relative to league average were quite on the negative side.
(32) The evidence for a distinct trend for walks among replacement-level players in 1969 instead shifted to pitchers, as the two leagues had a bottom walk differential average of +1.17%, with the same number at +0.89% and +0.48% for the non-1969 leagues.
(33) The failed attempt to tie the model results to developments in particular leagues made the mistake of modeling the expansion year itself and not the year before more glaring, as the misalignment couldn’t be ruled out as the reason for the lack of correspondence.
(34) One interpretation of the overall +1.17% pitcher walk differential in 1969 compared to the lower +0.48% in 1993 NL is that the 1993 NL had more depth in pitching control. 1993 NL was an expansion league as well.
(35) Among the findings from the complementary data obtained for 1969 AL was that bottom pitchers beat bottom hitters, +0.28 WAR/625 PA to -0.32 WAR/625 PA, reinforcing the overall pattern. Additionally, Defensive WAR for the position players in 1969 AL tied 1993 NL for the worst of the four leagues at -0.45/625 PA.
(36) Updating the total record to include 1969 AL, all four of the leagues sampled had net positive strikeout differentials, and three of the leagues sampled had net positive walk differentials.
(37) There is also a broad agreement across all of the expansion analyses that expansion leads to more walks and more strikeouts, although the effects and projected effects would charitably be interpreted as modest at most. The theory that expansion altered walk and strikeout data in the 1969 National League is also severely compromised because most of my underlying data seemed to suggest that strikeouts should have been more upwardly affected in the American League, while in fact the strikeout rate was higher in the National League.
I didn’t read that one myself. My direct experience with Curran came from his Mitts, which I was about 11 when I read, but am still convinced was absolutely awful.
That is not to say that I don’t struggle doing them, though. In fact, I grew so frustrated with this one that if I didn’t have the backup of the main text, giving me the feeling that there was really very little to lose in trying, I may have given up. The difficulty is that summaries usually have a super-objective sound and a passive voice. Those are not necessarily staples of most good writing, but if executed correctly, they can help make the summary a more concise version of the original. But it is a kind of writing where you better know how to do it, and one that really conflicts with my normal mania for thoroughness. When writing in this style, my normal series of conditionals inevitably results in a baroque product. I won’t apologize for the frustrating writing but understand if you are pushed more toward the “unsubscribe” side than to sharing with a friend.
I am aware that low scoring over a period of years provoked the mound-height change. A 1968 season in the presence of other seasons of reasonable run scoring would likely have prompted a more cautious approach.
Dodger Stadium was also associated with a mound that by legend somehow skirted regulation and had a steeper slope, although perhaps this was only after mounds were lowered to 10 inches that this reputation took hold. I checked statistics for Dodger hitters and pitchers at home and on the road in 1967 and 1968. The four home cells had an average strikeout rate of 16.8%, and the four road cells had an average strikeout rate of 16.0%. An index like that, which would read 103 on Baseball Reference, does not support mound height driving strikeout rate (although plenty of assumptions go into that read).
Academic stats has left me by, but I’m sure something like a Hierarchical Linear Model would be more powerful. A tool to measure different sources of non-independence simultaneously is ideal.
I can’t resist a note detailing the rise in strikeout rates today compared to 1969. No pitcher in 1969, throwing out intentional walks for ease of comparison, faced 200 batters and posted a 25% strikeout rate. This contrasted to 123 such pitchers last year (to help you if you’re a k-per-9 guy, 121 of these 25%+ pitchers also struck out at least one an inning).
On the other end of the spectrum, there were 22 1969 pitchers with 200 batters faced and a strikeout percentage under 10%. Only the Angels’ tall rookie Jack Kochanowicz met that criteria last year.
The two 1969 pitchers with 200 batters faced and the very worst strikeout rates, by the way, are both big names: Johnny Podres, who had the very worst, and Joe Niekro. Podres was on his last tour of duty, after being out of the majors altogether in 1968, and threw only 64.2 innings. He called the expansion Padres his home team (at least one assumes he did, since that was the truth). Joe Niekro didn’t come over to the Padres until the third week of the season (he was part of a three-player package the Cubs forked over for Dick Selma), but was more core to the team, compiling 202 innings.
One might ask if the phenomenon of the slant towards relieving extended to general effectiveness beyond strikeouts. Indeed it did: AL relievers led NL relievers in 1969 E.R.A., 3.50 to 3.82, but the table turned for starter statistics, with the NL coming out on top 3.52 to 3.68. The interaction, at least directionally, was also there in 1968: a 2.95 to 3.09 AL edge for relievers in E.R.A., but a 2.99 to 2.96 AL deficit in E.R.A. for starters.
Maybe there’s a ground ball/fly ball difference between starters and relievers, something that would also be touched by a lower mound? That’s the only thing I could come up with.
I suspected that year may have featured a cold September, and that was why home runs were down so dramatically. While even if I had extensive data, not being a meteorologist, I’m not equipped to analyze it, this doesn’t appear to have been the case. ChatGPT, apparently informed by the National Weather Service site, gave me the daily average September temperature by year in Central Park from 1965-1974. The 1969 average of 69.0 degrees rated colder than average, but an unremarkable 0.91 standard deviations below the mean year.
When I probed for information on Chicago, ChatGPT thought I was getting lazy and should do the work myself. It turned out my standard source for current weather, wunderground, has monthly historical data by region. Chicago’s 1969 September daily average temperature was 66.2 degrees. But it is colder there than in New York, so that actually scored as a positive z, 0.24, based on the 1965-1974 average.
Note that the average Central Park temperature in April of 1969 was under 55 degrees, as it always was in that era. Yet home runs-per-575-at-bats were 13.83, compared to 11.26 in September. So average temperature obviously explains only so much.
The September 1969 home run rate was just a blip. Home runs took another step forward in 1970, up another 9.2% in MLB from 1969 as a whole. The overall 14.92 HR/575 AB topped any individual month of 1969.
But those who said that Maris’s 61 home runs were aided by expansion should get a pass here and cannot automatically be said to have engaged in kneejerk criticism. You might say that they were out of line in sentiment, and even that analytically they made a mountain of a molehill, but the question of the effect of expansion on league statistics is wholly separate from the benefits it provided for individual players. There, of course, Maris’s statistics, as well as those of every other hitter, would be helped by facing worse overall pitching. The same would obviously go for any pitcher in the context of expansion.
1961 American League, 1962 National League, 1969 National and American Leagues, 1977 American League, 1993 National League, 1998 National and American Leagues
That kind of makes sense. Expansion teams have held their own in drawing walks, but perhaps at the cost of a really high number of strikeouts.
Not to be confused with the liberal group Media Matters! You know you are a statistician when you speak instead of median mattering.
The highest individual sum of +3.16 belonged to the 1961 Angels, corresponding to a 1.76 z, or the 96th percentile. Granted, it’s constrained by the small sample size it belongs to, but as an individual z value, this is not really notable.
As the Mets would the next year, the expansion Angels led their league in walks, but had a percentage 1.68% higher than league average on the mound. They were actually a worse walk pitching staff than a good walking group, going by the difference with league average. They completed both ends of the deal, leading the league in most walks issued, as well as drawing the most.
The Angels’ offensive strikeout differential of +3.72% (17.14% vs. a 13.42% league) was very much a team effort, but I will cite Ken Hunt, who finished 2nd in the AL in strikeouts, and Steve Bilko, George Thomas, and Gene Leek, who together struck out about as frequently per plate appearance as Hunt but didn’t play as much, as high strikeout guys.
On the pitching side, the Angels were 2.04% over the league average. Although he threw just 99 innings for them, Ryne Duren grades out as most responsible for that: his 23.58% strikeout rate was more than 75% above league average. Rookie Ken McBride, 5th in the AL with 180 punchouts, also did his part.
The good strikeout showing of the Colt .45 pitchers was led by Turk Farrell, 4th in the NL in strikeouts (behind Drysdale, Koufax, and Gibson), and by Ken Johnson, who was tied for 6th. Johnson was 2nd in the league in SO-per-9.
The identical SO Effect total for the Expos and Royals was also exactly the same in the components: a 0.08% Offensive SO Diff, and -0.28% Pitching SO Diff. I am ambivalent about sharing some of these things, but some of you really enjoy them, and I have also met people in sore need of hard evidence of the existence of surprising coincidence, so I feel it does provide a service.
Perhaps we can annually adopt some into a body and hold ceremonies, like for the Hall of Fame. Wouldn’t want to be gifted an “expansion player jacket,” though. Wouldn’t exactly hold the prestige of a Green Jacket.
I might have felt more tied to something specific, and less at liberty to spitball, if I had known at the time that replacement level is specifically defined by Baseball Reference as .294 winning baseball.
The salience of this objection to studying expansion years and not the years previous to them can wait until the next study.
They are two of the three “true outcomes,” one clue to that.
OBP and SA did not change to three digits; SO rate without position players increased from 22.64% to 22.70%, and walk rate fell from 7.93% to 7.92%.
I know juxtaposing the two percentages probably suggests a comparison is in order, but please note this percentage is vastly different from the 29% increase in walks. The only comparison that is intended is in terms of the sign. We are talking about different variables in different forms. Don’t be a Trump and think “same difference” when it comes to tariff rates and trade deficits, after all!
and I guess in the case of contemporary pitchers, are getting fired.
Given the little I know, it seems unlikely, though. Much more probable is that the league average rates were used for hitting, fielding, and baserunning categories, and then a certain percentage below used to define replacement.
I struggled with how to describe the more negative WAR because we need to remember that the WAR numbers must all balance. Everything is within the confines of a distribution, and absolute statements technically cannot be made about a part of a distribution versus the same part of another distribution. Numbers at the bottom might be more negative, but we don’t know if this is because everyone else was a little better than usual, or the worst pitchers were truly worse.
All three years sampled had cumulative negative Defensive WAR, in fact with the negative totals making up a majority of negative WAR. Every year had cumulative negative defensive WAR at least 62% of total WAR (which was negative in each case). I’m not sure exactly what to conclude from this, although I think it would be erroneous to conclude that defense is more central to WAR than offense. My suspicion is that it’s a scaling issue of some kind. It could also be a reflection on the composition of negative WAR in the case of players with limited playing time.
O.k. I do take my baseball research a little seriously.
Actually, to keep everything as close to constant as possible, I used as my target the exact number of plate appearances I had used in that league in the first study. So, if players with 25.2% of the total plate appearances had been used in the first study, I cut off plate appearances at the number that would return plate appearances as close as possible to 25.2% in this second part.
Although at least I did capture their performance when I looked at expansion teams specifically.
Why I didn’t adjust this aspect midstream here (i.e. do the year before expansion), I’m not sure. It was probably an oversight or a want to keep things consistent between parts A and B, more than a consideration of possible time savings from having done the earlier study.
The formula is, certainly, inexact and subjective (maybe it didn’t have to be, but I did nothing more than eye game totals of top starters and relievers). The right weighting is again a matter of what you want, but it does seems likely that the right ratio of a relief game to a start would not be the same for all eras.
One year producing quite a different figure for this percentage than the other two was odd, as it was that 1993 was the outlier, since it was the year in the middle in terms of relief pitcher usage. Personally, I don’t think the difference is of any great moment. If forced to hazard a guess, I would say that fewer relief pitchers meant fewer low-inning pitchers than there were in 2024, while five-man rotations, probably not in widespread use in 1969, tilted the balance away from relief pitchers in general.
An alternate theory to a higher value placed on defense in ‘93 would be more terrible defenders in circulation that year, but with these terrible defenders still generally kept on the bench. This is a possibility, and one that could be checked. Bad defenders, statistically speaking, are necessary in every year; terrible defenders are not.
Note that this was broadly consistent with the strikeout difference between the groups. I do think if BAbip were worked out, batters would still be shown to be more in arrears to league average than pitchers, but I won’t promise this would be the case.
1993, at least, shows a clear trend in BAbip. With the strikeout rates relative to average almost the same, bottom pitchers only allowed hitters an average 11 points over the mean, while bottom hitters hit 27 points worse than league average. Note that what is reflected is not just the respective correlations between BAbip and bad hitters and bad pitchers, but how managers react to BAbip in terms of giving players and pitchers more opportunities.
I found clear evidence of the slugging average split, which is to say that it’s not one of those things that appears in a summary but is hard to make much sense of. Of the 63 NL players who qualified for the batting/slugging average titles that season and therefore had at least 501 plate appearances, only two, Hal Lanier (.251) and Freddy Patek (.296) slugged under .300. Taking the 40 players in the bottom playing time group who had at least 200 plate appearances, meanwhile, 12 had slugging averages under .300. (I suppose part of the difference can be attributed to just more variability in smaller samples, not the emphasis teams put on slugging percentage. For what it’s worth, no one in the bottom quarter really showed the opposite trend in slugging average, though, with Carl Taylor best at .457. And Taylor stood out much more for his average (.348) and OBP (.432) than for his power.)
The worst slugging average in this 200-330 PA group belonged to Reds’ rookie shortstop Darrel Chaney, whose batting line struck all of the themes contained in the composite numbers. Chaney didn’t homer in 201 AB and hit .191. He struck out 75 times, for a very modern 31.6% k rate. He did, however, walk 24 times, clearing the 10% mark.
Just 21 in 1969, he would carve out a nice 11-year career for himself. He had five more seasons of 200 or more plate appearances, and took at-bats in the 1970, 1972, and 1975 World Series, although he had nothing to show for them except a couple of intentional walks issued him by Dick Williams in ‘72.
Leaving Cincinnati for Atlanta in ‘76, he started 147 games at short, and hit .252 with a .324 OBP. The Rtot method never liked his shortstop defense, and so unless he played another position, his overall WAR was likely going to come out negative, which it did that season.
To document an impressive turnaround: after spending most of ‘70 on the Reds bench, and most of ‘71 with Indianapolis, he took considerable playing time away from Davey Concepcion in the second half of ‘72. Chaney’s surprising offensive year included just 28 ks in 230 PA, and 29 walks. Remarkable progress from ‘69!
Chaney and Concepcion are only about three months apart in age. Concepcion hit .260 as a rookie in 1970 in 265 AB, but followed with OPS+es of 44 and 59 the next two years in larger sample sizes. He was 24 at the end of 1972. Maybe his metamorphosis wasn’t quite Justin Turner-like, but who could have seen .282, 81 home runs, 217 steals, 9 All-Star games, and an All-Star game MVP (earned by hitting a home run off Dennis Eckersley) coming in the next decade?
The overall edge for the AL pitchers hinged very much on their statistics, by the way, not the AL hitters having unusually bad strikeout rates and making them look good by comparison. The AL hitter strikeout differential was +2.36%, which compared to +2.81%, 2.35%, and 1.70% in the three other leagues.
I suppose this could be tested looking at pitch-framing data by height of pitcher, in cases with a significant discrepancy in height of pitcher.