Grand Slam Opportunities by Era: It's Not Just about OBP and Runs Scored

Sep 12, 2023

In a recent game recap, the Associated Press gave context to one-time first overall pick Royce Lewis’s grand slam binge, noting that he was “the fourth player to hit three in a span of eight games or less.” What grabbed my attention, however, was that one of the previous times it happened came in 1968, of all seasons. This mountain climber was Jim Northrup of the champion Detroit Tigers, who actually hit these slams over just four games, not eight.

Perhaps sabermetrically-inclined fans have gotten so disciplined in noting the effects of era that it may be their inclination to exaggerate them. Three American Leaguers (Frank Howard, Willie Horton, and Hawk Harrelson) hit at least 35 home runs that season, after all. Northrup himself hit 21. While Lou Gehrig was the correct answer to this trivia question that was easy, why couldn’t Northrup, with his 21 home runs, be another? It struck me, however, that the effects of a bad offensive environment would be more likely to show up in grand slams than home runs. A grand slam is a team achievement, and it logically should be harder for four players to defy averages than for one. So I set out to measure just how tough the conditions Northrup faced in 1968 were.

My mindset was of someone doing exploratory work, and I did not intend to do an extensive study (no surprise that I failed). So I variously analyzed number of grand slams, number of bases loaded opportunities, and what affected number of bases loaded opportunities, before settling on principally studying the latter.

To start, I used Baseball Reference Stathead and got a count of grand slams and of plate appearances-per-game with the bases loaded for every year back to 1914 . Total grand slams in a year are in part a function of the number of teams in operation, and to a lesser extent, whether the schedule was 154 games, 162 games, or a different number, as it was in 2020. There is also the issue of Stathead not having complete coverage of games with situational splits such as “bases loaded,” although aside from 1914-19 and the World War II years, missing games are not extensive. Because of these factors, while total grand slams make the data easier to understand, and are useful in comparing years that are apples to apples, I worked most seriously with grand slams-per-game. My source for the number of games with available data was the PBP (Play by Play) game count Baseball Reference lists on a page entitled “Data Coverage.”

The first numbers that I reviewed suggested Northrup’s feat of hitting three grand slams in four games really did deserve special recognition for happening in 1968. For the major-league year as a whole, there were only 37 grand slams hit. However, then I saw that there were also only 39 grand slams in 1967, not that that was a hitter’s season, featuring just 0.35 more runs per game scored than in ‘68. The .023 grand slams-per-game in ‘68 was the lowest since 1943.

Up until 1968, the most recorded grand slams in a season had been hit, wouldn’t you know it, in 1961, when 77 were recorded. Even with a couple of more expansions, it wouldn’t be until 1987 that a year got into three-digits for grand slams, and that was 100 exactly. The current record for MLB grand slams in a year is one home run record that survived the onslaught of 2019, as the 176 hit in 2000 remain the standard. With counts this low, it perhaps strains mathematical belief for one player to hit three grand slams in four games in any season, no matter which one.

Speaking more specifically to opportunity is bases loaded plate appearances per game. You can’t hit a grand slam if the bases aren’t loaded. The years 1966-1968 form a cluster, with averages of 1.507, 1.508, and 1.517, respectively. Those were the lowest yearly averages since the Dead Ball Era (I calculate 1.475 bases loaded plate appearances per game for 1919). But 1968 was not the nadir, as you can see. The reason 1968 grand slams were the very lowest in the cluster was because the home run rate per plate appearance in those bases loaded situations was the lowest.

My initial hypothesis was that runs scored would not be the best gauge of bases loaded opportunity, but on-base percentage minus home runs would be. Basically, it seemed to me that slugging average would not be helpful in producing bases loaded situations. Runs scored are in part of a function of slugging average, so I thought on-base percentage would be more indicative. On-base percentage itself is also helped by home runs, so I thought on-base percentage with the home runs taken out would be best of all. Comparing the three statistics for the seasons 1914-2023, however, with bases loaded plate appearances per game the constant variable and criterion, produced the following three correlations.

OBP .654

Runs .601

OBP w/o HR .509

It appeared my theory that home runs didn’t make bases loaded situations more likely was simply wrong. But for some reason, runs was quite clearly outperformed by OBP as a predictor of bases loaded situations.

The next step I took to get some perspective was to sort all years by the rate of grand slams per game. The rate in 1968 was 19th-lowest of the 110 years, but, oddly, the surrounding years were mostly ones we are taught to think of as dissimilar to 1968. Namely, there were a lot of 1920s seasons with similar grand slam rates, including 1927, which was one place ahead of 1968. 1930, with the .296 MLB average, was two spots behind 1968, and 1936, when the great Yankees scored 1065 runs, four spots in arrears.

Having once won a bet with someone who didn’t know that the hitting happiness of the ‘20s and ‘30s extended to real power for only a few players, I could accept this result, even though it wasn’t one I had anticipated. But this rationalization failed when I noticed that that the low grand slam rates in the ‘20s and ‘30s didn’t all trace back to very low home run percentages; bases loaded plate appearances per game were also not standing out. Ranking all years by OBP and bases loaded plate appearances per game, respectively, a typical 1920s season ranked 12th in OBP but 64th in BL PA/G. This was particularly curious because the correlation between the two over the entire 110 years (.654) had already been established .

I could only conclude, therefore, that Baseball Reference must be missing more BL plate appearances than their game coverage indicates. This could also be affecting the grand slam counts themselves. I double checked from the Data Coverage page that when they say a game is covered, they mean that it is completely covered, and they did appear to mean that. There was no reason to think that the PBP count was inflated, but the discrepancy I was seeing made little sense to me otherwise. Happily, however, I was not entertaining that bases loaded frequency could be as complex as it appears to be (I say “happily” because it is very stressful to discover and ultimately to try to redress a data problem).

While the low grand slam rate of 1936 had caught my attention, now that I was focusing on plate appearances with the bases loaded, 1936 ironically emerged as a turning point, as it marked the end of the surprising low-bases-loaded counts. BL PA/G were the 4th-highest of the past 110 years, the same rank as on-base percentage. My initial theory about the low grand slam counts of the era really did pertain to 1936: the reason there were only 22 grand slams hit that year was because hitters just did not go deep with the bases loaded much, despite their ample opportunity. There were more MLB home runs hit in 1936 than 1935, so by the norms of the time, it was not a low-home run season. But it was a fluky year when grand slam home run percentage was lower than overall home run percentage. In 1937, OBP and BL PA/G both rank 15th, and within a few years, the theory that bases loaded plate appearances are not proportionate to reported games because they were not getting recorded clearly loses plausibility.

I reran the correlations between BL PA/G and OBP, Runs, and OBP w/o home runs, this time excluding 1914-1935. No one category figured to necessarily gain at the expense of the others with the revision, but erroneously low BL PA/G counts would have had the potential to distort and attenuate the correlations. The new correlations were

OBP .830

OBP w/o HR .810

Runs .679

A few observations. First, the correlations as a whole rose substantially. Second, there is now no serious case to be made that runs are the best indicator of an era’s bases loaded opportunity; on-base percentage is better. This is not new, as we were already using OBP as the gold standard. Third, removing home runs from on-base percentage does not now seem to be harming the measure in the same way it was before, but given its deficit to regular OBP, neither does there seem to be any reason to pay attention to it.

While the period before 1936 might contain the most disconnect between OBP and BL PA/G, it is still evident that the two variables do not run on the same track for all of 1936-2023, even if the correlation coefficient is respectable. Noting the differences and the direction of those directions is the first step before trying to figure out their cause.

To give you the history, then, the pattern from 1940-1956 is the opposite of the pattern before 1936. Bases loaded plate appearances are more common than the league on-base percentages suggest. All of the differences between the ranks of the two variables are in this direction, with an average difference of 25 ranks (ranks 1 and 110 would be a difference of 109; there are 110 seasons of data). When the difference in ranks is only small, it is usually because the rank in OBP is high enough to begin that there was little room for improvement. That problem can be surmounted by dividing the ranks; the average rank of BL PA/G is 38% the average rank of OBP.

Convergence is better from ‘57-’78, but I will stress that the difference still runs in patterns, meaning it is clearly not random, and that there is more going into BL PA/G than just OBP. Those undocumented factors, tending to be similar from year to year, are presumably playing a part in governing BL PA/G.

1979-1987 is another time when BL PA/G lag behind OBP. This is true of every season in this interval. The OBPs aren’t brutally low, with an average rank of 66, but bases loaded plate appearances were hard to come by (average rank of 90).

A variable I never look at formally in this study is stolen bases; if there were a negative correlation with BL PA/G, it probably wouldn’t be because of them, per se, but because they indicate a high level of speed in the game, and therefore a high rate of runners going from 2nd to home, etc., and a potential scattering of those needed clean bases loaded lines. For what it’s worth, even though the most prolific base stealer of all time debuted in ‘79, and Lou Brock took his final baseball bows then, ‘79-’87 isn’t a credible delineation of the golden stolen base era of the modern age. There were more stolen bases per game in 1976 and in 1992 than there were in 1985, for instance. But maybe ‘79-’87 was the height of the artificial turf era, which would seem to have more to do directly with runners taking extra bases than stolen bases (But then, while balls landing in the outfield on artificial turf often go to the wall, facilitating the taking of an extra base, I suppose one can argue that balls slowing down in the outfield on grass are best for letting runners go 1st-to-3rd and 2nd-to-home. Another possibility is that ‘79-’87 was a time when the hit-and-run was employed most often, and more hitting-and-running meant less frequent bases-loaded situations).

From 1988-2001, there is good alignment between OBP rank and BL PA/G rank. If you stop to think about it, you know OBP increased considerably in this period, but BL PA/G changed along with it. In general, BL PA/G lagged a bit behind OBP, but not to the extent it did from ‘79-’87. This is not the only example of a trend with the difference of these ranks enduring into what most would think of as a different era because of a change in the level of scoring. Perhaps, then, when we think about eras, we pay too much attention to level of scoring, and not to other aspects which define the play.

The story has been different from 2002-2023. There has not been a season where OBP rank was better than BL PA/G rank. There are unmistakable mini eras within this larger era, so it’s probably wrong to generalize about the size of the difference. The time when the data are most discrepant is from 2018-2021. Mean OBP rank was 88, but hitters enjoyed a relatively high rate of bases loaded situations, with the average rank 46. In 2022 and 2023, the numbers have gotten more in line, with a difference of 12.5 ranks between BL PA/G and OBP in both years. It would make for a narrative to say that banning the shift had reduced bases loaded situations, but as I said, the same difference is found in 2022, when full shifts flourished.

I gained further clarity into the dynamics of bases loaded opportunity by sorting the years from highest to lowest in BL PA/G. The top 10 are 1948, 1949, 1950, 1936, 1938, 1951, 1947, 2000, 1941, 2008. So every year from ‘48-’51 was in there. Those powerhouse Red Sox teams aside, we don’t usually think of that era as the greatest offensive era in baseball history.

But what does stand out about that time? All the walks.

I added hit-by-pitch counts to walks, as in their effect they are indistinguishable, and on a per-game rate, 1948-1951 are four of the six highest seasons in the database. It was the Ted Williams era, and also of the walking Eddies (Stanky, Yost, and Joost). It was the time of Tommy Byrne walking 489 men over three seasons (and hitting 50 more, although that was not emblematic of the era).

Could there be a connection? I wondered if walks and hit-by-pitches somehow did more to increase bases loaded situations than other baserunners.

Deviating a bit from the OBP framework, then, which is calculated per plate appearance, not per game, I nevertheless decided to work with hits per game and “walks plus hit-by-pitches” per game separately.1 I regressed BL PA/G on both of those variables. If the type of baserunner didn’t matter, the two weights would be equal.

Instead, I obtained the following weights

BB+HBP .528 (Beta .628)

H .203 (Beta .368)

I derived these weights only using seasons after 1935, just to take no chances on the accuracy of the BL PA/G data. While before I had correlated OBP with BL PA/G and found a .830 correlation, using baserunners per game, the construct dropped to .798. But what was notable was that the model, with different weights for walks/HBP and hits, upped that to .834 (adjusted downward for the advantage of the two variables). Moreover, the relative weights showed that the adage a “walk is as good as a hit,” overselling walks when it comes to their general value, undersells them when it comes to their role in producing bases loaded situations. Walks are more useful than hits.

My next step was to re-rank seasons, applying the regression formula and not OBP. Every season receives a projected BL PA/G score. 1948-1951, which had ranked 22nd, 12th, 7th, and 30th in OBP while occupying four of the top six spots in BL PA/G, ranked 7th, 3rd, 1st, and 9th with the added bonus for walks and hit by pitches.

It seems that bases loaded is a precarious state, and the most impactful offensive events, i.e., hits, tend to knock over this house of cards. That’s overstating things a bit, because hits still increase bases loaded likelihood (the significance level of the variable left no doubt about that), but they are not the efficient vehicles that walks and hit by pitches are.

In finding this essential role for walks, could walks really just be a proxy for intentional walks? After all, if I ask you to name the first intentional walk situation that comes to mind, it might be a manager issuing one with runners on 2nd and 3rd and loading the bases. Maybe it’s not walks per se that have the special added importance for bases loaded. Maybe intentional walks and bases loaded cases greatly overlap, and walks as a circle containing intentional walks reflect both.

If I had thought of this earlier, I would have evaluated it through modeling, but reviewing the historical data for intentional walk frequency seems to dispel the notion. Intentional walks per game basically rose from their first recording in 1924 until about 1970. The most acceleration occurred in the mid-60s2, but intentional walks remained higher in the ‘80s than they were in the ‘50s. The high-walk years in question, anyway, were ‘47-’51, and it was not until the 21st century that intentional walks per game fell below the levels of those seasons.

The immense advantage of using this weighted OBP instead of regular OBP to project bases-loaded rates manifested outside of just the first seasons after World War II. While the two measures correlate at .79, in this context, that actually betrays substantial differences. The 1920s, where the BL PA/G rank had missed by an average of 52 ranks, had their ranks go down 43 per season with the weighted measure. There were individual changes of as much as 62 ranks. The verdict on BL PA/G in 1927 went from a miss of -53 to +3.

You have probably gathered from this that walk rates were low in the 1920s, and indeed they were. Walks differ from a lot of statistics, in that their fluctuation has been modest over most of baseball history. Even when I talk about all of the walks in the Williams era, it is within that context. Walk rates transcend era, to an extent. When Babe Ruth walked 170 times in ‘23 and set a record that would last until Barry Bonds, if you standardized his data for the league average, he was not doing what he did with home runs. And, at least since 1900, there have always been players who drew walk totals that would impress us today. 1916, for instance, rated 102nd in walk rate, but Burt Shotton (the future manager) drew 110 walks, and Jack Graney drew 102. What the model is saying, however, is that these small changes over time do go to explain a lot about bases loaded counts per game, which are themselves in a narrow window, with a range of less than 1.00 in the data.

Anyway, as a decade, the 1920s had an average BB+HBP ranking of 92nd, with a median of 95th. Hit-by-pitchers are greatly outnumbered by walks in this amalgam, but were higher than they would be at any point until about 1993, so they are not the reason why projected bases loaded counts are as low as they are in the 1920s despite the high batting averages.

I have said that bases loaded PA/G and OBP converged in about 1936, and what do you know? Walk plus HBP ranking jumped up from 74th in 1935 to 45.5 in 1936, and did not drop below 62nd again until 1944, an aberration likely reflecting the war-depleted personnel.

The illumination that walks were a key to bases loaded frequency had greater resonance to me because they vindicated the accuracy of the early data. Truly, discovering this relationship probably saved me from the ignominy of writing Baseball Reference an e-mail expressing grave concerns about their reported data coverage. Now I know that the 1914-1935 Bases Loaded counts do not stand out as peculiar. Batting averages were superior and on-base percentages were high, but walks were not.

Based on what I have detailed so far, the realignment effected by weighting walks and HBP more seems downright chiropractic, but it is necessary to finish assessing all seasons before reaching this conclusion. Using just on-base percentage, I said before that 1940-1956 was a rocky time, with BL PA/G running well ahead of OBP. I have also said that accounting for walks did wonders to resolve this from ‘47-’56, when BB+HBP ranked in the top 20 in 9 of those 10 years. But, as walk rate was average from ‘40-’46, the big differences between the predictor and the criterion remains for those years. All still differ by at least 20 ranks.

Clearly, we would have to take a hard look to understand why hitters hit with the bases loaded as often as they did in these seasons in light of the hit and walk totals. Something about the way baseball was played in this time must account for it, and whatever it was, it was a change from the ‘20s and ‘30s, when players did not hit with the bases loaded more often than the model predicts.

The 1940-1946 period poses the biggest challenge if you think the job is basically completed by weighting walks in OBP more than hits. Over the past 75 years or so, I see a coherent picture, not that there aren’t rough spots. A twist on using correlation coefficients to assess the improvement is to see how often the revised formula came closer to the year’s actual rank in BL PA/G than OBP. I’ve covered the pre-1957 years, and the model comes out a resounding winner using this method. As for 1957 on, the walk-weighted model compiled a record of 39 wins, 25 losses, and 3 ties against OBP. Errors of 20 or more ranks were cut by 6, although there were still 16 of them.

No other period suddenly snaps into place like the Williams and Goose Goslin periods did, but I do note quite a few modest corrections, which could in part be a product of some of the other OBP rankings that were so badly messed up being corrected. Every ranking exists within the context of the others.

1979-1984 is a good example of these modest improvements. Bases loaded PA/G were worse than OBP rankings, but these were all relatively low-walk years. The weighted-walk model’s errors are of the same sign as the OBP-centered ones, but they are less pronounced. The pattern of BL PA/G being overestimated extended to ‘85-’87, however, even though walks became more frequent. The model in fact resulted in a small regression from OBP rankings for these years.

The weighted-walk method does create one problem that wasn’t there before, which is the ‘94-’98 seasons. These were seasons when walks made up an above-average portion of on-base percentage. Regular OBP aligned fairly well with BL PA/G, 7 ranks better on average, and OBP as a predictor had an average miss of 10.4 (the numbers are different because BL PA/G was better than OBP in one instance). But adopting the model, the average error became 23. That’s a far cry from the average miss of 36 from 1940-1946, but still a real discrepancy.

Trying to understand these data, the theory that intrigues me is again that whatever it is exactly that a high stolen base rate signifies is something that keeps PA BL/G in check. Remember that with the proliferation in steals in 2023, we are only back to where we were in the late ‘90s in steals, so there were a lot of steals then. Stolen bases per game in 1997 were higher than they were in 1984.

In citing the overall better performance of the weighted-walk method post 1956, I did not realize how much of that was confined to post 2001, a time during which I noted BL PA/G have exceeded OBP in every season. While the median and mean improvement, 5 and 8 places respectively3, might not seem large, the rank of the weighted-walk model compiled a record of 19-2-1 versus the rank of OBP. That does mean that weighted walks have a losing record against OBP from 1957-2001, although not without the 1-8 record from 1993 to 2001. The pervasiveness of the improvement over the past 22 years can be hard to understand; BB+HBP rates have been quite low in some years, most notably from 2012-2015 (ranging from 77th to 92nd). We do know how low batting averages have in general been the last few years, and while that was not true from 2002-2009, walk rates have generally been one step ahead.

Using the new variable’s rankings, no longer does every year in the 2002-2023 period produce a positive differential subtracted from PA BL/G rank. There are six exceptions over the past 22 years, although 2005-2016 still shows as pure systematic difference.

One of these is 2023. This season, based on the latest data I had, the imbalance in ranks in favor of BB+HBP and away from hits (87 ranks) trails only 2020 among seasons since 1914. At the same time, the gap between OBP and BL PA has narrowed. The upshot is that the weighted-walk model overpredicted BL PA by 33 ranks.

If I stop to reiterate the point, it is only because I had not appreciated it: 2023 has been quite a walkathon, and when you add in HBP, the per-game rate is the 7th-highest since 1914. Taken alone, walks per game were tied for 85th last year, and are 15th this year. The movement toward relievers inflates walks, but the basic model of “six and shower” existed in 2022, making it easy to cite the pitch clock as an explanation for the increase.4

Something else I noted was that when the weighted-walk model made a bad prediction, generally in the middle of otherwise good predictions, it was sometimes also a year of aberrantly high home runs. Bases loaded plate appearances undershot the model by 30 rankings in 1961 (Roger Maris’s year, of course), and by 34 rankings in 1962, when HR/G were down just 0.02, and otherwise the highest since 1956. The weighted-walk model undershot BL PA/G by 29 ranks in 1977, when George Foster became the first player of the decade to hit 50, and home runs were up about 50% from 1976. Home runs were up in 1986, and the model missed by 33 ranks; they were really up in ‘87, and the model missed by 44.

While I had already compared OBP without home runs to OBP, and found it wanting as a predictor, beyond these examples, there was a reason why the concept deserved another look. Both the eyeball-test evidence and the theory suggested that home runs lowered bases loaded opportunities, or at least padded OBP. But home runs correlate with both league walks and league HBP5, and BB+HBP have been found to be very salutary for BL PA/G. So, one reason why the correlation of OBP and PA/G may have dropped when home runs were dropped from OBP is that low-HR, high OBP years may also tend to have low walks, and low walks are not good for BL PA/G. By forsaking a correlation for a model that would break out walks separately, this confound would be removed.

To keep the terms pure, I didn’t add HR to H and BB+HB as a variable, but just included (H - HR), and BB+HBP. A model with all three variables unaltered would have been flawed because home runs are a part of hits. Subtracting home runs eliminates the error of the model expecting MLB years where home runs drove a high hit total to have more bases loaded situations on account. Note that the model is technically identical to the previous one, except hits have home runs removed. I am in a way analyzing the effect of home runs indirectly.

The multiple R took a sizeable jump over the previous model, from .839 to .882. So both my observation and theory seemed to be correct. The Beta weights for BB+HBP (.698) and H (.442) were both well up as well, hits by a greater percentage than walks, suggesting to me that the variable had indeed been cleansed. The slope for hits without home runs was actually down by 0.002, but I would take that as a meaningless reflection of the fact that hits without home runs has deviated more per year than regular hits. The larger the standard deviation of a variable, the smaller b is for the same correlation. Note as well that, although I no longer considered it necessary, for consistency’s sake, I again ran the regression on just 1936-2023, and the weights and multiple R reflect that.

Adding home runs to the two other variables did not represent a methodological problem as long as hits were still defined without them. Indeed, at first this is what I had done, before thinking I had been so stupid I should hit myself in the head with a flyswatter. But the model including home runs is actually a defensible model. It theorizes that, not only are home runs not helpful for producing bases loaded situations (which actually isn’t really debatable; no one ever ended up part of a bases loaded situation after homering), but they are harmful. The idea is that home runs short-circuit bases loaded opportunities. Every time there is the beginning of a potential bases loaded situation, and a batter goes big fly, that bases-loaded opportunity goes out the window, and the team has to start over. One can also argue, given their covariation with BB+HBP, for including home runs simply as a placeholder, if the goal is to get the best assessment of the BB+HBP variable.

Home runs did have a negative sign in this model, of a size larger than would be produced by average variation, but it was small and not significant (t = 1.63; p = .12 two-tailed). In keeping with the t of over 1, the adjusted r rose to .8826, representing the best adherence yet. Nevertheless, for reasons of both theory and parsimony, I am more comfortable with the model with only statistically-significant variables. I don’t think the case for assessing an actual penalty for home runs is strong enough to warrant adopting the model that includes it.

Leaving theory behind for a bit, I repeated the step I took before of applying the latest regression model, generating new ranks, and comparing them with BL PA/G ranks. This experience left me at times mesmerized by how well the model sometimes seemed to fare, and at other times sobered by established problem periods that the model rated even more inaccurately than before.

My first observation was that the correction applied to the 1920s by weighting walks was diluted by removing home runs from hits. Since there weren’t many home runs hit in the 1920s, this change predicted more bases loaded situations than previously. The latest model was too high on bases loaded counts for every year from 1919-1930, and by an average of 22 ranks per year.

Then comes a period of remarkably smooth sailing. There is only one double-digit error in ranks from ‘31-’39.

The underestimation of bases loaded counts from ‘40-’46 is still there by freeing hits from home runs, but it is definitely improved, although the years ‘44-’46 remain quite beyond fixing.

The high-walk years are just flat out nailed, with all ranks within 6 of actual bases loaded rankings from ‘47 to ‘57. Part of this is because the ranks were close to the top, naturally making for less variation and a ceiling effect, but the model comes close on the 57th bases loaded rank of ‘57 as well as the other years in the period.

Those stray home run years such as 1987 are still off, but they are moderately improved (I suppose the model that directly penalized home runs might have made a better account of them). The 1987 rank difference moved from -44 to -29.

Removing home runs from hits did little to lower projected rankings and bring them more in line with actual BL PA/G from ‘94-’99. That’s somewhat surprising, based on our reflexive understanding of that time as home run happy, an image which is backed up to a certain extent in the numbers (The ranks for home runs were slightly above the ranks for hits). But it’s also true that hits per game have only gone down since that time, and home runs per game were quite a bit lower than in 2016-2023 (average of 1.06 from ‘94 to ‘99 counting all years equally; average of 1.21 from ‘16 to ‘23). So, it was not so much of a home run era, as just a great hitting era in general.

I suppose the effect that removing home runs from hits had on already underestimated bases loaded counts from 2002 forward should have been obvious, but getting the skinny was bloody indeed. The model moved projected bases loaded counts down when they needed to go up. The previous model made an estimate as good or better than this this one every year from 2002-2022, to compile a 19-0-1 mark versus it. Finally, for this year, when the high ratio of bases loaded counts-to-base runners that has prevailed in recent years has cooled, reducing hits by removing home runs was actually helpful, and the latest model is making the better prediction.

With so many visible lumps, I was surprised that Model 3 only lost out to Model 2 by a score of 52-48-10, pitting them against each other year by year. The weakness of ranks is significant enough that such a narrow verdict cannot be thought of as dispositive. I don’t think any statistician would prefer them to regression in evaluating collections of variables. All differences in rank count the same, no matter whether scores are in a cluster or far apart. Relying on the head-to-head count seems to inevitably also mean giving one model virtually all of the yearly wins in one era, and then the other model virtually all of the wins in another. It’s not clear to me that the relative length of the eras should decide the best model. There is also a categorical imperfection within the categorical imperfection, as I counted all yearly victories the same, irrespective of whether they came by one rank or 20.

I certainly believe, however, that no matter where you come down on the right adjustments for hits, walks, and home runs, there are still going to be significant discrepancies in the data that reflect more than random fluctuation. Simply put, there is likely more to bases loaded counts than hitting, and since I have only modeled hitting, it is inevitable that certain periods will be underestimated and some overestimated. Now that we have base running statistics that go beyond stolen bases, the puzzle piece is only missing because I haven’t tried to fit it.

One of the pleasures of doing this research was cementing my knowledge of the progression of different offensive norms. Although we never think of it as a statistic proper, bases loaded count most assuredly is one. Not only can its trend line be traced, but it appears to respond to its own mix of variables that do not just equate to runs scored. One of the key ingredients turns out to be base on balls. So, if your goal were to hit a grand slam, finding that fourth ‘50s Eddie, Gaedel, in your clubhouse would be very welcome, as would finding copies of Moneyball strewn throughout the locker room.

This is as good a point as any to note that one imperfection in my study is that the general yearly statistics and ranks derived from them come from a Baseball Reference file with data including the Negro Leagues as well as the American League and the National League. This is an issue because there isn’t the kind of specificity in Negro League data that something like “bases loaded” plate appearances can be collected. So, I am comparing yearly data on one front that has Negro League data, and on another that does not. Of course, this problem does not affect any season over the past 75 years or so. It also probably does nothing to explain why BL PA/G ran so far behind OBP from 1914-1935 and particularly during the 1920s. That was a hot hitting time in MLB where MLB on-base percentages on their own would have rated very highly, perhaps outpacing those in the Negro Leagues. Yet MLB-only BL PA/G was tepid.

Yes, 1968 had the co-second highest rate of intentional walks of any season in history! I would assume this just reflects the temperature of the time, and 1968 managers, if transported to the dugout and directing games in the year 2000, would have really gone crazy with intentional walks. But there is also the possibility that that mindset of pitching carefully in close, low-scoring games, of being afraid to make that one mistake that can beat you, really does take hold, and so intentional walks are not proportional to base runners.

The three years where weighted walks did the same or worse included in these numbers, with 0 when there was a tie, and a negative number when there wasn’t.

Note as well that we have more stolen bases in 2023, consistent with the theory that they have a negative effect on the rate of bases loaded situations.

Over the 110 seasons, the correlation is .60 between (BB+HBP)/G and HR/G and .41 between just BB/G and HR/G. Seeing the difference in those correlations, I imagined a correlation between HBP alone and HR near 1, but in fact it is just .64.

r with (H-HR) and (BB+HBP) was also .882, but adjusted r .879.

Baseball Math, Baseball History, and Whatever Else

Grand Slam Opportunities by Era: It's Not Just about OBP and Runs Scored