HomeSportsBaseballSwing, Swing by The Small-Sample Rejects

Swing, Swing by The Small-Sample Rejects

Nathan Ray Seebeck-USA TODAY Sports

Have you ever had a friend enthusiastically recommend that you watch a TV show and then say, “It takes a few episodes to get going, and the timeline gets weird at the end, and one or two of the main characters can be kind of annoying, but other than that it’s SO GOOD.” And initially you might be put off, thinking that a truly good show wouldn’t require that many qualifiers. Sometimes you’re right about that, but sometimes it turns out the show is Parks and Recreation and even though the first season is about as appealing as living in a pit, the rest of the show is an absolute treat.

Sometimes small components of a larger body of work do a poor job of representing the work as a whole. The oddities that occur in small samples are likely not a new concept to FanGraphs readers, nor will it shock anyone when I note that what constitutes a small sample depends on what exactly we want to measure. Recently, the fine folks at MLB Advanced Media gifted us with a handful of new metrics that make use of Statcast’s bat tracking technology. Every time we dig into a new metric, we must consider the appropriate serving size to satiate our hunger for knowledge, lest we find ourselves hangrily generating takes that we later regret.

For this article, we’ll attempt to determine appropriate sample thresholds for measuring a hitter’s average bat speed; so that players without bats don’t feel left out, we’ll do the same for sword rate from the pitcher’s perspective. For many metrics, the sample size is measured in pitches or plate appearences, but since both bat speed and sword rate are tied specifically to bat movement, their samples will be composed of swings. To determine reasonable sample sizes, I used the split-half correlation method. The idea is to randomly select two samples of size X from a player’s collection of swings, calculate the player’s average bat speed or sword rate for both samples, lather/rinse/repeat for a bunch of players, then take the full set of two-sample pairs for all players and see how well they correlate. We complete the experiment by repeating the process for progressively larger sample sizes. And just to be super thorough, we’ll re-run the experiment several times and average the correlation values.

The theory behind the method is that with large enough samples, the metric will contain more signal and less noise, thus representing the player more accurately. Therefore, two samples of sufficient size should look comparable to one another. Once we hit a sample size where the correlation is strong enough that the metric is considered to be what statisticians term “reliable,” that sample size becomes our minimum threshold for relying on the descriptive power of the metric. The poor six-episode showing from Parks and Recreation in its first season did not wind up providing a large enough sample to accurately depict the series’ overall episode quality. We needed to see more from the folks in Pawnee.

Starting with average bat speed, the chart below depicts the results of each experiment (in gray) and the average of all experiments (in green), with the sample sizes on the horizontal axis and the corresponding correlation coefficient on the vertical axis. Statistical standards dictate that once the correlation rises above 0.8, we’re in good shape. With that in mind, the output suggests that average bat speed becomes a reliably descriptive metric around 30 swings, which most players accumulate over 20ish plate appearances.

To emphasize the importance of the 30-swing minimum, I decided to find the wackiest 20-swing stretches in this metric’s short life so far. By wacky, I really just mean the span of 20 swings where the player’s average bat speed most differed from his season-long average. Topping the leaderboard is Ildemaro Vargas, who earned his spot by attempting to bunt against five of six consecutive pitches spread across two games on July 4 and July 5, leaving him with an average bat speed over 20 swings that was 20 mph slower than his season average of 69 mph. The first four bunt attempts were split evenly between two PA on July 4, where Vargas came up with a runner on first and no outs (a classic bunting scenario). On July 5, Vargas pinch-hit to start the bottom of the 11th with the zombie runner on second (a modern classic bunting scenario). His final attempt registered a bat speed of 9 mph, which looks like this:

The Vargas example highlights an important aspect of the average bat speed calculation. Per Baseball Savant: “The fastest 90% of a player’s swings, plus any 60+ MPH swings resulting in an exit velocity of 90+ MPH, are deemed to be his ‘competitive’ swings. The average of these swings are his seasonal average.” It’s possible that more complex logic is used on the backend, but from what I could find, no omissions are made for check swings, bunts, foul tips, etc. Furthermore, a spot check of the season-long averages I calculated against Savant’s bat speed leaderboard matched up nicely.

To me, this says that the calculation relies heavily on throwing out the bottom 10% of swings to remove these less earnest offerings. And in a sample of 50 swings, a bunting spree à la Vargas would get lopped off (admittedly this concentration of bunting is unusual), but 10% of 20 swings is only two swings, so the other three attempts, plus any other noncommittal swings, stay in and skew the calculation. Judging Vargas based on this 20-swing stretch would be a bit like judging The Wire based solely on season two (which I liked, but many didn’t). Vargas temporarily went all-in on bunting, while The Wire went all-in on the stevedores storyline, patterns of behavior that ultimately wouldn’t last.

While Vargas was hurt by a high volume of bunt attempts, others got dinged by their check swing habits. Juan Soto is famous for his knowledge of the strike zone and patience at the plate, but this means he likes to gather as much information as possible before committing to a swing, frequently pulling his bat back at the last second. During two games against the Mariners and their excellent pitching in late May, Soto pulled his bat back seven times, logging partial swings with low bat speeds, and dragging his 20-swing average 15 mph below his full-season number. The “swing” below registered a bat speed of 10 mph, and because he checked, it also earned him a walk:

The TV comp for Soto’s rough 20-swing stretch might be a Ross-heavy episode of Friends, which is to say, an overall good show/hitter that occasionally gives too much emphasis to an annoying character or particular habit.

Fernando Tatis Jr. is also a big check-swinger, but during a series against the Mets in mid-June, a few abandoned swings buddied up with a smattering of oddly hit foul balls to drag his small sample bat speed 19 mph below his full-season mark. The foul ball shown below resulted from a swing clocked at 43 mph:

The weirdness of the Tatis 20-swing sample could be considered akin to an episode from the gas leak season of Community, which, after parting ways with the original creator, still looked like the same show only with poorer execution, leading to mishits and uncertain decision-making.

Moving on to sword rate, finding an adequate sample size turned out to be a tough ask, mostly because the correlation graph (which you can see below) resembles television static from back when TVs were big boxy things; if the cable cut out, you were left with nothing to watch but squiggly black and white chaos. Here we see no gradual improvement as the sample expands; the correlation tops out around 0.2, well shy of the 0.8 target:

line plot showing the results of a reliability analysis on sword rate

This analysis suggests that getting swords at a consistent rate is not a reliable skill for pitchers, at least not given the currently available samples. Perhaps if we have full-season samples to work with, the measurement will stabilize, but the lack of any distinct upward trend in the correlation makes that seem unlikely. Instead we can treat sword rate like SNL, which in its current form doesn’t demand to be watched live in its entirety. Instead, you can catch whatever clips pop up online afterward, and while you’re scrolling, check out whatever swords Pitching Ninja posted.

Out of curiosity, I looked at sword rate from the batter’s perspective, since the bat itself (and therefore, the act of committing a sword) is actually in the hitter’s control, suggesting the skill might be more reliable for the player making the swing decision. The results were more promising, but even a 250-swing sample fell short of the 0.8 correlation cutoff, topping out with a correlation of 0.46.

Few players, even the best ones, are consistent performers with respect to any given metric. Variation, randomness, and external factors lead to noisy, uneven performances. Likewise, even the best shows have hits and misses. A series might crush it at holiday episodes, but still insist on doing musical episodes or dream sequences, or decide to dabble in time travel. In drawing conclusions about a performance, it’s important to make sure the sample size is large enough to distinguish between uncharacteristic miscues and a new state of being — like when Chris Davis forgot how to hit and Michael Scott left The Office.

Content Source: blogs.fangraphs.com

Related News

Latest News