(*) Not even perhaps. It is.

If you see Pedro with a .230 BABIP one year and a .330 BABIP the adjacent year, did Pedro’s TALENT actually change that much? The only way to know that is to know (a) how many BIP he had and (b) what is the MLB spread in BABIP. The first is easy, the second is not. But once you know that one SD is about 10 BABIP points, suddenly you are faced with the realization that a 100 point change in BABIP is either the result of an enormous change in talent that can only be caused by something like an injury, or the effect of Random Variation.

And since the seasons in question are 1999-2000, perhaps the 700 day period over the 150 year history of baseball where we witnessed the greatest pitching performance ever, we know that Pedro was not hampered physically. And thus was born Voros, crystallizing Bill James in a very real world way, begetting Moneyball, and a decade later, the entire revolution of sabermetrics was complete.

***

Now, let’s talk about Sprint Speed. We have hundreds of runs of Byron Buxton on the field of play. Most are low-effort because in baseball most plays require low effort. But, when max-effort plays are in play, then Buxton turns it on. In order to identify POTENTIAL max-effort plays, last year, we focused on runs of 2+ bases. The idea is that at some point during those runs, Buxton will, for at least one second, run very fast. Indeed, once we do that, we find we can focus on his 50% fastest runs, which is terrific. There’s tremendous amount of signal in 2+ base runs.

How much? Statcast Data Jock Travis Petersen looked into it, and the year to year correlation is so strong on this metric, that we only need to add 4 “league average” runs in order to establish a runner’s true talent. So, if you have 40 to 80 2+ base runs in a season, we can basically say that the observed Sprint Speed is 90-95% real. This is a fantastic finding. It’s to the point that you can just look at the leaderboard and not have to make any adjustment. I put in a cutoff of 10 runs, meaning the 5 fastest runs, which is great since we end up with a leaderboard of 400+ players, far more than any other kind of meaningful leaderboard. You won’t get that with OBP or wOBA or even K/PA. What you see is what you get.

Except. Well, except Magneuris Sierra. Last year, Statcast Thought Jock Mike Petriello asked me “how come I don’t see Sierra on the leaderboard yet?” He only had 7 runs, not enough to qualify my semi-arbitrary 10 run limit. But, EVERY single one of those 7 runs had a Sprint Speed of at least 29 ft/s.

Which means even if he had the three slowest runs ever to push him to 10, his 5 fastest were already in the bank. This is the George Brett exception rule, in his chase for .400 and the batting title. If he was short in his 502 PA quest, but still would have won the batting title going 0-fer, then he gets the batting title. AND cementing his OBSERVED .400+ average as legitimate.

So, I introduced the Sierra exception that for anyone with fewer than 10 attempts, I take his 5 fastest. For everyone else, I take their 50% fastest. So, 5, 6, 7, 8, 9, or 10 runs, I take his 5 fastest (rather than the 3 or 4 that the 50% rule would have insisted). At 11 or 12, it’s 6, at 13 or 14, it’s 7, and so on. In effect, for anyone with fewer than 10 runs, I padded his 50% fastest runs with his next fastest to get me to 5 runs (by adding 1 or 2 runs from the “previously discarded” pile).

***

Here’s where worlds collide.

We want to focus on a player’s TRUE speed, which means we should add 4 runs of league average. So for Sierra and his 7 runs, RTTM would insist we base it on his 4 fastest, plus adding in the 4 league-average runs that we add to all players. (The 4 here is coincidentally the same as his 4/7.) That would make the data science world happy.

Mike however said we are measuring his speed, so why would we do this? Why do we even want to do this? It looks wrong.

And I agree. What may make sense in the Voros world of DIPS doesn’t seem to really make sense here. So, deciding to use Mike’s position as our defacto position, what can we do? So, I had an unconventional thought: one thing we can do is rely on the other 50% of a runner’s runs that we did NOT use. Instead of padding a runner’s Sprint Speed with 4 league average runs, what if instead we pad it with X “discarded” runs. What if we take that runner’s 2 or 3 or 5 of his fastest runs that didn’t make our original cut? Remember, we did this for Sierra, but now the question is, what if we did something like this for everyone.

So Travis comes back in, and compares the results of the classic RTTM approach to this Petriello-inspired, Tango-concocted approach of relying SOLELY on the Statcast measurements. And what he found is that if we pad each runner with his 2 fastest of the discarded set, we end up with something VERY CLOSE to the RTTM approach. In other words, we’re able to establish a runner’s true talent level WITHOUT CONSIDERING THE POPULATION DISTRIBUTION.

***

This result is contrary to everything that Travis has been taught, and everything that the lay person is in favor of. Remember those 7 Sierra runs, of which we chose his top 5? Well, now we take the top 50% (meaning his 4 fastest runs) and we pad it with the 2 fastest of his discard pile. We use 6 of his 7 runs.

And Albert Pujols and his 20 or so runs of which we took his 10 fastest and he ended up last in the league. Do we pad it with 4 league-average runs? Nope, we pad it with his 2 fastest of the discard pile, for a total of 12 runs. While this new approach has the effect of hurting the slowest runners slightly more than the fastest runners, given the defecto position we’ve taken, this is a tiny price to pay for the enormous benefit of not having to say “regression toward the mean” to the reader.

So, there you have it, that’s how we can get the EFFECT of RTTM without doing RTTM. And all while treating the player’s runs as its own population independent of all the other runners.

***

In addition to that, we focus on trying to get even more runs. And we ended up being able to identify other high-competitive plays: weakly hit or poorly-topped batted balls. This has the effect of allowing us to more than double our sample. Those balls you may remember are in these two zones of speed+angle (“weak contact”, “hit over the ball”), representing, basically, balls hit to an infielder where the batter is going to bust it down the line.

***

You will see the leaderboard updated on Savant in the coming days. And we’re not done yet. In the coming months, hopefully by the All-Star Break, we’ll have broken down every runner’s speed into components:

- Explosiveness / out of the gate speed
- Burst / acceleration
- Sprint Speed

The combination of the three will give us the one number we have been aiming for. Whereas the NFL has their 40-yard times for all their runners, we’ll have an estimate of every player’s 90-foot times. We’re getting there.

]]>

His core metric is Win Shares. And his main calculation is to subtract 17 Win Shares to convert this into Hall of Fame points, but never going into negatives. Since a full time player has about 30 “Game Shares”, Bill is essentially subtracting win shares above .550, maybe .600.

This is interesting because I’ve been advocating “positive wins above average”. That is, I’ve been doing wins over .500, never going into negatives. Bill basically has a bit higher standard.

Except… when you get TOO HIGH, he starts capping seasons. The cap starts at a win % of around 1.000. Then the player doesn’t get as much credit. However, this may be an artifact of his “bonus points” system. In other words, he sorta of double-counts in some categories, so he has to “half count” to get things in balance. Or something like that.

Anyway, we intersect to a large degree, and we’re at the periphery in difference.

]]>If you are running 62.5 feet all out at 25 ft/s, you are going to cover that distance in 62.5/25 = 2.5 seconds.

When you give a slow runner 12-13 feet of a head start AND you give him essentially a running start, you are conceding the stolen base as was given to Max Scherzer.

***

(Add about 1 second for the start/stop involved.)

]]>Some, wrongly, or very wrongly depending on your point of view, compare nonpitchers such as LF only to other LF and RF only to other RF. We shouldn’t do that because LF and RF are essentially from the same pool of players. LF are a subset of CF+RF. RF are a subset of CF. The only potential exception is with catchers, since the flow of catchers to the other positions is limited. In other words, we have pitchers, catchers, and the pool for the other 7 positions.

Now we have Ohtani breaking that rule. The impact is going to be profound. When Ohtani is a pitcher, his batting performance in NL parks is compared to other pitchers. When he is not a pitcher, his performance is compared to nonpitchers. In other words, the impact of his hitting performance going 1-4 is going to be treated differently as a DH and as a P. As a DH, it’s a net negative and as a pitcher, it’s a net positive.

Does this make sense? For the most part, yes. For the part that it doesn’t make sense, the solution to that will make the rest of the positional adjustment make no sense at all. So, the current method still stands as the best way to do positional adjustments, even with Ohtani.

]]>- if the ball was hit with an EV + LA such that it was potentially close to the generic fence line, then we’re also including the spray direction
- if the ball was topped or hit weakly (which we are treating as the batter potentially legging out a hit), then we’re also including the batter’s seasonal sprint speed

We’re going to have a more elaborate post on this. Those who attended SABR got a sneak peek at this.

Update: this is current for 2018 season, as well as retrospectively for 2017 season. And we’ll be updating 2016 and 2015 as soon as we can. Might be a few weeks.

]]>

In this gray zone of 214-220 feet, there were 17 fielders of which 8 were IF and 9 were OF. (This is 2017 only.)

Another way to look at it is purely based on counts. And here we see that under 200 feet and over 240 feet, the counts are noticeable. So, the gray area is somewhere around 200-240, in terms of deciding what does it mean if an IF plays at 200-220 and an OF plays at 220-240. At that range, is an OF actually now part of “the shift”? Is an IF in that range actually now a “4th outfielder”. So, that’s the gray area to discuss, and where we can decide to set the IF/OF line. I have it at 220, but I can certainly see how it can be moved anywhere between 210 to 230. (Image below is clickable.)

We can even say that you are an “infielder” if you have a reasonable chance of getting the batter out on a force play at 1B. If you can’t, then you are really an outfielder, there to catch balls on the fly. To that end, maybe an infielder has to be shallower, like at 190-210, and what we see as deep “shifts” is in fact not a shift, but a 4th outfielder. Anyway, lots to think about.

What say you?

]]>

Shift v NoShift (pdf)

Below you will find MGL’s initial response to this research, followed by my response to him.

MGL:

]]>For those who followed my old NBA Replacement Level thread, you will see where I’m going with this. To summarize that one, what I had done is sorted the players by playing time, and I ended up with around 270 players, as that gave me the right number of minutes PER GAME to fill up 30 teams. That is, if everyone was healthy, and they all played as they normally would in terms of minutes, all minutes would be accounted for. That accounted for about 85% of the total minutes. Meaning the replacement players accounted for the other 15% of the playing time.

Baseball is a bit more complicated in terms of playing time. But, we can “back in” to that point by taking a tiny leap of faith (for you, not for me) that the .300 win% is close to the replacement level. We have good reasons to using that .300, and we can get there is a variety of ways. We saw Bobby’s terrific research a few weeks ago as further pointing toward somewhere close to .300, as one recent example. If you try it, you will see that it’s hard to get to a replacement level that is below .250 or above .350. In any case, we need a starting point, and .300 works for the illustration I’m about to present.

Anyway, by taking the bottom 20% in playing time, year by year, figuring each player’s Individualized Won-Loss Record (The Indis), you end up with an observed W-L record of close to .300. It will show something like this (image is clickable). Essentially it hovers around a .270 to .320 win%.

The next question is if we’d want to regress that as well, since we’re estimating to begin with. 2011 to 2012 change seems to be a good test case. And to test that, we can look at great players, as ideally, we’d want their talent level to remain (as a group) fairly constant. So, that’d be another avenue to investigate that year to year, we shouldn’t see much change there, except for when you get expansion, war years, etc. Anyway, aspiring saberists, that’s where you come in at this point. This is the kind of thing I would do during my day job that had nothing to do with baseball. Ironically, now that I do have a day job that has everything to do with baseball, I won’t be doing this at the moment. Priorities and all.

For those who want to see if what I’m suggesting has any merit, look at these specific time periods:

- 2014-2015-2016: 2015 is a big outlier
- 1958-1962: it’s jumping up and down every year, and it hovers

The question is if a constant 0.294 repl win% makes more sense, or if we want dynamic. Therefore, what you would do is look at a bunch of great players across 2014-2016, preferably around 24-32 years of age. You total up their Individualized W-L record, and then “back in” a replacement level that would keep their totals constant in each year 2014-2016. Since we’re dealing with samples, we might need to take some artistic licence.

1958-1962 will be interesting as well, as there is not only alot of up and down in that chart, but the end of there also includes expansion.

Anyway, this is the kind of project where 2 hours will lead to 2 days or 2 weeks. Eventually I’d be up for it, but not now. So, I’m putting it out there, and maybe an aspiring saberist will wow us.

]]>Let’s figure it out. Billy Hamilton has reached base by single, double, walk, hit batter, or error a bit over 600 times in his career and his baserunning runs according to MGL’s method published at Fangraphs is worth 46 runs. Which means that Hamilton can add close to an extra 0.08 runs each time he reaches base. That is a huge number. Joey Votto for example adds close to 0.08 runs each time he comes up to bat. The issue is Hamilton reaching base to begin with, since he can only use that 0.08 runs the 30% of the time he reaches base; when he comes to bat, his run value is minus 0.03 runs, which is why he is a net negative as an offensive player. And what about his fielding? He adds about 0.01 runs each inning.

So, let’s add it up: by having someone else bat, you gain 0.03 runs. By having someone else field, you lose 0.01 runs. Having him come in as a pinch runner gains you 0.08 runs. That’s a 0.10 run gain. That’s like adding Mike Trout for one time up.

The cost is that you lose a player for the game. You might end up losing Gennett or Peraza. You need a sub to come in if Hamilton is coming in for an infielder (or catcher!). The sub is a sub for a reason: he’s not league average, but somewhere between average and replacement level. And you’ll have to use him for the rest of the game. Figure that a replacement level player is worth -0.03 runs per PA. So, our bench player is probably going to be -0.02 runs per PA. Give him 3, maybe 4 PA per game, and that’s a cost of say 0.07 runs. But that’s going to only happen if it’s a 2b, ss, 3b, or C, meaning 4/7ths of the time. So, a total cost of 0.04 runs.

That’s the tradeoff: a huge instant gain of getting Hamilton into the game by placing him on base, with the dribble-drabble cost of a bench player usually being paired with him, and losing a roster player.

Unless I missed something in my back of the envelope calculation above, this seems like a decent tradeoff, a total gain of 0.06 runs, every game, for the cost of a roster player.

]]>

You bet. You just need a bit of baseball knowledge. The “average” fence line in baseball goes from 330 to 400 feet, depending whether you look down the line or straightaway. Figure the average distance is 370 feet.

The average fence is about 10 feet. Now, our first baseball assumption is that HR that just cross the fence do so at 45 degrees. Whether it’s true or not, it’s probably close enough to the truth that we can therefore take the simple property that a ball that crosses the fence 10 feet above the ground will also land 10 feet behind the fence.

In other words, the MINIMUM distance for a HR, for the purposes of the simple model we’re creating, is 380 feet. When you create a model, you take shortcuts, and this is going to be a useful shortcut.

We also know, as a baseball fan, that HR that go some 480+ feet are rare. For this model, let’s just force the MAXIMUM distance for a HR as 480 feet. In other words, there’s about 100 feet of play between 380 feet and 480 feet.

We expect some sort of “tapering off” in HR hit the farther away you get from 380 feet. That is, we have 6000 HR, where we get more at 380 than at 480, and it goes progressively down.

Given that, we can come up with a simple function that shows how many of the 6000 HR hit last year would have been hit at 380 feet, how many at 381, how many at 382… and so on up to 480 feet.

You can go ahead and try it… I’ll wait…

Ok, all done? I got around 3% of HR hit at 380-380.999 feet. Maybe it’s a bit lower, down to 2.5%. It could be as high as 4%. I’d be surprised if anyone got more than 5% or less than 2%. Let’s go with 3%. What you will find with whatever model you have is that, close to the fence line, the change will be about 0.1%. So, HR from 381-381.999 feet is 2.9%, and the next foot is at 2.8% and so on.

Which means that each prior foot, the lowering of the fence, or the bringing in of the fence (essentially same thing), we’d see 3.1% at 379 feet, and 3.2% at 378 feet and so on.

So, if we put it all together: with 6000 HR hit overall, or 200 at each park, we estimate that 3% of that, or 6 HR, were hit at 380-380.999 feet. Which means we’d expect 3.1% of 200, or 6.2 to be at at 379, 6.4 at 378, and 6.6 at 377.

Remember, Anaheim dropped their fence by 10 feet across 30% of the fence line, or an average of 3 feet. So, adding up, we get 6.2+6.4+6.6 more HR in Anaheim, or 19 more HR. That’s 19/200 = 9.5%.

As you can see looking at actual data, we would have guessed 17: https://www.mlb.com/angels/news/angel-stadium-will-add-more-homers-in-2018/c-267458738

And that’s how you can do physics-free HR estimates, using simple math, and your baseball knowledge.

]]>

The replacement level is a very real, very tangible place for a baseball team or a baseball player; drop under it and they release you. But nothing happens to you if you’re a little better than average or a litter worse than average; the difference between .490 and .510 is no different than the difference between .520 and .540, or any other similar distance.

This group of players can be represented by MLB players who sign minor league free agent contracts. These are players who are good enough to ride the bench, but lose some talent, or run into enough bad luck that you drop below “the line”, and suddenly, you are offered a minor league deal, and you have to fight to make it onto the team. ESPN provides a 10 year list of free agents who signed minor league deals. All that is needed is for an aspiring saberist to go through that list and show us how good those guys are. Bobby has done just that. He was nice enough to ask for my occasional input, but all the hard work was his.

There is some evidence that the replacement level, as represented by this group of players is a bit too low. However, note that these guys who sign these deals are NOT signing for the MLB league minimum, but rather some 0.5MM$ to 1.00MM$ ABOVE the minimum. From that perspective, it’s very possible that a .300 win% is indeed the right replacement level.

It’s fantastic work, and you should give it a read.

]]>

How do we go about finding it? Here’s how I tried. For every hitter, I looked at every batted ball or strike (i.e., I excluded balls, hit batters, 2-strike fouls, and bunts). For strikes, I gave a blanket -0.07 run value (though in a future iteration, I’d also consider the plate count). For the batted ball, I used the estimated wOBA based on the exit velocity + launch angle.

I then did a rolling average of these run values, with a “window” size of 4 inch x 4 inch, in steps of 0.05 feet. Whichever 4x4 window had the highest performance, I’d call that the hitter’s Nitro Zone. Here’s what it looks like for Aaron Judge.

This is from the catcher’s point of view. Black spots means fewer than 10 pitches.

The canvas is 3 feet x 3 feet, with the green box being 2 feet x 2 feet (roughly corresponding to the strike zone). The purple box is 1 foot x 1 foot, to give you a frame of reference, and would be considered the “heart” of the strike zone.

The white spot is a “neutral” run value, and the more blue, the more it favors the pitcher. The red is the hitter’s zone, with the darkest red being his Nitro Zone. As you can see with Judge, it’s hard to pinpoint just one Nitro Zone. Techically, if we give out the one spot where he does the best, it’s a 4 inch x 4 inch square centered at -0.3 feet along the x (that is, 0.3 feet toward him from the center of the plate), and 2.4 feet off the ground. I would expect pitchers to be throwing him low and/or away. Of course, Judge knows this is going to happen, so it will be interesting to see if he changes his approach.

Here is Mike Trout:

Trout Nitro Zone is low, at -0.1 feet along the x, 2.05 feet off the ground. But he’s got fairly good coverage overall except up high.

Joey Votto:

Votto has a couple of Nitro Zones, but the main one being +0.25 feet along the x, 2.45 feet off the ground. But, really, he’s got great coverage.

Finally, Altuve:

He also has great coverage like Votto, except in his case, he’s a bit higher, with his Nitro Zone centered at -0.45 feet along the X, 2.6 feet off the ground. He’s got good coverage everywhere. Best you can hope for is low and away, but not too low.

I plotted the Nitro Zone for 331 hitters, and here’s where their center points are:

As you’d expect, it hovers around the down-the-middle (x=0, 2.4 feet off the ground).

I have a couple of ideas I’m working on, so hopefully I’ll be able to do this tomorrow, so stay tuned for that.

]]>***

SANTA = -4.18 + 17.3*{Ball Percent} + 96.7*{Near_Barrel_Percent}

If we assume 144 pitches per 9 innings, we can change “ball percent” to “balls per 144 pitches”, and same for near barrel. We get:

SANTA = -4.18 + 17.3/144*{Balls per game} + 96.7/144*{Near_Barrel per game}

Which is

SANTA = -4.18 + 0.12*{Balls per game} + 0.67*{Near_Barrel per game}

This is baselined against “Strikes per game”, as well as all the non-barrel contacts, which are the missing variable. In other words, the above is saying that each ball is worth 0.12 runs above “the rest”.

Eli seems to suggest that a strike and a non-barrel contact is worth roughly the same by lumping them, and… it looks like he’s right! I don’t have the data like Eli has it, but it looks like a non-barrel wOBAcon is around .280, which is about -0.04 runs. A strike is closer to -0.07 runs or so. So, that probably averages out to -0.06 runs. With the ball 0.12 runs above “the rest”, that implies the ball at +0.06 runs.

That seems in the ballpark of being ok.

I don’t have the data exactly ready like Eli has, but a near-barrel and above would have a wOBA probably close to 1.100, or about 0.770 wOBA points above average or 0.62 runs above average. Baselined to Eli’s “the rest”, that makes it 0.56 runs above the rest. The near-barrel (and above) being 0.67 runs above “the rest” may be off, but maybe my assumptions aren’t good enough.

All to say that:

a. we could have gotten to where Eli is using linear weights

b. his equation basically checks out

If it was me, I’d try to merge the two approaches, the linear weights approach (which is the “truth”) to the regression approach (which captures some subtleties, like number of pitches per 9 inning being different for great and bad pitchers), and try to get some “simpler” numbers. Like if you can get “100” instead of “96.7”, I’d go for that. And same trying for the ball term, if that can be either 15 or 20.

Anyway, love the work and approach.

***

Here is a similar themed idea, but relying on BB and K, instead of balls and strikes, as well as Barrels, as opposed to Near-Barrels from Boguslaw.

Anyway, both of these guys are on the right path, in trying to focus on barrels and barrel-like events, as well as something related to balls and walks.

Maybe the answer is different weighting for Super Barrels, Barrels, and Near-Barrels.

But, what is great with what Eli did is that he kept it simple. Rather than the “soup” approach where you can’t see what is going on, you take a methodical approach, so you can isolate the variables.

]]>

Ozuna was -0.007 outs per play in CF and +0.014 in the corners. On the flip side was Aaron Hicks, +0.022 in CF and +0.003 in the corners.

Note that Catch Probability doesn’t distinguish between CF and the corners in establishing a baseline. So, Ozuna/CF is being compared to the same generic player as Ozuna/LF. When we add up all 180 players, weighting on the lesser of their playing time of CF/corner, all of these outfielders were +0.005 outs per play in CF and +0.003 outs per play in the corners. With about 300-400 plays in a full season, this comes out to roughly being a difference of less than 1 out per season. To the extent that there is some sort of bias in using the same baseline for CF and the corners, it is a very unnoticeable one.

Therefore, we can use Catch Probability to compare between the outfielder positions. Over the 2016-17 time period, the average CF was +6.2 outs above the generic outfielder, while the average corner OF was -3.1 outs, for a gap of CF to corner of 9.3 outs.

The corner OF was actually -4.8 outs in LF and -1.5 outs in RF, showing a gap of 3.3 outs in performance between the two corner positions.

]]>

I’m surprised he made no mention of Poisson, which should essentially be your prior, if not the entire model. You can see the Poisson model here, along with Leverage Index. I never incorporated the PP situations, which is a gap, so I’m very grateful that he did.

I’d be interested to see how it compares to the market approach, as determined by Gambeltron 2000.

And the @pullbot is another great innovation.

If recent history is any indication, expect Peter to be hired by an NHL team before the season ends, with his site taken down.

]]>