In the offseason, the priority we had was: no nulls in launch angle and speed (and as much as possible spray direction). And we finally reached that point. We have several levels of processing for the data. We have radar tracking on the ball, which may or may not tell the full story each and every time. We have camera tracking on the ball, which may or may not tell the full story each and every time. We have camera tracking on the players, which may or may not tell the full story each and every time. We have timestamps for all this. Occasionally, none of the play was picked up, be it in a biased reasoning (high popups) or an unbiased reasoning (tracking system was down for half an inning). But, we have stringers that tell us what they think happened for every single play. We have the actual outcome of the batted ball. Every piece of data has a confidence level, so based on the confidence level of the source, we weight the various sources as best we can to represent the play in the most accurate way possible.

As an example, if the tracking system was down, we rely totally on the stringer and the outcome. So a groundball out would get assigned -14 degrees, 84mph, because that’s the average of all groundball outs. But if the tracking system was up, but simply missed the play, we consider that “biased”. In that case a GB out would get assigned -21 degrees, 83mph.

See, in the case that the tracking system is operational, the kinds of plays it misses on are sharp grounders and high popups. So, we have to treat them differently. This is essentially the worst-case scenario. Well, the really worst-case is that we provide nulls, which is the same thing as giving out the “average” value of all batted balls, so say +11 degrees launch. This is what causes bias in the data, and this is what we need to avoid with the no-nulls requirement.

Far more often, we have more precise tracking via the radar and/or camera, and so the uncertainty level is much smaller. All to say, there are now no nulls in batted balls launch parameters, as it pertains to launch angle and speed. This applies to ALL batted balls. And this should get reflected on Baseball Savant in the coming days. We are working on making this retroactive to 2015, because we have the data. This is the cool part that any time we come up with better ways to estimate, we can rerun on all the data. It just takes time to reprocess all the games, and make it hit all the endpoints.

]]>So, Nate is proposing essentially using the situations that LI identifies, and tallying how often a pitcher gives up no runs. The “rules”, while not straightforward obvious, is no more confusing than the Save rule or Hold rule. It’s certainly a step forward.

]]>While I circled Jace Peterson, right behind him on the list is David Freese, who is a Statcast highlight player, in that any time I run a report, Freese stands out as someone exceptional. Freese is a very low launch angle hitter who also sprays the ball, and gets great success when he sprays the ball. When he pulls the ball, he’s one of the worst hitters in the league, but when he goes the other way, he’s one of the best hitters in the league.

He also gets his accomplishments by launching at 28 degrees, the ideal HR launch angle. So, he’s very confusing to analyze, because he seems to defy what he should do, yet he’s getting good success anyway. But, can he be even better?

This is what I did. When he launches at 28 degrees (24-32 degrees), he does that 9% of the time, and his wOBA is .854. But what if he shifted his entire launch angle by 8 degrees? He still get his .854 wOBA at 28 degrees, but now instead of doing so 9% of the time, he does so 12% of the time. How did I get 12%? Well, at 20 degrees, he launches it at that angle 12% of the time. So, we just “shift” his frequency up one rung of 8 degrees. When we do that, his wOBA on batted balls goes from an above-average .409 to a top twenty .443. But, Freese knows Freese, and maybe there’s a cost to pay for shifting his launch angle. Maybe his K rate will go up, so that while he may get more impact per batted ball, he’ll have fewer batted balls.

And maybe his swing is perfectly tuned that there’s no reason to mess with a good thing. That is, remember when I said that he’s one of the worst in the league when pulling? Well if he adds 8 degrees to his vertical launch, it may change his swing so he ends up pulling more, thereby reducing his spray skill, the one thing that he’s really exceptional at. He may simply be the exception that breaks the rules.

Still, it’s always fascinating when you see someone who is among the league-low in launch angle, at a time when many batters are trying to increase their launch angles.

]]>

We have 467 batted balls for David Ortiz. One third of that is 155. I asked this question: If we sort all of David Ortiz’s batted balls based on his launch angle from largest to smallest, which is the contiguous window with 155 batted balls where he MAXIMIZED his wOBA? That is, what is the ideal launch angle for David Ortiz? And the answer is 22.5 degrees. That is, from launch angle 13.8 to launch angle 35.0, David Ortiz had 155 batted balls where he had a wOBA of .929. And that was the highest wOBA for Ortiz in any 155 batted ball window.

I ran this for all batters with at least 200 batted balls. That’s 276 batters. The average launch angle for these batters is 18.3 degrees, well-above the actual league average. This suggests that batters should be, by and large, trying for higher launch angles. Chris Davis has the highest ideal launch angle at 27.2 degrees, followed by Statcast Launch Angle hero Kris Bryant at 25.9 degrees.

The fun though is at the bottom. Dee Gordon maximized his personal wOBA at an average launch angle of 6.5 degrees (-5.9 to +20.2), and with only a .502 wOBA at his best. Then we have Jed Lowrie at 9.1, Billy Hamilton at 9.5. Basically, your speedsters have the consideration that when they hit the ball lower, they can use their speed to get on base.

Does this method make sense? It SEEMS to make sense. However, is one-third the appropriate number? Maybe it should be one-half. I used one-third on the idea that the batting average on contact (BACON) is somewhat close to that number. But, let me try to maximize wOBA with a window that captures half of a hitter’s batted ball. Does that change anything?

Let’s see… ok, using the larger window and the average launch angle is now 15.4 degrees, still well above the league average. Curtis Granderson now has the highest ideal launch angle at 25.3, and Statcast hero Kris Bryant is again #2 at 23.2 degrees.

At the bottom, we have Dee Gordon with an ideal launch angle of 0.6, followed by Billy Hamilton at 2.4, and Billy Burns at 2.6.

I should point out that strikeouts will be the issue that a player can’t necessarily change his swing without his strikeouts taking a hit. Take for example Kris Bryant, who has the ideal launch angle of 23.2 degrees, based on his best 50% batted ball window. He had 105 batted balls hit higher than his window (meaning above 39.1 degrees) and 143 batted balls hit lower than his window (meaning below 7.1 degrees). If he increases his actual launch angle (average of 20.7) to his (presumably) ideal 23.2 degrees, is it going to increase his number of strikeouts?

In addition, the average wOBA of those 105 “too high” launch angle was .082 while the average wOBA of those 143 “too low” launch angle was .243. So, that is one benefit of NOT going for the ideal launch angle, because when you mishit, you’d rather mishit low than mishit high. The mishit high are simply filled with easy pop outs, while the mishit low have a decent chance of giving you a groundball single.

So, it’s very possible that Kris Bryant has figured out that while 23.2 degrees may be his ideal launch angle when he nails it, he should actually be going for 20.7 because it maximizes his overall production for the 50% of the time that he mishits it.

More to come…

]]>

...the speed of the ball off the bat is presumed to be a consequence of three things: the speed with which the ball was thrown, the bat speed, and the degree to which the bat was centered on the ball. Hopefully after we have a few years of data people will figure out that exit velocity doesn’t correlate very highly with the quality of offensive production, and then we can stop speaking about bat speed as if it was actually important.

With regards to this, “(a) the speed with which the ball was thrown, (b) the bat speed, and (c) the degree to which the bat was centered on the ball”: that’s actually a great summary. ...

For (a), Alan Nathan has shown that 18% of the pitcher’s speed is added to the exit velocity. So, 100mph incoming and 90mph outgoing means that 18 of that 90 is due to the speed of the pitch, and the other 72 is the batter himself. Meaning that had he done the exact same thing (with b and c above), it would have exited at 72. ...

For (c), Alan has also formulated in terms of trading speed to get loft, something that anyone hitting knows, but not necessarily can express precisely. I show an example from the HR Derby with Stanton, where 2 of his 3 highest exit velocity were among his shortest, because he hit it too dead-on (no loft), along with Alan’s technical explanation.

In terms of comparing exit speed (and a series of other metrics) to future production, the excellent saberist Craig Edwards at Fangraphs has a study here:

http://www.fangraphs.com/blogs/exit-velocity-part-iii-applying-meaning-to-the-data/

And he shows that the first-half components of ISO, Exit Speed, wOBA, SLG, OBP each forecast the second-half wOBA around the same. Fascinatingly, the one metric that does best is BB/PA. Which really means that a high BB is an indicator or proxy for alot of other things, which of course you can say that about any of them, including exit speed. The real test is that if you already have everything else that I mentioned, how much more (if any) does Exit Speed add.

Well. . .my concern would be the loss in bat control. Bat speed is competing with bat control. We’ve reached the point NOW where I’m ready to say that we’ve gone too far; we’d be better off (in building a team) to focus on guys who make solid contact, rather than those who hit the ball hard when they do hit it. I’m not CERTAIN whether this is true; it is just kind of what I think. Not sure how exactly the research relates to this point.

“rather than those who hit the ball hard when they do hit it”: I agree with you. It’s the Nuke Laloosh issue, but for batters. Carlos Peguero is probably a good example. Basically, by not making contact, you get a “pass” in terms of it not counting against you in exit speed. So, there is an extra layer needed there to handle the swing-and-miss. Another example would be Ortiz v Votto: for all we know, if Votto tried to swing harder, he could hit as hard as Ortiz, but may strike out even more than [he already does]. So, you can’t blindly go on exit speed. Therefore, I think we’re on the same page here.

]]>In order to figure out “speed”, we’d like to be able to put the three of them along a common scale, meaning fitting the distance or the time as a constant. Football has the 40-yard dash, with the 4.4 seconds as the standard to beat. We could try to do distance in baseball, perhaps choosing a short distance like 30 feet or 50 feet, so that we’d be able to cover all kinds of runs. That’s when I turned over to my followers at Twitter and asked them how they would measure it, and I asked it in terms of “number of strides”. That is, how many strides do you need. And it came out to around 7 or so, which is around 30 feet, or, more interestingly, 1 second.

One second. That is perfect. Because if we’re going to have a unit as a “per” second, whether I say “feet per second” or I say “most feet covered in his fastest 1 second”, it’s going to be the same number. Billy Hamilton runs at 30 feet per second? Great. Bill Hamilton covered 30 feet in his fastest 1 second window? Great. In either case, all I have to do is get that 30 feet and that 1 or per second out there, and everything else can be tailored to the discussion.

One second. The other reason that is perfect is because we use cameras to measure the position of the players. And the cameras operate at 30 frames per second. Therefore, technically, I’m looking at the Nth frame and the N+30th frame, and finding those two frames where he positions are the farthest apart.

Interlude: why not MPH? Because it doesn’t mean anything. MPH is the end of the discussion. You can’d do anything if I say “20 MPH”. The story ends. But, if I say “30 feet per second” or I say “he covered 30 feet in 1 second”, and then I show you how he missed a play by 1 foot, you can now relate it directly. The entirety of everything we do is grounded in feet and in seconds. The unit should reflect the reality we are living in.

Visually, we can show it like this:

It’s clear and concise.

Now, on to more specifics, and we are going to deal with speed only as fielders, in this installment. So, outfielders being outfielders. The two fastest outfielders (2015-2016) are Billy Hamilton and Byron Buxton. Everyone else is racing for 3rd place. Do we just take all of Billy Hamilton’s runs in the outfield? Well, this is where all the fun begins. Hamilton’s four fastest runs are all runs where he did not catch the ball on the fly. That is, he did not have to track the ball, he simply put his head down and went to get the rolling ball.

So, our first challenge: we need to separate each outfielder’s run, based on whether he touched the ball on the fly or whether it landed first. When we do that, we see that Hamilton’s fastest run on a putout was 0.3 feet per second slower than his fastest run on a hit. Indeed, in the 111 outfielders in my sample, the average putout was around 0.34 ft/s slower than the average hit. While we’re tracking everything, the standard for the outfielder will be that he had to touch the ball on the fly (putout or catching error).

Next up is to figure out how many runs you want for an outfielder. We can simply rely on his “personal best”, but that hardly seems that something we’d do for anything else. I asked a runner, Julia Prusaczyk, how do track times compare to their “personal best”, and her answer came back that there’s a 4 to 5% difference. So, for example, if Usain Bolt has a personal best at close to 9.6 seconds, then his “average top” speed is around 10 seconds. If we take the 95th percentile and above for each outfielder (average of 27.6 feet per second for the 111 outfielders), and compare it to each outfielder’s “personal best” (average of 28.8 feet per second), that gives us a difference of just over 4%.

We’re all set at this point. Billy Hamilton had 477 runs tracked where he made the putout, and if we take his fastest 5%, that’s an average of 30.2 feet per second. His personal best is 31.6 feet per second. That’s a difference of 4 1/2 percent. Byron Buxton is second at 30.1 feet per second.

Now, how about #3? I asked my readers who they thought was the fastest, and I gave them 3 obviously fast runners and Lorenzo Cain. The fans chose longtime mate Jarrod Dyson over Lorenzo Cain. But the third fastest runner is objectively Lorenzo Cain. Poz will have more on Cain, Dyson, and Orlando in a few days. But this is the great thing here, that we can now have a new way to talk about something as (seemingly) simple as speed. You young pups may not remember Tim Raines and Andre Dawson, best friends and teammates. Raines was a far faster runner than Dawson. But it was Dawson that had the nickname of Hawk. Trying to catch a ball is more than just speed, but “tracking speed”.

In alphabetical order, joining Hamilton, Buxton and Cain in the top 10: Bourjos, Gose, Herrera, Kiermaier, Marisnick, Orlando, Polanco. And in the bottom 5 in alphabetical order: Bautista, Beltran, Cabrera, Cruz, Francoeur. In a few weeks, leaderboards will be coming to a Savant near you. This should be fun.

Oh, and the Fans. Boy do they know what they see. Here is how the Sprint Speed from Statcast compares to the Fans Scouting Report. And with a healthy r=.80, we can see how the Fans have a pretty good way to measure speed, just with their eyes. The fun however is where the fans are wrong. After all, as Bill James said, a metric that never surprises is probably useless and a metric that always surprises is probably wrong. We’re looking for that sweetspot of a metric that confirms 4 out of 5 things we know, and the fifth one is the new thing we learned that makes the metric worthwhile.

]]>I don’t know if this is an epiphany. One of the issues with looking at the forecasts of poorly forecasted players is that when one of them has a terrific start, he will keep playing, while those who have poor starts won’t have a chance to keep pitching. And then we end up weighting the players based on playing time, giving more weight to the guys who got luckier.

So, when I saw this post from MGL, talking about Guthrie, I had a thought: what if MGL gives us the list of pitchers who he says is AA-level (or worse). And then we look at the FIRST game they play in MLB. And only the first game. Indeed, I would even say: let’s look at the first 9 batters they face in MLB. Or even, just the very first batter.

I presume there’s some 100 pitchers every year that MGL thinks is AA-level or worse who pitch each year, facing at least one batter. He does this for the last 10 years of his forecasts, and now we have 1000 PA of what should be AA-level or worse. Did they match MGL’s forecasts? Or did the teams know something he didn’t?

]]>

Let’s talk about outfielders. Outfielders as noted have to track balls. So the first thing we need to do is separate their runs based on whether they caught the ball, or whether they didn’t. The idea being that when you catch the ball, you have to track it, and when you didn’t, you (most often) will just put your head down and run to get it after it lands. When I do so, we get a difference of about 0.3 feet per second. Not much of a difference but enough to be noticeable if we look at the fastest runs in the outfield.

There’s more to come, but for now, there’s a couple of tweets out there to take a look at.

]]>- Hit Probability is completely agnostic of the fielding alignment on the play in question. It’s agnostic as to the identity of the fielders. Whether it’s Kiermaier playing 330 feet from home in dead center, or Kemp playing in an emergency role 280 feet from home dead center, the hit probability does not change. This is because we are calculating: “GIVEN the exit velocity and launch angle of the ball, and knowing nothing else about the launch characteristics, what is the probability that we will see a hit, in some random park on some random day against some random set of fielders playing some random fielding alignment?”
- Catch Probability is completely dependent on where the fielder is positioned, how many feet he has to run, and how much time he has to run. It makes a general allowance for balls hit “straight back” (defined as straight back +/- 22.5 degrees, or splitting the zone around you into 8 slices, and the one slice straight back is the “straight back”). It is (currently) unaware of the wall.

At it’s most extreme therefore, you can have a BARREL at 105mph hit at 30 degrees (hit probability say at 90%), but launched to dead center, where Kevin Kiermaier was positioned very deep and he jogs to the warning track where he’s camped under it (catch probability over 99%).

So, there’s your challenge. Say that in 10 seconds.

***

UPDATE:

A good way to think about out+hit not equal to 100%:

- REGARDLESS of where the fielder is positioned, the
**hit probability**does not change - BASED SPECIFICALLY on where the fielder is positioned, the
**catch probability**will change completely

So, just ask yourself this question:

“Do I care where the fielder was when the ball was hit?”

- If yes, Catch Probability.
- If not, then Hit Probability.

How does that compare to the velocity at y=50 (meaning 50 feet from the back tip of home plate), which was the previous number being reported? Glad you asked. Here are two charts, one based on the difference, and the other based on the rate. Each chart uses both the extension of the pitcher, as well as the pitch speed out of his hand.

Because the percentage retained is virtually entirely based on the distance, we can collapse the above chart like so

***

Just as interesting, an industry-leading site Brooks Baseball has been reporting measurements at y=55, meaning taking y=50 data and inferring speed at y=55 (whether the pitcher releases at 54 feet or 56 feet).

There are good reasons to have a fixed point (whether y=50 or y=55 or ... see below) as well as the actual release point. Both will be tracked. But in terms of the real-time tracking number, **the out-of-hand is what you will be seeing**.

UPDATE: As I noted above, I said BOTH will be tracked. The out-of-hand is what you will see. In order to see the fixed point, you can interpret it from the XML file. The key value you are after is vy0, which is in feet per second, which you can convert to MPH by multiplying by 0.681818. It’s velocity along the y-axis. Thanks to Dan Brooks below for reminding me that to get the speed toward the plate, you need all three axis values, vx0, vy0, vz0. You’d square them all, add them up, and square root.

***

Ok, the “fixed” point, presumably to make sure every pitcher is being compared the same. Let’s say two pitchers throw a ball, one that releases it 7 feet from the mound, and the other releases it 5 feet from the mound. By the time the ball reaches y=50 (meaning 50 feet from the back tip of home plate), both balls are traveling at 95mph. Are they equally impactful from the perspective of the batter?

The guy who released it with longer extension (i.e., closer to the plate), released it, out of hand, at 95.5. The guy who released it with shorter extension, released it, out of hand, at 95.8. Are those two equivalent, from the perspective of the batter?

I don’t know (yet). If they are not equivalent, then there’s no real purpose to reporting the y=50 value. We don’t calculate data for the purpose of calculating data. **We organize baseball data to be able to answer baseball questions**.

It may very well be that the best way to organize the data is to show: (a) speed out of hand, and (b) x,y,z position of the ball at T minus 250 ms, where T=0 is front of home plate (or perhaps where T=0 where y=2 feet from back tip of home plate). Once we figure out what we want, then we’ll do that.

]]>I then showed it to @mtmeyers and he blurted out “Kelly Leak!” I flipped the paper over like a magician, to show him I had already written Kelly Leak.

With @JasonBernard_ doing his magic, it was up to @mike_petriello to bring it all together, and his bang-up job is right here. Check out the video especially.

***

So we talked about the Billy Hamilton Statcast Play of the Year. There’s two main ways to approach the question of attaching run values to the play: bottom-up or top-down. A top-down approach starts with the actual outcome of the play, and tries to split things off into components, so that each player gets his deserved share, within the confines of that play. A bottom-up approach starts with what the player would normally deserve, and then tries to account for any gaps to non-player variables.

Bill James for example is top-down, and MGL is bottom-up.

I was thinking about hockey, where they have (up to) two assists per goal. While the official totals don’t mark the players as first-assist (the one directly prior to the goal) or second-assist (the pass to the first-assist player), the saber-hockey guys do. And I was thinking about that with respect to the Billy Hamilton play. And ESPECIALLY with the Kelly Leak plays.

When Herrera makes an 85% catch probability play ahead of the 99% corner outfielder, it does expose his talent (MGL’s bottom-up). But it also does nothing for the team at all (Bill’s top-down). So I was thinking: I should simply count them separately. When Herrera makes a play where someone else was initially closer, I give him +.15 outs… as a “second assist” kind of play. And we create a “-.14 outs”, which we may or may not, attach to Herrera, as a “unavailable” outs, so that at the team level, you still get your +.01 outs. Billy Hamilton gets +.93 outs for his Statcast play of the year… as “second assist”. When we start to tally the contributions of all our fielders, we can total them as his contributions when he was the closest fielder, and a separate tally when he was not the closest fielder.

Then, when we want to show the MGL approach of “talent”, we simply merge the two primary and secondary “assists” into one number. When we want to show the Bill James approach of “available value”, we merge the two and the “unavailable outs”.

From an official leaderboard perspective, I’d go with the bottom-up approach, since this is going to be consistent with how we are doing “Barrels” and the other hitting profile slices we are creating.

]]>

This was the Statcast Play of the Year. Billy Hamilton ran for 123 feet to catch that. His fastest 1-second window covered 31.6 feet, which was the second-fastest last year, behind his own 31.7 feet. That was a catch probability of 7%.

The issue? Adam Duvall was actually closer, at 108 feet. HIS catch probability was 38%.

Disregarding the fielders, from the perspective of the Reds team, that was a 38% catch probability, maybe 40% considering that was a potential two-fielder play. So, there was +.60 outs to be earned for the catch and -.40 outs for allowing it to drop.

From the perspective of Billy Hamilton, Duvall was not in play. Hamilton did everything he needed to do to earn +.93 outs. Except of course, there’s only +.60 outs to be earned. We can debit Duvall 0.33 outs, so that we are all in-synch.

Had Hamilton not caught the ball, Duvall would have lost -0.38 outs, and the CF would have lost -0.02 for not covering.

Therefore, is this a reasonable approach to follow? What other considerations would you like me to take?

After you respond, I will then give you a wrinkle to consider that will basically question your response, or at least, need to revise it with conditions.

]]>

(No runners on)

]]>

]]>There’s plenty of terrific people I met, and I don’t want to go through the whole list, out of fear I’ll be missing some. Let me pick out a couple of youngsters, to make a larger point. If I even have one. I have no idea, I’m just typing right now. Alex Marcotte came down from Montreal to see me and Mitchel (who I met for the first time as well). Alex is a kid. I don’t know if he’s a teenager or in his early 20s. Alex came into our saber world when I had put out an APB to have someone fix my Wiki. My webhost did some upgrades, which affected the software I was using. And I had NO inclination to spend time trying to fix it. That’s a job for someone in their 20s, maybe 30s. That’s not me any more. He offered, and a few hours later, with some back and forth suggestions on my part, he restored my wiki into archive mode. That was my very first dealings with him. When clubs and then Mickey was looking for some tech help, I immediately sent Alex their way. Terrific kid, and it was great that he made the trek from Montreal, all with his Expos cap on.

After the presentation, there was a line forming, which took me very much by surprise. Alex was in that line to thank me, when really I should have been the one to thank him. Whatever he thinks I may have done for him, he did more for me. Then there were a couple of girls also in line, probably no older than Alex, named Tess and Julia. You gotta understand, this saber-thing is dominated by men, to an absurd extent. To a very inefficient extent. There’s no reason that the % of women fans can’t also be the same as the % of saber fans. And I gotta figure that attendance must be some 40%+ women. Everyone should commit to adding one female follower on their Twitter feed every day, just to try to counter the bias. So, I was very surprised to see The Book, released a bit over 11 years ago when so many weren’t even teenagers, somehow have the staying power to have whatever impact it had on Tess and Julia. I thank them as well. And in due time, I imagine they’ll make their mark on the saber world.

I also met Bill James. His presentation was directly before ours. But I didn’t meet him then, nor did he see our presentation. He wasn’t aware of the schedule, and he had other media committments. I spent two hours with him and Tom Tippett later in the weekend. Tom Tippett basically led the saber-life I had envisioned for myself. He’s about ten years older than me, so you can say he hit the saber-world at the exact perfect time. Tom’s path is the one I admire, but Bill is the one we all revere. Brian Kenny called him “genius emeritus”. A perfect title for Bill. Just as the hockey world is lucky to have Gordie Howe as the face of hockey, one that later generation stars Wayne Gretzky and Bobby Orr still will instantly deflect to whenever their own exploits are brought up, I’m glad we have Bill James as that face. There’s several dozen people you can hold out as having had or will soon have great impact on the saber world. But it’s Bill’s world, and he was the perfect guy to start it all, and to continue to inspire even more. None of us is Bill James, which means we can all find our own little place in this saber world, without thinking that someone is trying to get to the top. It’s a big playground, and we’re all on the same team.

Now that I typed that, I see I had no larger point to make. Let me now post those slides. See you in another thread…

]]>