the BEST base thief

February 13, 2010

Who is the best base stealer in
MLB history? Most “who da best?”
questions don’t provoke answers as
seemingly obvious as this one. But…
I’ll post a quickie with my take
that the obvious answer isn’t the
right one. Maybe in a week or two.
For now, y’all can think about it.

Advertisements

Matsuzaka’s Journey Back

September 27, 2009

Two things we all know (or is that “know”?) about pitching:

— There’s no such thing as a pitcher with the skill of pitching out of jams. If a pitcher has much better numbers with runners on (or RISP) than with the bases empty, that’s luck and it will normalize, sooner than later.

— There’s no such thing as a pitcher with the skill of getting lots of easy outs on balls in play. While there are real differences in BABIP skill (with knuckleballers leading the pack), if a pitcher has a low enough BABIP, that’s luck and it will normalize, sooner than later.

What, then, do we make of a pitcher who consistently pitches out of jams by improving his rate of easy outs on balls in play?

I think we all knew that Dice-K has both a crazy bases empty / runners on split (and hence strand rate) and a crazy BABIP. What I didn’t know until I ran the numbers is that the crazy BABIP happens only with runners on:

Matsuzaka Tricks

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Bases Empty 1002 .271 .357 .429 .787 6.02 .211 .111 .039 .330 .088 .265 .268
Men On 835 .215 .310 .355 .665 3.31 .229 .104 .039 .259 .073 .200 .281
% Improvement 9% 6% 0% 22% 17% 24% -5%

The improvement in strike zone command seems modest, but it’s crucial: even without the change in BABIP it would be enough to reduce his component ERA with runners on from 6.02 to 4.58. But that guy would still be lousy; he needs the low BABIP with runners on to succeed. And the improvement in strike zone command with runners on shows that the notion that he nibbles more to avoid harder contact is just wrong; when runners get on he comes after hitters more — and with more success.

It’s not merely that he’s doing it with smoke and mirrors, it’s like the smoke is in front of the mirrors and nowhere else.

One might begin to think that he simply pitches better out of the stretch. But then what do we make of this already legendary split, even with its eenie-weenie-teenie-tiny sample size?

Matsuzka Tricks Reloaded

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Other Men On 778 .218 .315 .361 .676 3.40 .226 .105 .039 .263 .074 .204 .281
Bases Full 57 .163 .246 .265 .511 2.20 .263 .088 .028 .200 .057 .152 .286
16% 17% 30% 24% 23% 26% -2%

Or this more obscure and even more puzzling one?

Matsuzaka Tricks Revolutions

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Leadoff Batter 441 .290 .379 .464 .842 7.10 .200 .111 .044 .347 .088 .285 .253
Other Empty 561 .256 .340 .402 .743 5.27 .219 .111 .035 .316 .089 .249 .281
% Improvement 10% 1% 20% 9% -1% 12% -11%

Let’s just combine the last two for easy scanning:

Matsuzala Tricks, The Box Set

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Leadoff Batter 441 .290 .379 .464 .842 7.10 .200 .111 .044 .347 .088 .285 .253
Other Empty 561 .256 .340 .402 .743 5.27 .219 .111 .035 .316 .089 .249 .281
Other Men On 778 .218 .315 .361 .676 3.40 .226 .105 .039 .263 .074 .204 .281
Bases Full 57 .163 .246 .265 .511 2.20 .263 .088 .028 .200 .057 .152 .286

Pure science fiction.

I definitely intend to attack this question with pitch/fx data over the winter. Right now, I’m open to any possibility.

Taken from: Sons of Sam Horn

Streakiness and the “Fog”

September 9, 2009

An aside on the notion of streakiness (part 1B of the series, if you will).

There have been many studies showing that all streakiness in sports is random. There was an exhaustive study a couple of years ago on Retrosheet that found that hot and cold streaks had no predictive power.

The problem is in the assumption that if a hot streak or cold streak is real, it must have significant predictive power.

In baseball, the standard test of the reality of hitting streaks is serial correlation. Is a player’s performance in one game predictive of his performance in the next? The problem with this is that there’s an enormous amount of “noise” added to the signal. A hot hitter will face Sabathia and go 0-4, a cold one will face a AAA callup and / or get two bloop hits.  Red Sox and Yankee fans may remember a series in NYC (May 27-29, 2005) where Manny Ramirez, one of baseball’s truly streaky hitters, came in 1 for his last 12 and looking awful and went 7-13, each and every one a cheap single; he then left town and put up a 562 OPS in his next 10 games.

The further problem is that in a serial correlation, the end of each streak and the beginning of the next form a pair of points included in the correlation, when our hypothesis is that they should anti-correlate. That further reduces the strength of the measured correlation.

In fact, if you take Manny’s career with Boston and divide it into apparent hot (actually just normal) and cold streaks and remove the anti-correlated data pairs, you do get a significant or nearly significant serial correlation (of linear weights / PA). And as I have noted elsewhere, player seasons often divide into chunks that chi-square tells us are unlikely to be random.

Standard statistical tests of streakiness just aren’t up to the task of demonstrating it’s real. That doesn’t mean it isn’t real — a perfect example of what Bill James calls the “fog.”

I’m 100% certain that a study using experienced baseball scouts could prove the existence of streakiness by having them significantly outperform chance in their ability to predict the end of slumps by streaky players like Manny (as Jerry Remy used to do). IOW, they’d say, “OK, today player X fixed his mechanics and should perform better over the next N games than the last N.” And they would be right most of the time.


Understanding Player Streakiness #1: The Epic Slump of Big Papi

August 26, 2009

Why do we believe that David Ortiz’s unimaginably bad April and May should nearly be ignored when projecting his performance for the rest of the year?  Why should we believe he has the skills of a .259 / .352 / .566 hitter (his numbers from June 6 to August 24) rather then those of  a .227 / .320 / .439 hitter (his overall season numbers)?   Isn’t this “cherry-picking” of the worst sort?  Why would you toss out a significant chunk of a season like that?

It’s not a trivial question: if Ortiz really is a 759 OPS hitter now, he should be getting no PT for the Red Sox, not when Casey Kotchman is sitting on the bench.  In fact, some on SonsofSamHorn.net (from which much of this post is adapted) have been arguing just that: bench Papi, he’s toast.  But if he’s actually a 918 OPS hitter, that would be crazy.

The first thing to understand is that you could play Stratomatic Baseball or Diamond Mind from now until the day you die and not see a .439 slugger put up a .288 SA in his first 221 PA and a .566 SA in his next 264.  Ortiz’s HR / Contact has gone from .007 to .106; the odds against seeing that in a random simulation are something like 3,823 to 1 (chi-square, p < .0003).  [NB: Yes, I know that chi-square is not exactly accurate with any of the n < 5, but it’s close enough for sabermetrics.]

It’s important to understand that streakiness is real. Player seasons which divide like this and give ridiculous odds of happening randomly according to chi-square are commonplace.  That serial day-to-day correlation is not significant does not disprove the notion of streakiness; it just fails to give positive evidence.  Remember that a correlation does not measure the strength of a relationship; it measures the strength of a relationship minus the noise. Add sufficient noise, and any real relationship can fail to show a significant correlation.

The second thing that’s crucial is that we have a very good explanation for why Ortiz (or any hitter and especially any aging hitter) could have such a miserable stretch of season.  When looking at big splits, “do I understand how this happened?” is one of the two crucial interpretive questions you must ask yourself.  (The other—and sometime it’s the same question—is “did I look for this split to confirm an existing hypothesis or suspicion, or did I stumble on it while examining all of his splits?”)

The explanation (and here we shift from sabermetric mode to scouting mode, and it’s something that every sabermetrician needs to be able to do) begins with the psychological contrast between the “Big Papi” of legend and the Ortiz of April and May. From the ’04 post-season to the end of ’06 Ortiz may have been the most confident athlete you’ll ever have the pleasure to watch.  I’ve done studies which showed that his success in walk-off situations literally had millions-to-one (maybe billions-to-one) improbability. He not only knew he could hit, he knew they couldn’t get him out when it really counted.

This absolute confidence disappeared when he started suffering the health problems that come with ordinary aging. In ’07 and ’08 his clutch differential was actually negative, which was just being average plus bad luck.

Compare the guy who knew that no pitcher in the planet could get him out when the game was on the line to the guy who told the press “Papi sucks.” Ortiz suffered a complete collapse of confidence, complete self-doubt.

Now, the way this affects hitting is that it causes you to think about mechanics while you’re up there. That’s the last thing you want to do; it’s got to be what people call “muscle memory.” I suspect former Sox #1 prospect Lars Anderson has struggled this year in part because he’s so damn smart, and I suspect that the success of guys like Wade Boggs and Manny Ramirez is directly correlated to, shall we say, their unlikelihood of ever joining Mensa. I think it took Dwight Evans half his career to stop thinking too much while he was up there. (NB: I’m not talking about the “what’s he going to throw me next” thoughts between pitches, just whether the hitter can shut out conscious thoughts about swing mechanics.)

It’s important to note that there were a few weeks where we had persistent reports that Ortiz was having great BP but was still struggling in games. BP gives you a chance to work on mechanics and, having made an adjustment, get it out of your conscious mind, let it settle into muscle memory, and take a bunch of repetitions. Bringing that to games can be a big challenge. It’s the reason why slumps last as long as they do despite hitting coaches, video study, and extensive extra BP. It’s absolutely like the “don’t think about elephants” dilemma. It’s not just that you have to get past the stage where you’re actively thinking about mechanics, you have to get past the stage where you’re thinking that you shouldn’t think about your mechanics. That takes repetitions and confidence. You have to literally forget you’re slumping.  You can’t be trying really hard to relax.

To sum up:

Age and declining health – > increased likelihood of mechanics getting out of whack, at the purely physical level. Your knees hurt, you lessen the depth of your crouch, suddenly the swing is just a bit off.

Declining health -> loss of general confidence. You know you’re not physically the player you used to be.

Loss of confidence -> increased likelihood of thinking about swing mechanics while at the plate. Once the thought even crosses your mind that the swing might not be right, thinking about the swing while up there just makes things worse.

That’s how slumps start. And then the bad performance of the slump creates a further loss of self-confidence which leads to more thinking which leads to yet worse performance.

The reason why we can ascribe Ortiz’s epic slump to these psychological processes (writ much larger than usual) rather than to a fundamental loss of skills is obvious and trivial: his performance after the slump is over.  The numbers are arguably even more dramatic than the ones I noted at the beginning, because in the 11 games after the PED story broke, Ortiz hit .114 / .204 / .136 in 49 PA, and according to his own testimony he wasn’t sleeping at night.  Those 49 PA can actually be excluded by the same logic, leaving us with a “true maximum skill level” of something like .293 / .386 / .668, which is basically his season projections with a big power boost.

The reason why you don’t project him to hit like that the rest of the year is equally obvious and trivial: he is not immune to further slumps.  There is even a small probability that the next slump will be extended, like the first one, but that is mitigated by two factors: he is much more likely to fear that he has lost his skills when he’s slumping at the start of the season rather than in the middle, and the April and May slump was exacerbated by the pressure of not having hit his first home run of the year (from the date of his first homer on May 20 to the end of the slump on June 5, he showed real signs of life, with a huge increase in pulled line drive percentage.  This was precisely the period where he was reported to be having great BP.)

In terms of pure, peak, hitting skills, David Ortiz is probably 90% the hitter he used to be a few years ago.  He probably has something like a 500% higher probability of getting himself into a serious sustained slump, especially at the start of the season (April ’08 was also terrible).  The specter of these extended slumps diminishes his overall value, but they do not much affect our sense of what he’s likely to do in the short run.


Texas Rangers – who’d a thunk?

August 17, 2009

The Texas Rangers are LEADING the AL in run prevention! I did NOT see that coming. So, how are they doing it?

Check out the rotation: of the Rangers 6 pitchers who have started the most games, the THIRD best ERA is Holland, at 4.88! Hmmm… doesn’t seem all that special.

New call-up Tommy Hunter is on a roll; but a Fly-ball pitcher with a fairly low KO rate uis not a prescription to for an ERA in the low 2s, as he has so far. Combine that with his 4.2 minor league ERA in 09, and I think he’s over his head.

The Rangers have allowed an incredibly low 28 UNearned runs (league avg is 42). Even though they have committed an average amount of errors. Great ‘clutch’ play there I guess.

Their bullpen has been superb – anyone see Darren O’Day’s ERA of 1.69 coming, after the Mets waived him since his ERA last season was 4.57?

All told, I’m not big on expecting Texas to win the wild card; sure looks like smoke and mirrors. But go ahead guys, surprise me.


Writing ’bout ‘roids

July 31, 2009

In today’s world, steroids provide most of the controversy in sports.  However, ballplayers do not like being accused of juicing.  This provides a serious dilemma for writers, as the threat of libel hangs over all those who are brave enough to make accusations.  I was planning to do an all-steroid all star team, but I must refrain, lest my list or my liability be compromised. Think about that.  I have to worry about making the same statements I hear thousands of times a day merely because I am putting it on a blog.  Is that truly freedom of speech?