A Talent / Value Adjustment You’ve Never Thought About

September 28, 2009

(Exclusive to this web site!)

One of the great missing adjustments to even our most advanced stats is the adjustment for the quality of the opposition. A given stat line put up in the AL East means something rather different from the NL Central. Compounding the problem is that the adjustment needs to be iterative; once you have calculated the opposition quality for everyone and adjusted all the stats, you have to paste the adjusted stats over the originals, recalculate the opposition quality, and so on, again and again until the values stabilize. I do that with my adjusted standings at SoSH, but no one, AFAIK, has ever done that with individual hitting and pitching lines.

How exactly would you do this? You could just take opponent overall quality, which would in fact be your best adjustment for value. But an interesting and in some ways better alternative would be to include handedness. For instance, each LHB would get an adjustment based on the numbers versus LHB of the LHP pitchers he faced, and a separate adjustment for the RHP.

(As an aside, if you’re studying whether some hitters have persistent quality-of-opposition splits, you have to do it this way. AFAIK, no one ever has — the few studies showing no such persistence have used opponent ERA, which adds a huge amount of noise. Jon Lieber in his prime was a great pitcher vs. RHB and a lousy one vs. LHB; why count him as average vs. everyone?)

This notion of adjusting for opposition quality by handedness immediately suggests a value adjustment I’ve never heard mentioned. As a rule, elite LHB face a better quality of LHP than do average LHB. Not only are they more likely to not get benched against a C. C. Sabathia, they are hugely more likely to face a nasty LHR in the late innings. The differences among LHB are mitigated (slightly? more than that? I don’t think anyone knows) by this. The quality-of-opposition adjustment I just outlined would put the proper distance between the Alex Coras and Adrian Gonzalezes of the world. And this would be hugely desirable and interesting — when you’re assessing talent, that is.

Now, the funny thing is this: when assessing value, this can probably be safely overlooked. It’s built into the way the game is played now that this will happen. That we are underestimating how much better Gonzalez is than Cora is pretty much negated by the better pitching Gonzalez faces as a result.

However, there is probably a small but very interesting class of exceptions to this rule. You would expect there to be some hitters who get too much or not enough respect from opposing managers, and thus face more or fewer LHR than they ought to based on their own platoon splits. You would adjust for this by finding the correlative relationship between LHB platoon splits and the percentage of time they are at the platoon disadvantage, and then calculate the expected number and quality of LHR they faced versus the actual. The players with the biggest differences, in both directions, would make for very interesting lists. It’s possible that some of the “noise” in platoon splits is actual signal; as LHB establish reputation, managers begin to match them up with their lefty-killers. But reputations lag behind reality, both at the start and end of careers (David Ortiz may now be seeing tougher LHP than other LHB of his quality).

(As an aside, I know that Trot Nixon’s career path of splits vs. LHB was made completely nonsensical by the genius of Jimy Williams, who benched Nixon against even the easiest LH starters but never pinch hit for him against even the toughest LHR. So he was probably leading the league in toughest quality of LHP faced despite being nowhere near the top of the list for overall LHB quality. That’s the sort of guy it would be neat to identify and adjust.)

Advertisements

Matsuzaka’s Journey Back

September 27, 2009

Two things we all know (or is that “know”?) about pitching:

— There’s no such thing as a pitcher with the skill of pitching out of jams. If a pitcher has much better numbers with runners on (or RISP) than with the bases empty, that’s luck and it will normalize, sooner than later.

— There’s no such thing as a pitcher with the skill of getting lots of easy outs on balls in play. While there are real differences in BABIP skill (with knuckleballers leading the pack), if a pitcher has a low enough BABIP, that’s luck and it will normalize, sooner than later.

What, then, do we make of a pitcher who consistently pitches out of jams by improving his rate of easy outs on balls in play?

I think we all knew that Dice-K has both a crazy bases empty / runners on split (and hence strand rate) and a crazy BABIP. What I didn’t know until I ran the numbers is that the crazy BABIP happens only with runners on:

Matsuzaka Tricks

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Bases Empty 1002 .271 .357 .429 .787 6.02 .211 .111 .039 .330 .088 .265 .268
Men On 835 .215 .310 .355 .665 3.31 .229 .104 .039 .259 .073 .200 .281
% Improvement 9% 6% 0% 22% 17% 24% -5%

The improvement in strike zone command seems modest, but it’s crucial: even without the change in BABIP it would be enough to reduce his component ERA with runners on from 6.02 to 4.58. But that guy would still be lousy; he needs the low BABIP with runners on to succeed. And the improvement in strike zone command with runners on shows that the notion that he nibbles more to avoid harder contact is just wrong; when runners get on he comes after hitters more — and with more success.

It’s not merely that he’s doing it with smoke and mirrors, it’s like the smoke is in front of the mirrors and nowhere else.

One might begin to think that he simply pitches better out of the stretch. But then what do we make of this already legendary split, even with its eenie-weenie-teenie-tiny sample size?

Matsuzka Tricks Reloaded

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Other Men On 778 .218 .315 .361 .676 3.40 .226 .105 .039 .263 .074 .204 .281
Bases Full 57 .163 .246 .265 .511 2.20 .263 .088 .028 .200 .057 .152 .286
16% 17% 30% 24% 23% 26% -2%

Or this more obscure and even more puzzling one?

Matsuzaka Tricks Revolutions

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Leadoff Batter 441 .290 .379 .464 .842 7.10 .200 .111 .044 .347 .088 .285 .253
Other Empty 561 .256 .340 .402 .743 5.27 .219 .111 .035 .316 .089 .249 .281
% Improvement 10% 1% 20% 9% -1% 12% -11%

Let’s just combine the last two for easy scanning:

Matsuzala Tricks, The Box Set

Split PA BA OBP SA OPS CR/27 K% BB% HRC BABIP XBH/BIP 1B/(1B+OIP) XBH/HIP
Leadoff Batter 441 .290 .379 .464 .842 7.10 .200 .111 .044 .347 .088 .285 .253
Other Empty 561 .256 .340 .402 .743 5.27 .219 .111 .035 .316 .089 .249 .281
Other Men On 778 .218 .315 .361 .676 3.40 .226 .105 .039 .263 .074 .204 .281
Bases Full 57 .163 .246 .265 .511 2.20 .263 .088 .028 .200 .057 .152 .286

Pure science fiction.

I definitely intend to attack this question with pitch/fx data over the winter. Right now, I’m open to any possibility.

Taken from: Sons of Sam Horn

Strikeout Balance- What’s your preference?

September 23, 2009

Let’s Take a Look at Pedro, Circa 2000

September 23, 2009

I have friends who are complete non-baseball fans, but big-time math-science geeks, i.e., they’re folks who understand the normal distribution of natural phenomena and can appreciate exceptions. I have entertained them greatly by reading the leader boards from those two years in descending order. Especially 2000, where the 2-5 finisher in most stat categories were tightly clustered.

Let’s play “what’s the next term in this sequence?”!

ERA: 4.17, 4.14, 4.13, 4.12, 4.11, 3.88, 3.79, 3.79, 3.70 . . .

How many people had 1.74?

WHIP: 11.52, 11.48, 11.18, 10.79 …. 7.22

Opposition OBP: .306, .303, .298, .291 … .213.

Opposition SLG: .392, .384, .374, .371 … .259.

This (as well as 1999, of course) is an essentially superhuman performance. If there had been a league as much better than MLB as MLB was to AAA, and then another league with the same performance differential, Pedro would have still been the best pitcher of that league.


The Applause For Jeter is Still Ringing

September 13, 2009

The applause for Jeter is still ringing as longevity rings true to the clappers.  The question this Neanderthal baseball aficionados may be asking is where does he fit into the hierarchy of Yankees.

While I was known to sport 8 column accounting paper and index cards to compile (hence the tag of Neanderthal) I am doing this from gut, memory and historical data free floating through my brain and a sneak or two into Sam’s library.

1-Ruth-Rank # 1 RF All Time            HOF               James rank #1

2-Gehrig Rank # 1 1st Base All Time HOF    James rank #1

3-DiMaggio # 3 CF All Time    HOF       James rank #5 ( placed in front of Mantle due to fact that Mantle was not best CF of his era nor the best of his home city and was often number #3  his city and undue influence of many old time fans who always said  blah blah about Jolt ‘n Joe and fact that career was interrupted by WWII

4-Berra Rank top 2 catchers All Time    HOF   James rank #1 His job was to move runners along not stare at close pitchers like his Boston rival who cared more for his averages.

5-Mantle # 5 CF  HOF             James rank #3 Loved when the Mick would limp up to plate after a night out at the Copa and hit a 500 foot HR just to make sure then game was ended and he could get a drink asap. But after Doc Gooden the greatest waste of talent I have ever seen and ranked behind Berra as the best player on Yankees many years. The Yankee fans have a proclivity to allow players to waste their talents because they were surrounded by other talent. Example, Gooden and Strawberry were bums later in their careers with the cross-town team but beloved by Yankee fans. Yup, if you beat your wife, had children scattered around the country, drink too much, took drugs and were a Yankee it was all Ok. Of course if you acted like a thug in the Bronx and were not a Yankee then lock um up and throw away the key.

5-Riveria Rank # 1 CL All Time         F-HOF The best of his era and no MVP award.

6-Ford Rank, I would rank among the best pitchers in the clutch, HOF James ranked 22 (Sam do not edit out my words) and if you saw Whitey win the opening game of series after series year after year you would get it. The bottom line is the great Yankee dynasty of the Stengel years was dominated by two over-riding facts, the Yankees had a “major” league farm team in KC after 1955 and the Yankees had great pitching. So to achieve the winning they did they needed pitching and Whitey was Chairman of the Board. He continued his importance during the next Yankee dynasty, the Houk years

7-Jeter Rank Top ????  SS All Time

8-Gomez, Ruffing, Pennock, Reynolds the Yankees of all winning era’s need a stopper to maintain the level of winning they achieved.

9-Gossage one of great RP of All Time, James rank #37 .  One of best of the genre of RP, fun to watch, when baseball was more a blood sport instead of a constant bonding experience between players.  HOF

10- Dickey, Must have been great, and he was a catcher. HOF  James ranked 7

11- Lazzeri, Williams, ect.

No Mattingly, No Rodriguez. To be on the Yankees and considered an all time great you had to have won the World Series. This is not the Cubs and we do not have to worry about Ernie Banks.


Defensive Spectrum Be Damned

September 12, 2009

Over the last 10 years or so, the “advanced” statistics that became popular evaluated players against a position specific offensive baseline – VORP, for example. If a shortstop and a second baseman had the exact same batting line, the shortstop would rate higher by that kind of metric, due to the fact that second baseman hit better as a group than shortstops. As such, it’s become exceedingly common to see people write things like “he’s got enough offense to be valuable as a shortstop, but he doesn’t hit enough to play second or third”.

Positions are essentially just a way to arrange players in a manner that produces the most efficient defense possible. You can literally play anyone anywhere – there’s no rule preventing the Nationals from sticking Adam Dunn at shortstop, for instance. They realize, however, that they will field a better team by minimizing the amount of times that Dunn has to move laterally in order to make a play, so they hide him at first base.

From Cristian Guzman and Position Changes

I agree in heart with this post.  However, I believe it misses a couple of crucial points.  It does not take into account how a player can get used to his position, or how he can have a skill set that is more attuned to a position’s needs.  For instance, a third baseman needs less lateral range than a second baseman, who in turn needs less back and forth range than a shortstop,  and the defensive spectrum fails to recognize this.  UZR is not perfect, and no measure but FieldFX ever will be.  However, this is such an imperfect science, I don’t expect that we will ever be able to grasp all the nuances of defense.


Streakiness and the “Fog”

September 9, 2009

An aside on the notion of streakiness (part 1B of the series, if you will).

There have been many studies showing that all streakiness in sports is random. There was an exhaustive study a couple of years ago on Retrosheet that found that hot and cold streaks had no predictive power.

The problem is in the assumption that if a hot streak or cold streak is real, it must have significant predictive power.

In baseball, the standard test of the reality of hitting streaks is serial correlation. Is a player’s performance in one game predictive of his performance in the next? The problem with this is that there’s an enormous amount of “noise” added to the signal. A hot hitter will face Sabathia and go 0-4, a cold one will face a AAA callup and / or get two bloop hits.  Red Sox and Yankee fans may remember a series in NYC (May 27-29, 2005) where Manny Ramirez, one of baseball’s truly streaky hitters, came in 1 for his last 12 and looking awful and went 7-13, each and every one a cheap single; he then left town and put up a 562 OPS in his next 10 games.

The further problem is that in a serial correlation, the end of each streak and the beginning of the next form a pair of points included in the correlation, when our hypothesis is that they should anti-correlate. That further reduces the strength of the measured correlation.

In fact, if you take Manny’s career with Boston and divide it into apparent hot (actually just normal) and cold streaks and remove the anti-correlated data pairs, you do get a significant or nearly significant serial correlation (of linear weights / PA). And as I have noted elsewhere, player seasons often divide into chunks that chi-square tells us are unlikely to be random.

Standard statistical tests of streakiness just aren’t up to the task of demonstrating it’s real. That doesn’t mean it isn’t real — a perfect example of what Bill James calls the “fog.”

I’m 100% certain that a study using experienced baseball scouts could prove the existence of streakiness by having them significantly outperform chance in their ability to predict the end of slumps by streaky players like Manny (as Jerry Remy used to do). IOW, they’d say, “OK, today player X fixed his mechanics and should perform better over the next N games than the last N.” And they would be right most of the time.