Sunday, July 05, 2009

Updated Wins Above Replacement

I've made a few changes:

1. Add a column for reaching on errors for batters.

2. Cap the replacement column, which is based on plate appearances, at 4 PA per game. The batter will be evaluated as if he has the lessor of 4 PA per game, or his actual plate appearances. This eliminates the leadoff bonus, where leadoff hitters may have added 1-2 runs per year to their rating just because they bat at the top of the lineup and get more bats. One reason for this is that in evaluating runs over replacement, you have to assume that the replacement player will not bat at the top of the order, but the bottom, with the other hitters moving up.

3. Pitcher's hitting and pitching records are all on the same page.

4. Pitchers have a new set of columns, showing how far above/below average they were in several independent categories. This is meant not as a value measure, but more a descriptor. Not "How great a pitcher was he?", but "What kind of pitcher was he?" This shows that practically all of Randy Johnson's value came from his strikeouts, he was essentially an average pitcher if he didn't whiff you. Roger Clemen's value lies more in a mix of strikeouts and homerun prevention. Tom Glavine, on the other hand, was below average in strikeouts but excelled in keeping the ball in the park and stranding runners.

5. Pitcher's hitting records include a position adjustment, so his WAR Total shows how valuable he was relative to the average pitcher. (For pitcher's hitting, average and replacement level are the exact same thing.)

6. And finally, the numbers go all the way back to 1871 for hitters and 1876 for pitchers. Some of the estimates used to fill these stats in, like the baserunning regression formula or the JAARF fielding numbers, are not to be trusted as anything more than a reasonable guess. Catcher defensive ratings are based on passed balls and errors only. It is not worth it to even try to estimate performance against the running game by catcher assists. Mike Piazza had about as many assists per game as Johnny Bench. Enough said.

7. The 300 lists have not been updated. When I do, it will probably be a 500 list since so many more players are added.

Friday, July 03, 2009

Highest Leverage Index of All Time

...Or at least 1954, the years retrosheet has play by play files for.

I sorted my career pitcher log by career leverage index, which measures the volatility of a game (1 is average, 9th inning, 1 run lead is high, 7th inning, 12 run lead is low). Since there are some pitchers who came up for a cup of coffee, might have for some emergency found themselves in a crucial sitation, and never pitched again, I was expected to see a lot of 2-5 inning guys before I got to the real careers.

In reality, there are only a few of them. Bruce Sutter ranks 8th with a 2.0 leverage index, behind 7 guys who pitched no more than 3 innings each. A few pitchers come in at 1.9, K-Rod, Percival, John Franco, and Trevor Hoffman among them. Mariano Rivera is at 1.8.

Only one pitcher has a leverage index of 3.0, hitting that on the nose. Surprise is, it wasn't even a real pitcher, but catcher Brent Mayne for one inning on August 22, 2000. The game between the Rockies went 12 innings, and the Rockies had already used 9 pitchers before handing the ball to Mayne. I don't remember the whole story, maybe the last guy got hurt and Mayne had to pitch since there was nobody left. The 6 pitchers before him didn't work very hard, throwing between 3 and 12 pitches each.

Anyway, Mayne faced Tom Glavine and got a ground out. Walt Weiss flew out to center. This pitching stuff ain't that hard, is it? A single, wild pitch, and walk later, Mayne found himself staring down the reigning NL MVP, Larry Wayne Jones, who had already homered in the game. No big deal, Jones grounded out to third, the Rockies won the game in the bottom of the inning, and Mayne's legend as the most leverage pitcher of all time was established.

Tuesday, June 30, 2009

Updated Hitter Projections

The player pages haven't changed, but I did add a hitter update in an excel spreadsheet on the right hand side of Baseballprojection.com.

Some things that are not yet in the new program: Runs, RBI, SB, CS, and any estimate of playing time. For every player I just project 350 plate appearances, which is about what they'll get if they play every day from here on out.

The projections include all the pre-2009 minor league data that went into last winter's projections, but no 2009 minor league data. Exceptions are Brandon Wood, Sean Rodriguez, Matt Wieters and Jake Fox, who I entered by hand.

Wednesday, June 24, 2009

David Wright

I've been reading this thread on David Wright. It made me wonder what the proper weights are for projecting a player, varied by the component. I looked at strikeout rate, walk rate, homer rate, and BA on balls in play. The hitters I used are guys who had 400+ AB in 4 straight years from 1982 to 2005. I'm using the first 3 years to try and project year 5. The weights are year-1, year-2, year-3, and LG average.

For K, the weights I get are 7/3/2/1, which yields a new .235 K rate for Wright.

BB: 9/5/4/1. Not a whole lot of regression needed when you have 3 full years of these players. Wright comes in at .133 per PA, the only part of his game where he's playing at his normal level.

HR: 11/7/5/2. For David, a .043 rate per contact (AB-K) which means 12 more homers, and a projected season total of only 16.

BABIP: 10/7/6/10. Here's where regression plays a big role, but still gives a rest of season figure of .368. We can do a better job by considering batted ball data and player speed, but that's it for tonight.

I didn't do extrabase hits, but assume 20 2b and 3 3b, put the pieces together, and I get a rest of season line of 303/395/475. That does surprise me a bit, I didn't think his power projection would drop so much. But he's still a great player, even if that's all he does from here on out.

Monday, June 22, 2009

Outfield Ratings for pre-Retrosheet

A few weeks ago I asked people to rate, on a scale of 1-5, the defensive play fro some outfielders who played from 1900 to 1950. Only 3 people answered the call, but I'll take what I can get (thank you to those who helped), and compare them to my system, JAARF (Just Another Adjusted Range Factor).

I'll score my results as either hit if it reasonably matches the subjective rating, foul if it isn't horrible, and whiff if it's too far off. There were a few misses, but overall I'm happy with the results given the crude data I worked with. Even with today's advanced defensive stats, there are still a few whiffs where either the numbers or the impressions of observers are way off. Take Torii Hunter. His UZR is a bit below average though he regularly amazes Angel fans, including this blogger. I'm not going to go on an anti-UZR (or anti-TotalZone) diatribe, but I'm still quite content to have Torii in CF every day.

Anyway, the ratings. First is a number, 1 to 5, average reader response. 5 is better, 3 is average. Second is career JAARF runs at the position.

The hits:
Averill, cf, 2.3, -59
Carey, cf, 4, +103
Cobb, cf, 3.3, -15
Dom Dimaggio, cf, 5, 74
Goslin, lf, 3.7, 66
Heilmann, rf, 2, -70
Hooper, rf, 5, 150
Joe DiMaggio, cf, 5, 81
C Klein, rf, 2, -57
L Waner, cf, 4, 33
D Lewis, lf, 4.5, 48
Speaker, cf, 5, 182
P Waner, rf, 4, 55
Cy Williams, cf, 2.5, -16
T Williams, lf, 2, +3

I counted Williams as a hit since this stat only shows him through 1954, after that TotalZone takes over and rates him below average for his remaining years. It looks like a normal career progression, and Ted was an OK fielder when he was younger, and 20 years too early for the DH role after that. I'm proud of having Speaker with the big rating.

The fouls: Not good ratings, but not clearly wrong:

Crawford, rf, 3.7, -48
Bob Johnson, lf, 2.7, 46
Kiner, lf, 1, -16
Manush lf, 2, 28
S Rice, rf, 3.7, 91
Simmons, lf, 3, 105
Slaughter, lf, 3, 40
Z Wheat, lf, 3.3, 109
Cramer, cf, 2.7, -70

Perhaps these can be explained partially by position, most of these are corner outfielders who rate about average but have good numbers - They weren't especially good, thus they were stuck in corners, but might have been better than some oafs out there. Cramer is the opposite in center, he was a bit below average, but compared to much better outfielders his run rating is very low.

The ones who don't fit the pattern are Crawford, who probably should rate better, and Kiner, who should rate worse.

The whiffs:

Medwick, lf, 2.5, +97
Nicholson, rf, 1, +26
Mel Ott, rf, 2.5, +87
Roush, cf, 4, -8
Ruth, rf, 2.3, +103
Veach, lf, 2.7, 72

Mostly bad fielders who rate well by some flaw. I don't think Ott was a bad fielder, but I tried to keep my own opinion out of it. Roush was considered a great defensive centerfielder, perhaps the best in the NL of his day. Ruth? I've got some thoughts on this but with his high defensive ratings (justified or not) he and Bonds have very similar profiles on the WAR charts. Except for pitching, of course.

So overall, the ratings are probably right 50% of the time, wrong 20% of the time, and inconclusive 30%. I'm only doing themjavascript:void(0) because it has become an obsession, to rate every damn player who ever played the game.

Saturday, June 20, 2009

Praise for a few announcers

The normal thing for bloggers to do is wait for announcers to say something stupid, then write a post explaining exactly how they are stupid. Whole websites have been built around this concept. Sometimes, though, credit is due.

Today's game of the week features Tampa Bay and the Mets. David Wright is currently hitting .350 but is on pace for 170 strikeouts and only 10 homers. It is certainly a weird season. Ken Rosenthal brought these numbers and mentioned Wright's .480 BABIP. Even Tim McCarver seemed to grasp regression to the mean, as he (rightly) doesn't think Wright will continue striking out at a 170K pace. It was a discussion you'd be more likely to read at a site like the Hardball Times.

Wright is having an odd mixture of good and bad luck. There's some bad luck in that he's not making contact with balls he normally hits. Some bad luck in that balls he normally hits out of the park are staying in. And quite a bit of good luck in that balls he does contact are dropping in for hits. The thing about Wright is that he's not the best at any aspect of the game, but is well above average at just about everything. It's weird to see him with an extreme BABIP, and an extreme whiff rate.

I would think Wright will almost certainly hit more like his .300, 30 hr, 115K self from here on than the statistical freak he's been so far. He's having as good a season as he normally does, but it is difficult to see how one could consciously trade off one's abilities to get to where he is now.

Want a higher BA at the expense of homers? This can probably be done by shortening your swing, going with the pitch, etc., but that approach should decrease your strikeout rate, not increase it.

If you consciously swing harder, and do so at the expense of strikeouts, I can see where such an approach could increase your BABIP and your K rate at the same time. But that would approach should not kill your power, should it?

I'll add a little batted ball data from Fangraphs. Wright is hitting slightly more line drives (25% to 23% career) at the expense of ground balls. Not a big enough increase to explain the huge BABIP increase. He's hit 3 more line drives than his career averages would expect, yet has 20 more hits in play. His flyball percentage is unchanged, so it's hard to see why he's stopped hitting homers.

Just a weird season for Wright, but I expect him to display his normal profile of skills going forward.

Tuesday, June 16, 2009

Fact Check for Rob Dibble

I generally try to ignore most of the blabbering from the announcer's booth during games. But sometimes they just get on their points and just keep hammering at it, getting on my nerves. So I have to put my 2 cents in through the small forum I have into the baseball world, this blog.

Today it's Rob Dibble, talking about the greatness of Derek Jeter's defense. He's telling us that Jeter didn't win a gold glove until 2004 because Omar Vizquel was in the league.

Fact: Omar Vizquel won AL gold gloves every year from 1993 to 2001.

Fact: When Omar's streak was broken, the new gold glove shortstop was a guy who currently plays on the left side of the infield for the Yankees.

Fact: It was not Derek Jeter.

A-Rod broke the Vizquel streak, and had Vizquel declined/gotten hurt/been traded to the National league earlier, it is likely that A-Rod would have had a gold glove earlier, not Jeter. Jeter did break A-Rod's streak, of course, with some help from his manager and team, who decided to take A-Rod out of the competition.