The Editor's Disclaimer
HerEx intro

All the columns

Hardball home
sorry, andre
take a stick to it
baseball's best .500 pitcher
it's all in the game stat
relief from rolaids
sacrifice the bunt
managing to get by
how to score
talk the balk
clutch this
support your local pitcher
indefensible
hollow men
steal away
stuff it
halo there
under-achievers
it's a start
what ken wants
thundering herd
trade bait
what's wrong with don?
walk the walk
vital signs
the original stat freak
the missing link
angel on their shoulder
dodger dogs
desperately seeking cy young
hit parade

When It Comes to Hitting Formulas, Take Your Pick

By Kenneth Broder

Los Angeles Herald Examiner

April 10, 1988

Mathematics is supposed to be an exact science. So why is it so hard to find two SABRmetricians (baseball stat freaks) who agree on the best way to measure a baseball hitter's production?

The Society for American Baseball Research is loaded with people, armed with calculators and computer programs, who can "prove" that Tony Gwynn and a half-dozen other guys had better years in '87 than MVP Andre Dawson. and while it's hard to argue with their general proposition that these things can be measured, it's disheartening that these number-crunchers can't agree to the decimal point on the specifics.

Last week, I used a Bill James formula that ranked Andre the Giant eighth in the National League in total offense. As with most of these mathematical souffles, James lightly whips together eight or nine offensive categories (like hits, doubles, triples, homers, etc.) and out pops a single statistic.

James' number, called Runs Created, is like a cross between RBI and Runs Scored (in Andre's case 113.8) and tells you how many runs a player contributed to his team's total. The theory is that even if a base hit with a man on first doesn't drive home a run, the hitter should get credit for contributing to the increased likelihood that the runner will score.

If Runs Created is too abstract for you, tough. The formula works; it predicts with more than 90 percent accuracy how many runs a team should score. You can either trust me on this or flip the newspaper in the trash now and tune in to Stu Nahan of "SportsTalk" for the inside dope on how John Shelby's gritty determination inspires his Dodger teammates to greater heights. (Personally, I'm tired of guys with 32 walks and on-base percentages worse than the immortal Chris Speier's winning MVP awards.)

Unfortunately, Bill James hasn't got a corner on the hitting-formula market. In fact, there's a glut. John Thorn and Pete Palmer call theirs Linear Weights Batting Runs. Paul Johnson's is Estimated Runs. Steve Mann has the Run Productivity Average. Tom Boswell flaunts Total Average in Inside Sports magazine every year. You get the picture. Their methods are often quite different but are part and parcel of the same statistical approach to baseball that ignores much of the "conventional wisdom" about the value of team momentum, clutch hitting, RBI, stolen bases and runs scored.

The results of these valiant statistical crusades may vary (James rates Gwynn ahead of Murphy and Johnson reverses them), but for the most part the good ones have the virtue of excluding Andre Dawson from consideration as MVP.

Anyway, we're only looking for a tool to help determine how good a hitter is, not the perfect formula. But the least you can do before entering the Hot Stove League wars is arm yourself with the most powerful ammunition around. And that means valid statistics. Remember, fans once argued about the value of hitters without having the benefit of batting averages or ERAs. It's time for the next step.

A few years back I read a stupifying mathematical explanation of how Bill James' Run Created formula is flawed, something about the good hitters being rated too high and the poor ones too low. But the critic's substitute formula, called Runs Produced, used stats like "double plays hit into" that the average fan doesn't want to bother with. With a little fudging, though, the formula seems to work. This is how it compares to Runs Created when applied to the Dodgers and Angels who started on Opening Day.

Angels | Dodgers | ||||
---|---|---|---|---|---|

Runs | Runs | Runs | Runs | ||

Created | Produced | Created | Produced | ||

McLemore | 42.8 | 44.8 | Sax | 77.0 | 75.8 |

Ray | 75.6 | 72.6 | Griffin | 51.5 | 52.5 |

C. Davis | 75.1 | 79.1 | Gibson | 90.2 | 89.9 |

Joyner | 109.9 | 106.2 | Guerrero | 120.4 | 113.0 |

Downing | 105.1 | 107.2 | Marshall | 57.0 | 55.9 |

White | 86.9 | 68.4 | Shelby | 71.9 | 72.0 |

Howell | 67.6 | 69.6 | M. Davis | 74.7 | 75.5 |

Boone | 36.1 | 36.2 | Scioscia | 57.5 | 58.8 |

Schofield | 54.4 | 53.4 |

Run Produced (RP) is one of only two stats I generally use when evaluating a player. The other one is just as crucial because it puts the run production in a context of outs. If Andre and Jack Clark produce virtually the same number of runs, but Clark bats 90 fewer times, as he did in '87, who do you want on your team? When Clark wasn't playing, some other guy was producing more runs and adding to the production of Clark's spot in the lineup.

Andre was just supplying outs; 144 more outs to be exact. Hitting must be looked at in the context of outs. Here's a formula I've seen used in a few places that puts Runs Produced into perspective.

**Runs Produced Per Game = RP x 26 / (At Bats + Hits + CS)**

It changes Runs Produced into a statistic that puts a hitter's production into a 26-out game context. (Twenty-six, instead of 27, is used for technical reasons.) For example, if you had nine Andre Dawsons in your lineup, your team would theoretically score 6.4 runs per game. But a few players had better seasons than that. (RP/G is Runs Produced Per Game.)

THE TOP 20 | |||||||
---|---|---|---|---|---|---|---|

American League | National League | ||||||

Runs | Runs | ||||||

Produced | RP/G | Produced | RP/G | ||||

Boggs | 135.2 | 9.9 | J. Clark | 113.9 | 9.9 | ||

Molitor | 109.6 | 9.2 | Daniels | 91.2 | 9.4 | ||

Dw. Evans | 122.5 | 8.3 | Gwynn | 129.3 | 8.8 | ||

Trammell | 122.5 | 8.1 | E. Davis | 114.0 | 8.7 | ||

Henderson | 80.4 | 8.0 | Raines | 119.2 | 8.6 | ||

McGwire | 120.3 | 7.9 | Murphy | 131.8 | 8.5 | ||

Greenwell | 81.6 | 7.6 | Strawberry | 122.1 | 8.1 | ||

Hrbek | 99.4 | 7.5 | Guerrero | 113.0 | 8.0 | ||

Tartabull | 117.0 | 7.5 | Schmidt | 107.1 | 7.5 | ||

Mattingly | 110.5 | 7.4 | Kruk | 87.7 | 7.2 |

And for the sadists out there: | |||||||
---|---|---|---|---|---|---|---|

THE BOTTOM 20 | |||||||

Runs | Runs | ||||||

Produced | RP/G | Produced | RP/G | ||||

Pettis | 35.6 | 2.9 | Templeton | 43.4 | 2.6 | ||

Boone | 36.2 | 3.2 | Pena | 34.0 | 2.9 | ||

Moses | 39.7 | 3.3 | Dunston | 32.8 | 3.2 | ||

McLemore | 44.5 | 3.4 | Santana | 44.2 | 3.5 | ||

Lombardozzi | 44.5 | 3.5 | Jefferson | 46.7 | 3.6 | ||

Griffin | 52.5 | 3.6 | C. Reynolds | 40.1 | 3.7 | ||

Hill | 43.6 | 3.6 | Candaele | 50.1 | 3.9 | ||

Guillen | 59.5 | 3.8 | Carter | 60.0 | 3.9 | ||

Brookens | 49.2 | 3.8 | Larkin | 51.4 | 4.0 | ||

Schofield | 53.4 | 3.8 | Stillwell | 45.6 | 4.0 |

Dishonorable mention goes to Wayne Tolleson, cut this week by the Yankees, who for want of a single at-bat failed to lead the AL Bottom 10 with a 2.6 RP/G.

RP describes a whole season's production; RP/G gives you an indication what potential is there. There is a way of combining these two stats into one highly descriptive figure, but somehow I don't think now would be a good time to trot out the dreaded, long forgotten Pythagorean Theorem.

Of course, hitting isn't everything in baseball. Pitching is 44 percent of the game. At least Pete Palmer and John Thorn think so. Next week, we'll look at starting pitchers and that great mythological figure, Nolan Ryan.