The recent controversy concerning Roger Clemens and his alleged use of steroids and/or human growth hormones has lead to two statistical arguments. One argument set forth by statisticians hired by Roger Clemens’ defense team. The other argument, by statisticians from the Wharton School , attempted to poke holes in the Clemens’ argument.
The argument produced by Clemens’ team is based on comparing Clemens’ performance in his later years to the performances of Nolan Ryan, Curt Schilling, and Randy Johnson with respect to many pitching statistics. This showed Clemens’ performance not to be much different from the other three great pitchers. They concluded when comparing Clemens to other great pitchers his performance is not atypical.
The above argument was criticized by the Wharton team who claimed the study was invalid due to selection bias. This team chose pitchers based on the following criteria:
(1) Starting date of 1968(2) Pitched at least 15 years, each year of which they recorded at least 10 wins
(3) Pitched at least 3000 innings.
They found 31 pitchers who met these criteria. They then compared Clemens’ performance, in his later years, to these 31 pitchers. In this analysis Clemens’ performance was an outlier.
This leads to the following questions concerning Bonds’ performance as a batter between the ages of 35 and 40. Can we produce the same two arguments as discussed above for Clemens.
(2) What starting date should be imposed for use in (1) above?
(3) Can we then pick criteria, in a way similar to what was done by the Wharton people, to produce a much larger sample of batters? What starting year should be used?
Once the criteria for the batters is established I will select the players who match the criteria and then do an analysis to see whether Bonds’ batting performance is atypical to this group.
No comments:
Post a Comment