I collected the data for every rush by Shaun Alexander in 2002. Approximately 300 rushes for 1200 yards and a nice 4.0ish average. I should have ignored touchdown runs since their length is stopped by the end zone, but I don't think this makes a big problem in this analysis. I worked out how many times he rushed for a particular yardage.

I then created his SP card based on the Avalon Hill rules and worked out how many times each yardage came up. I did the same with a more simple card formula that just puts -1 at number 12 and increments by 1 for each run number. I am very strongly of the suspicion that the data for most starting running backs is very similar in the ranges -1 through to 4 yards. I will analyse this later. I would probably say it is not a bad idea to have just 2 columns for ALL running backs, those with a 3.0 avg and those with a 4.0 average. I am convinced the difference in yard avg over the season comes from the big gains of 10-20 and 20+.


I plotted a curve showing the spread of rushes in real life and added curves for Statis Pro unadjusted (meaning no blocks or tackles), plus 2 cases where Seattle (best blockers +3,+3 from Avalon Hill formula) rushed against the NFL average defenders [-2,-2] and the best pair [-4,-4] - ignoring any -5 tacklers!


What do we see from these curves?, looking at each region we can see clearly

* Losses of yardage are much rarer in real life than Statis Pro, even though Alexander DID to my surprise have a -7 rush and a couple of -6s and -5s. If you look at the tails of all the curves on the left hand side of the y-axis though, you will see all Statis Pro simulations have way too many negative plays - 30 for the avg defense, or 53 for the best defense. In reality there were only 15 plays of less than -1 yds out of 300 rushes. SP is giving 2-4 times too many of these. The unadjusted SP results have a limit of -1 yd of course, this actually looks ok.

* Looking at the medium range carries, all SP results are giving a considerable amount more runs in the 5-10 yard range, and this is consistent with my theory that overall the averages work out because SP has too many big negative plays and too many high yardage carries which more or less balance in the long run but do not reflect a real distribution of how rushing yards occur- and this does affect your drives in SP.

* Breakaways worked really well for Alexander. Reality had him with 10 rushes for 324 yards, SP had about 10 for 400. This also shows that these long rushes are what gives a back his good average since Alexander only had 900 yds on 290 carries (avg 3.1) but those 10 big gains pushed his entire season avg up to 4.0 ! I am expecting to find a starting back with a 3.5 avg to have the same stats as Alexander apart from those extra long runs, and a 3.0 rusher to have perhaps slightly worse stats (but not very different) and again much worse on the big gains category. I know for example Stephen Davis of the Redskins has an avg in the low to mid 3.0-3.5 and he has very few big gains. I totally expect to find Davis stats in the range -1 to 4 the same almost as Alexander.

* The mid range is where it all goes wrong. In reality the bulk of rushes are between -1 and 4 or 5, with 50% of all seasons rushes in the range 0,1,2,3. Because of the large amount of 6-10 yard and less than -2 yard results in SP, the mid range is much less common in SP than it should be, even if you ignore blocking and tackling.


Alexanders card is such that he gets -1 on 12, then 0 on 11 and 10, and then 1,2,3,5,6,7,8,9. 4 does not even appear on his card and that is his average. If you tweak his card to run as -1,0,1,2,3,4,5,6,7,8,9 you get the orange curve which has a slightly better mid range, better losses (although no losses of less than 1 which isnt right either) and still too many shortish gains.

Because Short Gains occur with Run#1 and give 10+ yards on his card, these are not really the main problem, although it still is a bit of a problem. The main problem areas are the losses of yards and the runs between 5 and 10 yards. because blocks and tackles are equally distributed more or less (slightly more tackles), then whether you take rushing results with NO blocking/tackling, or whether you assume average defense or premium defense, you are just getting too much action in the range 5-10 yards.


Conclusion

- Statis Pro gives too many results with losses of 2 yards or worse when you include normal blocking/tackling rules

- Statis Pro gives too many results in the range 5-10 and 10-15 yards, this balances somewhat the excessive losses so overall stats seem to work usually

- Statis Pro suffers from a major lack of mid range results in the range 0-3 especially which is at least half of the typical results from this RB (and I expect from all RBs).

- Statis Pro handles breakaways fairly well


Possible Suggestion

If you did want to try and implement a more realistic system you could either ignore all block/tackle settings except "Break" or you could continue to use blocking and tackling assignments. I would say if you keep them, you need to reduce them and an easy way would be have all "versus" results meaning a sum (so +3 vs -4 would be a -1 adjustment), and all double results meaning "take the best", so LG+LT would be just use the block of the best one of the two, E+J would be use the best tackle value. This would cut down the +6-+8 and -6 to -8 adjustments which should cut out some of the problem areas.

The other adjustment (more in depth) would be to work out the rushers long column in the following way first

Run 1 = if rusher has 50+TD as long run, put "TD" otherwise use longest run
Run 2 = if 1 is TD then use longest run here, else interpolate this value
Run 12 = 15

Run 3-12 (or 2-12) - interpolate between the values at 1 and 12 in equal steps.


Add up the values in the long column - consider TD results as the longest gain plus whatever the step is between run number results on his long column. This gives you 12 "breakaways". Now subtract this from the rushers season total and carries to get his "everyday average".


Now using the actual FAC deck, the run numbers for all backs would look like this:

1 - use SG column
2 - everyday avg +3
3 - everyday avg +2
4 - everyday avg +1
5 - rushers everyday avg
6 - rushers everyday avg
7 - everyday avg -1
8 - everyday avg -1
9 - everyday avg -2
10 - everyday avg -3
11 - everyday avg -4
12 - everyday avg -5

For Short Gain Column, start at everyday avg +4 and just add 1 for each two run numbers

For Alexander this would give
N: SG/6/5/4/3/3/2/2/1/0/-1/-2.
S: 12/12/11/11/10/10/9/9/8/8/7/7
L: 58/54/50/46/42/38/34/30/26/22/18/15
I suggest this would give better results, and if you look at the second chart you will see the following



If you wanted to use your own cards or random system, just create a "Normal" distribution which is a simple Gaussian or bell curve. Put his everyday avg at the peak and normalise the curve, assigning nearest integer values to each rushing amount.



Season Long simulation

Real life: 292 - 885, with 10 - 324 big runs
Statis Pro:
unadjusted: 292 - 1205 plus 10-400 big runs
vs avg defense: 292 - 1209 plus 10-400
vs best defense: 292-941 plus 10-400 (closest match statswise)

my adjusted Statis Pro:
unadjusted: 292 - 795 plus 10-400 - my best match
single block only: 292 - 782 plus 10-400
normal block/tackle: 292 - 726 plus 10-400
(both cases use +3,-3 for off and def boxes)


Stats wise comparison
Real Life: 292-885 plus 10-324
SP best: 292-941 plus 10-400 if 16 games vs strong rush def
My best: 292 - 795 plus 10-400, about 5 yds per game short on normal carries but look at the distribution of yardages in chart 2 !





---------------
OVERALL RESULTS
---------------

You get much better rushing simulation with the following rules


1. Create Long Gain column for rusher
Run #12 = 15 yds
Run #1 = Longest Gain
Run #2-11 - interpolate equally between these two amounts

If Longest gain is 50+ and a TD, put "TD" in #1 and put the long gain in #2, interpolate from 3-11 in this case.

2. Add up long gain column. Total the 12 values, if "TD" appears, count that as the longest gain again

3. Subtract 12 carries from the rushers season stats, and subtract the yardage you added up in step 2.

4. Find out his new "effective average". Round to the nearest one. It will usually be 3 or 4.

5. Set his normal rushing column to be (assume X  = whatever value you worked out in step 4
SG/X+3/X+2/X+1/X/X/X-1/X-1/X-2/X-3/X-4/X-5

6. Short gain column should be
X+9/X+9/X+8/X+8/X+7/X+7/X+6/X+6/X+5/X+5/X+4/X+4

7. I suggest separate endurance ratings for rushes and catches. For rushing, I would have
A = 240 rushes
B = 160 rushes
C = 80 rushes
D = 16 rushes
E = less than 16 rushes
A-E are equivalent to 0-4, but you have 2 ratings for backs, a letter for their rushing and number for catching.


IMPROVEMENTS

Looking at the curves, it's clear that an almost perfect sim could be realised if we could take out those double occurrences of X and X-1 on the chart (run numbers 5,6,7,8) and spread those 2 duplicates over the 4 results X-2,X-3,X-4,X-5. But that's not possible with the current FAC system.

However, I shall try and look at other combinations of distances on the Normal Gain column for even better alternatives.