The Heptathlon is one of the greatest tests of an all-round athlete that exists in World Athletics. Women compete in seven events over two days for the title. But this sport is suffering because of a biased scoring system. Why is this, and what can be done?
Let’s start with a bit about the Heptathlon. It has seven events, three running events (100 m Hurdles, 200m and 800m), two jumping events (High Jump and Long Jump), and two throwing events (Shot Put and Javelin). Certain equations are then used to turn the athletes’ raw scores into points, which are then totalled, and whoever has the most points wins.
The current scoring system was developed by Dr Karl Ulbrich in 1952, and amended in 1984. He based these early scoring systems on certain benchmarks: certain results were adjudged to be worth 1000 points, and others worth 0 points. A line was then put through these, but it was an upwards curve rather than a straight line, to account for the fact that the better an athlete performs, the harder it is to better that score by a certain amount. In other words, it is much easier for an athlete to reduce a time for the 100m Hurdles from 13.5s to 13.0s than from 13.0s to 12.5s.
He used three different formulae to calculate the score, P for each event.
P = a(b-T)c
for the running events, where T is the time in seconds.
P = a(M-b)c
for the jumping events, where M is the heigh/length in cm.
for the throwing events, where D is the heigh/length in metres.
a, b and c are different for each event, and are given in the following table.
| Event |
a |
b |
c |
| 200 meters |
4.99087 |
42.5 |
1.81 |
| 800 meters |
0.11193 |
254 |
1.88 |
| 100 metres hurdles |
9.23076 |
26.7 |
1.835 |
| High Jump |
1.84523 |
75.0 |
1.348 |
| Long Jump |
0.188807 |
210 |
1.41 |
| Shot Put |
56.0211 |
1.50 |
1.05 |
| Javelin Throw |
15.9803 |
3.80 |
1.04 |
This system works in principle, but a look at modern results begins to reveal a problem. Here is a table of the highest and lowest scores in each individual event at the Heptathlon at the 2011 World Championships in Daegu.
 |
| Times given in seconds (minutes:seconds for 800m), and distances in metres. The difference is in points. |
A first glance at this suggests that it is biased towards the the Hurdles, the scores for this event seem to be much higher than others! However, how high or low the scores are is actually irrelevant. For example, if you decided to add 100 points to everyone’s javelin score, the outcome of the competition would still be exactly the same, but with everyone’s scores simply 100 points higher.
The important statistic is instead how large the spread of scores is, here shown by the difference between the highest and lowest points scored. For example, the best hurdler at the Championships only gained 201 points over the worst hurdler, but the best javelin thrower gained a huge 400 points over the worst! In this sense, with the current scoring, the Javelin is worth almost twice as much as the Hurdles, and the Shot Put is worth nearly as much, leaving specialist throwers with a great advantage.
These differences are not just due to individual weak points giving very poor, anomalous scores. In the table below are the 10th place results for each event, and the distance from first.
At the top end, the competition is even more biased towards the Javelin, which is nearly 3 times as important as the Hurdles when considering only the top 10 results.
In order to have a truly fair system, and to provide the best test of who is really the top all-rounder, the difference between any two given positions should be as close as possible for each event.
The values for b have been carefully chosen to set the result that will score 0 points, and c has been chosen to give the graph the right shape. These values seem to work, and should not be changed. However, we can change ‘a’ to try and make the difference between the best and worst athletes equal. Using the data here, we can find new values for ‘a’ for each sport.
Let us set the difference between the best and worst results for each discipline to be 300 points. Taking the Hurdles as an example, we can form the following equation.
a(b-Tbest)c = a(b-Tworst)c + 300
a(b-Tbest)c - a(b-Tworst)c = 300
then substitute in the values.
a(26.7-12.93)1.835 - a(26.7-14.32)1.835 = 300
a(13.77)1.835 – a(12.38)1.835 = 300
Then factorise the left side of the equation to get
a(13.771.835 – 12.381.835) = 300
21.8198368a = 300
a=13.748957096
Using the same method, we can find new values of a for all other disciplines to give a difference of 300. These values turn out to be the following:
Using this scoring system, our table from before turns out like this.
The only problem with this is that the scores have radically different values awarded to them. This doesn’t impact on fairness (except in the rare event of three fouls or false starts in any event), but could be very confusing to athletes and spectators, and may also affect competitors psychologically. To fix this, we can introduce a new value, d, which is added to the end of each formula, to change the highest score from each of these disciplines to 1050 and the lowest to 750.
This also improves the differences between the top ten.
The Standard Deviation of the difference between 1st and 10th place is 13.4 with the system we propose, compared to 61.3 with the old system. This is clearly a much better and fairer method, and one we hope the IAAF will consider. To sum up, here is the Proposed Scoring System in full.
For running events:
P = a(b-T)c + d
For jumping events:
P = a(M-b)c + d
For throwing events:
P = a(D-b)c + d
And the following table of values are used:
Unfortunately, with the current scoring system, British Gold hopeful Jessica Ennis is at an acute disadvantage. Her best events, Hurdles and High Jump, are two of the most under-awarded events, and her weak point, Javelin, is the most over-awarded! Jessica came 2nd at these Championships in Daegu by a margin of 127 points, with only her Javelin below par on this occasion, but with these scoring revisions would have lost by a much tighter margin of 42 points. With the current scoring system, it seems that Jessica, along with all the other athletes, should turn their focus towards the Shot Put and Javelin in hope of securing Gold.
We’re all behind you Jess!
Tomorrow (the 5th of August) we will post again on this subject. We will discuss the difference our system could have made and how.
We hope you enjoyed this post, it was certainly very interesting to research and write. We don’t have any Olympic Heptathlon tickets, but will be watching Jess with earnest.
Theo Caplan
If you you want to get in touch you can follow and mention us on twitter,
@theaftermatter, email us at contactus@theaftermatter.com or search “The Aftermatter”
on Facebook.
Check out our last two posts:
The Physics of Gymnastics – The forces exerted on a gymnasts body during some routines are extreme, so how does it work?
Excellent work…keep it up such an interesting insight and really hope they take your proposal for the new scoring system…they need to listen to our youngest and brightest…regards James – Engineer by trade, Physicist by hobby
Great article – lots of valuable information, explained and presented clearly, and it enabled me to calculate the scores myself so I could check my understanding. This will be very helpful as I follow the Olympic heptathlon over the next two days.
I agree the system needs updating. However, as I understand it, your proposed system is based around the first and tenth scores in Daegu, and I think it needs to be based upon a much, much larger database of results.
You can’t have a system based upon the performance on the day. Athletes need a scoring system fixed in place for at least two years in the future, so they can study the scoring system and plan their training around the discipline in which they think they can pick up more points.
Nice tidy and easy to use website.
Hi Ian,
Glad you enjoyed the article and like our new look!
You’re almost right – our system is based around the first and last results in Daegu. Our proposals are currently being examined by UK Athletics and the IAAF, and if they decide to make a change from the current system, then obviously we could draw from a much larger dataset to get more and more precise scoring tables. However, the Daegu Championships were fairly representative of most major Heptathlons, and this dataset is enough to highlight the shortcomings of the current system.
If a change is made, it would be with a fixed system like the one we have now, just with different values for a, b and c in the equations. If a new system were to be adopted, there would be a transition period of at least a year, and likely longer, for athletes and coaches to get used to the new systems and plan their training, just like you say.
This Sunday, we’ll have a new post talking about the Olympic Heptathlon starting tomorrow, and what sort of effect our new system would have, we hope you will like it!
Enjoy the Games!
Theo Caplan
Good stuff, but (as you acknowledge), using extremes is horribly unstable. Why not either use the standard deviation, or the inter-quartile range to standardise?
Also, what would happen if you used a different competition to benchmark the method? For example, the last Olympics, or the 2010 world championships?
Great article, I’d always wondered if there was any sensible analysis behind scoring such as this — and pleased to find there is, even if improvements are possible.
One flaw in your analysis is possibly the use of “best” and “worst” scores — this is vulnerable to skew from outlying performances. Wouldn’t you be better to use upper and lower quartile (or something) over athletes to determine your values for a, b and c?
I found this an absolutely fascinating and persuasive article. I am watching heptathlon (and Jessica) as I write.
I look forward to the new post. It would be very interesting to know how this would have affected the placings of, say, world events since the event began, and in order not to run the risk of “name and shame”, which would be invidious, I would like to see tabulated numbers instead of names. So for example, in event A, the top placed competitor had 6487 points. Under the revised scoring A would have had 5995 points and position 3rd. Of course, competitors to some extent must play to the system but it would be nice to know just how big a difference the new system would make. Those who really want to do the work can research and match names and numbers for themselves.
I’ve just seen your Einstein quote at the masthead. Nice.
Haven’t seen this one before, but a variant quote was Lord Rutherford to his British atomic bomb team in the 1940s: “If you can’t explain it to the charlady, you don’t know what you are doing”.
Really enjoyed reading this – thank you
Nice one! Though I am from India I support Britain in Athletics, for obvious reasons. Came here to just see what was the chance Jessica might not end up getting the gold at this point. Pretty minuscule. She is going to get it unless something out of the ordinary happens!!
We are posting again this Sunday on what difference our scoring system will have made and if these games have brought to light anything about our system or the current one. Make sure to come back for that one!
Ned Summers
Thanks – now the system has been explained clearly with the variances that it has per discipline. And even so, Jess still managed it!!
How did you get a b c?