Here is a quote from Wayne Channon's article in Eurodressage from earlier this year. He is discussing the accuracy of awarding scores for dressage competition and I think he has some very good points.

"...I asked a mathematician at Imperial College, London, to give his view on our current system. His view was extremely interesting.

"He analysed our current scoring system and pointed out a possible perverse result of integer scoring. Take two horses, Horse A is a 6.4 mover for all 36 moves and Horse B is a 5.7 mover for 34 moves and a 6.6 mover for the other two.

"Clearly, horse A is the better horse of the two and should score more highly. However, when you work out their score for a Grand Prix test, the result is amazing:

* Horse A scores 6.4 at all 36 moves and hence is awarded a mark of 6 for every move and gets an average of 6.0 or 60%. Horse B scores 5.7 at 34 moves and 6.6 at the two remaining moves which happen to have a coefficient of two. The judges would award 34 times the mark 6 and twice the mark 7. The average is 6.083 which amount to a total mark of 60.83%
* So Horse B would win by 0.83%
* If the precision of the individual scores would be decimal, then the correct scores would be Horse A 64% and Horse B 57.75%. This is the problem with integer scoring!"

For the whole piece go to http://www.eurodressage.com/editor/wayne/20080118_halfpoints-sequal...

I think it is about time the dressage scoring system was reviewed, and I am surprised this issue has not been discussed more. We owe it to ourselves to make the marking system clearer.

Here is another point made by Wayne, which really made me think:

"It does not only affect the riders at the top of the sport it has just as big an effect on the everyday dressage competitor.

"For example, take a rider that achieved a solid 62% to 64% all year. Over the winter she makes considerable improvement only to find that she is still “rewarded” with 62% to 63%! How can this be?

"To understand this, we have to look at how judges judge according to the current marking system. Take a normal horse, ie one that has basic paces for a 6 or a 7 or somewhere in between. Few have movement for a 5 or less or for more than a 7. So when a judge looks at a combination they start on a “this is a 6 or a 7 trot or canter” and move up or down on how well each movement is performed.

"Now look a little more closely at what happens. Let’s say that the horse has a trot for a 6.4 – or 64%. With the current system, all of its marks will start on a 6 – yes, 60% – it has to be rounded down to a 6! If it moved for a 6.6 (only 2% better) it would start on a 7 or 70%. A 10% difference!!!"


http://www.eurodressage.com/editor/wayne/20071112_halfpoints.html

It is utterly mindblowing that the FEI has not acted on this already.

Views: 60

Replies to This Discussion

Sorry to say, but I don't agree.

Even though the mathematical reasoning is true, it neglects the fact that the data basis are human judgements.

Any scores derived from human judgement will not qualify as a ratio scale for mathematics. In other words, calculating those %-factors makes only sense mathematically if the data are specific enough. For example calculating a %-factor makes only sense if you can assume that the distance between values is equivalent. So the distance from 5 to 6 must be the same as for 8 to 9, or from 6,4 to 6,6 must be the same distance as from 2,4 to 2,6. Otherwise mathematical operations like calculating averages or standard deviations make no sense. - and no dressage rider will pretend that the distance from 4 to 5 score would be the same as from 9 to 10.

This is a principal problem in all those human judgement processes, no matter weather values are integer or real.

To be mathematically correct a human expert can only judge a comparison like this one is better than the other. This would give a ranking, but no values to build averages from.

As we all know, even the ranking is sometimes questionable, especially in situations were the performances are very similar, as they usually are with the top riders.

In addition the judges have to make their decisions very quickly. There is not much time for consideration when you have to score a Grand Prix performance on the spot for each lesson.

Therefore there have been in history many attempts to find an accurate process to allow judges to do their job and cope with the inevitably unprecise human judgement. In fact there have been ranking judgements in the past, which would be mathematically the most correct. For examply Alois Podhaisky suggested those after the 1936 Olympics.

However, riders feel that they want more 'objective' scores by lesson. The reason is not only that they wouldn't trust the judges' rating but also the more specific feedback by lesson which gives hints for improvements. Also those scores by lesson give a more impartial view of the judge to the performance as mistakes in one lesson are less likely to distort judgement of the other lessons.

However the judges cannot build rankings for each lesson during a performance but they can give a categorized answer such as: very good, good, mediocre, just ok and not ok. This is basically what judges do now with the integer scores. In most cases the scores are between 4-8. Actually the integer scale goes from 0 - 10 but those extremes are used very seldomly. The scores 9 and 10 for example are very rare. So in fact what the judges do, is categorizing the performance and psychologically that is the most appropriate way to do it. If we asked judges to score with decimals it would be just an illusionary precision that is not really there.

You can easily see this when you look at lower level performances in Germany where judges do in fact give joined score with decimals, but only one per rider. Than a rider will come and ask why his performance at 6,3 is worse that the other one with 6,4. All judges will tell you that they had some reason why they thought the other one slightly better, but is it worth 0,1 or 0,2 you couldn't tell. In the training sessions for judges in Germany the instructor will tell you openly that any score within 0.5 from another one has to be considered ok.

I guess we'll have to live with unprecise human judgements. Forcing the judges to issue more pecise numbers with decimal points will not improve anything. All it does is to produce a false precision and consequently increase complaints about wrong judgements. The human brain doesn't work in ratio scale numbers but it can differentiale good from bad - well, at least it should.

Ciao
Bernd
Thanks for your very interesting reply! You have made your point very well, and yes ultimately it comes down to finding the best way of recording subjective human judgements in a way which makes sense for scoring purposes.

However I don't think allowing judges to give decimal or half marks would be a bad thing, and here's why:

Most judges go through a test keeping a mental accounting system along the lines of "big 6", "small 7", giving scores of 7 for both, then the next "big 6" gets a 6, and so on...

Firstly, allowing use of decimals or halves would eliminate the need for judges to carry out this additional memorization challenge, on top of everything else they are trying to follow in the test. I'm sure even the sharpest judge will have trouble keeping track of what scores are "owed" at times. I have sat with international judges, and know the pace at which they have to make these decisions.

As a rider, I would rather see that the judge saw my first pirouette was a 6.5, and the second one was a 6, than get an average for both movements (probably both 6's!). As you correctly pointed out, the perceived distance between a 5 and 6 is different from that between an 8 and 9, so it would be nice if I was lucky enough to get an 8.5 and know it was more than an 8! Also, you mention that calculating averages makes no sense when based on numbers that have different distances between them: that is how the winner is decided when scores are added and turned into a % isn't it?

Also, and you might be able to clarify the mathematics of this, but often the placings in a class will be separated by as little as .05%. This is pretty strange to me.

Finally, it would not be compulsory for judges to use all the available range of scores, if they are happy with the current system, they would be perfectly OK to give 6.0's and 7.0's etc. As I see it, decimal or half point scoring could actually simplify the judging process and give better feedback for riders and spectators.

Cheers
Hi,

thanks for your nice and interesting reply. I really enjoy this discussion.

I guess one of your points will be inevitable and that is the close score for the winners. I think the main reason for that will always be that the top 2 -or even more- riders are in fact very similar so that the scores will be very close, too.

That's the same for many sprot disciplines. Just look at show jumping. If the winner in a jump off does it in 36,35 seconds and the second one got 36,42, can we really say the the winner was better? It is an objective result, ok, but the difference is so small that we can consider it unimportant. May next day it may be the other way round.

So if the winner in a dressage test is just 0.05% ahead it may well be that both were actually equal in performance. In fact when you look at many competitions this is the case any many arguments of supporters of one or the other rider just reflect this.

But looking at your main point about half point scoring I would agree with you. In many cases judges may feel it to be more appropriate.

I think the main reason for the integer scoring is a practical consideration as well as dealing with the inherently unprecise human judgement.

First doing half point scores makes the job of calculation of the totals more difficult. While this is probably no big deal in championships where you have enough computer equipment today, it is an issue in competitions were people do paper & pencil scoring and use a calculator afterwards. This sounds like an easy problem, but it is sometimes an important difficulty.

Than, looking at the scoring process during the competition it is considered to be easier for the judges. That maybe a question of experience and training though.

I could imagine this to be an interesting experiment. Why not try it on some competitions and see what the feedback would be? Maybe we should suggest that to the new FEI committee.

Coming back to the topic of psychology and mathematics we will have to live with the fact that distances between judgement scores are not equal. Still, once scores are given, we can make the mathematical steps and compute %-values. We should keep in mind though that those %-values are not comparable over time.

In other words, a score of 74,5% in one competition may actually represent a worse performance that a score of 72% in another competition. It just depends on the different judges views or the previous competitors scores.

We have now this new FEI committee in place charged with the job to improve dressage judgements among other things. It will be interesting to see wether they can come up with some good ideas.

Personally I would like to see the dressage tests more variable. If we had tests just designed one hour before a competition we might see more variable results. Dressage would be much more interesting that way. Even show jumping would be boring if they always jumped the exact same course as dressage horses have to do.

Ciao and my best wished for good new year
Bernd
A happy new year to you, too - sorry for the long delay, but we are in the busy part of our season in this part of the world!

Thanks for your comments, thoughtful and interesting as always :-)

I am definitely in favour of having some experimentation with the scoring system, it would certainly do no harm to give it a try. I would be keen to make some suggestions to the new Dressage committee, but I'm not sure what would be the best way to do it...

(I'll start a new discussion asking for suggestions)

RSS

The Rider Marketplace

International Horse News

Click Here for Barnmice Horse News

© 2024   Created by Barnmice Admin.   Powered by

Badges  |  Report an Issue  |  Terms of Service