Follow me on Twitter!


Tuesday, April 1, 2014

Can Mid-Week Projections Work?

Two weeks ago, I proposed a method to project an athlete's overall ranking before score submissions had closed for the week. To me, it made sense on paper, but it was admittedly untested. So I put out a request for help on testing it in week 4, and thanks to Andrew Havko (among others), I was able to make that happen.

So can it work? It appears that it can. That's not to say the projections are 100% accurate, and they are far from precise very early each week. But I think it's clear that the projections can give an athlete a good sense of where they would likely finish the week if they stick with their current score, which is something that is nearly impossible currently.

I tested these projections at three points during week 4: Friday 8 a.m., Saturday 5:30 p.m. and Sunday 3:30 a.m. (all EDT). The method requires one key assumption, which is the percentage of athletes who will drop off from the prior week, and for this I used 10%. Certainly this would need a bit more careful thought if it were to be implemented by HQ.

For each athlete, I projected their overall worldwide ranking at each of these times. For athletes whose score did not change by the end of the week, I compared my projection to their ultimate ranking. In total, the error of my projections were as follows:
  • Friday 8 a.m. (<1% of field reporting) - 9,575 mean absolute error*, 9,404 mean error
  • Saturday 5:30 p.m. (16% of field reporting) - 1,003 mean absolute error*, -787 mean error
  • Sunday 3:30 a.m (21% of field reporting) - 1,454 mean absolute error*, -1,362 mean error
Interestingly, the projections (at least using this first basic method) got slightly worse overall from Saturday to Sunday. The reason is that the distribution of scores submitted by Saturday 5:30 p.m. was more similar to the ultimate distribution than on Sunday. What I found was that, in general, the scores submitted very early on during the week are well above average, and the quality slowly declines throughout the week.  That is until Monday evening, when a slew of athletes replace their first score with a second improved submission. It turned out in this case that Saturday afternoon was a pretty accurate indication of how the current week's scores will turn out.

However, let's look a little more closely at the errors. Although an error of 1,003 (our best mean absolute error) is pretty small for an athlete finishing, say, 40,000th, it would be a very large error for an athlete finishing 2,000th. Thankfully, the size of the errors generally increased as the ranking increased. Below is a chart showing the percentage error for athletes across the spectrum of rankings, using our Saturday afternoon projections.


So you see that generally, we never really stray further than 3% error at any point. That's not too bad when you consider that there's currently no way to get even a good ballpark estimate until at least mid-day Monday.

Still, maybe we can do better. What if we had actually used the perfect assumption (8% in this case) for the percentage of athletes who would drop off from the prior week?  Well, in total, we improve for our Saturday and Sunday projections, with the mean absolute error going down to 338 for Saturday and 581 on Sunday. Interestingly, though, in this particular case it doesn't necessarily improve the projections across the board for Saturday and Sunday. Below is the same chart as above, but with the perfect assumption for attrition.


Although our error gets a little worse near the top, once we get near the middle of the pack, these projections are nearly spot-on. And even near the top, a 5% error isn't that bad - that's like these projections putting Josh Bridges at 100th overall, whereas he actually finishes 105th.

One way we can theoretically adjust to get even closer is to make an adjustment for the skill level of the atheletes who have submitted scores at a given point. This could involve looking at the average ranking of the athletes from their prior week's scores and comparing that to what we'd expect by week's end. The trouble is, it's challenging to know what the level will be at week's end. You might expect that the field would average out to be at the 50th percentile in prior weeks, but that wasn't actually the case here. The average athlete submitting a score for 14.4 was actually about the 48th percentile in prior weeks, which is due to the fact that the athletes dropping out after 14.3 were generally from the bottom of the pack.

My point is that while such an adjustment is possible, it might not be practical. And considering the projections even with my base 10% attrition assumption weren't too bad, I don't think further adjustments are necessary, beyond refining that attrition assumption to make it as accurate as we can.

Finally, while I think this method would produce reasonable results if implemented by HQ next year, there are some caveats about the testing done here:

  • I've only done testing for one week. There may be more (or less) error if we made these projections in week 2 or week 5.
  • I'm almost certain that the percentage error would increase a bit if we do this for each region. The sample size is much smaller, which means that even if the same principles apply, we're likely to see more variability. For one thing, it's going to take longer each week before the projections are even remotely meaningful, since many regions had less than 100 entries until late each Friday afternoon.
  • I only tested this for the men's field. I don't see any reason why the results would be much different for women, aside from the field being smaller, which would likely increase our percentage error a bit.
All that being said, I feel that implementing this method would provide a realistic glimpse into where an athlete will wind up. As long as athletes understand that this is merely an estimate, the information provided can be quite useful. 

Would this revolutionize the sport? Of course not. But I think it would be yet another improvement to the athlete experience as the largest stage of our sport continues to grow.


*Mean absolute error is the average of our errors, if we ignore the direction of the error. So if we are off by -500 for one athlete and +500 for another, the mean absolute error is 500 but the mean error is 0.

2 comments:

  1. Not data driven but I did find I could predict fairly accurately where I would place in my region by looking at what percentile my current score placed me and then factoring in the total number of scores I expected to post. It worked best for weeks 1-4. Week 5 it didn't work as well not sure if it was because it was for time and not for reps.

    ReplyDelete
    Replies
    1. I think that works reasonably well if your score on the current event puts you at a similar percentile to where you were overall after the prior event. For instance, if you're in the 30th percentile after 3 events, then on Saturday afternoon you're in the 30th percentile on event 4, then you can probably assume you'll be near the 30th percentile overall after event 4.

      But if you are the 10th percentile on event 4 on Saturday afternoon, your overall percentile won't really be accurate as of Saturday afternoon because that great finish you had on event 4 is not weighted enough yet (because there are so few scores for that event at that point).

      The problem is, it's really challenging to estimate your own overall percentile without estimating the placement for everyone and then recalculating.

      Delete