Tuesday, 9 August 2011

Scrabble Statistics (or: How Much is J Really Worth?)

After a rip-roaring knock out tournament on Monday (15 August, dedicated post to follow soon), many an experienced Scrabbler in attendance were lamenting the fickle allocations of Lady Luck. "It's just not fair! One game of never-ending vowel sets cannot give a true reflection of my innate awesomeness!" A simple solution is to play multiple games, but this places serious strain on both pizza supplies and reserves of patience before the coveted Scrabble (Not At All Dominoes) Champion Trophy* can be crowned.

Just how much Scrabble is a game of luck versus a game of skill has been investigated by some seriously statsy folk. A statistics professor, Andrew C Thomas from Carnegie Mellon University, looked at defining a Scrabble simulation to see how luck (i.e. tile allocations) affect Scrabble games. (Go check out some results at http://blog.revolutionanalytics.com/2011/07/scrabble-luck-and-skill.html). He then quantifies the luck factor by looking at the effective point effect of different letters. (What a clever guy!) Yip, that Q might be assigned 10 points by the Scrabble Deities, but it's actually (on average, after a gazillion games) a 5 point dastardly disadvantage!

[[WARNING! Geek speak coming up; skip to the friendly bullet list below for the juicy results]].
Prof Thomas fixed the Lady Luck mojo handout by considering a fixed sequence of tile draws. He fixed the skill effect (inherent player awesomeness) by repeating oodles of Scrabble simulations on a specific fixed tile sequence. Of course different game situations will lead to different length words, which will lead to different batches of tile draws. But Talented Prof Thomas sneaked past this chaos trap by having his Sim players drawing from either ends of a long row of tiles. The effect of the fuzzy tile allocation in the middle (due to different word lengths) gets cancelled out by repeated simulations. By tracking Scrabble scores across lots of random tile sequences, he could estimate per-letter (dis)advantages.

[[OK, safe to read again, here's the good stuff!]]

(Verbatim from the Revolutions blog):
  • The blank is worth about 30 points to a good player, mainly by making 50-point "bingo" plays possible.
  • Each S is worth about 10 points to the player who draws it.
  • The Q is a burden to whichever player receives it, effectively serving as a 5 point penalty for having to deal with it due to its effect in reducing bingo opportunities, needing either a U or a blank for a chance at a bingo and a 50-point bonus.
  • The J is essentially neutral pointwise.
  • The X and the Z are each worth about 3-5 extra points to the player who receives them. Their difficulty in playing in bingoes is mitigated by their usefulness in other short words.
  • Thomas also finds that the player who goes first generally has an advantage, to the tune of about 14 points.
And that's why statistics is awesome!

* From the two large shiny plastic gold dominoes on the trophy, confusion might arise as to the exact nature of the recipient's skill. 



No comments:

Post a Comment