How large of an
advantage
does Player #2
have in OSKI
?
 

by 
Edward D. Collins



 

 

OSKI is a word game recently implemented at the Little Golem turn-based site.  Briefly, players alternate turns by placing any letter of their choice on an empty cell, and forming a new word.  The number of letters in the length of the new word formed is the number of points the player earned for that move.  Since the initial seed word always consists of three letters, and with exactly 19 cells on the board, all games last just eight moves each.  The player with the most points at the end of the game is the winner.

After playing a number of games, I suspected the player moving second (Player #2) had a distinct advantage.  Discussions in the Little Golem forum with other players all agreed with this.  I was curious to know how much of an advantage Player #2 had.  How often did Player #2 win?  On the average, what was the average margin of victory?  How much would this margin of victory change based upon the vocabulary of the players?

To determine the answers to these and other questions, last weekend I wrote a little program that analyzes OSKI positions and plays OSKI games against itself.  I named the the program "Catherine" after my 90-year-old grandmother who passed away a couple of weeks ago.  (I think my grandmother would have liked this, since she enjoyed word games, especially the Wheel Of Fortune television game show.)

Catherine is capable of analyzing any OSKI board position and returning all of the possible plays, up to 12 letters in length or less.  Catherine can also play games against itself, uninterrupted, for any number of games desired.

Catherine follows all the rules of OSKI - it does not play valid words that are 'invalid' for that particular game.

Currently, with the code I wrote and the speed of my computer, Catherine is capable of playing a complete game against itself, (16 moves total) looking for all possible words of 9 letters or less, in about eight minutes. 

If I look for words that are just 8 letters or less in length, Catherine can play a full game in just two minutes.

If I look for all words that are 7 letters or less in length, Catherine can play a full game in about 30 seconds.  6 letters or less in length?  About twelve seconds.  5 letters?  About eight seconds.  (Programmers who wish to create their own OSKI program can use these times as benchmarks.  Trust me, they won't be too hard to beat.  After writing my code, I already have a few ideas on how I could have written it to make it more efficient, but what else is new.)

Alas, playing a complete game from start to finish, and looking for all words on every move of 10 letters or less in length, takes about 45 minutes!  As you can see, looking for words that are 10 letters or longer in length requires much more time.

Thus, in the interest of playing as many games in a short of period of time as possible, and to generate as much data as quickly as possible, the following statistics below are games played against itself with a vocabulary knowledge of words that are 9 letters or less in length.  For the purpose of deciding how much of an advantage Player #2 has, which was my goal, this should be more than adequate.  In my early tests I noticed that 10-letter words (and longer) are somewhat rare.  Looking for all words that are 9 letters or less shouldn't alter the average margin of victory very much at all, or Player #2's win percentage much at all.  Yes, the average number of points scored per game will be very slightly less, but the margin of victory and win percentage should be approximately the same.  At least that was my hypothesis.  Obviously, on any given turn, Black (Player #1) would be "penalized" by not being able to form a nine-letter word if such a word is available, but that holds true for White (Player #2) as well.
 



As I initially suspected, usually there are hundreds and hundreds and hundreds of possible plays in most middle-game board positions.  In fact, at times the number of possible plays for a given board position is often more than 1,000! 

Note that Catherine currently only looks one ply ahead - at the given position.  She does not look ahead any further than the given position, (two-ply) although it would be easy to modify her to do this.  Part of this reason I did not is that darn branching factor.  With hundreds and hundreds of different plays available on each turn, if I had Catherine look even just one move ahead, it would much longer to complete one full game than it does now.

When playing games against itself, Catherine begins the game by placing a random 3-letter word in the center position. As per the rules, the 3-letter word must contain three different letters. There are 1,153 different three-letter words that fit this description in the SOWPODS word list and Catherine chooses one at random for each game.  (SOWPODS is the word list used for this and other word games at the Little Golem website.)

Some of these words appear to be mirror images of each other but actually are not.  For example, take the words SAW and WAS.  Per the rules of the game, the four-letter word WASP is a valid word when SAW is the initial word, but that word is not valid at all if WAS is the initial word!

In order to make sure each game played is unique, Catherine first generates a list of all possible plays, and then sorts that list by length.  She then picks a word randomly from the list of all of the top scoring (longest) words.  (Obviously, if a given position only has one 'longest word' to choose from, that play is chosen.)

To clarify, Catherine will always plays a word that is at least equal to the longest word available, and if more than one 'longest word' exists, Catherine chooses a word from this list on a random basis.  If I did not choose a word randomly and instead chose a word alphabetically, or always chose the first long word found, etc., then games starting with the same initial seed word would all be exactly the same.

 



To generate the data below, I let the program run for many hours, usually while I was sleeping or while I was away at work.  In fact, on more than one evening I had two different computers both running the program!

I found some of the results below very surprising, the main one being I did not suspect that White (Player #2) would win nearly as often as White does!  Also, the number of draws is far less than what I would have suspected.

In addition to just keeping track of the final score and the win percentage of each player, I also kept track of the score at "halftime" - after each player had made just four moves each.  As you can see from the data, Player #2 is not only almost always winning at the midway point, Player #2 is often winning by a larger margin than the final score margin!

I also kept track of the number of times Player #2 scored less than Player #1 on the final move, and thus was "stuck" with that last cell.

I ran the program five different times, each time with different "strength" setting - the strength being vocabulary knowledge.  For example, the data below in the "8-letter words" column is from a run of 800 games played, and in each of those games Catherine looked for words up to 8 letters in length.  (Meaning all 9-letter words in length, and longer, were not in her vocabulary.)

 
 

  9-letter words 8-letter words 7-letter words 6-letter words 5-letter words
Number of Games Played 262 800 1,650 3,500 4,500
Number of wins by Player #1 17 (6.49%) 35 (4.38%) 29 (1.76%) 16 (0.46%) 3 (0.07%)
Number of wins by Player #2 209 (79.77%) 677 (84.62%) 1,492 (90.42%) 3,197 (91.34%) 3,819 (84.87%)
Number of Draws 36 (13.74%) 88 (11.00%) 129 (7.82%) 287 (8.20%) 678 (15.07%)
Average Final Score of Player #1 55.06 53.55 50.45 45.28 38.87
Average Final Score of Player #2 57.09 55.57 52.19 46.52 39.76
           
Number of Halftime Wins by Player #1 3 (1.15%) 8 (1.00%) 13 (0.79%) 10 (0.29%) 2 (0.04%)
Number of Halftime Wins by Player #2 244 (93.13%) 749 (93.62%) 1,554 (94.18%) 3,224 (92.11%) 3,821 (84.91%)
Number of Halftime Draws 15 (5.73%) 43 (5.38%) 83 (5.03%) 266 (7.60%) 677 (15.04%)
Average Halftime Score by Player #1 23.40 23.15 22.69 21.30 18.87
Average Halftime Score by Player #2 25.46 25.20 24.47 22.54 19.76
           
Number of wins by Player #1 by exactly 1 point 10 (58.82%) 24 (68.57%) 21 (72.41 %) 13 (81.25%) 2 (66.67%)
Number of wins by Player #1 by exactly 2 points 5 (29.41%) 10 (28.57%) 5 (17.24 %) 3 (18.75%) 1 (33.33%)
Number of wins by Player #1 by exactly 3 points 1 (5.88%) 1 (2.86%) 3 (10.34 %) 0 (0.00%) 0 (0.00%)
Number of wins by Player #1 by exactly 4 points 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%)
Number of wins by Player #1 by 5 or more points 1 (5.88%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%)
           
Number of wins by Player #2 by exactly 1 point 55 (26.32%) 160 (23.63%) 445 (29.83 %) 2,149 (67.22%) 3,653 (95.65%)
Number of wins by Player #2 by exactly 2 points 45 (21.53%) 225 (33.23%) 743 (49.80 %) 957 (29.93%) 165 (4.32%)
Number of wins by Player #2 by exactly 3 points 55 (26.32%) 170 (25.11%) 246 (16.49 %) 84 (2.63%) 1 (0.03%)
Number of wins by Player #2 by exactly 4 points 28 (13.40%) 84 (12.41%) 50 (3.35 %) 7 (0.22%) 0 (0.00%)
Number of wins by Player #2 by 5 or more points 26 (12.44%) 38 (5.61 %) 8 (0.54 %) 0 (0.00%) 0 (0.00%)
           
Number of games Player #1 did better on final move 75 (29.63%) 183 (22.88%) 134 (8.12%) 38 (1.09%) 2 (0.04%)
Number of games Player #2 did better on final move 66 (25.19%) 129 (16.12%) 70 (4.24%) 20 (0.57%) 0 (0.00%)
Number of games final move of each player was equal 121 (46.18%) 488 (61.00%) 1,446 (87.64%) 3,442 (98.45) 4,498 (99.96%)

 

Observations:

Against two players of equal strength, who always play a word that is at least equal to the longest possible word on for that move, (and who both both do not look ahead more than the given board position), Player #2 does indeed have a very large advantage.  Player #2 wins 79% to 91% of the games outright, no matter what the vocabulary.

The average margin of victory for Player #2 increases with the vocabulary.  With a full vocabulary of 9-letter or 8-letter words or less, the average margin of victory is about two points However, note that about 37% - 39% of the time, Player #2 wins by exactly three or four points.

The frequency that Player #2 is leading the game at halftime is always larger than the final Player #2 win percentage!  For example, with a complete knowledge of all 8-letter and less words, Player #2 wins the game 84.62% of the time, but Player #2 is leading at halftime a full 93.62% of the time!

When Player #1 does win, 90% of the time it is by just one or two letters.  Only one time out of all 10,712 games did Player #1 win by four points or more.

Player #1 is often able to make up a very tiny bit of ground on his final move of each game.  For example, with a vocabulary of all 8-letter and less words, 22.88% of the time Player #1 is able to play a move that is worth more than Player #2's final move.  16.12% of the time Player #2 played a word that was worth more than Player #1s final move.  Note: When Player #2 does better on the final move, this means, of course, that this move only became possible as a result of Player #1's last move.  (Otherwise, Player #1 would have played it when it was his turn.)  A one move look-a-head feature will only benefit Player #1 in this case.

 


 

Frequency of 9-Letter Words


In the future I would like to play more 9-letter games.  262 games is a good starting sample, but I would like to eventually take a larger sample than that.  However, I don't believe my final percentages will change too much.

After the 262, 9-letter or less games were played, I later determined the total number of 9-letter words that were formed in all of those games.  I was surprised by the number - it was slightly higher than I initially would have suspected. 

The first opportunity to form a 9-letter word is on Player #2's third move.  At this point there are already eight letters on the board, making a 9-letter word possible, but somewhat unlikely.  And from that point on, of course, forming a 9-letter word will always be possible for the remainder of the game.  Thus, with the 262 games I played, there was a potential for a total of 2,882 9-letter moves.  (262 games x 11 opportunities in each game (Moves 6 through 16) = 2,882.)  The actual number of 9-letter words formed was 623, or 21.62% of the possible total.  This was a little more than I would have initially guessed.  My initial guess would have been, on the average, just one or two 9-letter words formed each game, for a grand total somewhere between 262 and 524.

 

If you're curious, there were nine games (of 262) that DID indeed start with a 9-letter word on Black's third move, the first opportunity to see such a word:

1)  sap, pass, pases, peases, pleases, pleasers, pleasures, pleasured, reassured, treasured, reassures, resamples, tressures, simplesse, masterly, sesamoid, promisers

2)  cit, etic, telic, tickle, icklest, trickles, tricklets, strickled, brickles, articles, barmiest, prickled, comparted, docilest, articled, coomiest, utricles

3)  sip, pirs, scrip, scrimp, crimps, scrumps, scrumpies, crumpiest, scrummies, crummiest, scrummier, spammiest, clumpiest, clumpish, jammiest, pessimum, lummiest

4)  ped, deep, speed, depose, reposed, reposted, respotted, replotted, bespotted, repotted, besotted, crested, cabresto, allottees, scalloper, escallops, steddies

5)  bet, beat, bleat, oblate, boblet, bomblet, bombilate, bobtail, obligate, obligato, obligati, litigate, litigator, mitigator, orbital, lobelia, bornite

6)  pes, pecs, sepic, apices, plaices, special, allspices, spadices, illapses, diallists, palliates, tailpiece, raspiest, palliator, lapstones, nosepiece, medallist

7)  kaw, wank, knawe, weaken, wakener, rewakens, reawakens, reawakes, reawaked, sneaked, rewakes, rewashed, nakedest, snatched, swatches, switched, stitched

8)  cod, docs, cosed, closed, dulcose, colludes, colluders, occluders, closured, occluded, unclouded, concluder, unsounded, concluded, resounded, occludent, flounders

9)  rad, dari, triad, airted, detrain, urinated, ruminated, ruminates, septarium, minareted, septarian, mirandise, laminates, laminated, designate, settering, retearing

In the game immediately above, Player #2 scored a total of 66 points... the most points possible with a vocabulary of just 9-letter words.  (5 points on Move #2, 7 points on Move #4, and 9 points on EACH of her other moves.)  This was the highest scoring game of all 10,712 games I played.

 


 

Interesting Games


The two games out of 4,500 played in which Player #1 scored better on the final move than Player #2, (with a vocabulary of just 5-letter words), looked like this:

Game # 253: 
Words Played:
yin, nixy, dixy, dins, dints, vints, vends, finds, hends, whets, nides, ended, oxids, edify, byded, toxin, myxo
Board Position:



Catherine correctly formed the longest move available, TOXIN (OXIDE and EXINE were also available) leaving a choice of nothing but several 4-letter words for Player #2 on the final move.  (Catherine randomly chose MYXO.)

 


Game # 2242: 
Words played:
  vum, mux, hum, umph, gump, grump, rumps, smugs, muras, farms, rasps, gursh, chugs, margs, ichs, fichu, fash
Board Position:

Catherine formed the longest move available, FICHU, leaving a choice of nothing but several 4-letter words for Player #2 on the final move.  (Catherine randomly chose FASH.)

 

This is a position from one of the two games out of 4,500 that Player #1 was ahead at halftime, with a vocabulary of nothing but 5-letter words:

Game # 524: 
Words played:
  fix, faix, waif, wuxia, fiar, rawin, xenia, exine, raine, wined, farer, ainee, mined, twine, firer, imine, neemb
Board Position:

Player #1 just formed the word 5-letter word WUXIA.  It's Player # 2's turn to play and there are no other 5-letter words possible.  Player #2 randomly picked FIAR.  For the rest of the game, nothing but 5-letter words were played, and thus Player #1 went on to win the game by one point.

 

 

The Most Interesting Game Of All
 

The one game out of all 10,712 games, in which Player #1 won by five points, looked like this:

Game # 88: 
Words played:
  miz, zimb, bima, ambit, timbal, timbral, tombal, rabbito, frabbit, rabbity, rabbitry, rabbiter, biometry, algometry, obiter, hobbitry, bizzy
Board Position:



Scenario: 
With just two moves left each, Black (Player #1) is down by one point.  Player #1 forms the word ALGOMETRY, a 9-letter word, and thus goes ahead by eight points. 

Surprisingly, the longest word now possible is just six letters long and there is only one - OBITER.  Player #2 naturally plays it.  With just one move left each, Player #1 is now ahead by two full points. 

Player #1 then responds with the 8-letter word HOBBITRY, which is now possible because of White's last move.  For the final move, Player #2 is reduced to nothing but four, five-letter words, and chooses BIZZY.  Thus, Player #1 outgained Player #2 by six full points on the final two moves and won the game by five full points!

 



 
Update:

A small change to the program was made and the 8-letter vocabulary test was rerun.

This time, Player #1, when given of choice of two or more top-scoring moves, always chose the word using the less frequently seen letter.  For example, if placing a Z on the board and forming the word ZEBRA was possible, and placing a B on the board and forming the word REBELS was also possible, then ZEBRA would be chosen since the letter Z occurs less frequently than the letter B.

The idea was to see if this 'defensive strategy' would make it slightly more difficult for Player #2 to find a reply on Player #2's next move.

This won't change the mechanics of the game, of course.  Player #2 would still has an overwhelming advantage, since Player #2 could ALSO adopt this defensive strategy.  But I was curious to see how MUCH this defensive strategy would help one of the two players.

The order of the letter frequency list was determined by the exact number of times each letter appears in all 3 to 8-letter words in the SOWPODS word list... the exact words in the program's vocabulary!

After 1,300 games played, with a vocabulary of all 8-letter words and less, Player #1's win percentage jumped up noticeably so... from 4.28% to to 15.1%!  The strategy does indeed work.

Playing 1,300 games turned out to be overkill.  Almost the exact same percentages occurred after the program had played just 800 games.  (After 800 games, the Player #1s win percentage was 15.88%,  Player #2's win percentage was 64.50%, the average final score for Player #1 was 51.11, etc.... very similar percentages to the data below, for 1,300 games.) 

When comparing the data below, since there was a large difference between runs in the number of games played, be sure to just look at the percentages and averages.

 

  8-letter words 8-letter words
but Player #1 chose the
LEAST popular letter
Number of Games Played 800 1,300
Number of wins by Player #1 35 (4.38%) 199 (15.31 %)
Number of wins by Player #2 677 (84.62%) 846 (65.08 %)
Number of Draws 88 (11.00%) 255 (19.62) %
Average Final Score of Player #1 53.55 50.98
Average Final Score of Player #2 55.57 52.09
     
Number of Halftime Wins by Player #1 8 (1.00%) 27 (2.08%)
Number of Halftime Wins by Player #2 749 (93.62%) 1,073 (82.54%)
Number of Halftime Draws 43 (5.38%) 200 (15.38%)
Average Halftime Score by Player #1 23.15 22.13
Average Halftime Score by Player #2 25.20 23.59
     
Number of wins by Player #1 by exactly 1 point 24 (68.57%) 122 (61.31 %)
Number of wins by Player #1 by exactly 2 points 10 (28.57%) 55 (27.64 %)
Number of wins by Player #1 by exactly 3 points 1 (2.86%) 16 (8.04 %)
Number of wins by Player #1 by exactly 4 points 0 (0.00%) 5 (2.51 %)
Number of wins by Player #1 by 5 or more points 0 (0.00%) 1 (0.50 %)
     
Number of wins by Player #2 by exactly 1 point 160 (23.63%) 328 (38.77 %)
Number of wins by Player #2 by exactly 2 points 225 (33.23%) 265 (31.32 %)
Number of wins by Player #2 by exactly 3 points 170 (25.11%) 155 (18.32 %)
Number of wins by Player #2 by exactly 4 points 84 (12.41%) 70 (8.27 %)
Number of wins by Player #2 by 5 or more points 38 (5.61 %) 28 (3.31 %)
     
Number of games Player #1 did better on final move 183 (22.88%) 389 (29.92%)
Number of games Player #2 did better on final move 129 (16.12%) 176 (13.54%)
Number of games final move of each player was equal 488 (61.00%) 735 (56.54%)
     
Average Number of Moves Available - Move 1 unknown 188.78
Average Number of Moves Available - Move 2 unknown 268.90
Average Number of Moves Available - Move 3 unknown 404.20
Average Number of Moves Available - Move 4 unknown 477.41
Average Number of Moves Available - Move 5 unknown 595.25
Average Number of Moves Available - Move 6 unknown 641.47
Average Number of Moves Available - Move 7 unknown 711.54
Average Number of Moves Available - Move 8 unknown 690.51
Average Number of Moves Available - Move 9 unknown 723.39
Average Number of Moves Available - Move 10 unknown 666.96
Average Number of Moves Available - Move 11 unknown 649.40
Average Number of Moves Available - Move 12 unknown 564.39
Average Number of Moves Available - Move 13 unknown 524.81
Average Number of Moves Available - Move 14 unknown 411.27
Average Number of Moves Available - Move 15 unknown 340.90
Average Number of Moves Available - Move 16 unknown 199.50

 

As you can see, the program now also keeps track of the number of possible moves (words) available during each turn.  As you can imagine, these numbers for both sides will be higher when Player #1 is not using the 'less frequently chosen letter' strategy.

With more time (and more of an interest) there are lots of other types of data and totals and percentages I could keep track of.  And as mentioned above, someday I'd like to include a larger sample of runs for 9-letter words, as well as a sample of runs with the possibility of 10 and 11-letter words.  MAYBE I will do this in the future, but I've already moved on to another programming project, so don't look for any further updates any time soon.

- Ed
October 21, 2010