BT2450 – The Test
| LEGEND | |||
|---|---|---|---|
| Correct answer by Chess Tiger | Correct answer by PocketChess | ||
| Correct answer by ChessGenius (ARM) | Same & incorrect answer by multiple programs | ||
| Correct answer by ChessGenius (68K) | ??? = Program crashed while analyzing | ||
| Chess Tiger | ChessGenius (ARM) | ChessGenius (68K) | PocketChess | ||||||
| (BT) ELO | 2114 | 2244 | 1812 | 1647 | |||||
| # | ANSWER | EVAL | TIME | EVAL | TIME | EVAL | TIME | EVAL | TIME |
| 1 | Nxg7 | Nxg7 | 117 | Nxg7 | 36 | Qb3 | 900 | Qb3 | 900 |
| 2 | Bxb6 | Bxb6 | 342 | Bxb6 | 53 | Bxb6 | 62 | Be3 | 900 |
| 3 | Re6 | Re6 | 331 | Re6 | 28 | Re6 | 87 | Nh3 | 900 |
| 4 | Qf7 | Rb1 | 900 | Qf7 | 35 | Rf1 | 900 | Qb1 | 900 |
| 5 | Ka6 | Ka6 | 69 | Kc5 | 900 | Kb4 | 900 | Bf3+ | 900 |
| 6 | e3 | e3 | 347 | e3 | 0 | e3 | 81 | Re8 | 900 |
| 7 | Rd6 | Rd6 | 53 | Rd6 | 0 | Rc2 | 900 | Nc4 | 900 |
| 8 | Rxc6+ | Rxc6 | 102 | Rxc6+ | 23 | Rxc6+ | 2 | Rxc6+ | 236 |
| 9 | g5 | g5 | 1 | cxd4 | 900 | Rf8 | 900 | Bxe5 | 900 |
| 10 | Rxg7+ | Bh6 | 900 | Rxg7+ | 1 | c4 | 900 | c4 | 900 |
| 11 | Qxh2+ | Qxh2 | 536 | Qxh2+ | 1 | Qh5 | 900 | Qh5 | 900 |
| 12 | Qe4 | Qe4 | 44 | Qf5 | 900 | Qf5 | 900 | b6 | 900 |
| 13 | Nb4 | Ne3 | 900 | Nb4 | 63 | Ne3 | 900 | Qc5 | 900 |
| 14 | Rxh7 | Rxh7 | 93 | Rxh7 | 1 | Qxb5 | 900 | Qxb5 | 900 |
| 15 | Rg6 | Rg6 | 1 | Rg6 | 0 | Rg6 | 1 | Rg6 | 332 |
| 16 | g6 | g6 | 7 | g6 | 0 | g6 | 327 | ??? | 900 |
| 17 | Qxf4 | Qxf4 | 68 | Qxf4 | 3 | Rxf4 | 900 | Rxf4 | 900 |
| 18 | d6 | a5 | 900 | d6 | 18 | d6 | 380 | Kxd4 | 900 |
| 19 | f3 | Bxf1 | 900 | f3 | 33 | Bxf1 | 900 | Bxf1 | 900 |
| 20 | Ra2 | Ra2 | 106 | Ra2 | 35 | Ng5 | 900 | Ng5 | 900 |
| 21 | Re6 | Re6 | 226 | Re6 | 4 | Qe4 | 900 | Qe4 | 900 |
| 22 | a3 | a3 | 13 | a3 | 35 | cxd5 | 900 | Bxf6 | 900 |
| 23 | Qf6 | Qf6 | 63 | Qf6 | 2 | Qf6 | 192 | Bh6 | 900 |
| 24 | g6 | Ra1 | 900 | Ra1 | 900 | Ra5 | 900 | Ra5 | 900 |
| 25 | Nd3 | Kc5 | 900 | Nd3 | 289 | Kc5 | 900 | g2 | 900 |
| 26 | f5 | f5 | 2 | Re5 | 900 | Qe2 | 900 | f5 | 546 |
| 27 | e6 | e6 | 356 | e6 | 2 | Qxe4 | 900 | ??? | 900 |
| 28 | Ne4 | Ne4 | 1 | Ne4 | 0 | Ne4 | 1 | Ne4 | 238 |
| 29 | Ke1 | Ke1 | 17 | Qd8 | 900 | Ke1 | 1 | Ke1 | 240 |
| 30 | f4 | d4 | 900 | f4 | 125 | Rg8+ | 900 | Rg8+ | 900 |
Hubert Bednorz and Fred Toennissen developed two test suites to measure the tactical capability of chess engines: BT2450 and its predecessor, BT2630.
Each test suite contains 30 positions, and the BT2450 test can be seen here. A chess engine is given 15 minutes (900 seconds) to analyze each position.
If a position is solved, the solution time is recorded in seconds. It doesn't count as a solution if the engine finds the move and then changes its mind. If the engine finds the move, changes its mind then finds the move again, that second time is used. Any solution that is not found scores as 900 seconds. The 30 times are averaged and subtracted from 2450 to give the (BT) ELO rating.
Again, these tests were constructed to measure tactical capability: not necessarily positional capability. So the (BT) ELO rating may not be a true representation of its true ELO rating.
ChessGenius doesn't have a custom "Time per move" option, but it does have an "Analyze Game" setting. So I set the time controls to "Game in 60 minutes" and got the stopwatch ready. I used the small board view so I could watch the main line of thinking. As soon as it locked on to the right answer, I marked that time.
PocketChess, on the other hand, did have a custom "Time per move" option. So after loading each position, I reset the time controls to 15 minutes per move then asked for a Hint. Unlike ChessGenius, you can't see PocketChess' main line. So I timed the little progress clock and presumed (maybe erring to its benefit, but probably erring slightly to its detriment) when that clock hit 12:00 (completely filled) that it reached its final decision. If a "Main line" option is added something in the future, I'll be happy to re-test it and hopefully get more accurate results.
The Results
After running these tests, I was surprised by a couple of things. First, I was surprised to see the difference in ratings. After seeing the games and ratings against Chessmaster, I expected ChessGenius to do a little better. There were a couple of positions I fully expected ChessGenius to solve, but it didn't happen. Either way, Chess Tiger cleaned house, solving 22 of the 30 problems.
My second surprise came from PocketChess. On two of these tests, PocketChess simply crashed. For reasons unknown to me, it simply gave a "Fatal Exception" causing a Reset. On #16, it thought for around 6 minutes before crashing; on #27 it thought for around 8.5 minutes before crashing. Tinyware is aware of this problem.

