Rarest knight forks in chess history

2023 July 23
chess

What pair of pieces have only been forked a single time in professional chess history1by which I mean games played by high-level players with long time controls. – or perhaps never?

A knight fork, for our purposes, is when a knight is moved to a position from which it attacks two or more opponent’s pieces, pawns excluded. This is a somewhat broader definition of fork than is typically used2Typically we only are interested in forks which threaten a material advantage, so the knight must not be in a position where it can be captured, and must threaten two or more pieces which are undefended or of higher material value. However the boundaries of what exactly falls under this stricter definition is a bit fuzzy and depends on evaluation. For example, if the knight is pinned, does that count as a fork?, but unambiguous. At most 7 pieces can be forked at a time as the square the knight came from cannot hold an opponent’s piece.

While most pairs of pieces have been attacked in countless different games in chess history, there is one specific pair that is so unlikely I believed it would have never happened at all. Let’s walk through how I set up a simple python program to analyze historical chess games and the rarest knight forks in history.

Scanning a chess database

Chess databases are a big deal in professional chess, and there are extensive tools for sophisticated search of chess games, a problem very unlike standard relational database lookups. However ChessBase, possibly the best(?) database of high-level chess games, costs 500 euro (the software alone with no database is 150 euro!), and most online game search tools are ill-suited for this question.

The best freely available database I found was Caissabase, containing 4874268 games, the large majority of which with at least one player rated about 2000 ELO, and mostly much higher than that. The earliest game is from 1610 and the median year is 2009. About 90% were played over-the-board38.45% were played on chess.com, 1.36% were played on chess24.com, and .2% were played on lichess.. There are a few computer vs computer games (from the TCEC) and it includes games at rapid or bullet time controls, but I have included every game from Caissabase as part of my search4If for no other reason than because accurately filtering them is hard. Lichess also publishes a free database of all 4.6 billion rated games played on lichess but this will include lots of games I am uninterested in (low level or rapid time controls) and, more importantly, does not have any pre-computer games. (And is also far too large for my code to handle.)

Slightly irritatingly Caissabase is in a format that can be read only by the open-source SCID vs PC so the first step is to download and compile that software and use it to export the database as a single flat text file in PGN format, growing the size 5 fold to 3.5 GB. This now lacks any of the advanced search functionality the database had (though unhelpful to this problem) but can now be processed at will.

PGN, unlike computer-friendly formats like LAN or UCI, requires significant computation to parse because redundant information is omitted, and so the parser must keep track of the game state and have full knowledge of all rules of chess. E.g., disambiguating a move might depend on one of the pieces being pinned, and thus the parser must verify whether a move would place a king in check.

I considered writing my own parser (building off of the terrible chess engine I wrote in Cython) but opted for the easier route of using the excellent and fully-featured, though slow, chess library python-chess. My code will not be fast enough to handle lichess’s 4.6b games, and will take about 30 hours to go through the Caissabase, but writing it was very painless, taking only a few hours and mostly working on the first try.

Let’s dive in. I will omit some boilerplate or obvious code. First, we want to read the database as a stream of chess games:

def games(max_count = None):
    count = 0
    # The database is in UTF8 with BOM
    with open(pgn_file, 'r', encoding = 'utf-8-sig') as f:
        while (max_count is None) or (count < max_count):
            count += 1
            c = chess.pgn.read_game(f)
            if c is None:
                return
            yield c

I also wrote a function find_game_by_index for seeking to a particular game without wasting time parsing all of the preceding games. Let us test whether a particular board state has a knight fork:

def get_forked(board, sq, target_color):
    hit = board.occupied_co[target_color] & chess.BB_KNIGHT_ATTACKS[sq]
    hit -= board.pawns & hit
    if hit > 0 and chess.popcount(hit) >= 2:
        result = chess.popcount(board.kings & hit)
        result *= 8
        result += chess.popcount(board.queens & hit)
        result *= 8
        result += chess.popcount(board.rooks & hit)
        result *= 8
        result += chess.popcount(board.bishops & hit)
        result *= 8
        result += chess.popcount(board.knights & hit)
        return result
    else:
        return 0

Here, occupied_co[color] is a bit board (i.e., an integer representing a sequence of 64 bits) of all squares having pieces of that color. sq is the square the knight is on, and BB_KNIGHT_ATTACKS is a bit board of squares attacked from there. board.pawns etc are bit boards of locations of pieces. popcount computes the number of 1 bits in an integer. We return an integer representing the collection of attacked pieces in base 8.

Now we have to iterate through all board states of each game:

# g         game
# forks     dictionary from fork type to all games with that fork
# idx       index of this game in the database
def compute_forks(g, forks, idx):
    g = g.next()        # advances the game state by one move
    while not (g is None):
        sq = g.move.to_square
        b = g.board()
        if b.knights & chess.BB_SQUARES[sq]:
            f = get_forked(b, sq, b.turn)

            if f in forks:
                forks[f].add(idx)
            else:
                forks[f] = set()
                forks[f].add(idx)
        g = g.next()

g.move is the move that had just been played, and to_square is the square moved to. BB_SQUARES converts a square into a bit board of only that square (equivalent to 1 << sq). b.turn is which color is about to move.

Finally, it is a simple matter of iterating over the whole database and printing progress.

def find_forks(max_count = None, outfile = 'forks_listing'):
    start = time.time()
    fork2games = {}

    def status():
        dur = (time.time() - start) / 60
        table = write_forks_table(fork2games)
        with open(outfile, 'w') as f:
            f.write(table)
        print(table)
        print(f'{dur:.2f} minutes elapsed')

    for idx, g in enumerate(games(max_count)):
        if idx > 0 and idx % 1000 == 0:
            status()
            print('Next game will be', idx)
        compute_forks(g, fork2games, idx)

    status()

python-chess even comes with utilities for displaying chess boards as SVG files. Since we want to actually see any interesting forks we find:

def render_fork_on_board(idx, fork_string, outfile):
    g = find_game_by_index(idx)

    # Scan for the fork
    g = g.next()
    while not (g is None):
        sq = g.move.to_square
        b = g.board()
        if b.knights & chess.BB_SQUARES[sq]:
            f = get_forked(b, sq, b.turn)
            if f > 0 and fork2str(f) == fork_string:
                break
        g = g.next()

    svg = chess.svg.board(
                g.board(),
                lastmove = g.move,
                arrows = [(g.move.from_square, g.move.to_square)],
                size = 1000)
    with open(outfile, 'w') as f:
        f.write(svg)

Now as of this writing, the program is only about half way and 1043 minutes through the database.

Intermission

Intermission

Intermission

Rare forks

A total of 29.6 hours later, we have our results!

First, let’s take a quick diversion to look at some of the rarest forks seen in chess.

Double-queen forks are very rare, with only 40 games featuring such forks in my database. Two games however go further, with a KQQ fork. (There were no double-queen forks with any other piece.)

In the above game, as more pieces began to fall I feared I had a bug or somehow selected the wrong game, but then was very satisfied to see the fork unfold elegantly. Both black queens are promoted.

Here the black king was driven across the board into the midst of the two black queens, with the white knight lagging behind. White delivered the fork and then immediately resigned (normally one resigns after one’s opponent’s move; clearly white already knew he was lost and just wanted to go out with a spectacular fork).

Tigran L Petrosian, named after the unrelated World Champion Tigran Petrosian and also a chess grandmaster himself, is best known for cheating and insulting Wesley So so egregiously that he caused his team to forfeit the finals of the tournament he was in and earned lifetime bans from multiple venues.

Across all games there was only a single 6-way knight fork, shown below. However it was simply a piece trade: white captured a bishop with the knight and was in turn captured by black’s knight, so the presence of the grand fork did not influence the course of the game.

Now back to our original question: how rare are specific pairs of pieces to be forked? Here I have tabulated the number of games in the database in which at least one fork of the specified pair occurs. Note that I have only included forks of exactly two pieces (so a KQR fork is not a KQ fork).

King Queen Rook Bishop Knight
King - 92256 228873 218087 145917
Queen 92256 40 194557 382769 298664
Rook 228873 194557 78387 399765 181290
Bishop 218087 382769 399765 1 671303
Knight 145917 298664 181290 671303 38449

QQ forks are of course very rare, as you only start with one queen. But the rarest of all – which I had predicted have never occurred – is the BB fork.

Why? A knight can only fork two pieces on the same colored square. To fork two bishops means one of the bishops must be promoted. But bishop promotions are very rare! I counted (approximately) 323831 queen promotions in my database, compared to 4837 knight promotions, 3538 rook promotions, and only 976 bishop promotions. While knights are occasionally useful if it allows you to promote with check, there is almost no circumstance in which a rook or bishop is better than a queen; even in highly contrived chess problems5Something I have a little experience with bishop promotions are very rare, as there is only so much you can do with them.

Therefore I expect most bishop promotions in chess play to be either showing off in highly forcing lines leading directly to mate (427 of the bishop promotions are with check(mate)), or are immediately captured the turn after promotion (by eyeball this seems to be super common), or both. Neither of those circumstances are conducive to getting forked by a knight. Moreover, knights are often traded off in the early and midgame well before any promotions can occur.

Yet, despite this, exactly one game featured a BB fork. Let’s take a look:

The game was blitz (typically this means 3 to 5 minute time controls) between two quite highly rated players – white is a grandmaster. The computer analysis shows wild swings in evaluation, with multiple massive blunders on both sides. Indeed the bishop promotion is one such blunder, bringing the game from a theoretical draw to a crushing advantage for black that white never recovers from.

Or is it? Both players ignore the hanging black rook on g8 for several moves, in a way that even amateur players on bullet time controls would never do, and blitz is comparatively a leisurely stroll. With 3 minutes, GMs absolutely have plenty of time to notice pieces blatantly hanging.

However, what they may not have time to do is correctly record their moves on their score pad. (Generally keeping notation live is not required in blitz games.) I strongly suspect there are multiple transcription errors in this game. The black rook on g8 does not move again after turn 41 and is ignored by both sides – I believe it is not really there, and possibly never even reached g8, with some errors around turn 38.

The white bishop on d7 is also ignored by both sides after it moves there on turn 35, and I again suspect that it is not actually there, though this is much more unclear than with the rook. Possibly the pawn never promoted to a bishop and instead the existing bishop moved to e8. If so, then the two bishops forked by the knight were actually the same bishop.

If I am right that the bishop and rook are not actually there, then the final position is an easy draw instead of black to mate in 9. (White won: presumably on time. Or maybe even that is a typo, as a draw agreement would be likely in the final position.)

While I was unable to reconstruct a plausible alternate game history, it seems certain there at least some transcription errors with the rooks, and I am fairly confident the bishop-bishop fork never happened.

Update: It was suggested that the pawn actually promoted to a queen, the rook on g8 captured it, and then the bishop on d7 captured that, then the game proceeded as recorded. If so, then the bishop and rook are missing, as stated, and the game is plausible though there are still some blunders. It is possible that if the promoted queen had not been seated properly on the square then an electronic board might have failed to record the move pair, and ended up thinking it was a pawn promoting to a bishop. Corrected game here. Thus there are no known bishop-bishop forks in professional chess history. There was, however, a close miss, in which on move 50 a pawn promoted to a bishop in a forked position (this game is in my chess database, but does not count as a fork as I defined it above).

Update 2: I also looked for games with the most (under)promotions, but most of them were clearly silly. The most I saw was QQRRNNBBBB, 1 game had QQQQQNB, and 2 games each of 7Q, 5N, and 5B. Most rooks was 4.

For completeness’ sake, here is the inventory of all knight forks found in the Caissabase:

fork count
BN 671303
RB 399765
QB 382769
QN 298664
KR 228873
KB 218087
QR 194557
RN 181290
KN 145917
KQ 92256
RR 78387
QBN 58958
QRB 44602
NN 38449
KRB 26791
RBN 26712
KRN 23100
QRN 20336
KQR 17254
KBN 15848
RRB 12656
KQB 10321
BNN 8228
KQN 7500
QRR 7185
RRN 6183
KRR 4898
QNN 4198
QRBN 3629
RNN 2386
KRBN 1714
KNN 1676
KQRB 1399
KQRN 1240
QRRB 1158
QBNN 914
RRBN 696
KQBN 569
QRRN 541
KRRB 324
KRRN 297
RBNN 287
KQRR 276
KRNN 258
QRNN 233
KBNN 151
KQRBN 74
KQNN 70
RRNN 68
QRRBN 55
QQ 40
QRBNN 33
KQRRB 17
KRRBN 12
KQRRN 12
RRBNN 7
QRRNN 7
KQBNN 7
KQRNN 6
KRBNN 4
KRRNN 2
KQQ 2
KQRRNN 1
BB 1

Follow RSS/Atom feed for updates.