_{25 Dec 2022}

# Solving perfect clears with Rust and WebAssembly

_{Building a real-time Tetris perfect clear solver.}

## Introduction

One of the most difficult challenges in modern Tetris is **10 PC** – clearing 40 lines by achieving 10 consecutive perfect clears (PC), where no filled cells are left on the board after a line clear. Several guides exist on how to complete the 10 PC challenge, but this requires *hours upon hours of practice, memorization, and effort*.

Being a computer science student with too much time on his hands, I figured it would be fun to build an application that would find solutions on the fly and help new Tetris players develop their 10 PC skills in a reasonable amount of time.

I wanted my application to run on the web. After all, it seemed excessive to download and install an entire program for what amounts to a game helper tool. For the sake of my wallet, I also wanted to avoid doing any heavy computation server-side, so the solver had to run client-side. However, a regular JavaScript web application would not suffice; the solver will be computationally expensive and will need as much performance as it can get.

### Native performance with WebAssembly

Enter WebAssembly. WebAssembly is a browser standard that allows web applications to run specific parts of their logic in a near-native environment. It is a binary format which is usually compiled to from lower-level languages like C. Gone are the days where we need to download native applications for native performance. **With WebAssembly, we can deliver native performance without compromising on ease of access or delivery of content.**

I’ve been dabbling in Rust for awhile, and saw this as an opportunity to apply my newfound Rust knowledge to an interesting project. Thanks to the Rust team and their amazing work on WebAssembly compatibility with `wasm-bindgen`

, this process was surprisingly easy.

## Solver strategy

When starting a new game of Tetris, we are presented with an active piece and a queue with the next 5 pieces.

You may have noticed that each piece is made of 4 cells. Since the board is 10 cells wide, the only possible way to achieve a perfect clear is to place 10 pieces to fill 40 cells within 4 lines.

Technically 2-line perfect clears are possible with 5 pieces, but not with the starting pieces. We will come back to 2-line PCs later.

Since we can only see 6 pieces at the start of the game, the last 4 pieces to make our perfect clear must be determined stochastically. Because SRS uses a 7-bag system, the probability of seeing any next piece is not uniform and can be used to make educated guesses on what pieces should appear next. This allows us to calculate the probability of seeing any 10-piece sequence.

My approach was simple:

- use a backtracking search algorithm to explore all possible moves given the visible and guessed pieces,
- find all move sequences that end in a perfect clear and their associated probabilities,
- group sequences based on their first move and sum their probabilities,
- recommend the move with the highest summed probability to the player,
- when the player places a piece and reveals the next piece in the queue, filter out sequences that are no longer possible, and update the probabilities.

This process can be repeated until all 10 pieces are placed in a perfect clear.

With that approach in mind, let's start coding!

## Implementation

### Board representation

A Tetris board comprises 22 rows of 10 cells (20 visible rows and 2 hidden rows above the visible board) where each cell is either filled or empty. This means that a full Tetris board contains 220 cells with binary values, and can be represented as a bitfield with 220 bits. Given that the largest integer supported in WebAssembly is 64 bits, 4 `u64`

values are required to represent a full board.

```
struct Board {
fill: [u64; 4],
}
```

Because we want to memoize our backtracking search during the solver’s execution, many board states must be stored in memory. Since we are only concerned with the bottom 4 rows when solving for perfect clears, we can reduce the memory footprint of our solver by only tracking the bottom 4 rows with 2 hidden rows above. This means each board only contains 60 cells and can be stored in a single `u64`

.

```
struct Board {
fill: u64,
}
```

This simplifies and speeds up operations on the board since we can use simple bitwise operations to modify the board or check the state of the board.

### Positioning and rotating pieces

There are a multitude of different rulesets for Tetris which affect how pieces behave and interact with the game, but the most popular variant in modern Tetris is the Super Rotation System (or SRS for short). These rulesets define behaviors such as piece spawn orientation and location, rotation behaviour, and wall kicks. Despite extensive documentation of these behaviours, there are a few technical details that stood out during implementation.

In SRS, pieces are given a center point which defines their position and center of rotation. However, since some pieces are 2 or 4 cells wide, this center point can sometimes be found in between cells i.e. in half-integer coordinates. These half-integer coordinates are difficult to represent in an integer-based coordinate system and cause many complications for the calculation of rotations.

To simplify the system, we can instead treat the position coordinate as the bottom-left-most cell of a square bounding box, and standardize rotations as rotations of a 3x3 or 4x4 bounding box. This preserves the behaviors described in the SRS specification while eliminating the need to track half-integer coordinates, thereby simplifying our code overall.

### Solver state machine

The solver is modelled as a state machine with 3 general stages:

- an inactive piece stage,
- an active piece stage, and
- a decision stage.

#### Inactive piece stage

During the inactive stage, actions do not move pieces on the board but instead determine the solver’s next piece. This can be achieved by either:

- consuming the next piece in queue if available, or
- guessing the next piece in queue.

To continue solving, a given board with no active piece is branched into many possible boards with an active piece.

```
fn branch_state_for_piece(config: &Config, state: &State) -> Vec<State> {
// An active piece already exists, return the current state.
if state.game.piece.is_some() {
return vec![state.clone()];
}
// Try to consume the next piece in queue.
if let Ok(state_after_consume_queue) = state.reduce(config, &Action::ConsumeQueue) {
return vec![state_after_consume_queue];
}
// Otherwise, guess the next piece.
PIECE_KINDS
.iter()
.filter_map(|&kind| {
if state.piece_probability(kind) < 0.01 {
return None;
}
match state.reduce(config, &Action::WithNextPiece { kind }) {
Ok(state) => Some(state),
Err(_) => None,
}
})
.collect()
}
```

To account for the hold piece updating the next sequence, an additional branch is considered where the hold piece is used.

```
fn branch_game_on_hold(config: &Config, game: &Game) -> Vec<Game> {
// Either switch or do not switch the hold piece.
[true, false]
.iter()
.filter_map(|&switch| game.reduce(config, &GameAction::Hold { switch }).ok())
.collect()
}
```

#### Active piece stage

During the active stage, the game is played to move the active piece to all possible piece positions and orientations where the piece can be placed. This step is memoized as multiple movement actions can lead to the same piece position and orientation.

```
fn branch_game_to_placable_pieces(config: &Config, game: &Game) -> Vec<Game> {
let piece = game.piece.unwrap();
// Since the board and piece are given, we only need to memoize
// on the position and orientation of the active piece.
let mut memo = HashMap::new();
// Run the memoized recursive search.
generate_placable_pieces(config, game, &mut memo);
// Get all generated positions that are placable.
memo.into_iter()
.filter_map(|(k, v)| if v.is_placable { Some(k) } else { None })
.map(|(position, orientation)| Game {
piece: Some(Piece {
orientation,
position,
..piece
}),
..game.clone()
})
.collect()
}
type PlaceablePiecesKey = (Point, Orientation);
struct PlaceablePiecesValue {
is_placable: bool,
previous_key: Option<PlaceablePiecesKey>,
}
fn generate_placable_pieces(
config: &Config,
game: &Game,
memo: &mut HashMap<PlaceablePiecesKey, PlaceablePiecesValue>,
) {
let piece = game.piece.unwrap();
config
// The moves available depend on the user's configuration.
.possible_moves()
.iter()
.filter_map(|&mov| game.reduce(config, &GameAction::Move(mov)).ok())
.for_each(|next_game| {
let next_piece = next_game.piece.unwrap();
let key = (next_piece.position, next_piece.orientation);
// If the next position and orientation has already been
// visited, there's no need to revisit the node.
if memo.contains_key(&key) {
return;
}
memo.entry((next_piece.position, next_piece.orientation))
.or_insert(PlaceablePiecesValue {
is_placable: next_game.board.can_place(&next_piece.get_points(config)),
previous_key: Some((piece.position, piece.orientation)),
});
generate_placable_pieces(config, &next_game, memo);
});
}
```

## Performance

### Naive approach

The current solver explores all possible solver states by recursively branching all possible moves on each state. **In hindsight, this was not a smart decision.** For every solver state, assuming there are no pieces in the queue, we have to:

- guess approximately 7 possible next pieces for 7 states,
- account for the hold piece which doubles our possible next states to 14 states, and
- check an average of 20 possible placable positions and orientations, increasing the number of possible next states to approximately 280.

For a single perfect clear with 4 guessed pieces, that amounts to a minimum of **6 billion possible states**. Since we want the solver to provide real-time recommendations, this is not computationally feasible given the time and memory constraints.

### Memoization in a graph

Many sets of actions can result in the same solver state. To eliminate repeated work, we can memoize solver states in a graph. We use the `petgraph`

crate for its wonderful graph implementation alongside a `HashMap<SolverState, NodeIndex>`

to keep track of the node index for a given solver state.

This greatly reduces the work required to find all sequences that achieve a perfect clear, however this still does not perform well enough for a real-time solver.

### Pruning invalid states

The next easy optimization we can perform is to ignore all branches where a piece sits above the bottom 4 lines, since these states will never be able to achieve a perfect clear in 4 lines. On a 15-inch 2017 Macbook Pro, this brings our average execution time to find the first perfect clear sequence down to 5 minutes. However, not only has the solver not finished finding all valid sequences, the time taken to find a single perfect clear is still orders of magnitude too slow.

With all that we’ve learnt, it seems as if solving perfect clears computationally in real-time is an insurmountable problem, or is it?

## Future improvements

At the time of writing this post, I have spent far too much time working on this project and have decided to take a break instead. It seems as if the problem of solving perfect clears on the fly is insurmountable. However, many Tetris players are able to do in their head what my solver is not capable of. As of 2022, the record for most consecutive perfect clears belongs to Holifyre with 312 perfect clears under an hour. What’s their secret?

### Memorizing patterns

The trick to solving perfect clears quickly is to learn setups and solutions for a given piece sequence. As Clcck puts it, “70% […] comes down to memorised setups/solutions, [and] the other 30% [is] based on feel and understanding of how 7-bag works”.

I’ve mentioned 10 PC guides in the introduction but did not spend the time to learn them or understand how perfect clears are solved by regular players. With more time on my hands, it would make sense to integrate those memorized patterns into the solver to reduce the computational work required and possibly achieve reasonable solving times.

## Conclusion

If there’s one thing I’ve learnt from this project, it is the value of conventional wisdom applied to new problems. I committed to brute-forcing the problem early on with the expectation that raw computing power would prevail. However, I let my own hubris blind me from existing knowledge that would have been extremely useful.

This is my first project which focuses on algorithmic efficiency and performance, and my mistakes clearly stand out. Moving forward, I would save myself a lot of wasted time and effort by evaluating my approach with performance in mind before coding.

If you’d like to view the rest of the solver’s code as it was during the time of this post, you can view the repository on tag `v0.1`

.