Points for Tricks

Part II: Exploration

5 min readSep 27, 2020

This blog is the second part of a series about my favourite card game. The aim of this series of posts is to find the optimal scoring system for the game. For more information, please see the first instalment.

Preamble: Developments Since the Last Post

Some digging has been done since the last post, and the game itself has been identified! It appears that the official name of the game is Oh Hell, or more specifically, the British variant thereof. If the entire project fails, at the very least we won’t have to argue about the game is actually called anymore. I choose to take this as a significant result.

Some more digging has revealed a (large) number of scoring systems that exist around the world. This will be a useful starting point for this mini-project, as we can use these existing scoring systems to identify desirable scoring features, rather than starting from scratch.

Introduction

The aim of this post is to take a deep(er) dive into the mathematical mechanics of the game, in order to gain a better understanding of how various scoring systems will affect overall outcomes. Using this understanding, we will be in a better position to simulate actual games using various scoring methods, and thereby determine the best scoring system according to our criteria outlined in the first post. The simulations themselves, and how to best construct them using Python, will be the subject of the future posts.

We will begin with a brief recap of the rules of the game, before then looking into the mathematical issues that will need to be accounted for in any simulation.

Recap: Oh Hell

The game, in brief, goes like this: four players are dealt a single card, and single card from the remaining stack is turned over, revealing the trump suit for the round. Each player makes a bid for the number of tricks they expect to win, and if they predict their result correctly (or, in Oh Hell lingo, if they fulfil their contract), they win points. In the next round, the players are dealt two cards each and bid to win the two available tricks. The game works in this way up to the thirteenth round (in which all the cards are dealt, and therefore there is no trump suit), which is played four times, and then all the way back down to one again, for a total of 28 rounds.

Finally, recall that the final bidder of the four is not permitted to bid in such a way that the total number of bids made is equal to the round number (e.g., in round three, the total number of bids cannot be equal to three).

Endless (Almost) Possibilities

In the first round, each player can bid for either one or zero tricks, which means that there are ²⁴ = 16 possible bids in total for the round. Some of these will sum up to one, however, and therefore will not be allowed. In this case, there are four possible bid combinations that sum to one, and therefore there are 12 allowed bid combinations.

In general, the number of possible bid combinations in round n is given by n⁴, and the number of disallowed bid combinations can be show to be expressed as follows (note that relevant proofs will be included in an appendix to this blog series):

I have a truly marvellous demonstration of this proposition which this caption is too narrow to contain

Over the course of the 28 rounds, the total of possible allowed bids is approximately 1⁰⁹⁶, and for each of these bid combinations there are even more possible outcomes depending on wether or not each player fulfils their contract. It is clear, therefore, that the project of finding the optimal scoring system cannot be based on a brute-force approach in which every possible game scenario is simulated and scored under different systems. If a simulation were to run every nano second, it would take approximately 3 * 1⁰⁷⁶ years to cover all the possible bid combinations, not even taking into about the outcomes of each round. Note for illustrative purposes that the universe is approximately 1.38 * 1⁰¹⁰ years old. In short, a different approach is required.

Oh Heavens!

As it is clear that we cannot simulate every possible game outcome, we will have to choose some subset to simulate. The question now becomes; how do we decide which outcomes to include?

If we consider the full list of possible outcomes, we can imagine that many (or even most) of them are perhaps not representative of a real game played by human players. For example, the full list will contain a game in which one player has 13 terrible cards in round 13, but still bids for 13 tricks. This type of scenario, while not impossible, is implausible assuming a minimum level of player competence. It is worth considering, therefore, whether a subset of realistic outcomes could be simulated, where the outcomes included in the subset would be more representative of real play and would exhibit at least a semblance of strategic thinking.

Suppose, say, that we were somehow able to codify competent play into a random game; that is, randomly deal out the cards to each player over the course of 28 rounds, and generate a game outcome based on a few simple guidelines (for example, bidding for zero tricks in round one if the player is dealt the 2 of hearts when spades are the trump). If it is possible to account for merely competent play in this way, a round outcome could be generated from a random deck, and therefore an entire game outcome could be generated over the 28 rounds. If this can be done once, there is no reason why it couldn’t be done ten times, hundreds of times, or even thousands of times. Each game outcome could then be scored based on the various possible scoring systems, and the results could be evaluated in terms of the criteria of merriment discussed in the previous post. I believe this would be the best way to proceed, since it would generate enough data to be worth looking into, while discarding the vast majority of implausible game outcomes.

Codifying Competence

The next step, therefore, is to come up with a system of rules for a simulation to play with.