Winnining by betting always on the favourite?
Disclaimer: I am not a betting expert. Please do not take this as financial advice. Do your own research before making any bets.
I did this analysis just for fun. I am not sure if it has any practical value, however I was just curious to see if betting always on the favourite would be a good strategy. My intuitive answer would be to the question, that it cannot be a good strategy, because the odds are always lower for the favourite, and the favourite does not win every time.
Betting data
To answer the question, I collected the data for 3 different leagues:
- English Premier League
- Spanish La Liga
- German Bundesliga
The data is containing all the matches from the season 2018/2019 to 2023/2024.
The data which I used looks like this:
| id | home | away | home-name | away-name | sport-id | date-start-timestamp | result | homeResult | awayResult | home-winner | away-winner | postmatchResult | country-id | country-name | 1_avg_odds | x_avg_odds | 2_avg_odds | 1_max_odds | x_max_odds | 2_max_odds | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3263915 | 8002902 | 8002903 | Liverpool | Manchester City | 1 | 1538926200 | 0:0 | 0 | 0 | draw | draw | 0:0 | 198 | England | 2.75 | 3.60 | 2.60 | 2.75 | 3.60 | 2.60 |
| 1 | 3263917 | 8002906 | 8002907 | Southampton | Chelsea | 1 | 1538918100 | 0:3 | 0 | 3 | lost | win | 0:3 | 198 | England | 6.50 | 4.20 | 1.57 | 6.50 | 4.20 | 1.57 |
| 2 | 3263913 | 8002898 | 8002899 | Fulham | Arsenal | 1 | 1538910000 | 1:5 | 1 | 5 | lost | win | 1:5 | 198 | England | 5.00 | 4.40 | 1.66 | 5.00 | 4.40 | 1.66 |
| 3 | 3263916 | 8002904 | 8002905 | Manchester Utd | Newcastle | 1 | 1538843400 | 3:2 | 3 | 2 | win | lost | 3:2 | 198 | England | 1.40 | 4.75 | 10.00 | 1.40 | 4.75 | 10.00 |
| 4 | 3263911 | 8002894 | 8002895 | Burnley | Huddersfield | 1 | 1538834400 | 1:1 | 1 | 1 | draw | draw | 1:1 | 198 | England | 2.35 | 3.00 | 3.80 | 2.35 | 3.00 | 3.80 |
In the data we have all necessary information to calculate the return on bet. However the data needed to be cleaned and transformed before it can be used.
The following preprocessing steps were applied:
- Removes specified columns
- Converts timestamp to datetime
- Adds a winner column
- Maps the winner column to numeric values
- Converts result columns to integers
After transforming the data, the table looks like this:
| home | away | home-name | away-name | result | homeResult | awayResult | home-winner | away-winner | 1_avg_odds | x_avg_odds | 2_avg_odds | date | winner | winner_num | home_result | away_result | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8002902 | 8002903 | Liverpool | Manchester City | 0:0 | 0 | 0 | draw | draw | 2.75 | 3.60 | 2.60 | 2018-10-07 15:30:00 | draw | 0 | 0 | 0 |
| 1 | 8002906 | 8002907 | Southampton | Chelsea | 0:3 | 0 | 3 | lost | win | 6.50 | 4.20 | 1.57 | 2018-10-07 13:15:00 | away | 2 | 0 | 3 |
| 2 | 8002898 | 8002899 | Fulham | Arsenal | 1:5 | 1 | 5 | lost | win | 5.00 | 4.40 | 1.66 | 2018-10-07 11:00:00 | away | 2 | 1 | 5 |
| 3 | 8002904 | 8002905 | Manchester Utd | Newcastle | 3:2 | 3 | 2 | win | lost | 1.40 | 4.75 | 10.00 | 2018-10-06 16:30:00 | home | 1 | 3 | 2 |
| 4 | 8002894 | 8002895 | Burnley | Huddersfield | 1:1 | 1 | 1 | draw | draw | 2.35 | 3.00 | 3.80 | 2018-10-06 14:00:00 | draw | 0 | 1 | 1 |
Betting strategy
I decided to build a class for the betting strategy, which can be easily reused and adjusted, for different strategies. However the analysis of different strategies is out of scope of this post. Let`s focus on the most simple strategy, which is betting always on the favourite.
from abc import ABC, abstractmethod
class BettingStrategy(ABC):
def __init__(self, df: pl.DataFrame, target_columns: list[str]):
self.df = df
self.target_columns = target_columns
@abstractmethod
def add_bet(self, df: pl.DataFrame) -> pl.DataFrame:
"""
Add a column to the dataframe which will be the bet number (1, 0 or 2)
"""
pass
def add_bet_won_column(self, df):
"""
Add a column to the dataframe which will be True if the bet has won.
"""
return df.with_columns(bet_won=pl.col("winner_num").eq(pl.col("bet")))
def add_odds_to_use(self, df):
"""
Add a column to the dataframe which will be the odds to use for the bet.
"""
return df.with_columns(
odds_to_use=pl.when(pl.col("winner_num").eq(1))
.then(pl.col("1_avg_odds"))
.when(pl.col("winner_num").eq(2))
.then(pl.col("2_avg_odds"))
.otherwise(pl.col("x_avg_odds"))
)
def calculate_return(self, df):
"""
Add a column to the dataframe which will be the return on the bet.
"""
return df.with_columns(
return_on_bet=pl.when(pl.col("bet_won"))
.then(pl.col("odds_to_use") - 1)
.otherwise(pl.lit(-1))
)
def get_underdog(self, df):
"""
Add a column to the dataframe which will be the underdog team (1=home, 2=away, or None)
"""
return df.with_columns(
underdog=pl.when(pl.col("1_avg_odds") < pl.col("2_avg_odds"))
.then(pl.lit(2))
.when(pl.col("2_avg_odds") < pl.col("1_avg_odds"))
.then(pl.lit(1))
.otherwise(pl.lit(None))
)
def get_favourite(self, df):
"""
Add a column to the dataframe which will be the favourite team (1=home, 2=away, or None)
"""
return df.with_columns(
favourite=pl.when(pl.col("1_avg_odds") < pl.col("2_avg_odds"))
.then(pl.lit(1))
.when(pl.col("2_avg_odds") < pl.col("1_avg_odds"))
.then(pl.lit(2))
.otherwise(pl.lit(None))
)
def has_favourite_won(self, df):
"""
Add a column to the dataframe which will be True if the favourite has won.
"""
return df.with_columns(
has_favourite_won=pl.col("favourite").eq(pl.col("winner_num"))
)
def has_underdog_won(self, df):
"""
Add a column to the dataframe which will be True if the underdog has won.
"""
return df.with_columns(
has_underdog_won=pl.col("underdog").eq(pl.col("winner_num"))
)
def apply_strategy(self):
"""
Apply the strategy to the dataframe.
"""
prep_df = (
self.df.pipe(self.get_underdog)
.pipe(self.get_favourite)
)
bet_df = self.add_bet(prep_df)
# check if the method has been implemented correctly
if "bet" not in bet_df.columns:
raise ValueError(
"The add_bet method has not been implemented correctly."
"Please add a column called 'bet'."
)
# check if the bet column only contains 1, 0 or 2
required_bet_values = set([1, 0, 2])
if not bet_df["bet"].is_in(required_bet_values).all():
raise ValueError(
"The add_bet method has not been implemented correctly."
f"Please add a column called 'bet' with the values {required_bet_values}."
)
return (
bet_df.pipe(self.add_bet_won_column)
.pipe(self.add_odds_to_use)
.pipe(self.has_favourite_won)
.pipe(self.has_underdog_won)
.pipe(self.calculate_return)
)
The class which I defined is an abstract class, which means that it cannot be instantiated directly. However it can be used as a base class for other strategies. There are multiple methods which are common for all strategies, like the add_bet_won_column or the calculate_return method. However the add_bet method is abstract and has to be implemented in the subclass every time. With the help of the abstract class, we can define the BetAlwaysOnFavourite strategy like this:
class BetAlwaysOnFavourite(BettingStrategy):
def add_bet(self, df: pl.DataFrame) -> pl.DataFrame:
return df.with_columns(bet=pl.lit(1))
As you can see the implementation is straightforward, only the add_bet method needs to be implemented. In some complicated strategies it is might necessary to implement other methods as well.
Applying the strategy to the data:
betting_strategy = BetAlwaysOnFavourite(df, target_columns)
result = betting_strategy.apply_strategy()
We will get the following result:
| home | away | home-name | away-name | result | homeResult | awayResult | home-winner | away-winner | 1_avg_odds | x_avg_odds | 2_avg_odds | date | winner | winner_num | home_result | away_result | underdog | favourite | bet | bet_won | odds_to_use | has_favourite_won | has_underdog_won | return_on_bet | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 8002902 | 8002903 | Liverpool | Manchester City | 0:0 | 0 | 0 | draw | draw | 2.75 | 3.60 | 2.60 | 2018-10-07 15:30:00 | draw | 0 | 0 | 0 | 1.0 | 2.0 | 1 | False | 3.60 | False | False | -1.0 |
| 1 | 8002906 | 8002907 | Southampton | Chelsea | 0:3 | 0 | 3 | lost | win | 6.50 | 4.20 | 1.57 | 2018-10-07 13:15:00 | away | 2 | 0 | 3 | 1.0 | 2.0 | 1 | False | 1.57 | True | False | -1.0 |
| 2 | 8002898 | 8002899 | Fulham | Arsenal | 1:5 | 1 | 5 | lost | win | 5.00 | 4.40 | 1.66 | 2018-10-07 11:00:00 | away | 2 | 1 | 5 | 1.0 | 2.0 | 1 | False | 1.66 | True | False | -1.0 |
| 3 | 8002904 | 8002905 | Manchester Utd | Newcastle | 3:2 | 3 | 2 | win | lost | 1.40 | 4.75 | 10.00 | 2018-10-06 16:30:00 | home | 1 | 3 | 2 | 2.0 | 1.0 | 1 | True | 1.40 | True | False | 0.4 |
| 4 | 8002894 | 8002895 | Burnley | Huddersfield | 1:1 | 1 | 1 | draw | draw | 2.35 | 3.00 | 3.80 | 2018-10-06 14:00:00 | draw | 0 | 1 | 1 | 2.0 | 1.0 | 1 | False | 3.00 | False | False | -1.0 |
In the last column of the table we can see the return on the bet. The sum of this column shows the total return on the bet if we would bet always on the favourite. In our case the sum of the return on bet is -300.87 which means if we would bet always on the favourite with just 1€ we would end up with -300.87€. This is obviously a losing strategy.
Furthermore we can just plot the cumulative sum of the return see how the strategy performs over time:
There are some periods where the strategy performs better, however in general the trend is negative, and you cannot make money!
Statistics
I was curious to see how performs strategy performs if we try it multiple times but with always other part of the data. To do so I created a function which runs the selected strategy multiple times with random games.
def run_strategy_n_times(
strategy: BettingStrategy, df: pl.DataFrame, n_times: int, n_games: int, target_columns: list[str]
) -> list[float]:
"""
Run a betting strategy multiple times on randomly sampled subsets of data.
This function applies a given betting strategy to randomly sampled subsets of the input
DataFrame multiple times and returns a list of the total returns for each run.
Args:
strategy (BettingStrategy): The betting strategy class to be applied.
df (pl.DataFrame): The input DataFrame containing the full dataset of games and their information.
n_times (int): The number of times to run the strategy.
n_games (int): The number of games to sample for each run of the strategy.
target_columns (list[str]): The target columns to be used in the strategy.
Returns:
list[float]: A list containing the total return on bets for each run of the strategy.
Notes:
- The function uses random sampling with replacement, so the same game may appear
multiple times in a single run or across different runs.
- The 'target_columns' variable is assumed to be defined in the outer scope.
"""
results = []
for i in range(n_times):
result = strategy(
df.sample(n_games, shuffle=True), target_columns
).apply_strategy()
results.append(result["return_on_bet"].sum())
return results
Let`s run the strategy 100.000 times with 100 random games:
If we plot the results we can see the distribution of the returns:
With the simulation we get a nearly normal distribution, where the mean is -4.01. This is the expected value for the return on bet if we would bet always on the favourite, after 100 games.
Conclusion
As expected the strategy, betting always on the favourite is a losing strategy. However I was suprised to see how much we lose on average. And If we keep betting, the losses are accumulating. With that said, I would not recommend betting always on the favourite.