Why you should not trust the Buy-and-hold [%] in Backtesting.py

quant
trading
Position Sizing
Author

Ray Lai

Published

April 13, 2025

Modified

April 13, 2025

/Users/raylai/Library/Caches/org.R-project.R/R/reticulate/uv/cache/archive-v0/n27kI_pu2kgd7cyXmF2Fh/lib/python3.11/site-packages/backtesting/_plotting.py:55: UserWarning: Jupyter Notebook detected. Setting Bokeh output to notebook. This may not work in Jupyter clients without JavaScript support, such as old IDEs. Reset with `backtesting.set_bokeh_output(notebook=False)`.
  warnings.warn('Jupyter Notebook detected. '
Loading BokehJS ...

Why you should not trust the Buy and Hold Return [%]

In our previous articles, we have been backtesting strategies by buying or selling 1 unit of stock or 1 contract of futures, and we examine the return % as our first prelim benchmark. However, it doesn’t look quite correct as different assets they have different volatility.

This also makes the statistics generated by the backesting software unreliable. For example, the Buy and Hold Return % always assume we invest the whole portfolio, while the Return % compares in terms of the growth of the equity.

An Example of testing a strategy for SPY

Let’s illustrate the difference. Suppose we want to test a 80-day trend-following strategy, where we buy when the 80-day SMA is turning up, or vice versa.

  • Buy when 80-day SMA turning up \(MA_t > MA_{t-1} > MA_{t-2}\)
  • Sell when 80-day SMA turning down \(MA_t < MA_{t-1} < MA_{t-2}\)

We will always be in position. When we buy or sell the position, we close the position of the another side. We will start with $10,000 of capital.

Code
import yfinance as yf

SPY = yf.Tickers("SPY")
data = SPY.history(period = "3Y")

df_SPY = data.loc[:, pd.IndexSlice[:, "SPY"]]  # Select all columns for "TLT"
df_SPY.columns = df_SPY.columns.droplevel(1)  # Drop the "Ticker" level
df_SPY = df_SPY.loc[:, ["Close", "High", "Low", "Open"]]  # Keep only specific levels
[*********************100%***********************]  1 of 1 completed

Returns of SPY

Let’s get 3 years of data for SPY, and first calculate the returns of the asset.

\[ \text{Returns}=(\frac{P_t}{P_1} - 1) \times 100\% \]

Code
print("The returns of SPY is %.2f%%" % 
  float((data["Close"]["SPY"].iloc[-1] / data["Close"]["SPY"].iloc[0] - 1) * 100))
The returns of SPY is 27.28%

We would be able to replicate this return by using a Do nothing strategy.

Code
from importlib import reload
import strategy.DoNothingStrategy

reload(strategy.DoNothingStrategy)

from strategy.DoNothingStrategy import DoNothingStrategy
Code
bt = Backtest(df_SPY,
              DoNothingStrategy,
              cash=10_000)

bt.run()
bt.plot()
GridPlot(
id = 'p1109', …)

We can see the buy and hold return is the same as the one that we calculate above. However, because we have not done anything in the strategy, the return of the strategy is 0%. (i.e. There is no trade at all)

Buying 1 share will underestimate the return

If we want to fairly compare the return % with buy-and-hold strategy, we cannot only use buy one share for comparison. This is because the return [%] that reported by backtesting.py is on the equity growth, but not on the return of the trade.

Code
import strategy.BuyOneShareStrategy

reload(strategy.BuyOneShareStrategy)

from strategy.BuyOneShareStrategy import BuyOneShareStrategy
Code
bt = Backtest(df_SPY,
              BuyOneShareStrategy,
              cash=10_000,
              finalize_trades=True)

bt.run()
bt.plot()
GridPlot(
id = 'p1407', …)

Now, we see that the Return [%] becomes 0.97%, although what we did was only buy one share and hold it for the whole period. This is because the Return [%] metric compares the growth of equity, instead of the comparing the returns of price.

The “correct” way to evaluate the buy-and-hold strategy

The correct way to do it is to invest the maximum amount of money available. To fairly compare the return, we need to invest the same value into the strategy. This means our position sizing needs to be adjusted to volatility.

Here we have modified the strategy to enter the maximum amount of equity into the position. Below is the code snippet.

class PositionSizingStrategy(BaseStrategy):
  """
  Position Sizing Strategy Class
  """

  def get_position_size(self, price):
    """
    Get position sizing
    """
    return math.floor(self.equity / price)

    
  def next(self):
    
    if not self.position:
      self.buy(size=self.get_position_size(self.data.Close[-1]))
Code
import strategy.PositionSizingStrategy

reload(strategy.PositionSizingStrategy)

from strategy.PositionSizingStrategy import PositionSizingStrategy
Code
bt = Backtest(df_SPY,
              PositionSizingStrategy,
              cash=10_000,
              finalize_trades=True)

stats = bt.run()
Code
stats
Start                     2022-04-12 00:00:00
End                       2025-04-11 00:00:00
Duration                   1095 days 00:00:00
Exposure Time [%]                     99.7344
Equity Final [$]                   12265.0129
Equity Peak [$]                    14290.8912
Return [%]                           22.65013
Buy & Hold Return [%]                 27.2811
Return (Ann.) [%]                     7.07147
Volatility (Ann.) [%]                19.77134
CAGR [%]                              4.81074
Sharpe Ratio                          0.35766
Sortino Ratio                         0.55573
Calmar Ratio                          0.37708
Alpha [%]                            -3.87871
Beta                                  0.97243
Max. Drawdown [%]                    -18.7534
Avg. Drawdown [%]                    -2.29408
Max. Drawdown Duration      422 days 00:00:00
Avg. Drawdown Duration       25 days 00:00:00
# Trades                                    1
Win Rate [%]                            100.0
Best Trade [%]                       23.19708
Worst Trade [%]                      23.19708
Avg. Trade [%]                       23.19708
Max. Trade Duration        1093 days 00:00:00
Avg. Trade Duration        1093 days 00:00:00
Profit Factor                             NaN
Expectancy [%]                       23.19708
SQN                                       NaN
Kelly Criterion                           NaN
_strategy                 PositionSizingSt...
_equity_curve                             ...
_trades                      Size  EntryBa...
dtype: object

There’s one important thing to take a note on is the entry mechanism of the programing. Backtesting.py, by default, will enter the position at the next open. Therefore, if we do a closer examination on the EntryBar, we will see the position is entered on the 2nd bar, instead of the first bar.

This should be the better reflection of the Buy-and-hold strategy.

Code
show(pd.DataFrame(stats._trades))
Size EntryBar ExitBar EntryPrice ExitPrice SL TP PnL ReturnPct EntryTime ExitTime Duration Tag
Loading ITables v2.2.5 from the internet... (need help?)