Introduction to the martingalebot package in R

The martingalebot package provides functions to download cryptocurrency price data from Binance and to perform backtesting and parameter optimization for a single pair martingale trading strategy as implemented by single pair dca bots on 3commas, Pionex, TradeSanta, Mizar, OKX, Bitget and others.

Downloading price data

There are three different functions to download data from Binance: get_binance_klines(), get_binance_klines_from_csv(), get_binance_prices_from_csv. The function get_binance_klines() can download candlestick data directly. The user can specify the trading pair, the start and end time and the time frame for the candles. For example, to download hourly candles from ETHUSDT from the first of January to the first of March 2023, one could specify:

get_binance_klines(symbol = 'ETHUSDT',
                   start_time = '2025-01-01',
                   end_time = '2025-03-01',
                   interval = '1h')
#>                 open_time    open    high     low   close          close_time
#>                    <POSc>   <num>   <num>   <num>   <num>              <POSc>
#>    1: 2025-01-01 00:00:00 3339.88 3345.98 3328.47 3337.78 2025-01-01 00:59:59
#>    2: 2025-01-01 01:00:00 3337.78 3365.71 3335.84 3363.70 2025-01-01 01:59:59
#>    3: 2025-01-01 02:00:00 3363.69 3366.40 3342.67 3346.54 2025-01-01 02:59:59
#>    4: 2025-01-01 03:00:00 3346.54 3368.42 3346.35 3362.61 2025-01-01 03:59:59
#>    5: 2025-01-01 04:00:00 3362.61 3363.72 3351.00 3355.20 2025-01-01 04:59:59
#>   ---                                                                        
#> 1413: 2025-02-28 20:00:00 2228.12 2238.69 2221.30 2230.15 2025-02-28 20:59:59
#> 1414: 2025-02-28 21:00:00 2230.14 2238.28 2198.51 2216.13 2025-02-28 21:59:59
#> 1415: 2025-02-28 22:00:00 2216.12 2234.60 2210.35 2225.30 2025-02-28 22:59:59
#> 1416: 2025-02-28 23:00:00 2225.31 2231.96 2209.76 2216.58 2025-02-28 23:59:59
#> 1417: 2025-03-01 00:00:00 2216.59 2239.69 2213.57 2237.59 2025-03-01 00:59:59

An advantage of get_binance_klines() is that it can download price data up to the current time. A disadvantage is that the lowest time frame for the candles is 1 minute.

The function get_binance_klines_from_csv() downloads candlestick data via csv files from https://data.binance.vision/. The advantage of this method is that it is faster for large amounts of data and that that the lowest time frame for the candles is 1 second. A disadvantage is that it can only download price data up to 1-2 days ago as the csv files on https://data.binance.vision are only updated once per day.

The function get_binance_prices_from_csv() also downloads price data via csv files from https://data.binance.vision/ and thus shares the same advantages and disadvantages, but it downloads aggregated trades instead of candlestick data. This allows for an even lower time resolution as it returns all traded prices of a coin over time. Knowing the exact price at each point in time is particularly helpful for backtesting martingale bots with trailing buy and sell orders. The function get_binance_prices_from_csv() returns a data frame with only two columns. See, for example:

get_binance_prices_from_csv('LTCBTC',
                            start_time = '2025-01-01',
                            end_time = '2025-02-01', progressbar = F)
#>                        time    price
#>                      <POSc>    <num>
#>      1: 2025-01-01 00:00:21 0.001105
#>      2: 2025-01-01 00:00:22 0.001104
#>      3: 2025-01-01 00:00:35 0.001104
#>      4: 2025-01-01 00:01:11 0.001105
#>      5: 2025-01-01 00:01:13 0.001105
#>     ---                             
#> 133794: 2025-02-01 23:59:31 0.001174
#> 133795: 2025-02-01 23:59:35 0.001173
#> 133796: 2025-02-01 23:59:38 0.001173
#> 133797: 2025-02-01 23:59:41 0.001172
#> 133798: 2025-02-01 23:59:43 0.001173

Since this function returns very large amounts of data for frequently traded pairs such as BTCUSDT, it is, by default, parallelized and shows a progress bar. Currently, the functions backtest and grid_search are implemented in such a way that they expect the price data to be in the format as returned by this function.

Visualizing a Martingale Configuration

Before running scans, you can quickly inspect how a parameter set allocates capital across safety orders, where orders are placed, and where the take‑profit lands.

# Capital allocation by price level (horizontal stacked bars)
plot_martingale_config(
  starting_price = 100,
  n_safety_orders = 8,
  pricescale = 2.4,
  volumescale = 1.5,
  take_profit = 2.4,
  stepscale = 1.1, 
  plot_type = "allocation"
)

# Timeline view (order sequence with buy amounts and final TP point)
plot_martingale_config(
  starting_price = 100,
  n_safety_orders = 8,
  pricescale = 2.4,
  volumescale = 1.5,
  take_profit = 2.4,
  stepscale = 1.1,
  plot_type = "timeline"
)

Performing a backtest

To perform a backtest of a martingale bot, we first download price data for a specific time period and trading pair with get_binance_prices_from_csv() and then apply backtest to it. The tested martingale bot can be set up with the following parameters:

If we don’t specify any of these arguments, the default parameter settings will be used. To show the default settings, type args(backtest) or go to the help file with ?backtest.

dat <- get_binance_prices_from_csv('BONKUSDT',
                                   start_time = '2025-03-01',
                                   end_time = '2025-07-01', 
                                   progressbar = F)
dat |> backtest()
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   50.4      174          33.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>

The backtest function returns the following measures:

If the argument plot is TRUE, an interactive plot showing the changes in capital and price of the traded cryptocurrency over time is produced. Buys, sells and stop-losses are displayed as red, green and blue dots, respectively.

dat |> backtest(plot = T)
MärAprMaiJunJul80100120140MärAprMaiJunJul1.0e-051.5e-052.0e-052.5e-05
Bot profit: 50.4 %; Max draw down: 33.3 % Safety orders: 8, Price scale: 2.4, Volume scale: 1.5, Take profit: 2.4, Step scale: 1, Stoploss: 0, Start ASAP: TRUECapitalPrice

Deal start conditions

By default, new trades are started as soon as possible. If the price data set contains a logical vector deal_start and the argument start_asap is set to FALSE, new deals are only started where the logical vector deal_start in data is TRUE. We can add a deal start condition, for example based on the Relative Strength Index (RSI), by using one of the add_*_filter functions. We can specify the time frame for the candles, the number of candles that are considered and the cutoff for creating the logical vector deal_start. In the following example, new deals are only started if the hourly RSI is below 30. You can see in the plot that there are no buys (red dots) at peaks of the price curve anymore. However, the performance is slightly worse because there are now less trades in total.

dat |>
  add_rsi_filter(time_period = "1 hour", n = 7, cutoff = 30) |>
  backtest(start_asap = FALSE, plot = TRUE)
MärAprMaiJunJul95100105110115120MärAprMaiJunJul1.0e-051.5e-052.0e-052.5e-05
Bot profit: 22.1 %; Max draw down: 8.8 % Safety orders: 8, Price scale: 2.4, Volume scale: 1.5, Take profit: 2.4, Step scale: 1, Stoploss: 0, Start ASAP: FALSECapitalPrice

Other useful deal-start filters provided by the package:

dat |>
  add_sma_filter(n = 100, column_name = "deal_start", price_is_above = TRUE) |>
  backtest(start_asap = FALSE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   46.0      160          33.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat |>
  add_bollinger_filter(time_period = "1 hour", n = 20, cutoff = 0.10,
                       column_name = "deal_start", signal_on_below = TRUE) |>
  backtest(start_asap = FALSE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   25.8       67          8.79             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat |>
  add_macd_filter(time_period = "4 hours", column_name = "deal_start",
                  macd_is_above_signal = TRUE) |>
  backtest(start_asap = FALSE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   30.6      132          33.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat_regime <- dat |>
  add_rsi_filter(time_period = "1 week", n = 14, cutoff = 40,
                 column_name = "is_bull_regime", rsi_is_above = FALSE) |>
  add_rsi_filter(time_period = "4 hours", n = 14, cutoff = 30,
                 column_name = "is_dip", rsi_is_above = FALSE)

dat_regime[, deal_start := is_bull_regime & is_dip]
dat_regime |>
  backtest(start_asap = FALSE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   4.73       23          33.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>

Emergency Stop Conditions

Emergency stops are rare, high-conviction exit signals to protect the bot from regime changes (e.g., start of a bear market) or extreme momentum down moves. The following helpers produce a logical column named emergency_stop that backtest(..., use_emergency_stop = TRUE) will honor.

dat |>
  add_rsi_filter(time_period = "1 week", n = 14, cutoff = 40,
                 column_name = "emergency_stop", rsi_is_above = FALSE) |>
  backtest(use_emergency_stop = TRUE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   5.77       19          7.45             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat |>
  add_death_cross_filter(column_name = "emergency_stop") |>
  backtest(use_emergency_stop = TRUE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   50.4      174          33.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat |>
  add_roc_filter(time_period = "1 day", n = 90, cutoff = -30,
                 smoothing_period = 7, column_name = "emergency_stop",
                 roc_is_below = TRUE) |>
  backtest(use_emergency_stop = TRUE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   50.4      174          33.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat |>
  add_sma_filter(n = 200, column_name = "emergency_stop", price_is_above = FALSE) |>
  backtest(use_emergency_stop = TRUE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1      0        0             0             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>
dat |>
  add_bollinger_filter(time_period = "1 day", n = 20, cutoff = 0.95,
                       column_name = "emergency_stop", signal_on_below = FALSE) |>
  backtest(use_emergency_stop = TRUE)
#> # A tibble: 1 × 10
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   42.5      149          20.3             503.              19.2
#> # ℹ 5 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>

Notes: - Emergency stops should be infrequent; prefer higher timeframes and conservative thresholds. - You can combine multiple stops by OR‑condition, e.g.dat[, emergency_stop := stop1 | stop2]. - Emergency stops are displayed as purple dots in the plot.

Parameter optimization

To find the best parameter set for a given time period, we can perform a grid search using the function grid_search. This function takes possible values of martingale bot parameters, runs the function backtest with each possible combination of these values and returns the results as a date frame. Each row of this data frame contains the result of one possible combination of parameters. Since doing a grid search can be computationally expensive, the grid_search function is parallelized by default.

By default, grid_search uses a broad range of parameters. For example, for n_safety_orders, values between 6 and 16 in steps of 2 are tested (see args(grid_search)for default ranges of parameters). However, we could also use, for, examples, values between 4 and 6, by explicitly specifying it:

res <- dat |> 
  grid_search(n_safety_orders = 4:6, progressbar = F)
res
#> # A tibble: 628 × 20
#>    profit n_trades max_draw_down required_capital covered_deviation
#>     <dbl>    <int>         <dbl>            <dbl>             <dbl>
#>  1   51.2      124          32.3             640               18  
#>  2   51.2      124          32.3             640               18  
#>  3   44.0      164          32.5             640               18  
#>  4   44.0      164          32.5             640               18  
#>  5   42.4      204          33.3             423.              18  
#>  6   42.4      204          33.3             423.              18  
#>  7   39.0      251          34.1             273.              18  
#>  8   39.0      251          34.1             273.              18  
#>  9   37.7      161          33.7             423.              15.6
#> 10   37.7      161          33.7             423.              15.6
#> # ℹ 618 more rows
#> # ℹ 15 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>,
#> #   base_order_volume <dbl>, first_safety_order_volume <dbl>,
#> #   n_safety_orders <int>, pricescale <dbl>, volumescale <dbl>,
#> #   take_profit <dbl>, stepscale <dbl>, start_asap <lgl>, stoploss <dbl>,
#> #   compound <lgl>

The rows of the returned data frame are ordered by the column profit. In the first row, we see the set of parameters that led to the highest profit. To plot the best-performing parameter set, we can pass the values from the first row of res as arguments to backtest using purrr::exec(). This function takes a function as its first argument and a list of parameters as its second, which we can create on the fly.

# First, run the grid search
res <- dat |>
  grid_search(n_safety_orders = 4:6, progressbar = FALSE)

# Then, plot the best result
# We extract the first row as a list of parameters
best_params <- res |> dplyr::slice(1)
# And pass them to backtest using the !!! (big bang) operator
exec(backtest, !!!best_params, data = dat, plot = TRUE)

Instead of picking the most profitable parameter constellation, we could also pick the one with the best compromise between profit and max_draw_down by replacing the command slice(1) with slice_max(profit - max_draw_down).

It should be noted that the grid_search function also has the following arguments that allow to restrict the search space:

This can be handy because we might only want to search for optimal parameter combinations within a set of parameters that have minimum “down tolerance” and thus have certain robustness against sudden price drops. In this case, it would be a waste of computation time if we tested all possible combinations of parameters.

Instead of performing a grid search, we can also search for the best parameter combination with built-in optimization helpers.

Differential Evolution (DE)

# Optimize for profit using Differential Evolution
best_de <- de_search(
  data = dat,
  objective_metric = "profit",
  DEoptim_control = list(itermax = 50, NP = 64, trace = FALSE)
)

best_de
#> # A tibble: 1 × 18
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   80.6      126          15.0             315.              26.9
#> # ℹ 13 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>,
#> #   n_safety_orders <dbl>, pricescale <dbl>, volumescale <dbl>,
#> #   take_profit <dbl>, stepscale <dbl>, stoploss <dbl>,
#> #   base_order_volume <dbl>, first_safety_order_volume <dbl>

# Plot the best configuration found by DE
best_de %>% exec(backtest, !!!., data = dat, plot = TRUE)
MärAprMaiJunJul100120140160180MärAprMaiJunJul1.0e-051.5e-052.0e-052.5e-05
Bot profit: 80.6 %; Max draw down: 15 % Safety orders: 13, Price scale: 2.5, Volume scale: 1.1, Take profit: 3.4, Step scale: 1, Stoploss: 31.4, Start ASAP: TRUECapitalPrice

You can also optimize a custom metric, e.g. a simple risk-adjusted target:

best_de_custom <- de_search(
  data = dat,
  objective_metric = "profit / (1 + max_draw_down)",
  DEoptim_control = list(itermax = 40, NP = 48, trace = FALSE)
)
best_de_custom
#> # A tibble: 1 × 18
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   75.6      151          14.7             259.              27.3
#> # ℹ 13 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>,
#> #   n_safety_orders <dbl>, pricescale <dbl>, volumescale <dbl>,
#> #   take_profit <dbl>, stepscale <dbl>, stoploss <dbl>,
#> #   base_order_volume <dbl>, first_safety_order_volume <dbl>

Random Search (Latin Hypercube Sampling)

# Random search to explore the space broadly
rand <- random_search(
  data = dat,
  n_samples = 200,
  progressbar = FALSE
)

# Inspect top candidates
rand %>% 
  dplyr::slice_max(profit, n = 5)
#> # A tibble: 5 × 20
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   69.5      133          13.2             440.              24.1
#> 2   59.4       97          32.0             525.              21.3
#> 3   53.4      106          31.8             741.              18.8
#> 4   49.2      171          12.8            2627.              21.2
#> 5   45.0      190          12.8            3115.              21.3
#> # ℹ 15 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>,
#> #   base_order_volume <dbl>, first_safety_order_volume <dbl>,
#> #   n_safety_orders <dbl>, pricescale <dbl>, volumescale <dbl>,
#> #   take_profit <dbl>, stepscale <dbl>, stoploss <dbl>, start_asap <lgl>,
#> #   compound <lgl>

# Plot the best one
rand %>%
  dplyr::slice_max(profit, n = 1) %>%
  exec(backtest, !!!., data = dat, plot = TRUE)
MärAprMaiJunJul100125150MärAprMaiJunJul1.0e-051.5e-052.0e-052.5e-05
Bot profit: 69.5 %; Max draw down: 13.2 % Safety orders: 10, Price scale: 2.9, Volume scale: 1.3, Take profit: 3.5, Step scale: 1, Stoploss: 30, Start ASAP: TRUECapitalPrice

Setting Parameter Ranges

You can restrict the search space by providing lower/upper bounds (DE) or ranges (random search).

# Differential Evolution with custom bounds
best_de_bounds <- de_search(
  data = dat,
  objective_metric = "profit",
  n_safety_orders_bounds = c(6, 14),
  pricescale_bounds      = c(1.2, 3.2),
  volumescale_bounds     = c(1.0, 2.0),
  take_profit_bounds     = c(1.0, 3.0),
  stepscale_bounds       = c(0.8, 1.2),
  stoploss_bounds        = c(0, 40),
  base_order_volume_bounds         = c(10, 50),
  first_safety_order_volume_bounds = c(10, 50),
  DEoptim_control = list(itermax = 40, NP = 48, trace = FALSE)
)
best_de_bounds
#> # A tibble: 1 × 18
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   80.8      147          14.6            1096.              27.0
#> # ℹ 13 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>,
#> #   n_safety_orders <dbl>, pricescale <dbl>, volumescale <dbl>,
#> #   take_profit <dbl>, stepscale <dbl>, stoploss <dbl>,
#> #   base_order_volume <dbl>, first_safety_order_volume <dbl>

# Random search with matching ranges and pre-filters
rand_bounds <- random_search(
  data = dat,
  n_samples = 200,
  n_safety_orders_bounds = c(6, 14),
  pricescale_bounds      = c(1.2, 3.2),
  volumescale_bounds     = c(1.0, 2.0),
  take_profit_bounds     = c(1.0, 3.0),
  stepscale_bounds       = c(0.8, 1.2),
  stoploss_values        = c(0, 25, 30, 40),
  min_covered_deviation  = 8,
  min_down_tolerance     = 8,
  max_required_capital   = 10000,
  progressbar = FALSE
)
rand_bounds %>%
  dplyr::slice_max(profit, n = 1)
#> # A tibble: 1 × 20
#>   profit n_trades max_draw_down required_capital covered_deviation
#>    <dbl>    <int>         <dbl>            <dbl>             <dbl>
#> 1   62.1      117          31.8             204.              22.8
#> # ℹ 15 more variables: down_tolerance <dbl>, max_time <dbl>,
#> #   percent_inactive <dbl>, n_stoploss <int>, n_emergency_stops <int>,
#> #   base_order_volume <dbl>, first_safety_order_volume <dbl>,
#> #   n_safety_orders <dbl>, pricescale <dbl>, volumescale <dbl>,
#> #   take_profit <dbl>, stepscale <dbl>, stoploss <dbl>, start_asap <lgl>,
#> #   compound <lgl>

Cross-validation

In the previous examples, we used the same data for training and testing the algorithm. However, this most likely resulted in over-fitting and over-optimistic performance estimation. A better strategy would be to strictly separate testing and learning by using cross-validation.

We first download a longer time period of price data so that we have more data for training and testing:

dat <- get_binance_prices_from_csv("ATOMUSDT", 
                                   start_time = '2022-01-01',
                                   end_time = '2023-03-03', progressbar = F)

Next, we split our data into many different test and training time periods. We can use the function create_timeslices to create start and end times of the different splits. It has the following 4 arguments.

For example, if we want to use 4 months for training, 4 months for testing and create training and testing periods every month, we could specify:

slices <- dat |>
  create_timeslices(train_months = 4, test_months = 4, shift_months = 1)
slices
#> # A tibble: 7 × 5
#>   period start_train         end_train           start_test         
#>    <dbl> <dttm>              <dttm>              <dttm>             
#> 1      1 2022-01-01 00:00:00 2022-05-02 18:00:00 2022-05-02 18:00:00
#> 2      2 2022-01-31 10:30:00 2022-06-02 04:30:00 2022-06-02 04:30:00
#> 3      3 2022-03-02 21:00:00 2022-07-02 15:00:00 2022-07-02 15:00:00
#> 4      4 2022-04-02 07:30:00 2022-08-02 01:30:00 2022-08-02 01:30:00
#> 5      5 2022-05-02 18:00:00 2022-09-01 12:00:00 2022-09-01 12:00:00
#> 6      6 2022-06-02 04:30:00 2022-10-01 22:30:00 2022-10-01 22:30:00
#> 7      7 2022-07-02 15:00:00 2022-11-01 09:00:00 2022-11-01 09:00:00
#> # ℹ 1 more variable: end_test <dttm>

Note that these time periods are partially overlapping. If we want to have non-overlapping time periods, we could specify shift_months = 4.

We can now perform cross-validation by iterating over the rows of slices. At each iteration, we perform a grid search for the best parameter combination using the training data and then apply this parameter combination to the test data. For simplicity, we only return the final performance in the test data.

library(tidyverse)
slices %>% 
  group_by(start_test, end_test) %>% 
  reframe({
    # Get test and training data of the present row / iteration
    train_data <- filter(dat, between(time, start_train, end_train))
    test_data <- filter(dat, between(time, start_test, end_test))
    # Find the best parameter combination in the training data
    best <- train_data |>
      grid_search(progressbar = FALSE) |>
      slice(1)
    # Apply this parameter combination to the test data
    pmap_df(best, backtest, data = test_data)
  })
#> # A tibble: 7 × 12
#>   start_test          end_test             profit n_trades max_draw_down
#>   <dttm>              <dttm>                <dbl>    <int>         <dbl>
#> 1 2022-05-02 18:00:00 2022-09-01 12:00:00 -33.2         13          69.7
#> 2 2022-06-02 04:30:00 2022-10-01 22:30:00  39.3        350          12.8
#> 3 2022-07-02 15:00:00 2022-11-01 09:00:00  49.5        148          10.5
#> 4 2022-08-02 01:30:00 2022-12-01 19:30:00  13.5        119          29.6
#> 5 2022-09-01 12:00:00 2023-01-01 06:00:00  -8.29        96          31.7
#> 6 2022-10-01 22:30:00 2023-01-31 16:30:00  29.1         30          33.7
#> 7 2022-11-01 09:00:00 2023-03-03 03:00:00   0.628       15          39.7
#> # ℹ 7 more variables: required_capital <dbl>, covered_deviation <dbl>,
#> #   down_tolerance <dbl>, max_time <dbl>, percent_inactive <dbl>,
#> #   n_stoploss <int>, n_emergency_stops <int>

We can see that only 3 of the 7 tested time periods were in profit. This is because we only maximized profitability during training, which likely led to the selection of “aggressive” or risky strategies that work well in the training set but poorly in the test set due to little robustness against sudden price drops. This is illustrated by the the relatively small price down tolerance, which varied between 8.2 and 10.9 % for the selected parameter combinations (see column down_tolerance in the above table). A potential solution to this problem is therefore to restrict the search space to those parameter combinations that have a minimum price down tolerance of, for example, 12 %. We can do this by using the argument min_down_tolerance of the grid_search function:

library(tidyverse)
slices %>% 
  group_by(start_test, end_test) %>% 
  reframe({
    train_data <- filter(dat, between(time, start_train, end_train))
    test_data <- filter(dat, between(time, start_test, end_test))
    best <- train_data |>
      grid_search(min_down_tolerance = 12, progressbar = FALSE) |>
      slice(1)
    pmap_df(best, backtest, data = test_data)
  })
#> # A tibble: 7 × 12
#>   start_test          end_test            profit n_trades max_draw_down
#>   <dttm>              <dttm>               <dbl>    <int>         <dbl>
#> 1 2022-05-02 18:00:00 2022-09-01 12:00:00 -28.0         6         67.3 
#> 2 2022-06-02 04:30:00 2022-10-01 22:30:00  13.3       249         28.8 
#> 3 2022-07-02 15:00:00 2022-11-01 09:00:00  13.4       244          9.82
#> 4 2022-08-02 01:30:00 2022-12-01 19:30:00  -6.80      286         29.4 
#> 5 2022-09-01 12:00:00 2023-01-01 06:00:00  16.6       292          7.26
#> 6 2022-10-01 22:30:00 2023-01-31 16:30:00  15.7        42         37.0 
#> 7 2022-11-01 09:00:00 2023-03-03 03:00:00   4.58       37         36.2 
#> # ℹ 7 more variables: required_capital <dbl>, covered_deviation <dbl>,
#> #   down_tolerance <dbl>, max_time <dbl>, percent_inactive <dbl>,
#> #   n_stoploss <int>, n_emergency_stops <int>

Except for the first time period, all time periods are now in profit. However, this more conservative strategy came with the price of slightly lower profits in the second and third time periods.

Alternatively, we could also select the most profitable parameter combination only among those combinations that had little draw down and did not result in “red bags” for extended periods of time. For example, to select the most profitable parameter combination among those combinations that had no more than 30% draw down and that were no longer than 3% of the time fully invested in the training period, we could do:

library(tidyverse)
slices %>% 
  group_by(start_test, end_test) %>% 
  reframe({
    train_data <- filter(dat, between(time, start_train, end_train))
    test_data <- filter(dat, between(time, start_test, end_test))
    best <- train_data |>
      grid_search(progressbar = FALSE) |>
      filter(max_draw_down < 30 & percent_inactive < 3) |>
      slice(1)
    pmap_df(best, backtest, data = test_data)
  })
#> # A tibble: 7 × 12
#>   start_test          end_test            profit n_trades max_draw_down
#>   <dttm>              <dttm>               <dbl>    <int>         <dbl>
#> 1 2022-05-02 18:00:00 2022-09-01 12:00:00 -33.1        26         69.6 
#> 2 2022-06-02 04:30:00 2022-10-01 22:30:00  13.4       448         28.2 
#> 3 2022-07-02 15:00:00 2022-11-01 09:00:00  13.0       431         10.0 
#> 4 2022-08-02 01:30:00 2022-12-01 19:30:00  21.6       336         10.1 
#> 5 2022-09-01 12:00:00 2023-01-01 06:00:00  19.0       270          8.73
#> 6 2022-10-01 22:30:00 2023-01-31 16:30:00  -1.78       74         39.0 
#> 7 2022-11-01 09:00:00 2023-03-03 03:00:00   1.05       25         39.1 
#> # ℹ 7 more variables: required_capital <dbl>, covered_deviation <dbl>,
#> #   down_tolerance <dbl>, max_time <dbl>, percent_inactive <dbl>,
#> #   n_stoploss <int>, n_emergency_stops <int>

Another option would be to select the parameter combination that maximizes a combination of measures, such as profit - max_draw_down - percent_inactive .

library(tidyverse)
slices %>% 
  group_by(start_test, end_test) %>% 
  reframe({
    train_data <- filter(dat, between(time, start_train, end_train))
    test_data <- filter(dat, between(time, start_test, end_test))
    best <- train_data |>
      grid_search(progressbar = FALSE) |>
      slice_max(profit - max_draw_down - percent_inactive)
    pmap_df(best, backtest, data = test_data)
  })
#> # A tibble: 24 × 12
#>    start_test          end_test            profit n_trades max_draw_down
#>    <dttm>              <dttm>               <dbl>    <int>         <dbl>
#>  1 2022-05-02 18:00:00 2022-09-01 12:00:00 -33.2        13          69.7
#>  2 2022-05-02 18:00:00 2022-09-01 12:00:00  22.4       435          37.3
#>  3 2022-05-02 18:00:00 2022-09-01 12:00:00  13.4       435          41.9
#>  4 2022-05-02 18:00:00 2022-09-01 12:00:00  -6.08      419          48.9
#>  5 2022-06-02 04:30:00 2022-10-01 22:30:00  39.3       350          12.8
#>  6 2022-06-02 04:30:00 2022-10-01 22:30:00  39.3       350          12.8
#>  7 2022-06-02 04:30:00 2022-10-01 22:30:00  39.3       350          12.8
#>  8 2022-06-02 04:30:00 2022-10-01 22:30:00  39.3       350          12.8
#>  9 2022-07-02 15:00:00 2022-11-01 09:00:00  49.5       148          10.5
#> 10 2022-07-02 15:00:00 2022-11-01 09:00:00  49.5       148          10.5
#> # ℹ 14 more rows
#> # ℹ 7 more variables: required_capital <dbl>, covered_deviation <dbl>,
#> #   down_tolerance <dbl>, max_time <dbl>, percent_inactive <dbl>,
#> #   n_stoploss <int>, n_emergency_stops <int>

Instead of performing a grid search, we can also run cross‑validation with Differential Evolution (DE) using the built‑in helper:

library(tidyverse)
slices %>% 
  group_by(start_test, end_test) %>% 
  reframe({
    # Split present fold
    train_data <- filter(dat, between(time, start_train, end_train))
    test_data  <- filter(dat, between(time, start_test,  end_test))

    # Optimize on training set
    best <- de_search(
      data = train_data,
      objective_metric = "profit / (1 + max_draw_down)",
      # keep runtime reasonable for vignette
      DEoptim_control = list(itermax = 30, NP = 48, trace = FALSE)
    )

    # Evaluate on test set
    pmap_df(best, backtest, data = test_data)
  })
#> Warning: There were 7 warnings in `reframe()`.
#> The first warning was:
#> ℹ In argument: `{ ... }`.
#> ℹ In group 1: `start_test = 2022-05-02 18:00:00` `end_test = 2022-09-01
#>   12:00:00`.
#> Caused by warning in `DEoptim::DEoptim()`:
#> ! For many problems it is best to set 'NP' (in 'control') to be at least ten times the length of the parameter vector. 
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 6 remaining warnings.
#> # A tibble: 7 × 12
#>   start_test          end_test            profit n_trades max_draw_down
#>   <dttm>              <dttm>               <dbl>    <int>         <dbl>
#> 1 2022-05-02 18:00:00 2022-09-01 12:00:00  10.4       185         21.3 
#> 2 2022-06-02 04:30:00 2022-10-01 22:30:00  13.9       170         32.2 
#> 3 2022-07-02 15:00:00 2022-11-01 09:00:00   6.54      934          4.59
#> 4 2022-08-02 01:30:00 2022-12-01 19:30:00  34.0       155          8.53
#> 5 2022-09-01 12:00:00 2023-01-01 06:00:00  11.2       140         23.8 
#> 6 2022-10-01 22:30:00 2023-01-31 16:30:00  24.1       306         32.0 
#> 7 2022-11-01 09:00:00 2023-03-03 03:00:00   3.08      119         33.6 
#> # ℹ 7 more variables: required_capital <dbl>, covered_deviation <dbl>,
#> #   down_tolerance <dbl>, max_time <dbl>, percent_inactive <dbl>,
#> #   n_stoploss <int>, n_emergency_stops <int>