Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid daily aggregation of OHLC data with timezone/datetime offset #1002

Open
h0wXD opened this issue Jun 13, 2023 · 5 comments
Open

Invalid daily aggregation of OHLC data with timezone/datetime offset #1002

h0wXD opened this issue Jun 13, 2023 · 5 comments
Labels
question Not a bug, but a FAQ entry

Comments

@h0wXD
Copy link

h0wXD commented Jun 13, 2023

Expected Behavior

resample('D') to take in account the right trading day when using timezoneoffset dates (issue with date parsing?)

Actual Behavior

Resample('D') of hourly candle puts equity sample on weekend instead of friday, when position entry was clearly on friday, equity balance should also be on friday instead of saturday.

Steps to Reproduce

Added log lines (see below) and ran sample strategy on AAPL1H timeframe exported from tradingview (only way to get correct candles plotted and entries plotted in my timezone +8, is to also add the timezoneoffset to the export) for both plotting and 100% same entry/exits (can see in trades table - except last position entry is using final candle.close instead of final candle.open in backtesting.py)
Comparing with my C# code, where starting equity is at day end of Friday 2022-08-26 - 10077.1, where on backtesting.py it's moved to saturday, leading to incorrect results on lower timeframes.
I have compared daily backtest of 'D' in both my program and backtesting.py, results are equal, so I think backtesting.py is not taking datetimeoffset into account for candles with lower interval

    day_returns = np.array(np.nan)
    annual_trading_days = np.nan
    if isinstance(index, pd.DatetimeIndex):
        day_returns = equity_df['Equity'].resample('D').last().dropna().pct_change()
        equity_df['Equity'].to_csv("Equity.csv")
        equity_df['Equity'].resample('D').last().dropna().to_csv("EquityD.csv")
class SmaCross(Strategy):
    n1 = 50
    n2 = 100

    def init(self):
        close = self.data.Close
        self.sma1 = self.I(SMA, close, self.n1)
        self.sma2 = self.I(SMA, close, self.n2)

    def next(self):
        if crossover(self.sma1, self.sma2):
            self.buy()
        elif crossover(self.sma2, self.sma1):
            self.sell()

bt = Backtest(AAPL1H, SmaCross,
              cash=10000, commission=.00,
              exclusive_orders=True,)

Additional info

AAPL1H.csv
Equity.csv
EquityD.csv
image
Some C# logic I wrote shows first change in portfolio balance on friday 2022-08-26
image
backtesting.py logic shows first change in portfolio balance on saturday 2022-08-27
image

  • Backtesting version: 0.3.3
  • bokeh.__version__: 3.1.1
  • OS: Win 10
@kernc
Copy link
Owner

kernc commented Jun 13, 2023

In Equity.csv, the first change occurs:

2022-08-27 01:30:00+08:00,10000.0
2022-08-27 02:30:00+08:00,10028.392  <--
2022-08-27 03:30:00+08:00,10077.1
2022-08-29 21:30:00+08:00,10173.56

In EquityD.csv, this shows as:

2022-08-27 00:00:00+08:00,10077.1

which I guess is reasonable since the two dates match.

Can you use:

df.index = df.index.tz_convert(None)

before passing df to Backtest()?

@h0wXD
Copy link
Author

h0wXD commented Jun 14, 2023

@kernc that works perfectly, thanks for the quick response

after doing the following before passing it to backtest

AAPL1H.index = AAPL1H.index.tz_convert(None)

now the Equity results are correct comparing to my previously shared C# sample

2022-08-23,10000.0
2022-08-24,10000.0
2022-08-25,10000.0
2022-08-26,10077.1
2022-08-29,10214.5
2022-08-30,10362.1
2022-08-31,10466.5
2022-09-01,10418.5
2022-09-02,10545.7
2022-09-06,10624.3
2022-09-07,10541.5
2022-09-08,10628.5

EquityD.csv
Equity.csv

Do you reckon this should be built-in to backtesting.py?

@kernc
Copy link
Owner

kernc commented Jun 14, 2023

Do you reckon this should be built-in to backtesting.py?

I'm not too certain. If the user prefers timestamps in TZ-aware UTC time, I'm thinking why override it? In all respects, the user (should) knows what they are doing. And it's a simple-enough workaround.

@kernc kernc changed the title Invalid handling of OHLC data with datetime offset Invalid daily aggregation of OHLC data with timezone/datetime offset Jun 14, 2023
@kernc kernc closed this as completed Jun 14, 2023
@kernc kernc added the question Not a bug, but a FAQ entry label Jun 14, 2023
@h0wXD
Copy link
Author

h0wXD commented Jun 14, 2023

I still think this should be handled by the library when library users do use unintended datetime formats, as using date time dataset with offset causes invalid backtest results, this is a date handling issue. The datasets used above are default tradingview exports with only the csv headers updated to ,Open,High,Low,Close,Volume,VolMa. When changing the tradingview chart to UTC and exporting dates are in format "2022-08-03T16:30:00Z", when exporting from your local timezone it's in "2022-08-04 00:30:00+08:00". If this is not supported / leads to invalid backtest results, it would be nice to at least show a warning message. Thanks for the quick response and time spent building this amazing library!

@kernc
Copy link
Owner

kernc commented Jun 14, 2023

using date time dataset with offset causes invalid backtest results

Those results are not invalid! In Greenwich, it was simply already Saturday when the trade closed!

I feel this change would force a behavior which then couldn't be reverted. Maybe we can indeed issue a warning if timezone offset is present somewhere around here:

if not isinstance(data.index, pd.DatetimeIndex):
warnings.warn('Data index is not datetime. Assuming simple periods, '
'but `pd.DateTimeIndex` is advised.',
stacklevel=2)

@kernc kernc reopened this Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Not a bug, but a FAQ entry
Projects
None yet
Development

No branches or pull requests

2 participants