-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clearsky functions handling tz-aware datetimes as tz-naive, contrary to user guide #2054
Comments
I'm not quite following. You write "Two functions clearsky.lookup_linke_turbidity and irradiance.get_extra_radiation is tz-naive but has been used as if they were tz-aware" but in the clear-sky user guide, the
|
Hi, @cwhanse Yes, that localized For instance, try converting the timezone of the tus = Location(32.2, -111, 'US/Arizona', 700, 'Tucson')
times = pd.date_range(start='2016-07-01', end='2016-07-04', freq='1min', tz=tus.tz)
cs = tus.get_clearsky(times) # ineichen with climatology table by default
cs_utc = tus.get_clearsky(times.tz_convert('utc')).tz_convert(tus.tz) # ineichen with climatology table by default
assert cs.index.equals(cs_utc.index)
assert not ((cs < 0).any().any() or (cs_utc < 0).any().any())
print(len(cs))
df = cs.join(cs_utc, how = 'outer', rsuffix = '_utc')[~ cs.fillna(-1).eq(cs_utc.fillna(-1)).all(axis = 'columns')]
print(df.index.hour.unique(), df.index.minute.nunique())
df
Does that difference conform to an expected behavior? I think On the other hand, you can also see on the user guide another Another example from my analysis above: # times = pd.date_range(start='2015-01-01', end='2016-01-01', freq='1D')
times_h = pd.date_range(start='2015-01-01', end='2016-01-01', freq='1h')
times_h_kst = pd.date_range(start='2015-01-01', end='2016-01-01', freq='1h').tz_localize('utc').tz_convert('Asia/Seoul')
# sites = [(32, -111, 'Tucson1'), (32.2, -110.9, 'Tucson2'),
# (33.5, -112.1, 'Phoenix'), (35.1, -106.6, 'Albuquerque')]
lat, lon, name = 35.1, -106.6, 'Albuquerque'
# for lat, lon, name in sites:
turbidity_h = pvlib.clearsky.lookup_linke_turbidity(times_h, lat, lon, interp_turbidity=False)
turbidity_h_kst = pvlib.clearsky.lookup_linke_turbidity(times_h_kst, lat, lon, interp_turbidity=False).tz_convert(None)
assert turbidity_h.index.equals(turbidity_h_kst.index)
assert not ((turbidity_h < 0).any() or (turbidity_h_kst < 0).any())
print(len(turbidity_h))
df = pd.concat([turbidity_h, turbidity_h_kst], axis = 'columns', join = 'outer')[~ turbidity_h.fillna(-1).eq(turbidity_h_kst.fillna(-1))]
print(df.index.hour.unique(), df.index.minute.nunique())
df
Is this an expected behavior? Isn't To summarize, the user guide has considered those functions as if tz-aware, where any |
Sorry for my poor English. I've thoroughly revised my words. Could you please review them again? @cwhanse |
@yhkee0404 aha, I think I partly understand what's going on here. The Linke turbidity data distributed with pvlib are gridded (location), monthly values. These values (from an outside source) do not have datetime stamps, only a month. pvlib (somewhat arbitrarily) assigns these values to the middle day of each month so that a daily value can be determined by interpolation. That machinery is not shown to the user; it happens here: pvlib-python/pvlib/clearsky.py Line 228 in d53f97e
The examples you provide are comparing the interpolated turbidity at the same location but localized to different timezones. The interpolation is being done using Arguably, pandas had to make a decision how to return a day of year for each location, so I don't think this is a fault in pandas - its behaving as intended (I assume). So: if we want Linke turbidity at a location to be consistent with equal timestamps, we need to work under the hood in pvlib to fix the interpolation.
|
Thank you for your clear explanation! Could I add some details to your example?
pvlib-python/pvlib/clearsky.py Lines 201 to 205 in d53f97e
pvlib-python/pvlib/irradiance.py Lines 90 to 111 in d53f97e
pvlib-python/pvlib/clearsky.py Lines 174 to 193 in d53f97e
Fortunately, we can experimentally confirm that both functions use UTC for dividing days without needing additional references. The following code provides evidence for import pandas as pd
from pvlib.location import Location
import pvlib
tus = Location(32.2, -111, 'US/Arizona', 700, 'Tucson')
times = pd.date_range(start='2016-07-01', end='2016-07-04', freq='30min', tz=tus.tz)
times_utc = times.tz_convert('utc')
dni_extra = pvlib.irradiance.get_extra_radiation(times)
dni_extra_utc = pvlib.irradiance.get_extra_radiation(times_utc).tz_convert(tus.tz)
assert dni_extra.index.equals(dni_extra_utc.index)
assert not ((dni_extra < 0).any() or (dni_extra_utc < 0).any())
print(len(dni_extra))
df = pd.concat([dni_extra, dni_extra_utc], axis = 'columns', join = 'outer')[~ dni_extra.fillna(-1).eq(dni_extra_utc.fillna(-1))]
print(df.index.hour.unique(), df.index.minute.nunique())
print(dni_extra_utc.tz_convert('UTC').groupby(by=lambda x: x.day).unique())
df
|
Describe the bug
Both of the functions
clearsky.lookup_linke_turbidity
andirradiance.get_extra_radiation
are supposed to be tz-aware in accordance with the user guide, but take the tz-aware inputs as tz-naive in their implementations and return incorrect results.Another functions using these two functions may also have been affected by the errors; i.e.
Location.get_clearsky
,clearsky.ineichen
, orclearsky.simplified_solis
.To Reproduce
Refer to my comprehensive analysis on the .ipynb file simulated in the same way as the user guide.
It seems too big for the browser to render because it shows me only this message: An error occurred.
In a similar situation, please download the file or clone my repository to see the content.
Expected behavior
There should be no items searched on the file by the keyword
rows
, each of which means errors subject to timezones.Screenshots
For instance, on the user guide, it prints:
climatological linke_turbidity = 3.329496820286459
.However, after converting the local timezone to utc, it prints on my analysis above another value:
climatological linke_turbidity = 3.3247187176482633
.Versions:
pvlib.__version__
: v0.10.5pandas.__version__
:2.2.2
3.11.9
Additional context
Almost the same as #237
The text was updated successfully, but these errors were encountered: