Skip to content

Size calculation bug #43

@corviday

Description

@corviday

ORCA seems to sometimes mishandle files with sizes between 500 MB and 1000 MB, requesting them in chunks too large for ORCA.

Here's an example request with this problem.

ORCA logs:

syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
syntax error, unexpected WORD_WORD, expecting SCAN_ATTR or SCAN_DATASET or SCAN_ERROR
context: Request^ Too Large: 547.3103 Mbytes, max=500.0
[2025-10-22 01:22:16,622] ERROR in app: Exception on /data/ [GET]
Traceback (most recent call last):
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/flask/app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/flask/app.py", line 882, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/flask/app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/flask/app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
  File "/app/orca/routes.py", line 51, in orc_route
    orc(
  File "/app/orca/compiler.py", line 33, in orc
    file_from_opendap(opendap_url, threshold, outdir, outfile)
  File "/app/orca/requester.py", line 38, in file_from_opendap
    outpath = to_file(dataset, outdir, outfile)
  File "/app/orca/requester.py", line 52, in to_file
    dataset.to_netcdf(outpath)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/dataset.py", line 2298, in to_netcdf
    return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/api.py", line 1339, in to_netcdf
    dump_to_store(
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/api.py", line 1386, in dump_to_store
    store.store(variables, attrs, check_encoding, writer, unlimited_dims=unlimited_dims)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/common.py", line 393, in store
    variables, attributes = self.encode(variables, attributes)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/common.py", line 482, in encode
    variables, attributes = cf_encoder(variables, attributes)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/conventions.py", line 795, in cf_encoder
    new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()}
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/conventions.py", line 795, in <dictcomp>
    new_vars = {k: encode_cf_variable(v, name=k) for k, v in variables.items()}
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/conventions.py", line 196, in encode_cf_variable
    var = coder.encode(var, name=name)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/coding/times.py", line 972, in encode
    variable.data.dtype, np.datetime64
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/variable.py", line 433, in data
    return self._data.get_duck_array()
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/indexing.py", line 809, in get_duck_array
    self._ensure_cached()
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/indexing.py", line 803, in _ensure_cached
    self.array = as_indexable(self.array.get_duck_array())
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/indexing.py", line 760, in get_duck_array
    return self.array.get_duck_array()
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/indexing.py", line 630, in get_duck_array
    array = array.get_duck_array()
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/coding/variables.py", line 81, in get_duck_array
    return self.func(self.array.get_duck_array())
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/indexing.py", line 623, in get_duck_array
    array = self.array[self.key]
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 101, in __getitem__
    return indexing.explicit_indexing_adapter(
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/core/indexing.py", line 987, in explicit_indexing_adapter
    result = raw_indexing_method(raw_key.tuple)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/netCDF4_.py", line 114, in _getitem
    array = getitem(original_array, key)
  File "/root/.cache/pypoetry/virtualenvs/orca-9TtSrW0h-py3.10/lib/python3.10/site-packages/xarray/backends/common.py", line 192, in robust_getitem
    return array[key]
  File "src/netCDF4/_netCDF4.pyx", line 4972, in netCDF4._netCDF4.Variable.__getitem__
  File "src/netCDF4/_netCDF4.pyx", line 5930, in netCDF4._netCDF4.Variable._get
  File "src/netCDF4/_netCDF4.pyx", line 2034, in netCDF4._netCDF4._ensure_nc_success
RuntimeError: NetCDF: Authorization failure

The issue may possibly be that header size is not included when calculating request size?

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions