invalid dtype: 'object' when processing NetCDF files with separate band variables
When attempting to process NetCDF files that contain separate band variables (such as Sentinel-2 data with individual bands like B02
, B03
, B04
, etc.), raster2stac
fails with a TypeError: invalid dtype: 'object'
during the COG generation process.
Environment
- raster2stac version: 0.0.8
- Python version: 3.11
- Operating System: Linux
- xarray version: 2024.3.0
- rioxarray version: 0.17.0
Steps to Reproduce
- Download or use a NetCDF file with separate band variables (example: Sentinel-2 L2A data)
- Run the following code:
from raster2stac import Raster2STAC
rs2stac = Raster2STAC(
data="path/to/S2_L2A_sample.nc",
collection_id="SENTINEL2_L2A_SAMPLE",
collection_url="https://stac.eurac.edu/collections/",
output_folder="SENTINEL2_L2A_SAMPLE_STAC"
).generate_cog_stac()
Expected Behavior
The library should successfully process the NetCDF file and generate COG files and STAC metadata for each band and timestep.
Traceback (most recent call last):
File "test_stac.py", line 7, in <module>
).generate_cog_stac()
^^^^^^^^^^^^^^^^^^^
File "raster2stac/raster2stac.py", line 1429, in generate_cog_stac
].to_dataset(name=band).rio.to_raster(
^^^^^^^^^^
File "rioxarray/raster_dataset.py", line 539, in to_raster
return data_array.rio.set_spatial_dims(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "rioxarray/raster_array.py", line 1135, in to_raster
return RasterioWriter(raster_path=raster_path).to_raster(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "rioxarray/raster_writer.py", line 279, in to_raster
with rasterio.open(self.raster_path, "w", **kwargs) as rds:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "rasterio/env.py", line 463, in wrapper
return f(*args, **kwds)
^^^^^^^^^^^^^^^^
File "rasterio/__init__.py", line 254, in open
raise TypeError(f"invalid dtype: {dtype!r}")
TypeError: invalid dtype: 'object'
Root Cause Analysis
The issue occurs because:
-
Data Structure Mismatch:
raster2stac
expects the input data to be anxarray.DataArray
with abands
dimension -
NetCDF Structure: Many NetCDF files (especially Earth observation data) store bands as separate data variables in an
xarray.Dataset
rather than as a single DataArray - Dtype Conversion: When the library tries to process individual bands from the Dataset structure, it encounters dtype incompatibilities that result in 'object' dtypes