Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gdalinfo on Sentinel-2 SAFE archive does not return all rasters matching SUBDATASET_4_NAME=SENTINEL2_L2A:...:TCI #9066

Closed
MathewNWSH opened this issue Jan 12, 2024 · 9 comments · Fixed by #9067

Comments

@MathewNWSH
Copy link

Expected behavior and actual behavior.

After running

gdalinfo /Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml

on Sentinel-2 product metadata file, among output a subset information is returned:

Subdatasets:
  SUBDATASET_1_NAME=SENTINEL2_L2A:/Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml:10m:EPSG_32635
  SUBDATASET_1_DESC=Bands B2, B3, B4, B8 with 10m resolution, UTM 35N
  SUBDATASET_2_NAME=SENTINEL2_L2A:/Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml:20m:EPSG_32635
  SUBDATASET_2_DESC=Bands B5, B6, B7, B8A, B11, B12, AOT, CLD, SCL, SNW, WVP with 20m resolution, UTM 35N
  SUBDATASET_3_NAME=SENTINEL2_L2A:/Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml:60m:EPSG_32635
  SUBDATASET_3_DESC=Bands B1, B9, AOT, CLD, SCL, SNW, WVP with 60m resolution, UTM 35N
  SUBDATASET_4_NAME=SENTINEL2_L2A:/Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml:TCI:EPSG_32635
  SUBDATASET_4_DESC=True color image, UTM 35N

What is weird in my personal option is that the TCI (pre-processed true color composition) is defined as separate subset. After running gdalinfo on the TCI subset, only metadata for 10m resolution is returned, despite existence of 2 other resolution files.

root@7c6e8e3b6807:/# gdalinfo SENTINEL2_L2A:/Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml:TCI:EPSG_32635
Driver: SENTINEL2/Sentinel 2
Files: /Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml
       /Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/GRANULE/L2A_T35TMJ_A041903_20230701T090005/MTD_TL.xml
       /Desktop/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/GRANULE/L2A_T35TMJ_A041903_20230701T090005/IMG_DATA/R10m/T35TMJ_20230701T085601_TCI_10m.jp2
Size is 10980, 10980
Coordinate System is:
PROJCRS["WGS 84 / UTM zone 35N",
    BASEGEOGCRS["WGS 84",
        DATUM["World Geodetic System 1984",
            ELLIPSOID["WGS 84",6378137,298.257223563,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4326]],
    CONVERSION["UTM zone 35N",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",0,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",27,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at natural origin",0.9996,
            SCALEUNIT["unity",1],
            ID["EPSG",8805]],
        PARAMETER["False easting",500000,
            LENGTHUNIT["metre",1],
            ID["EPSG",8806]],
        PARAMETER["False northing",0,
            LENGTHUNIT["metre",1],
            ID["EPSG",8807]]],
    CS[Cartesian,2],
        AXIS["easting",east,
            ORDER[1],
            LENGTHUNIT["metre",1]],
        AXIS["northing",north,
            ORDER[2],
            LENGTHUNIT["metre",1]],
    ID["EPSG",32635]]
Data axis to CRS axis mapping: 1,2
Origin = (399960.000000000000000,4900020.000000000000000)
Pixel Size = (10.000000000000000,-10.000000000000000)
Metadata:
  AOT_QUANTIFICATION_VALUE=1000.0
  AOT_QUANTIFICATION_VALUE_UNIT=none
  AOT_RETRIEVAL_ACCURACY=0.0
  AOT_RETRIEVAL_METHOD=SEN2COR_DDV
  BOA_QUANTIFICATION_VALUE=10000
  BOA_QUANTIFICATION_VALUE_UNIT=none
  CLOUDY_PIXEL_OVER_LAND_PERCENTAGE=2.075189
  CLOUD_COVERAGE_ASSESSMENT=2.059663
  CLOUD_SHADOW_PERCENTAGE=1.385088
  DARK_FEATURES_PERCENTAGE=0.00927
  DATATAKE_1_DATATAKE_SENSING_START=2023-07-01T08:56:01.024Z
  DATATAKE_1_DATATAKE_TYPE=INS-NOBS
  DATATAKE_1_ID=GS2A_20230701T085601_041903_N05.09
  DATATAKE_1_SENSING_ORBIT_DIRECTION=DESCENDING
  DATATAKE_1_SENSING_ORBIT_NUMBER=7
  DATATAKE_1_SPACECRAFT_NAME=Sentinel-2A
  DEGRADED_ANC_DATA_PERCENTAGE=0.0
  DEGRADED_MSI_DATA_PERCENTAGE=0
  FORMAT_CORRECTNESS=PASSED
  GENERAL_QUALITY=PASSED
  GENERATION_TIME=2023-07-01T15:08:57.000000Z
  GEOMETRIC_QUALITY=PASSED
  GRANULE_MEAN_AOT=0.087159
  GRANULE_MEAN_WV=1.534707
  HIGH_PROBA_CLOUDS_PERCENTAGE=0.294961
  L2A_QUALITY=PASSED
  MEDIUM_PROBA_CLOUDS_PERCENTAGE=0.953114
  NODATA_PIXEL_PERCENTAGE=78.239954
  NOT_VEGETATED_PERCENTAGE=28.415212
  OZONE_SOURCE=AUX_ECMWFT
  OZONE_VALUE=310.257418
  PREVIEW_GEO_INFO=Not applicable
  PREVIEW_IMAGE_URL=Not applicable
  PROCESSING_BASELINE=05.09
  PROCESSING_LEVEL=Level-2A
  PRODUCT_DOI=https://doi.org/10.5270/S2_-znk9xsj
  PRODUCT_START_TIME=2023-07-01T08:56:01.024Z
  PRODUCT_STOP_TIME=2023-07-01T08:56:01.024Z
  PRODUCT_TYPE=S2MSI2A
  PRODUCT_URI=S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE
  RADIATIVE_TRANSFER_ACCURACY=0.0
  RADIOMETRIC_QUALITY=PASSED
  REFERENCE_BAND=B4
  REFLECTANCE_CONVERSION_U=0.96764643507388
  SATURATED_DEFECTIVE_PIXEL_PERCENTAGE=0.0
  SENSOR_QUALITY=PASSED
  SNOW_ICE_PERCENTAGE=0.0
  SPECIAL_VALUE_NODATA=0
  SPECIAL_VALUE_SATURATED=65535
  THIN_CIRRUS_PERCENTAGE=0.811588
  UNCLASSIFIED_PERCENTAGE=0.146695
  VEGETATION_PERCENTAGE=67.267752
  WATER_PERCENTAGE=0.716322
  WATER_VAPOUR_RETRIEVAL_ACCURACY=0.0
  WVP_QUANTIFICATION_VALUE=1000.0
  WVP_QUANTIFICATION_VALUE_UNIT=cm
Image Structure Metadata:
  COMPRESSION=JPEG2000
  INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left  (  399960.000, 4900020.000) ( 25d44'49.27"E, 44d14'47.56"N)
Lower Left  (  399960.000, 4790220.000) ( 25d46' 2.91"E, 43d15'29.34"N)
Upper Right (  509760.000, 4900020.000) ( 27d 7'20.12"E, 44d15'12.07"N)
Lower Right (  509760.000, 4790220.000) ( 27d 7'12.94"E, 43d15'53.02"N)
Center      (  454860.000, 4845120.000) ( 26d26'21.31"E, 43d45'27.90"N)
Band 1 Block=128x128 Type=Byte, ColorInterp=Red
  Description = B4, central wavelength 665 nm
  Overviews: 5490x5490, 2745x2745, 1373x1373, 687x687
  Metadata:
    BANDNAME=B4
    BANDWIDTH=30
    BANDWIDTH_UNIT=nm
    BOA_ADD_OFFSET=-1000
    SOLAR_IRRADIANCE=1512.06
    SOLAR_IRRADIANCE_UNIT=W/m2/um
    WAVELENGTH=665
    WAVELENGTH_UNIT=nm
Band 2 Block=128x128 Type=Byte, ColorInterp=Green
  Description = B3, central wavelength 560 nm
  Overviews: 5490x5490, 2745x2745, 1373x1373, 687x687
  Metadata:
    BANDNAME=B3
    BANDWIDTH=35
    BANDWIDTH_UNIT=nm
    BOA_ADD_OFFSET=-1000
    SOLAR_IRRADIANCE=1823.24
    SOLAR_IRRADIANCE_UNIT=W/m2/um
    WAVELENGTH=560
    WAVELENGTH_UNIT=nm
Band 3 Block=128x128 Type=Byte, ColorInterp=Blue
  Description = B2, central wavelength 490 nm
  Overviews: 5490x5490, 2745x2745, 1373x1373, 687x687
  Metadata:
    BANDNAME=B2
    BANDWIDTH=65
    BANDWIDTH_UNIT=nm
    BOA_ADD_OFFSET=-1000
    SOLAR_IRRADIANCE=1959.66
    SOLAR_IRRADIANCE_UNIT=W/m2/um
    WAVELENGTH=490
    WAVELENGTH_UNIT=nm

Information about all available resolutions should be returned or the TCI composition metadata should be captured in corresponding resolution subset.

Steps to reproduce the problem.

gdalinfo SENTINEL2_L2A:/vsicurl/https://s3.waw3-2.cloudferro.com/swift/v1/Tomkralidis/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/MTD_MSIL2A.xml:TCI:EPSG_32635

sample data:
https://s3.waw3-2.cloudferro.com/swift/v1/Tomkralidis/

Operating system and GDAL version

sudo docker run -it -v /home/eouser/Desktop:/Desktop ghcr.io/osgeo/gdal:ubuntu-small-latest
GDAL 3.9.0dev-bf5289c78039babb6694851ea9be57116ae088aa, released 2024/01/04
@jratike80
Copy link
Collaborator

Do you mean that there should be different subdatasets for each resolution?

...TCI_10m.jp2
...TCI_20m.jp2
...TCI_60m.jp2

@MathewNWSH
Copy link
Author

Do you mean that there should be different subdatasets for each resolution?

...TCI_10m.jp2
...TCI_20m.jp2
...TCI_60m.jp2

Yes, or every available resolution in one subset if possible. As it stands, the user does not receive information about the two existing rasters (TCI_20m and TCI_60m) at all.

@jratike80
Copy link
Collaborator

jratike80 commented Jan 12, 2024

Have you tried to study the metadata xml files? I am trying but this is the first time ever with such files for me. I guess that TCI is defined somehow differently than for example bands Bands B2, B3, B4, B8 which are collected into a multiband layers sharing the same resolution.
I tried to read the MTD_MSIL2A.xml file and this other one https://s3.waw3-2.cloudferro.com/swift/v1/Tomkralidis/S2A_MSIL2A_20230701T085601_N0509_R007_T35TMJ_20230701T150857.SAFE/GRANULE/L2A_T35TMJ_A041903_20230701T090005/MTD_TL.xml but they do not reveal me yet how the granules are grouped.

Do you know that the XML files are correct and some other software can read the 10m, 20m, and 60m TCI bands separately?

@jratike80
Copy link
Collaborator

jratike80 commented Jan 12, 2024

I think that I will give up now. The main XML document refers to XML schema https://psd-14.sentinel2.eo.esa.int/PSD/User_Product_Level-2A.xsd but it is not there anymore. Probably the schema is now somewhere here https://sentinel.esa.int/web/sentinel/user-guides/sentinel-2-msi/data-formats/xsd but I believe that I would not really understand it anyway.

@MathewNWSH
Copy link
Author

MathewNWSH commented Jan 12, 2024

From what I know, these are the official XML files provided by ESA. I am not familiar with other metadata tools that extracts S2L2A information. I was looking for a way to transmit S2 metadata to STAC, and that's how I came up this ticket.

This is nice metadata tool, but it's necessary to write the rule for S2L2A. You can find more information here: https://github.com/dlr-eoc/EOmetadataTool/tree/main

The content of MTD_MSIL2A.xml shows that the different resolutions TCI files are clearly listed here:

image

From what I understand, the difference between TCIs and other rasters is that TCIs are 3-band rasters, while others are panchromatic-based. There is no difference between (for example) SLC and TCI from the xml definition perspective

@jratike80
Copy link
Collaborator

The TCI files are listed but in the same way for example B2 files are listed and I did not find what makes them grouped with bands B3, B4, and B8 into 10m resolution subdataset. There must be some additional information somewhere.
SUBDATASET_1_DESC=Bands B2, B3, B4, B8 with 10m resolution, UTM 35N
And did you notice that B2 does not appear in the 20m and 60m resolution subdatasets?

@MathewNWSH
Copy link
Author

I haven't seen it. I just found something interesting in GDAL docs, namely:
image

You are right, it's matter of more than one subdataset.
the driver is supposed to find 4 subdatasets:

  • one for the 4 native 10m bands, and L2A specific bands (AOT and WVP) - So in this case the AOT and WVP are skipped
  • one for the 6 native 20m bands, plus the 10m bands, except B8, resampled to 20m, and L2A specific bands (AOT, WVP, SCL, CLD and SNW) - the resampled 10m bands are skipped
  • one for the 3 native 60m bands, plus the 10m&20m bands, except B8, resampled to 60m, and L2A specific bands (AOT, WVP, SCL, CLD and SNW), the resampled 10 and 20m bands are skipped
  • one for a preview of the R,G,B bands at a 320m resolution - here instead of quicklook the 10 meter TCI is listed

TCI rasters are not even mentioned in the docs.

@jratike80
Copy link
Collaborator

For me it seems that the driver works as it is made to work. I think I will close this issue because there is no bug involved. Please write to gdal-dev mailing list and make a reasoned suggestion about how to improve the driver and open a discussion.

@rouault
Copy link
Member

rouault commented Jan 12, 2024

Mostly documentation adjustments in #9067 . Sentinel2 is a bit messy due to the various levels L1B/L1C/L2A and the different "formulations" of it within the same level (original formuation, "safe_compact"...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants