So if I'm understanding correctly this is using AI to fancily upscale Senitel 2 data, essentially guessing what it's seeing, and then suggesting the output of that should be used for making new products/decisions/models. Sounds a bit like CSI Zoom Enhance stuff...
The super-res is surprisingly usable for making sense of land use changes. With OpenStreetMap editing, one common challenge is that out of the usable (license-wise) imagery, the high-defs ones are old and the new ones from Sentinel are low-def. A lot of switching, squinting, and gusssing is required to understand of what's going on, even when most of the work is as basic as trying to spot this old road in the blurry new image. This super-res seems to do that well enough. It doesn't have enough information to guess the exact shape of buildings and that's okay.
They also do some object recognition, which is useful if you're an electric infrastructure nut. It spotted some solar fields in Shanghai which I've never heard of before -- a look at the same coordinates (30.753, 121.392) on Google sure shows the expected blue.
The models we use to extract the geospatial data (like solar farm and offshore platform positions) from Sentinel-2 imagery are currently separate from the Sentinel-2 upscaling model, which is a more exploratory project.
We report the accuracy of the data at [1]; the Satlas project is quite new and we're aiming to improve accuracy as well as add more categories over time.
We expect the geospatial data will be useful for certain applications, but I agree that the upscaled super-resolution output has more limited uses, especially in its current state outside the US since it is trained using NAIP imagery that is only available in the continental US. We're exploring methods to quantify and improve the accuracy of the upscaled imagery.
Note that the model weights, training data, and generated geospatial data can all be downloaded at [2].
Does satlas currently use any channels other than Sentinel's visible RGB? I imagine that those near IR bands can be very useful for plant-related tasks and (with a long stretch) potentially help with object discrimination by adding an extra band.
The marine infrastructure (offshore platform and offshore wind turbine) and super-resolution models only use RGB bands (B04, B03, B02), while the solar farm, onshore wind turbine, and tree cover models use 9 Sentinel-2 bands (add B05, B06, B07, B08, B11, and B12). With enough high-quality labels, the extra bands do provide slightly improved performance (1-2% gain in our accuracy metric, e.g. from 89% to 91%), but we don't have a detailed comparison or analysis at this time.
Also, all of the models input three or four images of the same ___location (captured within a few months), with max temporal pooling used at intermediate layers to enable model to synthesize information across the images. This helps a lot, definitely when one image has a section obscured by clouds (so model can use the other images instead), and maybe also when different images provide different information (e.g. shadows going in different directions due to slightly different times of day).
We plan to eventually add some real paid high-res imagery to the map just as a comparison, but for now you would need to look at the map at https://satlas.allen.ai/map (select Super Resolution) and compare it to a source of aerial imagery like Google Maps or Bing Maps at the same spot.
Sounds good, it's been a little while since I've touched anything GIS related, but it was kind of fun while at the same time stressful for me as a junior developer at the time. I'm definitely curious how insanely accurate AI upscaling will become with stuff like this, at least in terms of getting a good amount of the terrain correct.
I'm curious about the application and implication of using generative model for comparative analysis. Wherein, if the results are incorrect or a have a slight error in a map, can lead to incorrect conclusion and impact on policy.
This observation is not centered on the Satlas projects because medical image analysis is also out there (but may be the FDA can drive some regulation).
Broader question, how would we have to think about generative modeling for applications that are more then entertainment and cannot be corrected/ verified
by a person (like the user in case of ChatGPT)
I fully agree that errors in extracted data can lead to making incorrect decisions/policies. Even for applications where accuracy is paramount, though, I think error-prone models still have their uses:
- For applications that only need summary statistics over certain geographies, analyzing small samples of data can yield correction factors and error estimates.
- The data could also be combined with manual verification to improve existing higher-precision but lower-recall datasets (e.g. OpenStreetMap where features are more likely to be correct but also have less overall coverage).
Is an 'ELI5' possible, explaining how the Super Resolution or upscaling of satellite imagery takes place, in broad concept?
My first assumption is that it takes existing datasets of high and low res imagery, covering the same areas, and builds a complex understanding of extrapolation between the two. A sort of reverse-engineering guidebook. It can then be fed low res imagery alone, and refer to the guidebook its built up, in attempt to extrapolate high res output.
"The example data suggests, at X.X% likelihood, that this particular pattern of low res pixels resolves into a high res shape of these particulars : ".
Yes, I think this is part of it -- when the model sees a new low-res image, it compares it to patterns that it has already seen to estimate what that ___location might look like at high-res.
The other important part is that the model inputs many low-res images (up to 18, i.e. about three months of images) to produce each high-res image. If you were to down-sample an image by 2x via averaging, then offset the image by one pixel to the right and down-sample it, then repeat for two more offsets, then across the four down-sampled images, you should have enough information to reconstruct the original image. We want our ML model to attempt a similar reconstruction but with actual low-res images. The idea breaks down in practice since pixel values from a camera aren't a perfect average of the light reflected from that grid cell, and there are seasonal changes and clouds and other dynamic factors, but with many aligned low-res captures (with sub-pixel offsets) an ML model should still be able to somewhat accurately estimate what the scene looks like at 2x or 4x higher-res (the Satlas map shows a 4x attempt). The current model we've deployed does this far from perfectly, so there are some issues like figuring out where the model might be making a mistake and enabling the model to best make use of the many low-res input images, and we're actively exploring how to improve on these.
Could you do this for arctic ice? And hallucinate reasonable super-resolution forecasts for ships? For example, the icebreaker Healy is up there right now: https://www.cruisemapper.com/?imo=9083380
Just a heads up but the super resolution example absolutely spams the history API in your browser if you move the map slightly. You could add a bit of delay before trying to save the new ___location in the URL.
Or, how about not saving the ___location in the URL at all when moving the map? I guess there must be a reason why none of the mainstream map webapps does something like this...