Can a Model Trained on Satellite Data Really Find Brambles on the Ground?
Posted3 months agoActive3 months ago
toao.comResearchstoryHigh profile
skepticalmixed
Debate
70/100
Remote SensingMachine LearningEcology
Key topics
Remote Sensing
Machine Learning
Ecology
A model trained on satellite data was used to identify bramble locations, but commenters question its accuracy and methodology, highlighting concerns about false positives and the need for ground truth validation.
Snapshot generated from the HN discussion
Discussion Activity
Active discussionFirst comment
56m
Peak period
20
0-2h
Avg / period
5.3
Comment distribution53 data points
Loading chart...
Based on 53 loaded comments
Key moments
- 01Story posted
Sep 25, 2025 at 3:28 PM EDT
3 months ago
Step 01 - 02First comment
Sep 25, 2025 at 4:23 PM EDT
56m after posting
Step 02 - 03Peak activity
20 comments in 0-2h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 26, 2025 at 9:55 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45377748Type: storyLast synced: 11/20/2025, 6:33:43 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
For example, figure out what crop someone’s growing and decide how healthy it is. With sufficient temporal resolution, you can understand when things are planted and how well they’re growing, how weedy or infiltrated they are by pest plants, how long the soil remains wet or if rainwater runs off and leaves the crop dry earlier than desired. Etc.
If you’re a good guy, you’d leverage this data to empower farmers. If you’re an asshole, you’re looking to see who has planted your crop illegally, or who is breaking your insurance fine print, etc.
How does using it to speculate on crop futures rank?
Same with insurance… socialized risk for our food supply is objectively good, and protecting the insurance mechanism from fraud is good. People can always bastardize these things.
Even calling this a speculative market is a gross simplification of the truth.
You are very right on the temporal aspect though, that's what makes the representation so powerful. Crops grow and change colour or scatter patterns in distinct ways.
It's worth pointing out the model and training code is under an Apache2 license and the global embeddings are under a CC-BY-A. We have a python library that makes working with them pretty easy: https://github.com/ucam-eo/geotessera
Video of the notebook in action https://crank.recoil.org/w/mDzPQ8vW7mkLjdmWsW8vpQ and the source https://github.com/ucam-eo/tessera-interactive-map
Downstream classifiers are really fast to train (seconds for small regions). You can try out a notebook in VSCode to mess around with it graphically using https://github.com/ucam-eo/tessera-interactive-map
The berries were a bit sour, summer is sadly over here!
The easiest way to test is to try out the interactive notebook and drop some labels in known areas.
What I mean is a vein is usually a few meters wide but can be hundreds of meters long so ten meter resolution is probably not very helpful unless the embeddings can encode some sort of pattern that stretches across many cells.
The downside of that approach is that you need to spend valuable labels on learning the spatial feature extraction during training. To fix that we're working on building some pre-trained spatial feature extractors that you should only need to minimally fine-tune.
Hyperspectral in the SWIR range is what you really want for this, but that's a whole different ball game.
Are there any hyperspectral surveys with UAVs etc instead of satellites?
We're hoping to try it with a few different things for our next field trip, maybe some that are much harder to find than brambles.
What detail was in the satellite images, was it taking signals of the type of spaces brambles are in, or was it just visually identifying bramble patches?
In the UK you get brambles in pretty much every non-cultivated green space. I wonder how well the classifier did?
Interesting project.
When it comes to the satellite images, the model actually used TESSERA (https://arxiv.org/abs/2506.20380) which is a model we trained to produce embeddings for every point on earth that encodes the temporal-spectral properties over a year.
Think of it like a compression of potentially fifty or a hundred observations of a particular point in earth down to a single 128 dimension vector.
Happy to answer any other questions.
https://www.pnas.org/doi/10.1073/pnas.2407652121
It would be interesting to overlay TESSERA data there, although the resolution is of course very different.
There is the issue of just how visible truffles are from space though, if they grow under cover. That said, it may still work because you can find habitats that are very likely to have truffles. We've had some promising results looking at fungal biomass.
No, as per researcher, "However, it is obvious that most of the generated findings aren’t brambles" and obviously no.
All the model did was think they followed roads, all roads.
If it was oil and gas where people put in effort and their results where checked vs universities where meaningless citations matter and results are never confirmed, it would be more believable.
What they are asking is impossible, increasing the likelihood without silly hacks like it's not in rivers or on top of buildings is an interesting problem but out of scope for academics.
As I mentioned in one of the other comments, the model is also only pixel-wise. That is, it is not using spatial information for predictions.
For the "However, it is obvious that most of the generated findings aren’t brambles"
Show us the bee!
Cue dowsers, who successfully find water... but also who would anyway anywhere else because underground water isn't the underground river/pocket that people imagine and thus random chance by itself has high probability of finding water.
For a proper evaluation you would need to be more methodological but as a sanity-check we were very happy with it.
One other thing to point out about the bramble model is that it is pixel-wise. That is each prediction is exclusively only what is within the 10 metre pixel (give or take the georeferencing error).
https://github.com/ucam-eo/geotessera has an image showing our embedding coverage at the moment. Blue areas we have complete coverage for 2024, green areas we cover 2017-2024. We're slowly trying to populate everything 2017-2024 but the constraint is GPU and storage at the moment - each year takes ~20k GPU/200k CPU hours and requires storing and serving 200 terabytes of data. The world is big!
If there is an area you would like prioritised, there's an issue template on the geotessera github repo which we can use to move regions around in the processing queue.
If you want to go further you can export the GeoJSON and then run it through any machine learning pipeline you like.
Plants are a way different and more difficult ballgame (they like to mess up my satellite data) so as I read I am not surprised to see that this didn't really give proper results.
> In every place we checked, we found pretty significant amounts of bramble.
[Shocked Pikachu face]