Beyond Sensor Data: Foundation Models of Behavioral Data From Wearables

Posted4 months agoActive4 months ago

brandonb

230 points

54 comments

arxiv.orgResearchstoryHigh profile

calmmixed

Debate

60/100

Wearable TechnologyHealth Data AnalysisAI in Healthcare

Key topics

Wearable Technology

Health Data Analysis

AI in Healthcare

A research paper from Apple presents a foundation model for analyzing behavioral data from wearables, sparking discussion on its potential applications and limitations, including concerns about data accessibility and model performance.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

29m

Peak period

0-6h

Avg / period

Comment distribution54 data points

Loading chart...

Based on 54 loaded comments

Key moments

01Story posted
Aug 21, 2025 at 10:39 AM EDT
4 months ago
Step 01
02First comment
Aug 21, 2025 at 11:08 AM EDT
29m after posting
Step 02
03Peak activity
37 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Aug 24, 2025 at 11:40 AM EDT
4 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (54 comments)

Showing 54 comments

brandonbAuthor

4 months ago

4 replies

I worked on one of the first wearable foundation models in 2018. The innovation of this 2025 paper from Apple is moving up to a higher level of abstraction: instead of training on raw sensor data (PPG, accelerometer), it trains on a timeseries of behavioral biomarkers derived from that data (e.g., HRV, resting heart rate, and so on.).

They find high accuracy in detecting many conditions: diabetes (83%), heart failure (90%), sleep apnea (85%), etc.

puppymaster

4 months ago

1 reply

reminds me of Jim Simons of Renaissance advise when it comes to data science - sort first, then regress.

dinobones

4 months ago

2 replies

Not sort in the literal way right?

https://stats.stackexchange.com/questions/185507/what-happen...

clickety_clack

4 months ago

1 reply

The guy was sorting the X separately from y? That can’t be a real.

falcor84

4 months ago

"Nothing is foolproof to a sufficiently talented fool"

tomrod

4 months ago

Not every day you find pseudo permutation in the wild!

throwaway314155

4 months ago

1 reply

Had the phrase "foundation model" become a term of art yet?

brandonbAuthor

4 months ago

1 reply

By 2018, the concept was definitely in the air since you had GPT-1 (2018) and BERT (2018). You could argue even Word2Vec (2013) had the core concept of pre-training on an unsupervised or self-supervised objective leading to performance on a downstream semantic task. However, the phrase "foundation model" wasn't coined until 2021, to my knowledge.

throwaway314155

4 months ago

I guess I just find the whole "foundation model" phrasing to be designed in a way to pat the backs of the "winners" who would of course be those with the most money. I'm sure there are foundation models from groups that aren't e.g. OpenAI, but the origins felt egotistical and asserting that you made one prior to the phrase's inception only feels more-so.

Had you merely called it an early instance of pretraining, I'd be fine with it.

crorella

4 months ago

2 replies

Insurance and health insurance companies must be super interested in this research and its applications.

jeron

4 months ago

2 replies

I'm sure they're also interested in the data. Imagine raising premiums based on conditions they detect from your wearables. That's why it's of utmost importance to secure biometrics data

brandonbAuthor

4 months ago

1 reply

At least in the US, health insurers can’t raise rates or deny coverage based on pre-existing conditions. That was a major part of the Affordable Care Act.

abenga

4 months ago

1 reply

The ACA will not survive the next couple of years.

daveguy

4 months ago

1 reply

That's what they said in 2016.

abenga

4 months ago

1 reply

This time he has control over all the arms of government + the support of much of the private sector though. The last time there was at least some push-back.

daveguy

4 months ago

They had the exact same thing last time too. Control over all arms + support of much of the private sector. We told them to go fuck themselves then and we will again.

apwell23

4 months ago

how would that work. i pay flat rate through my employer.

autoexec

4 months ago

1 reply

There are so many companies across many industries who are salivating at the thought of everyone using wearables to monitor their "health" and getting their hands on that data. Including law enforcement, lawyers, and other government agencies.

teiferer

4 months ago

It's industry leaders that are salivating the most.

teiferer

4 months ago

1 reply

What is an "accuracy" of 83%? Do 83% of predicted diabetes cases actually have diabetes? Or did 83% of those who have diabetes get diagnosed as such? It's about precision vs. recall. You can improve one by sacrificing the other. Boiling it down to one number is hard.

topaz0

4 months ago

1 reply

They use the area under the receiver operating curve, which is a pretty standard way to boil that down to one number.

teiferer

4 months ago

Ah, thanks for the pointer!

https://en.m.wikipedia.org/wiki/Receiver_operating_character...

So, 83% is actually not that great, given that you can achieve 50% by guessing randomly.

dyauspitr

4 months ago

1 reply

Is there a way to run this on your own data? I’ve been wearing my Apple Watch for years and would love to be able to use it better.

brandonbAuthor

4 months ago

2 replies

Not yet -- this one is just a research study. Some of their previous research has made it into product features.

For example, Apple Watch VO2Max (cardio fitness) is based on a deep neural network published in 2023: https://www.empirical.health/blog/how-apple-watch-cardio-fit...

llm_nerd

4 months ago

1 reply

Apple's VO2Max measures are not based upon that deep neural network development, and empirical seems to be conflating a few things. And FWIW, just finding the actual paper is almost impossible as that same site has SEO-bombed Google so thoroughly you end up in the circular-reference empirical world where all of their pages reference each other as authorities.

Apple and Columbia did recently collaborate on a heart rate response model -- one which can be downloaded and trialed -- but that was not related to the development of their VO2Max calculations.

Apple is very shrouded about how they calculate VO2Max, but it likely is a pretty simple calculation (e.g. how much is your heart responding based upon the level of activity assumed based upon your motion, method of exercise and movements). The most detail they provide is in https://www.apple.com/healthcare/docs/site/Using_Apple_Watch..., which mostly is a validation that it's providing decent enough accuracy.

brandonbAuthor

4 months ago

1 reply

What’s your source on Apple not using the neural network for VO2Max estimation? They’ve been using on-device neural networks for various biomarkers for several years now (even for seemingly simple metrics like heart rate).

FWIW, the article above links directly to both the paper and a GitHub repo with PyTorch code.

llm_nerd

4 months ago

1 reply

>FWIW, the article above links directly to both the paper and a GitHub repo with PyTorch code.

Neat, though the paper and the Github repo have nothing to do with Apple's VO2Max estimations. It's related to health, and touches on VO2Max and health sensors, but the only source claiming any association at all is that Empirical site. And given that this research came out literally years after Apple added VO2Max estimates to their health metrics, it seems pretty conclusive that it is not the source of Apple's calculations. Neat research related to predicting heart rate response to activity (which might come into play for filling in measurement gaps which happen during activity when a device isn't tight enough, etc).

>What’s your source on Apple not using the neural network for VO2Max estimation?

You're asking me to prove a negative. Apple never claims that they do any complex math or deep neural networks to derive VO2Max, and from my own observations of its estimates of mine, it seems remarkably trivial.

Trivial can still be accurate. But it hardly seems complex. Like, guess people's A1c based upon age, body fat percentage, demographic and you'll likely be high-90s accurate with trivial algebra.

>even for seemingly simple metrics like heart rate

Deriving heart rate from a green light imperfectly reflecting off skin, watching for tiny variations in colour change, is actually super complex! Doing it accurately is actually pretty difficult, which is why wearable accuracy is all over the place, though Apple is one of the leaders and has been for years. Guessing a number based upon HR and activity level isn't quite as complex.

pricklyprice

4 months ago

Interesting, there is no mention of neural networks in that 2021 pdf from Apple (https://www.apple.com/healthcare/docs/site/Using_Apple_Watch...). Also, it seems like the only reference of VO2 Max computation with neural net is on empirical.health blog. https://imgur.com/a/2hX7XNc

I wonder now if all of the derived metrics on Garmin (Training readiness, Training load, Training status)are purely statistical algorithms

pricklyprice

4 months ago

Apple was reporting VO2max for a very long time (much before 2023). I wonder what the accuracy was back then? Maybe they should the option for users to re-compute those past numbers based on the latest and greatest algorithm.

fiduciarytemp

4 months ago

1 reply

Has anyone seen the publishing of the weights or even an API release?

brandonbAuthor

4 months ago

In the paper, they say they can't release the weights due to terms of consent with study participants (this is from the Apple Heart and Movement study).

vibecodermcswag

4 months ago

2 replies

i love this because I build in medtech, but the big problem is no open weights, nor open data.

you can export your own apple XML data for usage and processing, but if you want to create an application and request apple XML data from users, that likely crosses into clinical research territory with data security policy requirements and de-identification needs.

pricklyprice

4 months ago

2 replies

what is the best way for non-big tech to buy such data for research and product development?

guzik

4 months ago

1 reply

Some are for free:

- aidlab.com/datasets

- physionet.org

pricklyprice

4 months ago

thanks for sharing. I also found wearable dataset of ~1k users https://cseweb.ucsd.edu/~jmcauley/datasets/fitrec.html

autoexec

4 months ago

data brokers.

piratesAndSons

4 months ago

2 replies

Trusting your health data with AI brothers is... extremely ill-advised.

I don't even trust Apple themselves, which will sell your health data any insurance company any minute now.

kridsdale1

4 months ago

2 replies

What do you base that suspicion on?

autoexec

4 months ago

If a corporation can make money hand over fist by doing something, they will do it. It doesn't matter if it's illegal or unethical. As long as it's still highly profitable it will be done.

evertedsphere

4 months ago

nature abhors an unexploited resource

autoexec

4 months ago

They might not sell "your" data outright, but it doesn't mean they won't sell inferences/assumptions that they make about you using your data.

The reality is that no matter how ethical the company you trust with that data is, you're still one hack or pissed off employee away from having that data leaked, and all of that data is freely up for grabs to the state (whose 3 letter agencies are likely collecting it wholesale) and open to subpoena in a lawsuit.

MangoToupe

4 months ago

1 reply

Can someone explain what "wearable foundation" means?

compiler-guy

4 months ago

It's a "Foundation Model" for wearable devices. So "wearable" describes where it is to be used, rather than describing "foundation".

LPisGood

4 months ago

2 replies

Is anyone else surprised by how poorly performing the results are for the vast majority of cases? The foundation model which had access to sensor data and behavioral biomarkers actually _underperformed_ the baseline predictor that just uses nonspecific demographic data in almost 10 areas.

In fact, even when the wearable foundation model was better, it was only marginally better.

I was expecting much more dramatic improvements with such rich data available.

Herring

4 months ago

I worked with similar data in grad school. I'm not surprised. You can have a lot of data, but sometimes the signal (or signal quality) just isn't present in that haystack, and there's nothing you can do about it.

Sometimes you just have to use ultrasound or MRI or stick a camera in the body, because everything else might as well be reading tea leaves, and people generally demand very high accuracy when it comes to their health.

bumby

4 months ago

I wonder how much of that is driven by poor performing behavioral models. There was a HN article from a few weeks back and it only had an accuracy of about 70% determining if someone was awake or asleep. I would guess that the secondary behavioral data used in this data (like cardiovascular fitness) are much harder to predict from raw sensor data than being awake or asleep.

aanet

4 months ago

2 replies

Thanks for posting this. This looks promising...

I have about 3-3.5 years worth of Apple Health + Fitness data (via my Apple Watch) encompassing daily walks / workouts / runs / HIIT / weight + BMI / etc. I started collecting this religiously during pandemic.

The exported Fitness data is ~3.5GB

I'm looking to do some longitudinal analysis - for my own purposes first, to see how certain indicators have evolved.

Has anyone done something similar? Perhaps in R, Python? Would love to do some tinkering. Any pointers appreciated!

Thanks!!

kridsdale1

4 months ago

2 replies

It might actually be worth writing your analysis in Swift with the actual HealthKit API and visualization libraries.

Bonus: when you’re done, you’ll have an app you can sell.

bob_theslob646

4 months ago

How hard is it to write in swift?

aanet

4 months ago

:thumbs_up.gif:

My sentiments, exactly.

Though I'm looking to scratch my own itch for now...

brandonbAuthor

4 months ago

1 reply

FWIW, we're working on something similar (you wouldn't necessarily need to write R or Python). Feel free to email me at bmb@empirical.health and I can add you to a beta once we have it ready!

aanet

4 months ago

Thanks, I'll reach out.

I am curious to do my own analysis, for two main reasons:

- some data is confidential (I'd hate for it to leave my devices) - wanna DIY / learn / iterate

Will ping you in any case. Thanks

memming

4 months ago

Interesting to see contrastive loss instead of a reconstruction loss.

rsanek

4 months ago

Cool way of integrating the two approaches. For those on mobile, I created an infographic that's a bit more accessible: https://studyvisuals.com/artificial-intelligence/beyond-sens...

View full discussion on Hacker News

ID: 44973375Type: storyLast synced: 11/20/2025, 6:48:47 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN