Simplefold: Folding Proteins Is Simpler Than You Think

Posted3 months agoActive3 months ago

kevlened

471 points

132 comments

github.comResearchstoryHigh profile

calmmixed

Debate

60/100

Protein FoldingAI in BiologyMachine Learning

Key topics

Protein Folding

AI in Biology

Machine Learning

https://arxiv.org/abs/2509.18480

Apple researchers released a paper and code for SimpleFold, a simplified protein folding model, sparking discussion on its implications, comparison to AlphaFold, and the role of large language models in biology.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

19m

Peak period

0-6h

Avg / period

16.5

Comment distribution132 data points

Loading chart...

Based on 132 loaded comments

Key moments

01Story posted
Sep 26, 2025 at 2:01 PM EDT
3 months ago
Step 01
02First comment
Sep 26, 2025 at 2:20 PM EDT
19m after posting
Step 02
03Peak activity
96 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Sep 28, 2025 at 8:17 PM EDT
3 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (132 comments)

Showing 132 comments

kylehotchkiss

3 months ago

3 replies

> Folding Proteins Is Simpler Than You Think

Then why do we need customized LLM models, two of which seemed to require the resources of 2 of the wealthiest companies on earth (this and google's alphafold) to do it?

wrs

3 months ago

1 reply

How simple did you think it was before?

kylehotchkiss

3 months ago

1 reply

Not simple! Wasn't/Isn't X-ray crystallography what it usually takes to determine the structure?

wrs

3 months ago

Yes, that’s the empirical observation method and the source of ground truth for model training.

wrsh07

3 months ago

1 reply

Folding proteins is pretty valuable and this model is comparably small

This doesn't seem like particularly wasteful overinvestment.

Granted, I'm more excited about the research coming out of arc

jjtheblunt

3 months ago

2 replies

what are you referring to by arc?

hirenj

3 months ago

Arc institute probably.

ben_w

3 months ago

Not op, but I presume the ARC prize/ARC-AGI series of tests: https://arcprize.org/

aDyslecticCrow

3 months ago

1 reply

Its not an LLM, It's a transformer. I know the terms are really being butchered in media, but if we're gonna use the term LLM instead of AI, we better make sure it's actually a "large language model" that is being refereed to. If you're unsure, call it a neural net, or machine learning algorithm, or AI.

It's indeed a large model. But if you knew the history of the field, it's a massive improvement. It has progressed from a almost "NP" problem only barely approachable with distributed cluster compute, to something that can run on a single server with some pricey hardware. The smallest model is only here is only 100M parameters and the largest is 3B parameters, that's very approachable to run locally with the right hardware, and easily within the range for a small biotech lab (compared to the cost of other biotech equipment)

It's also (i'd argue) one of the only truly economically and sociably valuable AI technologies we've found over the past few years. Every simulated protein fold is saving a biotech company weeks of work for highly skilled biotech engineers and very expensive chemicals (In a way that that truly only supplement rather than replace the work). Any progress in the field is a huge win for society.

kylehotchkiss

3 months ago

I'm more teasing the title than the tech :) I'm all for innovation in the field especially with so much bio funding cut!

barbarr

3 months ago

10 replies

Why is apple doing protein folding?

Forbo

3 months ago

2 replies

Reputation laundering?

jama211

3 months ago

1 reply

What’s there to launder? Perhaps they shouldn’t have as good a reputation as they do, but you can’t deny they do have a good reputation.

amelius

3 months ago

3 replies

Reputation of what? They are just an office appliance company.

axoltl

3 months ago

1 reply

You're confusing your opinion of the company with the perception by the general public. Apple's definitely not perceived as 'an office appliance company' by your average person. It's considered a high-end luxury brand by many[1].

1: https://www.researchgate.net/publication/361238549_Consumer_...

commandersaki

3 months ago

I think you mean high-tech brand, which the linked article affirms.

robotresearcher

3 months ago

1 reply

I think their public sales data shows Apple sells mainly to consumers, and mainly iPhones at that.

Like 1980s SONY, they are the top of the line consumer electronics giant of the time. The iPhone is even more successful than the Walkman or Trinitron TVs.

They also sell the most popular laptops,to consumers as well as corporate. Like SONY’s VAIO but more popular again.

robotresearcher

3 months ago

The move in consumer electronics leadership from Japan to the US, Korea, and now China is probably pretty interesting to understand.

Can anyone recommend a good book or article about this?

jama211

3 months ago

Ah, I see where we went wrong here, you never specified that you meant reputation in your mind only. FYI, “reputation” is usually considered to be related to a general public opinion, not your personal one.

EasyMark

3 months ago

1 reply

They have a much better reputation that most companies. I think they're doing okay compared to google, facebook, oracle, etc. Few people are going to think a corp is "doing good" but reputation does still matter somewhat.

leptons

3 months ago

If more people read the cases against Apple by the DOJ & the EU, they probably wouldn't have such a high opinion of Apple.

nextos

3 months ago

2 replies

Local inference. I imagine they have an interest in making this and other cutting edge models small enough to be possible to do quick inference on their desktop machines. The article shows that, with Figure 1E demonstrating inference on an M2 Max 64 GB.

Frankly, it's a great idea. If you are a small pharma company, being able to do quick local inference removes lots of barriers and gatekeeping. You can even afford to do some Bayesian optimization or RL with lab feedback on some generated sequences.

In comparison, running AlphaFold requires significant resources. And IMHO, their usage of multiple alignments is a bit hacky, makes performance worse on proteins without close homologs, and requires tons of preprocessing.

A few years back, ESM from Meta already demonstrated that alignment-free approaches are possible and perform well. AlphaFold has no secret sauce, it's just a seq2seq problem, and many different approaches work well, including attention-free SSMs.

Zacharias030

3 months ago

I think people often interpret a bit too much. Perhaps it’s just some researchers who got enough freedom to run and publish interesting work within apple. For a company like apple it makes sense to have a research lab with considerable freedoms even if protein folding is not a core interest, which is why you see it published but not the formula for the new Corning Gorilla glass…

mensetmanusman

3 months ago

Will be fascinating to see how the market breaks down in the future, will enough people want a third best model they can run on prem, or will people all be fighting in line for the top models that are a few cents more per token on supercomputers.

cowsandmilk

3 months ago

1 reply

To sell computers? 20 years ago, Apple had scientific poster sessions at WWDC and worked to bring PyMol to the Mac. The pictures of proteins you see in the paper were generated with PyMol as are probably >50% of the protein images in scientific papers for the last 15 years.

whyenot

3 months ago

If Warren Delano (the author of PyMol) were still with us, I think he would be amazed at where we are now with AlphaFold, and all the rest. At least what he hoped for, that software like this would be open source and peer-reviewable, has mostly held true.

giancarlostoro

3 months ago

No idea, but can I be signed up for R&D jobs where you don't necessarily build something generating revenue?

Maybe these are just projects they use to test and polish their AI chips? Not sure.

mabedan

3 months ago

Prowlly cuz Siri didn’t work out

shpongled

3 months ago

Probably because ByteDance and Facebook (spun out into EvolutionaryScale) are doing it

lovasoa

3 months ago

How do you call the opposite of green washing? When you want to show that you are burning as much energy on training models as the others.

robotresearcher

3 months ago

Apple has an ML research group. They do a mixture of obviously-Apple things, other applications, generally useful optimizations, and basic research.

https://machinelearning.apple.com/

bobmarleybiceps

3 months ago

This may not be the actual reason in this case, but I think it's good to be aware of: A non-zero chunk of "ai for science" research done at tech companies is basically done for marketing. Even in cases where it's not directly beneficial for the companies products or is unlikely to really lead to anything substantial, it is still good for "prestige"

IncreasePosts

3 months ago

They're jealous they haven't won a Nobel prize

IAmBroom

3 months ago

2 replies

Link goes the github repository behind the article you might want to read.

https://arxiv.org/abs/2509.18480

IAmBroom

3 months ago

1 reply

And the abstract alone says (if I'm reading it correctly), "It still takes AI; just not nearly as much as others are doing."

mentalgear

3 months ago

another form: transformers for the task

serjester

3 months ago

For those interested in the GitHub link.

https://github.com/apple/ml-simplefold

turblety

3 months ago

5 replies

I wonder why Apple can create a model to fold proteins, but still can't get Siri to control the phone competently? I'm not sure I agree with Apple's priorities. I guess these things are not synchronous and they can work on multiple things at a time.

frenchie4111

3 months ago

5 replies

I am genuinely interested where the strong negativity towards Siri has come from in recent culture. From what I gather it's likely due to the high expectations we have for Apple. But what I don't really get is why is there not a similar amount of negativity being directed at Google or Samsung, who both have equally shit phone AI assistants (obviously this is just from my perspective, I am a daily user of both iOS and a Samsung Android)

I am not trying to defend Apple or Siri by any means. I think the product absolutely should (and will) improve. I am just curious to explore why there is such negativity being directed specifically at Apple's AI assistant.

citizenpaul

3 months ago

1 reply

Is it just my rosie glasses or did siri work much better in the first couple of years and seem to decline continually since then. I actually used it a lot initially then eventually disabled it as it never worked anymore.

devmor

3 months ago

I feel like the same is true of a lot of products that moved from being programmatically connected ML workflows to multi-modal AI.

We, the consumer, have received inferior products because of the vague promise that the company might one day be able to make it cheaper if they invest now.

xp84

3 months ago

As a vocal critic of Siri, I can give you a number of reasons we hate it:

1. It seems to be actively getting worse. On a daily basis, I see it responding to queries nonsensically, like when i say “play (song) by (artist)” (I have Apple Music) by opening my Sirius app and putting on a random thing that isn’t even that artist. Other trivial commands are frequently just met with apologies or searching the web.

2. Over a year ago Apple conducted a flashy announcement full of promises about how Siri would not only do the things that it’s been marketed as being able to do for the last decade, but also things that no one has seen an assistant do. Many people believe that announcement was based on fantasy thinking and those people are looking more and more correct every day that Apple ships no actual improvements to Siri.

3. Apple also shipped a visual overhaul of how Siri looks, which gives the impression that work has been done, leading people to be even more disappointed when Siri continues to be a pile of trash.

4. The only competitor that makes sense to compare is Google, since no one else has access to do useful things on your device with your data. At least Google has a clear path to an LLM-based assistant, since they’ve built an LLM. It seems believable that android users will have access to a Gemini-based assistant, whereas it appears to most of us that Apple‘s internal dysfunction has rendered them unable to ship something of that caliber.

SoftTalker

3 months ago

I've disabled Siri as much as I possibly can. I've never even tried to use it. I would do the same for any other AI assistant. I don't like that they are always listening, and I just don't like talking to computers. I find it unnatural, and I get irrationally angry when they don't understand what I want.

If I could buy a phone without an assistant I would see that as a desirable feature.

samuelg123

3 months ago

I think Siri has always been criticized, likely because it has never worked super well and it has the most eyes (or ears) on it (iPhones still have 50% market share in the US).

And now that we have ChatGPT with voice mode, Gemini Live, etc which have incredible speech recognition and reasoning comparatively, it's harder to argue that "every voice assistant is bad" still.

Invictus0

3 months ago

For the last three iOS major versions, Siri has been unable to execute the simple command "shuffle the playlist 'Jams'", or any variation, like "play the playlist Jams on shuffle". I am upset for that reason.

EasyMark

3 months ago

Fair point, even X was able to pump out a usable AI, grok.

mapmeld

3 months ago

As I understand it, Siri and Alexa could be plugged into an LLM, but changing it to an "open world" device that can tell your kid something disturbing, text all of your contacts, buy groceries, etc. comes with serious risk of reputational harm. While still falling short of people's expectations if it isn't ChatGPT-quality. OpenAI is new enough that they get to play by different rules.

al_borland

3 months ago

Something like this doesn’t actually have to work. There were no expectations at all in this space.

Meanwhile, people expect perfection from Siri. At this point a new version of Siri will never live up to people’s expectations. Had they released something on-par with ChatGPT, people would hate it and probably file a class action lawsuit against Apple over it.

The entire company isn’t going to work on Siri. In a large company there are a lot of priorities, and some things that happen on the side as well. For all we know this was one person’s weekend project to help learn something new that will later be applied to the priorities.

I’ve made plenty of hobby projects related to work that weren’t important or priorities, but what I learned along the want proved extremely valuable to key deliverables down the road.

tanelpoder

3 months ago

I guess it's because SimpleFold came from a research lab with different autonomy and less competing interests and internal politics...

stephenpontes

3 months ago

6 replies

I remember first hearing about protein folding with the Folding @Home project (https://foldingathome.org) back when I had a spare media server and energy was cheap (free) in my college dorm. I'm not knowledgable on this, but have we come a long way in terms of making protein folding simpler on today's hardware, or is this only applicable to certain types of problems?

It seems like the Folding @Home project is still around!

_joel

3 months ago

2 replies

Yep, that and SETI@Home. I loved the eye candy, even if I didn't know what it fully meant.

seydor

3 months ago

1 reply

How come we don't have AI@Home

throwup238

3 months ago

2 replies

The network bandwidth between nodes is a bigger limitation than compute. The newest Nvidia cards come with 400gbit busses now to communicate between them, even on a single motherboard.

Compared to SETI or Folding @Home, this would work glacially slow for AI models.

fourthark

3 months ago

1 reply

Seems like training would be a better match, where you need tons of compute but don’t care about latency.

ronsor

3 months ago

1 reply

No, the problem is that with training, you do care about latency, and you need a crap-ton of bandwidth too! Think of the all_gather; think of the gradients! Inference is actually easier to distribute.

meehai

3 months ago

Yeah, but if you can do topologies based on latencies you may get some decent tradeoffs. For example with N=1M nodes each doing batch updates in a tree manner, i.e the all reduce is actually layered by latency between nodes.

shaklee3

3 months ago

800Gbps

gregsadetsky

3 months ago

That and project RC5 from the same time period..! :-)

https://www.distributed.net/RC5

https://en.wikipedia.org/wiki/RSA_Secret-Key_Challenge

I wonder what kind of performance would I get on a M1 computer today... haha

EDIT: people are still participating in rc5-72...?? https://stats.distributed.net/projects.php?project_id=8

roughly

3 months ago

1 reply

As I understand it, folding at home was a physics based simulation solver, whereas alphafold and its progeny (including this) are statistical methods. The statistical methods are much, much cheaper computationally, but rely on existing protein folds and can’t generate strong predictions for proteins that don’t have some similarities to proteins in their training set.

In other words, it’s a different approach that trades off versatility for speed, but that trade off is significant enough to make it viable to generate protein folds for really any protein you’re interested in - it moves folding from something that’s almost computationally infeasible for most projects to something that you can just do for any protein as part of a normal workflow.

cowsandmilk

3 months ago

2 replies

1. I would be hesitant to not categorize folding@home as statistics based; they use Markov state models which is very much based on statistics. And their current force fields are parameterized via machine learning ( https://pubs.acs.org/doi/10.1021/acs.jctc.0c00355 ).

2. The biggest difference between folding@home and alphafold is that folding@home tries to generate the full folding trajectory while alphafold is just protein structure prediction; only looking to match the folded crystal structure. Folding@home can do things like look into how a mutation may make a protein take longer to fold or be more or less stable in its folded state. Alphafold doesn’t try to do that.

roughly

3 months ago

You’re right, that’s true - I’d glossed over the folding@ methodology a bit. I think the core distinction is still that Folding is trying to divine the fold via simulation, while Alphafold is playing closer to a gpt-style predictor relying on training data.

I actually really like Alphafold because of that - the core recognition that an amino acid string’s relationship to the structure and function of the protein was akin to the cross-interactions of words in a paragraph to the overall meaning of the excerpt is one of those beautiful revelations that come along only so often and are typically marked by leaps like what Alphafold was for the field. The technique has a lot of limitations, but it’s the kind of field cross-pollination that always generates the most interesting new developments.

stwsk

3 months ago

They both use GPU energy, yes?

Are there any benchmarks for say a $3,000 RTXetc. vs a nice cluster of M4 Mac Minis?

ge96

3 months ago

1 reply

I contributed a lot on there too used my 3080Ti-FE as a small heater in the winter

EasyMark

3 months ago

lol I still run it in the winter but I feel bad running it in the summer, so I don't run it when A/C or heating is not necessary. I figure some contribution is infinitely more than 0 contribution.

nkjoep

3 months ago

Team F@H forever!

EasyMark

3 months ago

They're still going and have made some great discoveries over the years.

https://foldingathome.org/papers-results/?lng=en

jffry

3 months ago

Apparently from a F@H blog post [1] they say it's still useful to know the dynamics of how it folded, in addition to the final folded shape. And that having ML-folded proteins is a rich target for simulation to validate and to understand how the protein works

[1] https://foldingathome.org/2024/05/02/alphafold-opens-new-opp...

foodevl

3 months ago

1 reply

I was curious what the protein picture was showing: "Figure 1 Example predictions of SimpleFold on targets ... with ground truth shown in light aqua and prediction in deep teal."

and now I'm even more curious why they thought "light aqua" vs "deep teal" would be a good choice

gilleain

3 months ago

Well, figure a) shows a ribbon representation of the fold (as helices and strands) of the protein 7QSW (https://www.ebi.ac.uk/pdbe/entry/pdb/7qsw) which is RubisCO (https://en.wikipedia.org/wiki/RuBisCO), an plant protein that plays a key role in photosynthesis.

The different colours are for the predicted and 'real' (ground truth) models. The fact that it is hard to distinguish is partly the - as you point out - weird colour choice, but also because they are so close together. An inaccurate prediction would have parts that stand out more as they would not align well in 3D space.

underdeserver

3 months ago

1 reply

So, how does this compare to AlphaFold?

mentalgear

3 months ago

seems like they use the normal transformer architecture versus deep fold's more specialised machine-learning approaches.

Invictus0

3 months ago

1 reply

They'll do anything but fix Siri

mentalgear

3 months ago

They can keep on doing stuff like this that's open-source and beneficial to society.

frenchie4111

3 months ago

2 replies

I am curious to hear an expert weigh in on this approach's implications for protein folding research. This sounds cool but it's really unclear to me what the implications are

geremiiah

3 months ago

4 replies

Their representation is simpler, just a transformer. That means you can just plug in all the theory and tools that have been developed specifically for transformers, most importantly you can scale the model easier. But more than that, I think, it shows that there was no magic to AlphaFold. The details of the architecture and training method didn't matter much. All that was needed was training a big enough model on a large enough dataset. Indeed lots of people who have experimented with AlphaFold have found it to behave similiar to LLMs, i.e. it performs well on inputs close to the training dataset and but it doesn't generalize well at all.

aDyslecticCrow

3 months ago

I think the sentiment that simplicity is good, is a false conclusion. Simplicity is simply good scientific methodology.

Doing too many things at once makes methods hard to adopt and makes conclusions harder to draw. So we try to find simple methods that show measurable gain, so we can adapt it to future approaches.

Its a cycle between complexity and simplicity. When a new simple and scalable approach beats the previous state of art, that just means we discovered a new local maxima hill to climp up.

johncolanduoni

3 months ago

Except their dataset is mostly the output of AlphaFold, which had to use the much smaller dataset of proteins analyzed by crystallography as input. This is really an exercise in model distillation - a worthy endeavor but it's not like they could have just taken their architecture and the dataset AlphaFold had and expect to get the same results. If that was the case, that's what they would have done because it would've been much more impressive.

visarga

3 months ago

> But more than that, I think, it shows that there was no magic to AlphaFold. The details of the architecture and training method didn't matter much. All that was needed was training a big enough model on a large enough dataset.

People often like to say that we just need one more algorithmic breakthrough or two for AGI. But in reality it's the dataset and the environment based learning. Almost any model would do if you collected the data. It's not in the model, it's outside where we need to work on.

cma

3 months ago

They had to largely use alpha fold for the data part of the transformer scaling laws so not quite a bitter lesson, but still interesting.

epistasis

3 months ago

It may be a change in future models, perhaps. Here's one person's opinion:

https://genomely.substack.com/p/simplefold-and-the-future-of...

But as with anything in research, it will take months and years to see what the actual implications are. Predictions of future directions can only go so far!

331c8c71

3 months ago

1 reply

It is for structure prediction, not folding (rolleyes).

jandom

3 months ago

Pssst they'll realise scientists hand out here too

shpongled

3 months ago

1 reply

It's not totally novel, but it's very cool to see the continued simplification of protein folding models - AF2 -> AF3 was a reduction in model architecture complexity, and this is a another step in the direction of the bitter lesson.

hashta

3 months ago

I’m not sure AF3’s performance would hold up if it hadn’t been trained on data from AF2 which itself bakes in a lot of inductive bias like equivariance

hashta

3 months ago

4 replies

One caveat that’s easy to miss: the "simple" model here didn’t just learn folding from raw experimental structures. Most of its training data comes from AlphaFold-style predictions. Millions of protein structures that were themselves generated by big MSA-based and highly engineered models.

It’s not like we can throw away all the inductive biases and MSA machinery, someone upstream still had to build and run those models to create the training corpus.

godelski

3 months ago

2 replies

Is this so unusual? Almost everything that is simple was once considered complex. That's the thing about emergence, you have to go through all the complexities first to find the generalized and simpler formulations. It should be obvious that things in nature run off of relatively simple rulesets, but it's like looking at a Game of Life and trying to reverse engineer those rules AND the starting parameters. Anyone telling you such a task is easy is full of themselves. But then again, who seriously believes that P=NP?

hashta

3 months ago

3 replies

To people outside the field, the title/abstract can make it sound like folding is just inherently simple now, but this model wouldn’t exist without the large synthetic dataset produced by the more complex AF. The "simple" architecture is still using the complex model indirectly through distillation. We didn’t really extract new tricks to design a simpler model from scratch, we shifted the complexity from the model space into the data space (think GPT-5 => GPT-5-mini, there’s no GPT-5-mini without GPT-5)

stavros

3 months ago

1 reply

But this is just a detail, right? If we went and painstakingly catalogued millions of proteins, we'd be able to use the simple model without needing a complex model to generated data, no?

connorbrinton

3 months ago

1 reply

Technically yes. But it can take months to years to experimentally obtain the structure for a single protein, and that assumes that it's possible to crystallize (X-ray), prepare grids (cryo-EM) or highly concentrate (NMR) the protein at all.

On the other hand, validating a predicted protein structure to a good level of accuracy is much easier (solvent accessibility, mutagenesis, etc.). So having a complex model that can be trained on a small dataset drastically expands the set of accurate protein structure samples available to future models, both through direct predictions and validated protein structures.

So technically yes, this dataset could have been collected solely experimentally, but in practice, AlphaFold is now part of the experimental process. Without it, the world would have less protein structure data, in terms of both directly predicted and experimentally verified protein structures

stavros

3 months ago

I agree, I guess I'm saying that it's more of a quantitative improvement, rather than a qualitative one.

godelski

3 months ago

1 reply

  > To people outside the field

So what?

It's a research paper. That's not how you communicate to a general audience. Just because the paper is accessible in terms of literal access doesn't mean you're the intended audience. Papers are how scientists communicate to other scientists. More specifically, it is how communication happens between peers. They shouldn't even be writing for just other scientists. They shouldn't be writing for even the full set of machine learning researchers nor the full set of biologists. Their intended audience is people researching computational systems that solve protein folding problems.

I'm sorry, but where do you want scientists to be able to talk directly to their peers? Behind closed doors? I just honestly don't understand these types of arguments.

Besides, anyone conflating "Simpler than You Think" as "Simple" is far from qualified from being able to read such a paper. They'll misread whatever the authors say. Conflating those two is something we'd expect from an Elementary School level reader who is unable to process comparative statements.

I don't think we should be making that the bar...

hashta

3 months ago

1 reply

It’s literally called "SimpleFold". But that’s not really my point, from your earlier comment (".. go through all the complexities first to find the generalized and simpler formulations"), I got the impression you thought the simplicity came purely from architectural insights. My point was just that to compare apples to apples, a model claiming "simpler but just as good" should ideally train on the same kind of data as AF or at least acknowledge very clearly that substantial amount of its training data comes from AF.

I’m not trying to knock the work, I think it’s genuinely cool and a great engineering result. I just wanted to flag that nuance for readers who might not have the time or background to spot it, and I get that part of the "simple/simpler" messaging is also about attracting attention which clearly worked!

godelski

3 months ago

  > I got the impression you thought the simplicity came purely from architectural insights.

I'm unsure where I indicated that, but apologize for the confusion. I was initially pushing back against your original criticism of something like Alphafold having needed to be built first.

Like you suggest, simple can mean many things. I think it's clear that in this context they mean "simple" (not from an absolute sense) in terms of the architectural design. I think the abstract is more than sufficient to convey this.

  > My point was just that to compare apples to apples

As a ML researcher who does a lot of work on architecture and efficiency, I think they are. Consider this from the end of the abstract

  | SimpleFold shows efficiency in deployment and inference on consumer-level hardware.

To me they are clearly stating that their goal isn't to get the top score on a benchmark. Their appendix shows that the 100M param is apples to apples to alphafold2 by size but not by compute. Even their 3B model uses less compute then alphafold2.

So being someone in a neighboring niche, I don't understand your claim. There's no easy way to make your comparisons "apples to apples" because we shouldn't be evaluating on a single metric. Sure, alphafold2 gives better results on the benchmarks but does that mean people wouldn't sacrifice performance for a 450x reduction in compute? (20x for their largest model. But note that compute, not memory).

  >  messaging is also about attracting attention

Yeah this is an unfortunate thing and I'm incredibly frustrated with this in academia and especially in ML. But it's also why I'm pushing against you. The problem stems from needing to get people to read your paper. There's a perverse incentive because you could have a paper that is groundbreaking but ends up having little to no impact because it didn't get read. A common occurrence is that less innovative papers will get magnitudes more citations by using similar methods but scale and beat benchmarks. So unfortunately as long as we use citation metrics as a significant measure of our research impact then marketing will be necessary. A catchy title is a good way to get more eyeballs. But I think you're being too nitpicky here and there's far more egregious/problematic examples. I'm not going to pick my fight with a title when the abstract is sufficiently clear. Could it be more clear? Certainly. But if the title is all that's wrong then it's a pretty petty problem. Especially if it's only confusing people who are significantly outside the target audience.

Seriously, what's the alternative? That researchers write to the general public? To the general technical public? I'm sorry, I don't think that's a good solution. It's already difficult to communicate to people in the same domain (but not niche) in the page limit. It's hard to be them to read everything as it is. I'd rather papers be written strongly for the niche peers and enough generalization that domain experts can get through it with effort. For the general public, that's what science communicators are for

littlestymaar

3 months ago

1 reply

> but this model wouldn’t exist without the large synthetic dataset produced by the more complex AF

This model could also have existed from natural data if we had access to enough of it.

inkysigma

3 months ago

Maybe, but then this seems more like an exercise in distillation rather than solving the original problem which is what the title "Folding proteins is simpler..." suggested to me at least. Part of the problem with any ML task is that data is usually limited and presumably far more limited than the amount of synthetic data you can generate.

slashdave

3 months ago

2 replies

> It should be obvious that things in nature run off of relatively simple rulesets

Only if you are willing to call a billion years of evolutionary selection a "simple ruleset"

TeMPOraL

3 months ago

1 reply

Evolution is a dumb, greedy search that can only work in extremely tiny increments, and every step has to result in a viable organism that's also at least as fit as it was before.

That means whatever evolution created, whether it's wings or brains, however complex it looks now, must be fundamentally simple enough it could be reached by iterating in tiny steps that were useful in isolation. It constrains the space of designs reachable by evolution considerably.

slashdave

3 months ago

1 reply

> every step has to result in a viable organism that's also at least as fit as it was before.

Not true. Learn some genomics before trying to explain evolution.

TeMPOraL

3 months ago

I did. Sure, I'm glossing over some detail - in fact, in the passage you quoted, half of the words stand for something that would take paragraph each to expand on - but that doesn't conflict with the zoomed-out perspective. Can you tell me where you think I'm wrong about the gist of it?

godelski

3 months ago

1 reply

Run a game of life for a billion years and tell me if your answer is the same. You can accelerate that so I'll wait.

Does the time matter? A ruleset doesn't change with time.

If you're still unconvinced, get a degree in physics. I'm not sure how you could get through that and still not believe that complexity rises from simplicity and how you end up getting drops in that complexity, which we call emergence, before becoming more complex than before.

slashdave

3 months ago

1 reply

And now you compare biology to the game of life...

godelski

3 months ago

1 reply

I want you to read what you wrote again...

But you really do seem to be trying hard to miss the point entirely. Life has actually nothing to do with what I said did it. And I can assure you, by nature of being one, that physicists are certain that nature follows simple rules, even if we don't know them.

We are also absolutely confident in that complexity rises out of simplicity. Go look at anything like fractals, chaos theory, perturbation theory, or you should have run into at least bifurcation diagrams in your differential equations course. If you haven't taken diff eq, then well.... perhaps the problem is that your confidence in your result is stronger than your expertise. If not, well... make a real argument because I'm not going to hold your hand through this any longer.

slashdave

3 months ago

1 reply

Except I have a PhD in physics...

The thing is, Biology is anything besides simple.

godelski

3 months ago

1 reply

For fucks sake man, all the physics problems we end up doing start from a relatively simple description of a system, which turns into a huge mess and pages of writing trying to solve it, and you end up with another relatively simple description.

I just don't understand how someone could even get through an undergraduate degree in physics without seeing this complexity in E&M. You got 4 rules that describe everything. Each rule can be written on a short line and only contain a handful of symbols. In other words: those rules are simple. Yet that doesn't mean they're very useful in that form, but you can derive the rest from them. That is exactly what I'm talking about with the game of life.

How the fuck did you get through differential equations without seeing how complexity arises from simplicity, let alone Jackson or Goldstein!?

Idk man, either you're lying or being disingenuous. You're the only one who said biology is simple. No one even implied that! If you're not lying about your degree you're willfully misinterpreting the comments. Why? For what purpose?

slashdave

3 months ago

I had to break it to you, but you cannot predict protein folding directly from first principles. Nor can any deep-learning model.

You can use molecular dynamics. Maybe, if you are lucky and have the computational resources to do so.

You might want to relate molecular dynamics to "simple rules", but you would be delusional. Molecular dynamics typically use classical forcefields parameterized on data and some quantum simulations. It is not based on first principles.

Proteins fold in patterns generated over millennium of natural selection. It is not simple.

aDyslecticCrow

3 months ago

1 reply

What i take away is the simplicity and scaling behavior. The ML field often sees an increase in module complexity to reach higher scores, and then a breakthrough where a simple model performs on-par with the most complex. That such a "simple" architecture works this well on its own, means we can potentially add back the complexity again to reach further. Can we add back MSA now? where will that take us?

My rough understanding of field is that a "rough" generative model makes a bunch of decent guesses, and more formal "verifiers" ensure they abide by the laws of physics and geometry. The AI reduce the unfathomably large search-space so the expensive simulation doesn't need to do so much wasted work on dead-ends. If the guessing network improves, then the whole process speeds up.

- I'm recalling the increasingly complex transfer functions in redcurrant networks,

- The deep pre-processing chains before skip forward layers.

- The complex normalization objectives before Relu.

- The convoluted multi-objective GAN networks before diffusion.

- The complex multi-pass models before full-convolution networks.

So basically, i'm very excited by this. Not because this itself is an optimal architecture, but precisely because it isn't!

nextos

3 months ago

> Can we add back MSA now?

Using MSAs might be a local optimum. ESM showed good performance on some protein problems without MSAs. MSAs offer a nice inductive bias and better average performance. However, the cost is doing poorly on proteins where MSAs are not accurate. These include B and T cell receptors, which are clinically very relevant.

Isomorphic Labs, Oxford, MRC, and others have started the OpenBind Consortium (https://openbind.uk) to generate large-scale structure and affinity data. I believe that once more data is available, MSAs will be less relevant as model inputs. They are "too linear".

slashdave

3 months ago

Correct. For those that might not follow, the MSA is used to generalize from known PDB structures to new sequences. If you train on AlphaFold2 results, those results include that generalization, so that your model no longer needs that capability (you can rely on rote memorization). This simple conclusion seems to have escaped the authors.

mapmeld

3 months ago

And AlphaFold was validated with experimental observation of folded proteins using X-rays

dyauspitr

3 months ago

2 replies

Isn’t this a largely solved problem after Alphafold?

samfriedman

3 months ago

1 reply

Maybe they've been working on it, but got scooped?

zamadatix

3 months ago

I don't think that's the case. The numbers in the paper suggest ~92% of the training data comes from pre-existing AI models, including AlphaFold, and they claim things like:

> We largely adopt the data pipeline implemented in Boltz-11 1https://github.com/jwohlwend/boltz (Wohlwend et al., 2024), which is an open-source replication of AlphaFold3

I believe the story here is largely that they simplified the architecture and scaled it to 3B parameters while maintaining leading results.

the__alchemist

3 months ago

Should an entry in a field preclude other ones. I encourage you to apply reductio-ad-absurdum here. Should Pepsi exist if Coke does? Should C exist if Fortran does?

phoenicyan

3 months ago

3 replies

Curious since AlphaFold got released: have classical molecular dynamics sims in this area become obsolete, at least for protein folding? How does the research coming out of venues like DESRES compare? Are they working on more specific problems in the same area or are they in a different business altogether?

dekhn

3 months ago

1 reply

MD was never really a viable way to do structure prediction, so it didn't become obsolete with AlphaFold. Instead, MD is more useful for studying the physical process of protein folding (before the protein folds to its final structure, as well as once it has reached its final structure and sort of jiggles and wiggles around that).

cowsandmilk

3 months ago

MD simulations typically aren’t run for time scales that tell you anything about the folding process. Most people are looking at motion after the protein has folded.

the__alchemist

3 months ago

No. AlphaFold doesn't do dynamics; it does end-state snapshots only. It does not do anything about the motion of the atoms, which is the core functionality of MD.

tripplyons

3 months ago

I was curious about what was released, and the parameters for AlphaFold V3 are only given to certain groups for non-commercial use: https://github.com/google-deepmind/alphafold3?tab=readme-ov-...

However, it seems like anyone can download the parameters for AlphaFold V2: https://github.com/google-deepmind/alphafold?tab=readme-ov-f...

GistNoesis

3 months ago

1 reply

Intellectually, I don't like this approach.

Predicting the end-result from the sequence of protein directly is prone to miss any new phenomenon and would just regurgitate/interpolate the training datasets.

I would much prefer an approach based on first principles.

In theory folding is easy, it's just running a simulation of your protein surrounded by some water molecules for the same number of nano-seconds nature do.

The problem is that usually this take a long time because evolving a system needs to compute the energy of the system as a position of the atoms which is a complex problem involving Quantum Mechanics. It's mostly due to the behavior of the electrons, but because they are much lighter they operate on a faster timescale. You typically don't care about them, only the effect they have on your atoms.

In the past, you would use various Lennard-Jones potentials for pairs of atoms when the pair of atoms are unbounded, and other potentials when they are bonded and it would get very complex very quickly. But now there are deep-learning based approach to compute the energy of the system by using a neural network. (See (Gromacs) Neural Network Potentials https://rowansci.com/publications/introduction-to-nnps ). So you train these networks so that they learn the local interactions between atoms based on trajectories generated from ab-initio theories. This allows you to have a faster simulator which approximate the more complex physics. It's in a sort just tabulating using a neural network the effect of the electrons would have in a specific atom arrangements according to the theory you have chosen.

At any time if you have some doubt, you can always run the slower simulator in the small local neighborhood to check that the effective field neural network approximation holds.

Only then once you have your simulator which is able to fold, you can generate some dataset of pairs "sequence of protein" to "end of trajectory", to learn the shortcut like Alpha/Simple/Fold do. And when in doubt you can go back to the slower more precise method.

If you had enough data and can train perfectly a model with sufficient representation power, you could theoretically infer the correct physics just from the correspondence initial to final arrangements. But if you don't have enough data it will just learn some shortcut and accept that it will be wrong some times.

slashdave

3 months ago

1 reply

> it's just running a simulation of your protein surrounded by some water molecules for the same number of nano-seconds nature do.

No, the environment is important. Also, some proteins fold while being sequenced.

Folding can also take minutes in some cases, which is the real problem.

> which is a complex problem involving Quantum Mechanics

Most MD simulations use classical approximations, and I don't see why folding is any different.

GistNoesis

3 months ago

1 reply

Being able to quantify the importance of the environment is one advantage of using a simulator based approach. You know what's happening, and you can simulate other environments by adding the relevant molecules around.

Speeding-up the folding is not the real problem, knowing what happen is. One way to speed-up the process is just to minimize the free-energy of the configuration (or some other quantity you derive from the neural network vector potential). (That's what the game fold-it was about : minimizing the Rosetta energy function). An other way would be to just use generative method like diffusion model to generate a plausible full trajectory (but you need some training dataset to bootstrap the process). Or work with key-configuration frames. The simulation can take a long time but it goes through specific arrangements (the transitions between energy plateau), and you learn these key points.

The simulator can also be much faster because it doesn't have to consider all the pair of atom arrangements (n^2 behavior if you are naive) into O(n) with n the number of atoms (with the bigger constant which is running the neural network hidden inside the O notation).

The simulations are classical but fundamentally they rely on the shape of the electron clouds. The electron density can deform (that's what bonding is), providing additional degrees of liberty, allowing the atom configuration to slide more easily against itself and avoid getting stuck in local optimum. Fortunately all this mess is nicely encapsulated inside the neural network potential and we can work without worrying about the electrons, their shape being implicitly defined by the current position of the atoms (using the implicit function theorem make abstracting their behaviour sound because of the faster timescales).

slashdave

3 months ago

No, this is all basically wrong.

Potential != free energy. Entropy is a driving force behind folding.

> The simulations are classical but fundamentally they rely on the shape of the electron clouds.

This is not what is meant by classical

barbazoo

3 months ago

No folding here. Proteins go on the hanger or in the drawer.

ziofill

3 months ago

In the plots in Fig. 4 it looks like they should have continued the training because the performance was still climbing, am I reading it incorrectly?

alex77456

3 months ago

Semi related, Veritassium channel made a nice video on protein folding

https://www.youtube.com/watch?v=P_fHJIYENdI

vbarrielle

3 months ago

A paper that says: "our approach is simpler than the state of the art". But also does not loudly say "our approach is significantly behind the state of the art on all metrics". Not easy to get published, but I guess putting it as a preprint with a big company's name will help...

kazinator

3 months ago

I'm satisfied with with folding roast beef onto a sandwich, or folding egg whites into batter. All the protein folding action I could ever want.

wild_pointer

3 months ago

Did you just assume what I think about protein folding simplicity?!

tzumby

3 months ago

Flow-matching, the technique they describe is incredibly interesting. I studied it in the context of generative AI and found it fascinating. It’s so fitting that a technique that borrows from thermodynamics and uses Brownian motion would go full circle to solve for protein folding.

nicohayes

3 months ago

This is a classic knowledge distillation pattern in ML - the "teacher" models (AlphaFold, ESMFold) with complex MSA-based architectures generate training data for a simpler "student" model. What s particularly interesting is how well the simplified architecture generalizes despite losing the evolutionary signal from MSAs. The performance suggests that much of the MSA complexity might be capturing patterns that can be learned more directly from structure data. This could be huge for real-time applications where MSA computation is the bottleneck. Has anyone benchmarked inference speed comparisons with the original AlphaFold pipeline?

nicohayes

3 months ago

This is a classic knowledge distillation pattern in ML - the "teacher" models (AlphaFold, ESMFold) with complex MSA-based architectures generate training data for a simpler "student" model. What's particularly interesting is how well the simplified architecture generalizes despite losing the evolutionary signal from MSAs. The performance suggests that much of the MSA complexity might be capturing patterns that can be learned more directly from structure data. This could be huge for real-time applications where MSA computation is the bottleneck. Has anyone benchmarked inference speed comparisons with the original AlphaFold pipeline?

nextworddev

3 months ago

In industry Google practically dominates this field

View full discussion on Hacker News

ID: 45389267Type: storyLast synced: 11/20/2025, 7:35:46 PM

Want the full context?

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN