Claude Code
exopriors.comKey Features
Tech Stack
Key Features
Tech Stack
Hopefully your API doesn't get exploited and you are doing timeouts/sandboxing -- it'd be easy to do a massive join on this.
I also have a question mostly stemming from me being not knowledgeable in the area -- have you noticed any semantic bleeding when research is done between your datasets? e.g., "optimization" probably means different things under ArXiv, LessWrong, and HN. Wondering if vector searches account for this given a more specific question.
Larger, more capable embedding models are better able to separate the different uses of a given word in the embedding space, smaller models are not.
When Claude is using our embed endpoint to embed arbitrary text as a search vector, it should work pretty well cross-domains. One can also use compositions of centroids (averages) of vectors in our database, as search vectors.
Yeah I've out a lot of care into rate-limiting and security. We do AST parsing and block certain joins, and Hacker News has not bricked or overloaded my machine yet--there's actually a lot more bandwidth for people to run expensive queries.
As for getting good semantic queries for different domains, one thing Claude can do besides use our embed endpoint to embed arbitrary text as a search vector, is use compositions of centroids (averages) of vectors in our database, as search vectors. Like it can effortlessly average every lesswrong chunk embedding over text mentioning "optimization" and search with that. You can actually ask Claude to run an experiment averaging the "optimization" vectors from different sources, and see what kind of different queries you get when using them on different sources. Then the fun challenge would be figuring out legible vectors that bridge the gap between these different platform's vectors. Maybe there's half the cosine distance when you average the lesswrong "optimization" vector with embed("convex/nonconvex optimization, SGD, loss landscapes, constrained optimization.")
When people say AI is a bubble but will still be transformational, I think of stuff like this. The amount of use cases for natural language interpretation and translation is enormous even without all the BS vibe coding nonsense. I reckon once the bubble pops most investment will go into tools that operate something like this.
Create the DB and provide the tools and skill.
This blog entry explains how: https://contextify.sh/blog/total-recall-rag-search-claude-co...
It is a macOS client at the present but I have a Linux-ready engine I could use early feedback on if anyone is interested in giving it a go.
> How does your comment enrich the conversation for the rest of us?
Straight back at you.
who knows what kind of fun patterns could emerge
Okaaaaaaay....
Which means that humans are reduced to a sort of uselessness / helplessness, using tools they don't understand.
Overall, no one tells Uncle Bob that he doesn't deserve to fly home to Minnesota for Christmas because he didn't build the aircraft himself.
But we all think it.
To understand how big this is, you have to look at the numbers:
The Naive Method: This is what most people learn in school. To multiply two 4x4 matrices, you need 64 multiplications.
The Human Record (1969): For over 50 years, the "gold standard" was Strassen’s algorithm, which used a clever trick to get it down to 49 multiplications.
The AI Discovery (2022): AlphaTensor beat the human record by finding a way to do it in just 47 steps.
The real "intelligence explosion" feedback loop happened even more recently with AlphaEvolve (2025). While the 2022 discovery only worked for specific "finite field" math (mostly used in cryptography), AlphaEvolve used Gemini to find a shortcut (48 steps) that works for the standard complex numbers AI actually uses for training.
Because matrix multiplication accounts for the vast majority of the work an AI does, Google used these AI-discovered shortcuts to optimize the kernels in Gemini itself.
It’s a literal cycle: the AI found a way to rewrite its own fundamental math to be more efficient, which then makes the next generation of AI faster and cheaper to build.
https://deepmind.google/blog/discovering-novel-algorithms-wi... https://www.reddit.com/r/singularity/comments/1knem3r/i_dont...
While a bit 'time-machiney' - I think if you took an LLM of today and showed it to someone 20 years ago, most people would probably say AGI has been achieved. If someone wrote a definition of AGI 20 years ago, we would probably have met that.
By todays definition of AGI we haven't met it yet, but eventually it comes down to 'I know it if I see it' - the problem with this definition is that it is polluted by what people have already seen.
No, as long as people can do work that a robot cannot do, we don't have AGI. That was always, if not the definition, at least implied by the definition.
I don't know why the meme of AGI being not well defined has had such success over the past few years.
Just as a person can be generally intelligent even if they are quadraplegic, we can have a system that is generally intelligent even if it can't fold laundry.
It gets worse though. While one could claim that scoring equivalently on some benchmark indicates performance at the same level - and I'd likely agree - that's not what I take AGI to mean. Rather I take it to mean "equivalent to a human" so if it utterly fails at something we're good at such as driving a car through a construction zone during rush hour then I don't consider it to have met the bar of AGI even if it meets or exceeds us at other unrelated tasks. You have to be at least as general as a stock human to qualify as AGI in my books.
Now I may be but a single datapoint but I think there are a lot of people out there who feel similarly. You can see this a lot in popular culture with AGI (or often AI) being used to refer to autonomous humanoid robots portrayed as operating at or above a human level.
Related to all that, since you mention protein folding. I consider that to be a form of super intelligence as it is more or less inconceivable that an unaided human would ever be able to accomplish such a feat. So I consider alphafold to be both super intelligent and decidedly _not_ AGI. Make of that what you will.
Artificial General Intelligence says nothing about physical ability, but movies with the ‘intelligence’ part typically match it with equally futuristic biomechanics to make the movie more interesting. AGI = Skynet, Physical AI = Terminator. The latter will likely be the hardest part, not only because it requires the former first, but because you can’t just throw more watts at a stepper motor and get a ballet dancer.
That said, I’m confident that if I could throw zero noise and precise “human sensory” level sensor data at any of the top LLM models, and their output was equally coupled to a human arm with the same sensory feedback, that it would definitely outdo any current self-driving car implementation. The physical connection is the issue, and will be for a long time.
> LLM models, ... outdo any current self-driving car
How would an LLM handle computer vision? Are you implicitly including a second embedding model there? But I think that's still the wrong sort of vision data for precise control, at least in general.
How do you propose to handle the model hallucinating? What about losing its train of thought?
I was confusing in my previous message. Right now it would be terrible at driving a car, but I was saying that has more to do with the physical interface (camera, sensors, etc) than the ability of an LLM. The ‘intelligence’ part is better than the PyTorch image recognition attached to a servo they’re using now, how to attach that ‘intelligence’ to the physical world is the 50 year task. (To be clear: LLMs aren’t intelligent, smart, or any sense of the word and never will be. But they can sure replicate the effect better than current self-driving tech.)
There is certainly a lot road between current technology and driving a car through a construction zone during rush hour, particularly with the same amount of driving practice a human gets.
Personally I think there could be an AGI which couldn't drive a car, but has genuine sentience - an awareness of being alive, although not necessarily the exact human experience. This to me would still be AGI - just one without reference to the human experience really. Maybe that's something beyond / different to general intelligence though, which more implies problem-solving and thinking rather than sentience.
Nonetheless, it's difficult to imagine a scenario where something that is genuinely human level can't adapt in the field to a novel task such as driving a car. That sort of wide adaptability is exactly what the "general" in AGI is attempting to capture (imo).
Very abstract, but I think it’s important to remember that human intelligence also has jagged edges.
I think it was supposed to be a more useful term than the earlier and more common "Strong AI". With regards to strong AI, there was a widely accepted definition - i.e. passing the Turing Test - and we are way past that point already: ( see https://arxiv.org/pdf/2503.23674 )
Also, I would go one step further and claim that to pass the Turing test an AI system should be indistinguishable from a human when judged by people trained in making such a distinction. I doubt that they used such people in the experiment.
I doubt that any AI system available today, or in the foreseeable future, can pass the test as I qualify it above.
It also makes me consider it an inadequate test to begin with, since all classes of humans including domain experts can be fooled and have been in the past. The Turing test has always said more about the human participants than the machine.
Most people who took a look at a carefully crafted demo. I.e. the CEOs who keep pouring money down this hole.
If you actually use it you'll realize it's a tool, and not a particularly dependable tool unless you want to code what amounts to the React tutorial.
I think if you took an LLM of today and showed it to someone 20 years ago, most people would probably say AGI has been achieved.
I’ve got to disagree with this. All past pop-culture AI was sentient and self-motivated, it was human like in that it had it’s own goals and autonomy.Current AI is a transcript generator. It can do smart stuff but it has no goals, it just responds with text when you prompt it. It feels like magic, even compared to 4-5 years ago, but it doesn’t feel like what was classically understood as AI, certainly by the public.
Somewhere marketers changed AGI to mean “does predefined tasks with human level accuracy” or the like. This is more like the definition of a good function approximator (how appropriate) instead of what people think (or thought) about when considering intelligence.
We can't measure goals, autonomy, or consciousness. We don't even have an objective measure of intelligence. Instead, since you probably look like me I think it's polite to assume you're conscious…that's about it. There’s literally no other measure. I mean, if I wanted to be a jerk, I could ask if you're conscious, but whether you say yes or no is proof enough that you are. If I'm curious about intelligence I can come up with a few dozen questions, out of a possible infinite number, and if you get those right I'll call you intelligent too. But if you get them wrong… well, I'll just give you a different set of questions; maybe accounting is more your thing than physics.
So, do you just respond with text when you’re promoted with input from your eyes or ears? You’ll instinctively say “No, I’m conscious and make my own decisions”, but that’s just a sequence of tokens with a high probability in response to that question.
Do you actually have goals, or did the system prompt of life tell you that in your culture, at this point in time, you should strive to achieve goals[] because that’s what gets positive feedback?
Maybe this will help you: https://en.wikipedia.org/wiki/Philosophical_zombie
The hard nut to crack here is nobody has am empirical test for the subjective experience of consciousness. A machine which actually possesses it, and a machine which merely emulates it and answers questions as if it has that subjective experience cannot be distinguished using any empirical test. That includes people; it's simply a matter of common courtesy and pragmatism that we assume other people have comparable subjective conscious experiences (aka they aren't p-zombies.)
I have to strongly disagree with you here. This was absolutely not the case in a very large amount of science fiction media, particular in the 20th century. AIs / robots were often depicted of automatons with no self-agency, no goal setting of their own, who were usually capable of understanding and following complex orders issued in natural language (but which frequently misunderstood orders in ways humans find surprising, leading to a source of conflict.)
Almost all of Asimov's robots are like this, there are a handful of counter examples, but for the most part his robots are p-zombies that mis-follow orders.
Nonhsentient AI with no personal motivation also frequently comes up in situations where the machine is built to be an impartial judge, for instance in The Demolished Man, all criminal prosecutions need to persuade a computer which does nothing but evaluate evidence and issue judgments.
Non-sentient AIs also show up often in ship-board computers. Examples are Mother in Alien, and the Computer in at least most of Star Trek (I'm no Trekkie, so forgive me for missing counter examples and nuance, technology in that show does whatever the writers needed.)
Even the droids in Star Wars, do they ever really execute agency over their own lives? They have no apparent life goals or plans, they're along for the ride, appliances with superficial personalities.
In The Hitchhiker's Guide to the Galaxy, does Deep Thought actually have self-agency? I only recall it thinking hard about the questions posed to it, and giving nonsensical answers which miss the obvious intent of the question, causing more trouble than any of it was worth.
Ghost in the Shell; obviously has sentient AIs, but in that setting these are novel and surprising, most androids in that are presumed to be just machines with dumb programming and it's only the unexpected emergence of more complicated systems that prompt the philosophizing.
That's probably not because of an inherent lack of capability, but because the companies that run AI products don't want to run autonomous intelligent systems like that
The book is a collection of nine short stories telling the tale of three generations of a family before, during, and after a technological singularity.
what makes this state of the art?
> Currently have embedded: posts: 1.4M / 4.6M comments: 15.6M / 38M That's with Voyage-3.5-lite
It is not a protected term, so anything is state-of-the-art if you want it to be. For example, Gemma models at the moment of release were performing worse their competition, but still, it is "state-of-the-art".
Juicero was state-of-the-art on release too, though hands were better, etc.
But is it true?
I think we ought to stop indulging and rationalizing self-serving bullshit with the "it's just marketing" bit, as if that somehow makes bullshit okay. It's not okay. Normalizing bullshit is culturally destructive and reinforces the existing indifference to truth.
Part of the motivation people have seems to be a cowardly morbid fear of conflict or the acknowledgment that the world is a mess. But I'm not even suggesting conflict. I'm suggesting demoting the dignity of bullshitters in one's own estimation of them. A bullshitter should appear trashy to us, because bullshitting is trashy.
1) Full readonly-SQL + vector manipulation in a live public database. Most vector DB products expose a much narrower search API. Basically only a few enterprise level services let you run arbitrary SQL on remote machines. Google BigQuery gives users SQL power, but it mostly doesn't have embeddings, connect public corpora, have as good of indexes, and doesn't have support an agentic research experience. Beyond object-level research, Scry a good tool for exploring and acquiring intuitions about embedding-space.
2) An agent-native text-to-SQL + lexical + semantic deep research workflow. We have a prompt that's been heavily optimized for taking full advantage of our machine and Claude Code for exploration and answering nuanced questions. Claude fires off many exploratory queries and builds towards really big queries that lean on the SQL query planner. You can interrupt at any time. You have the compute limits to do lots of exhaustive exploration--often more epistemically powerful than finding a document often, is being confident than one doesn't exist.
3) dozens of public commons in one database, with embeddings.
How much do you need for the various leaks, like the paradise papers, the panama papers, the offshore leajay, the Bahamas leaks, the fincen files, the Uber files, etc. and what's your Venmo?
As it is, the Supplementary Materials are inconsistently indexed so a lot of insight you might get from the last 15 years of genomics or proteomics work is invisible.
I imagine this approach could work, especially for Open Access data?
I wanted to find all cryoprotective agents that were tested at different temperatures, but it should be extandable to your problem too. Uses OpenAlex to traverse a citation graph and open access pdfs
Using LLm for tasks that could be done faster with traditional algorithmic approaches seems wasteful, but this is one of the few legitimate cases where embeddings are doing something classical IR literally cannot. You could also make make the LLM explain the query it’s about to run. Before execution:
“Here’s the SQL and semantic filters I’m about to apply. Does this match your intent?”
debias_vector(axis, topic) removes the projection of axis onto topic: axis − topic * (dot(axis, topic) / dot(topic, topic))
That preserves the signal in axis while subtracting only the overlap with topic (not the whole topic). It’s strictly better than naive subtraction for “about X but not Y.”
There are a lot of other low cost countries out there!
The mix of extreme poverty, extreme relative wealth, and the blind eye towards the sex trade is dark.
Such misery is not unique but you may find it more open, deeply rooted, palpable or covert-in-troubling-ways.
If you are doing serious dev work of a leveragable nature, I would also be thoughtful about how to protect my innovations in a heavenly land adjacent to China, full of friendly Russian expat hackers post-Ukraine-sanctions, with my hinkiness detectors already overwhelmed by cross cultural signals of a new environment.
Good luck!
Why would I want to explore the embedding space myself, isn't this a tool where I can run cross-data exploratory analyses against unstructured data, where it's pre-populated with content?
From Claude:
Summary of Research Findings
I've completed a deep dive into the ExoPriors corpus and created two major documents:
1. /research/exopriors_findings.md - GA and String Landscape Papers
Key discoveries:
- arXiv:1907.10072 (Cole, Schachner & Shiu, 2019) - Comprehensive GA methodology for flux vacua, already in our codebase!
- arXiv:1302.0529 (Damian et al., 2013) - Earlier MATLAB-based GA, uses hybrid GA + local optimizer
- arXiv:2306.06160 - Scalability analysis showing methods work for large Hodge numbers
2. /research/DESI_DARK_ENERGY_IMPLICATIONS.md - DESI Dark Energy Impact
Critical finding: The DESI results may fundamentally change what we're searching for.
The DESI result in brief:
- Dark energy equation of state w ≈ -0.55 (not -1)
- Evidence for evolution: wₐ ≈ -1.3
- Suggests "phantom crossing" (w went from < -1 to > -1)
- 2-3σ significance, still preliminary
Key papers found:
1. arXiv:2112.10779 (Cicoli et al.) - Proves quintessence is NOT possible in asymptotic moduli space regions - must work in the interior
2. arXiv:2511.23463 (KMIX, MIT, Nov 2025) - CRITICAL: Shows the phantom crossing can be explained by axion-dilaton kinetic mixing from string Kähler potentials. The "phantom" is an illusion!
What this means for our project:
- Instead of targeting static Λ = 2.888 × 10⁻¹²², we may need to target (w₀, wₐ)
- The KMIX model suggests axion + moduli dynamics, not just moduli stabilization
- The exponential kinetic coupling e^(αφ)(∂θ)² is exactly what appears in CY compactifications
Next steps:
1. Study KMIX model in detail - it uses string-motivated physics
2. Consider adding (w₀, wₐ) to our fitness function
3. Possibly include axion dynamics alongside Kähler moduli
4. Update FORMULAS.md with quintessence equations
[1] https://www.bbc.com/news/articles/c17xe5kl78voYou need to sandbox Claude to safely use this flag. There are easy to use options for this.
The first threat is making edits to arbitrary files, exfiltrating your SSL keys or crypto wallets. A container solves that by not mounting your sensitive files.
The second threat would be if Claude gets fully owned and really tries to hack out of its container, in which case theoretically docker might not protect you. But that seems quite speculative.
my charlatan radar is going off.
I think your definition of good needs to be rethought
Let’s not romanticize the past because it’s easier to ship (probably still buggy) code today.
I’ve got a TimescaleDB database with several years of market data: 1-minute stock aggregates, 1-minute options aggregates plus every individual options trade, SEC filings, Fed reports, earnings, and news stories with sentiment analysis, etc. It’s built on continuous aggregates with time_bucket(), plus materialized views and indexes tuned for performance.
Then I can ask Claude questions about the data in plain English, and it generates the SQL, runs it, and returns the results as tables.
For example, if I ask about multi-day momentum after an earnings miss, it’ll generate queries like:
“On days when SPY has been down three days in a row, what does SPY do the next day on average—and how often is the next day also down?”
``` -- Signal 4: MULTI-DAY MOMENTUM (3 consecutive down days)
WITH consecutive_down AS (
SELECT
day,
spy_ret,
LAG(spy_ret, 1) OVER (ORDER BY day) AS spy_ret_1,
LAG(spy_ret, 2) OVER (ORDER BY day) AS spy_ret_2,
LEAD(spy_ret, 1) OVER (ORDER BY day) AS next_spy
FROM expanded_spreads
)
SELECT
'MULTI-DAY MOMENTUM' AS signal,
COUNT(*) AS trades,
ROUND(AVG(next_spy)::numeric, 3) AS avg_ret,
ROUND(
100.0 * SUM(CASE WHEN next_spy < 0 THEN 1 ELSE 0 END)
/ COUNT(*)::numeric,
1
) AS short_win_rate
FROM consecutive_down
WHERE spy_ret < 0
AND spy_ret_1 < 0
AND spy_ret_2 < 0
AND next_spy IS NOT NULL;
```Not affiliated with Hacker News or Y Combinator. We simply enrich the public API with analytics.