Things I've Learned in My 7 Years Implementing AI

Posted3 months agoActive2 months ago

jampa

156 points

55 comments

jampa.devTechstoryHigh profile

calmmixed

Debate

60/100

AILlmsProduct DevelopmentStartups

Key topics

Llms

Product Development

Startups

The author reflects on 7 years of implementing AI, sharing lessons learned and observations on the current state of AI development, sparking a discussion on the limitations and potential of AI as a product or tool.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

22m

Peak period

Day 1

Avg / period

Comment distribution55 data points

Loading chart...

Based on 55 loaded comments

Key moments

01Story posted
Oct 15, 2025 at 2:27 PM EDT
3 months ago
Step 01
02First comment
Oct 15, 2025 at 2:49 PM EDT
22m after posting
Step 02
03Peak activity
51 comments in Day 1
Hottest window of the conversation
Step 03
04Latest activity
Oct 28, 2025 at 1:02 PM EDT
2 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (55 comments)

Showing 55 comments

nh23423fefe

3 months ago

1 reply

nit: space bar doesn't scroll

this list isnt learnings.

sodapopcan

3 months ago

2 replies

It doesn't. It's a jump scare, and a stupid one at that.

jampaAuthor

3 months ago

3 replies

I'm not sure what you mean. This is the vanilla Substack layout.

EDIT: Removed the video because a bug in Substack causes the space bar to play the video instead of scrolling down. Sorry for the unintentional jumpscare.

stronglikedan

3 months ago

1 reply

It means scrolling with the space bar is janky and broken, instead of smooth and working properly.

jampaAuthor

3 months ago

Removed the video because a bug in Substack causes the space bar to play the video instead of scrolling down. Sorry for the unintentional jumpscare.

sodapopcan

3 months ago

Oh I didn't even see my video. I honestly thought it was a sick joke, lol. My speakers were turned way up.

CamperBob2

3 months ago

Seems to make noise (I closed the page ASAP so not sure what it actually did.)

SeanAnderson

3 months ago

1 reply

holy shit lol. I should've trusted you but I hit spacebar and was NOT expecting that hahaha.

jay_kyburz

3 months ago

Oh, I wanted to see what you guys are talking about, but I just get a scroll. I wonder if they changed it already.

wppick

3 months ago

3 replies

The one issue is the accuracy of these AI models, which is that you can't -really- trust them to do a task fully, so that makes it hard to fully automate things with them. But the other is cost. Anyone using these models to do something at scale is paying maybe 100X over would it would cost in compute to run deterministic code to do the same thing. So in cases where you can write deterministic code to do something, or build a UI for a user to do it themselves, that still seems to be the best way. Once AI gets to the point where you can fully trust some model, then we've probably already hit AGI and at that point we're probably all in pods with a cable in our brainstems, so who cares...

frankc

3 months ago

3 replies

The thing is that I don't use AI to replace things I can do deterministically with code. I use it to replace things I cannot do deterministically with code - often something I would have a person do. People are also fallible and can't be completely trusted to do the thing exactly right. I think it works very well for things that have a human in the loop, like codeing agents where someone needs to review changes. For instance, I put an agent in a tool for generating aws access policies from english descriptions or answering questions about current access (where they agent has access to tools to see current users, buckets policies etc). I don't trust the agent to do it exactly right so it just proposes the policies and I have to accept or modify them before they are applied, but its still better than writing them myself. And it's better than having a web interface do it because that is lacking context.

I think it's a good example of the kind of internal tools the article is talking about. I would not have spent the time to build this without claude making it much faster to build stand-alone projects and I would not have the agent to do the english -> policy output with LLMs.

enraged_camel

3 months ago

1 reply

>> The thing is that I don't use AI to replace things I can do deterministically with code. I use it to replace things I cannot do deterministically with code - often something I would have a person do.

Nailed it. And the thing is, you can (and should) still have deterministic guard rails around AI! Things like normalization, data mapping, validations etc. protect against hallucinations and help ensure AI’s output follows your business rules.

Terr_

3 months ago

> Things like normalization, data mapping, validations etc. protect against hallucinations

And further downstream: Audit trails, human sign-offs, operations which are reversible or have another workflow for making compensating actions to fix it up.

asdff

3 months ago

1 reply

Or, you could make a tool that can generate this stuff deterministically every time the exact same way. At least with that situation you can audit the tool and see if it is correct or not. You still leave the point of failure on the user in your situation, even higher because they could get complacent with the llm output and assume it is correct or mistakenly think it is correct.

In my mind you are trading potentially a function that always evaluates the same for a given f(x) for one that might not evaluate the same and requires oversight.

Bratmon

3 months ago

So how would you implement "generating aws access policies from english descriptions" using deterministic code in a way that doesn't require human oversight?

teleforce

3 months ago

> I think it works very well for things that have a human in the loop, like codeing agents where someone needs to review changes

This is the best case for AI, it's not very different from the level 3 autonomous car with driver in the loop instead of fully autonomous level 5 vehicle that probably requires AGI level of AI.

The same applies to medicine where limited number specialists (radiologist/cardiologist/oncologist/etc) in the loop are being assisted by AI for activities that probably require too much time for experts manually looking at laborious evidences especially for non-obvious early symptom detection (X-ray/ECG/MRI) for the modern practice of evidence based medicine.

dayvid

3 months ago

The article kind of addresses that in identifying what are the best type of problems AI can solve

bsder

3 months ago

> Anyone using these models to do something at scale is paying maybe 100X over would it would cost in compute to run deterministic code to do the same thing

That's fine if the person wouldn't be able to write the code otherwise.

There are lots and lots of people in positions that are "programming adjacent". They use computers as their primary tool and are good at something (like CAD), but can't necessarily sling code. So, a task like: "We're about to release these drawings to an external client. Please write a script to check that all the drawings have author, project, and contract number that matches what they should for this client and flag any that don't." is good AI bait. Or "Please shovel this data from X, Y, and Z into an Excel Spredsheet" is also decent AI bait.

Programmers underestimate how difficult it is to synthesize code from thin air. It is much easier to read a small script than to construct it.

isoprophlex

3 months ago

3 replies

> Creating AI models is hard, but working with them is simple

I'm not disagreeing with the overall post, but from closely observing end users of LLM-backed products for a while now, I think this needs nuance.

The average joe, be it a developer, random business type, a school teacher or your mum, is very bad at telling an llm what it should do.

- In general people are bad at expressing their thoughts and desires clearly. Frontier LLMs are still mostly sycophantic, so in absence of clear instructions they will make up things. People are prone to treating the LLM as a mind reader, without critically assessing if their prompts are self-contained and sufficiently detailed.

- People are pretty bad at estimating what kind of data an LLM understands well. In general data literacy, and basic data manipulation skills, are beneficial when the use case requires operating on data besides natural language prompts. This is not a given across user bases.

- Very few people have a sensible working model of what goes on in an autoregressive black box, so they have no intuition on managing context

User education still has a long way to go, and IMO is a big determining factor in people getting any use at all from the shiny new AI stuff that gets slathered onto every single software product these days

dayvid

3 months ago

1 reply

Prompt engineering is a transitory phase. Embedding it into existing tools so the 80-90% of regular prompt patterns can be worked into the UI (or contextual UI designed around how a user uses the product) is the next step

isoprophlex

3 months ago

1 reply

Yeah if you mean that LLMs are used with UIs on top that allow "too much magic", I agree.

Free form chat is pretty terrible. People just want the thing to (smartly) take actions. One or two buttons that do the thing, no prompting involved, is much less complicated.

gubicle

3 months ago

1 reply

The whole point is the prompt (+ a static set of (system)prompts). If your whole function as a human is clicking one of a set of buttons to trigger an AI action, then you are automate-able in a few lines of code (and the AI is better than you at deciding which button to click anyway (supposedly)).

There are like thousands wrappers around LLMs masquerading as AI apps for specialized usecases, but the real performance of these apps is really only bottlenecked by the LLM performance, and their UIs generally only get in the way of the direct LLM access/feedback loop.

To work with LLMs effectively you need to understand how to craft good prompts, and how to read/debug the responses.

dayvid

3 months ago

1 reply

I mean if you’re building for a consumer and you know what most of them may prompt, you can interface it with the UI so it’s not a game of hope you’re good at prompting because if not your experience isn’t going to be good. You could still offer a text panel if it fails

gubicle

3 months ago

1 reply

What does 'interface it with the UI' mean though? How does adding buttons make it easier for the user to work with the AI? The whole point is that users can control it using the most natural and ubiquitous way possible - through natural language.

Yeah, it often makes sense to adjust the user's prompt, add system/wrapper prompts, etc. But that's not really related to UI..

dayvid

2 months ago

A lot of people don't know how to ask for what they want or ask it in different ways. If you can normalize this, you can normalize results. When consistent results are more important, introducing guardrails via UI or a guided flow is more relevant

wppick

3 months ago

1 reply

The higher your executive function the more use you will get out of LLMs. This is really the skill you should be testing for in interviews now. Not letting a candidate use AI for their interview is not a useful evaluation anymore. I want to see how you use it. Do you prompt well, how much are you trusting and verifying what it outputs

procaryote

3 months ago

2 replies

Is "executive function" the thing you mean here? You need a lot of self control to write good prompts?

arthurcolle

3 months ago

1 reply

Executive functions just means exerting control over things. It can be self-control, but that can mean control over one's focus

procaryote

3 months ago

1 reply

Yeah, I'm just struggling a bit to see how that is the key thing to write a good prompt

wppick

3 months ago

https://en.wikipedia.org/wiki/Executive_functions

"Executive functions include basic cognitive processes such as attentional control, cognitive inhibition, inhibitory control, working memory, and cognitive flexibility. Higher-order executive functions require the simultaneous use of multiple basic executive functions and include planning and fluid intelligence (e.g., reasoning and problem-solving)."

Executive function is not just emotional control. It's higher levels of thinking. It's the E in CEO. AI is really just an amplifier, just like how a calculator in one person's hand could be much powerful than in someone else's

kilna

3 months ago

It takes good self control to not go down rabbit holes writing the wrong prompts, or prompts that produce interesting or pleasing results without necessarily solving the problem you intended to solve. Sycophantic LLMs are an addiction engine, in addition to being a guess-based autocomplete for thoughts.

dgfitz

3 months ago

1 reply

Is this a “you’re holding it wrong, you idiot” post?

isoprophlex

3 months ago

No no definitely not meant to be condescending; our ui paradigms don't align with users that just want to get stuff done. See what dayvid writes, too.

and everyone's being sold on this tech being super magic but to some questions there is an irreducible complexity that you have to deal with, and that still takes effort.

random3

3 months ago

1 reply

I like the engineering part at the top but projecting AI perspectives blindsided through the lens of LLMs is effectively "looking backwards".

So this is nice

> productionizing their proof-of-concept code and turning it into something people could actually use.

because it's so easy to glamorize research, while ignoring what actually makes ideas products.

This is also the problem. It's a looking back perspective and it's so easy to be miss the forest from the trees when you're down in the weeds. I'm talking from experience and it's a feeling I get when reading the post.

In the grand scheme of things our current "AI" will probably look like a weird detour.

Note that a lot of these perspectives are presented (and thought) without a timeline in mind. We're actually witnessing timelines getting compressed. It's easy to see the effects of one track while missing the general trend.

This take is looking at (arguably "over") LLM timeline, while missing everything else that is happening.

only-one1701

3 months ago

Ok, so if projecting through the lens of LLMs is looking backwards...what would you have preferred to see?

cadamsdotcom

3 months ago

2 replies

> What I do see is a boom in internal tools.

It’s easy now to get something good enough for use by you, friends, colleagues etc.

As it’s always been, developing an actual product is at least one order of magnitude more work. Maybe two.

But both internal tools and full products are made one order of magnitude easier by AI. Whole products can be made by tiny teams. And that’s amazing for the world.

zwnow

3 months ago

2 replies

> Everything got one order of magnitude easier thanks to AI.

No. Not at all. Many things maybe got easier but a lot of things got magnitudes harder. Maintaining bug bounty programs for example, or checking the authenticity and validity of written content on blogs.

Calling LLMs are a huge win for humanity is incredibly naive given we dont know the long term effects these tools are having on creativity in online spaces, authenticity of user bases, etc etc.

cadamsdotcom

3 months ago

1 reply

A thing doesn’t have to be universally good to be good.

zwnow

3 months ago

1 reply

Obviously, but id refrain from calling something good without knowing the extend of its damage.

reppap

3 months ago

Agreed. We don't know if AI is another asbestos or not, yet.

leptons

3 months ago

>we dont know the long term effects these tools are having on creativity in online spaces, authenticity of user bases, etc etc.

Some artificial flavorings and artificial coloring have been proven to be cancerous. I doubt artificial intelligence is going to end up being much different.

asdff

3 months ago

Whole products could always be made by tiny teams.

blobbers

3 months ago

2 replies

"There’s a reason we’re not seeing a “Startup Boom” AI skeptics ask, “If AI is so good, why don’t we see a lot of new startups?” Ask any founder. Coding isn’t even close to the most challenging part of creating a startup."

-- uhhh... am I the only one seeing a startup boom??? There are a bajillion kids working on AI start ups these days.

at-fates-hands

3 months ago

1 reply

Nope, you're spot on.

AI deals continued to dominate venture funding during the third quarter. AI companies raised $19 billion in Q3, according to Crunchbase data. That figure represents 28% of all venture funding.

The fourth quarter of 2024 has been no less busy for these outsized rounds. Elon Musk’s xAI raised a behemoth $6 billion round, one of seven AI funding rounds over $1 billion in 2024, in November. That’s just months after OpenAI raised its $6.6 billion round.

https://techcrunch.com/2024/12/20/heres-the-full-list-of-49-...

blobbers

3 months ago

Yeah - it's basically like the days when everyone added "Deep" to their name to get an extra 0 on their valuation.

What do you do? Oh we do DeepCoffee brewing. It's a coffee machine powered by Deep Learning to brew the perfect cup. Keurig and Starbucks are Yahoo, and we're Google. (now people probably say those guys are google and we're openai but I digress)

asdff

3 months ago

Startups optimize for things that secure initial seed money not things that stand up on their own two feet. That is why most fail. What VCs invest in and what succeeds is not really correlated.

personjerry

3 months ago

I stopped reading at "AI skeptics ask, “If AI is so good, why don’t we see a lot of new startups?”" because what???

apf6

3 months ago

> AI as a product isn’t viable

Yeah I don't know about that, the model providers like OpenAI, Anthropic, etc, literally sell intelligence as a product. And their business model is looking a lot more stable in the long term than all the startups built on top.

Nevermark

3 months ago

> AI as a product isn’t viable: It’s either a tool or a feature

This correlates with the natural world. Intelligence isn’t a direct means of survival for anything. It isn’t a requirement for physical health.

It is an indirect means, I.e. a tool.

basisword

3 months ago

>> What I do see is a boom in internal tools.

This has been my main use case for AI. I have lots of ideas for little tools to take some of the drudgery out of regular work tasks. I'm a developer and could build them but I don't have the time. However, they're simple enough that I can throw them together in a basic script form really quickly with Cursor. Recently I built a tool to analyse some files, pull out data, and give me it in the format I needed. A relatively simple python script. Then I used Cursor to put it together with a simple file input UI in an electron app so I could easily share it with colleagues. Like I say, I've been developer for a long time but never written python or packaged an electron app and this made it so easy. The whole thing took less than 20mins and it was quick enough that I could do it as part of the task I was doing anyway rather than additional work I needed to find time to do.

Zababa

3 months ago

> LLMs won’t get much better, but that is okay

> The last releases were unimpressive. Does anyone know a real application where ChatGPT 5 can do something that o3 could not?

Yeah, GPT-5 is way better at coding in codex, especially for longer tasks. Opus 4.1 is pretty good too. Gemini 3 is dropping soon. GPT-5 was more about reducing costs/having a router so when your doctor asks a question to chatgpt he's routed to the thinking version from what I understand.

> The good news is that what we have is enough for most people.

This is true.

Also stop using LMArena as an indicator of anything, it hasn't meant much for more than 6 months.

aDyslecticCrow

3 months ago

This article aligns very well to my frustration to the current view of AI in media and discussion.

> AI tools like KNNs are very limited but still valuable today.

I've seen discussions calling even feed-forward CNNs, monte-carlo chains, or GANs "antiquated" because transformers and diffusion have surpassed their performance on many domains. There is a hyper-fixation on large transformers and a sentiment that it somehow replaces everything that came before in every domain.

It's a tool that unlocks things we could not do before. But it doesn't do everything better. It does plenty of things worse (at-least taking power and compute into account). Even if it can do algebraic now (as is so proudly proclaimed in the benchmarks), wolfram alpha remains and will continue to remain far more suited to the task. Even if it can write code; it does NOT replace programming languages as I've seen people claim in very recent posts on here on HN.

groby_b

3 months ago

Many good points in the article, but I'd caveat that if you judge performance only by ELO score, you are not applying the best criteria.

SWEBench performance moved from 25% to 70% solved since beginning of 2025, and even with the narrowest possible lens from 65% to 70% since May. ARC-AGI2 keeps rapidly climing. We have experimental models able to (maybe) hold their ground at IMO gold. As well as performing at IPhO gold level.

And that leaves out the point that LMArena is a popularity contest. Not "was this correct", but "which answer did you like better". The thing that brought us glazing. A leveling ELO (or a very slowly climbing one) is kind of expected, and is really saying nothing about progress in the field.

Still doesn't mean "THE MACHINE GODS ARE COMING", but I would expect to see continued if slowing improvement. (I mean, how much better can you get at math if you already can be a useful assistant to Terence Tao and win IMO gold? And wouldn't we expect that progress to slow?)

But more than "how hard of a problem can you solve", I expect we'll see a shift to instead looking at missing capabilities. E.g. memory - currently, you can tell Claude 500 times to use uv, not pip, and it will cheerfully not know that again the 501st time. That's much more important than "oh, it now solves slightly harder problems most of us don't have". And if you look at arxiv papers, a lot are peeking in that direction.

I'd also expect work on efficiency. "can we not make it cost about the amount of India's budget to move the industry forward, every year" is kind of a nice idea. No, I'm not making the number up, or taking OAIs fantasy numbers - 2025 AI industry capex is expected to be $375B. We'll likely need that efficiency if we want to get significantly better at difficulty level or task length, too.

sleazebreeze

3 months ago

Contrary to the OP, there is a useless chatbot on the Amazon homepage ("Rufus" sparkle button).

asdev

3 months ago

You can't rely on them directly for business value, but you can rely on them indirectly is the summary of this post

View full discussion on Hacker News

ID: 45596602Type: storyLast synced: 11/20/2025, 6:45:47 PM

Want the full context?