Product Launch
anonymous
22 points
14 comments
Postedabout 2 months agoActiveabout 2 months ago
Show HN: A business SIM where humans beat GPT-5 by 9.8 X
AIbusiness simulationLLM
Discussion (14 comments)
Showing 4 comments of 14
about 2 months ago
It feels like we are pretty far away from LLMs running a concession stand (see andon labs) so not surprised it would struggle here. Still the failure modes are super interesting and having benchmarks seems to be the starting point to domain-specific improvements.
about 2 months ago
Saving this for next time I get over caffeinated and try to convince my friends that economically viable AI will make their CPG business irrelevant
about 2 months ago
I'm kinda curious how a VLM would do -- better spatial reasoning but worse planning? I don't use an AI web browser, but I'd be curious to know what happens if you throw something like OpenAI Atlas at the game's webpage.
about 2 months ago
Have you talked to Alex Duffy from Good Start Labs? Recommend reaching out
10 more comments available on Hacker News