MCP Apps just dropped (OpenAI and Anthropic collab) and I think this is huge
Mood
excited
Sentiment
positive
Category
startup_launch
Key topics
Openai
Anthropic
Ai_collaboration
Mcp_apps
Discussion Activity
Very active discussionFirst comment
N/A
Peak period
22
Hour 10
Avg / period
6
Based on 119 loaded comments
Key moments
- 01Story posted
Nov 22, 2025 at 10:27 PM EST
1d ago
Step 01 - 02First comment
Nov 22, 2025 at 10:27 PM EST
0s after posting
Step 02 - 03Peak activity
22 comments in Hour 10
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 24, 2025 at 2:18 AM EST
30m ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Honestly, I think the biggest friction for MCP adoption has been how un-userfriendly it is. It’s great for devs, but not the average users. Users don't always want to chat, sometimes they just want to click a button or adjust a slider. This feels like the answer to that problem.
Full disclosure, I'm partial here because of our work at https://usefractal.dev. We were early adopters when MCP first came out, but we always felt like something was missing. We kept wishing for a UI layer on top, and everyone says it's gonna take forever for the industry to adopt, maybe months, maybe years.
I cannot believe the adoption comes so quickly. I think this is gonna be huge. What do you guys think?
This is an oxymoron.
MCP servers’ tools are literally just function calls. It’s the LLM MCP client that’s not deterministic, not the MCP server.
In the real world, where it is (at least in our current state of overall programming language tooling, and the existence of physics) intractable to prove all eventualities and absence of side-effects of executed code, determinism is indeed a spectrum.
If we want to be specific here, I would say the "pretty deterministic" is equal to "as deterministic as your typical non-LLM REST API call", which still spans a big range of determinism.
google.com (1998-present)
[I'm feeling lucky]Things like ChatGPT are remarkably limited from a UX/UI point of view. The product can do amazing things but the UI is nothing special. The mac version currently has a bug where option+shift+1 opens a chat window but doesn't give it focus. When I do that from vs code it adds the editor window. But it's completely blind to any browser tab on which I do that. I'm sure there are good reasons for all that. But it strikes me a bit as a work in progress that a good product owner would spot.
With apps some of the more powerful capabilities (llms driving UIs directly, doing things in agentic loops, tool and API usage) are going to require much deeper integrations than are currently there. We get hints of what is possible and nice technology demos. But it's still hard to build more complicated workflows around this. Unless you build your own applications.
We've been staring at this from the point of view of automating some highly tedious stuff that we currently do in our company manually. For example, working with chat GPT seems to involve a lot of copy paste and manually doing things that it can't really do by itself. Even something as simple as working on a document it will do alright work on the text but then make a complete mess of the formatting. I spend an hour a few days ago iterating on a document where I was basically just fixing bullets and headers. Most alternatives I've tried aren't any better at this.
None of this seems particularly hard; it's just a lot of integration work that just hasn't happened yet. We have a bunch of lego bricks, not a lot of fully mature solutions. MCP isn't a full solution, it's a pretty lego brick. Mostly even OpenAI and Anthropic aren't getting around to doing much more than simplistic technology demos with their own stuff. IMHO their product teams are a lot less remarkable than their AI teams.
If one of the vendors manages to get their protocol to become the target platform (eg oai and app sdk), that is essentially their vendor lock in to become the next iOS/Android.
Private API’s or EEE strategies are gonna be something to keep an eye for and i wish regulators would step in to prevent them before its too late.
Having a chatbot that drives websites inside of it is such an attempted monopolist play. Having a system agent that can interact with apps via API without being connected to the app is the pattern that's both elegant and preserves freedom.
> Having a system agent that can interact with apps via API without being connected to the app is the pattern that's both elegant and preserves freedom
Does this prevent anyone from doing that?
The post title is quite editorialized.
I also think this is pretty big. I think a problem we collectively have right now is that getting MCP closer to real user flows is pretty hard and requires a lot of handholding. Ideally, most users of MCP wouldn't even know that MCP is a thing - the same way your average user of the web has no idea about DNS/HTTP/WebSockets. They just know that the browser helps them look at puppy pictures, connect with friends, or get some work done.
I think this is a meaningful step in the direction of getting more people who'll never know or care about MCP to get value out of MCP.
So, WSDL?
It actually never tried to figure out what API call you actually needed based on what the user asked and how to handle it in real time.
I mean it could, but WSDL was already superseded by REST.
MCP is incredibly vibe-coded. We know how to make APIs. We know how to make two-way communications. And yet "let's invent new terminology that makes little sense and awkward workarounds on top of unidirectional protocols and call it the best thing since sliced cheese".
And it does so in a standard way so that my client is available across AI providers. My users can install my client from an ordained URL without getting phished, or the LLM asking them to enter an API key.
What’s the alternative? Providing a sandbox to execute arbitrary code and make API calls? Having an LLM implement OAuth on the fly when it needs to make an API call?
MCP has a place.
It does just provide an API. Your client may have a way to talk to something via MCP protocol.
> And it does so in a standard way so that my client is available across AI providers.
As in: it's an API on port with schema that a certain subset of software understands.
> What’s the alternative? Providing a sandbox to execute arbitrary code and make API calls?
MCP is a protocol. It couldn't care less what you do with your "tool" calls. Vast majority of clients and servers don't run in any sandbox at all. Because MCP is a protocol, not a docker container.
> Having an LLM implement OAuth on the fly when it needs to make an API call?
Yes, MCP has also a bolted-on authorisation that they didn't even think of when they vibe-coded the protocol. And at least they finally there was some adult in the room that said "perhaps you should actually use a standardised way to do this".
The combination of these things turns into an ecosystem.
Why?
Do you refer to REST APIs or GraphQL as a whole? There are servers "adhering to the protocol" and "clients adding support" for these.
These are literally APIs.
Or are you confusing the LLM / MCP client invoking the tools being non-deterministic?
Turns out the approach works well for integrating web apps with LLMs. I have a payroll company using it in their stack to replace MCP and they’re reporting lower token usage and a better end result.
I am waiting for Excel CLI…
I love being able to type "make an iptables rule that opens 443" instead of having to dig out the man page and remember how to do that. IMO the next natural extension of this is giving the LLM more capability to generate user interfaces so I can interact with stuff exactly bespoke to my task.
This on the other hand seems the other way round, it's like bolting a static interface onto the LLM, which could defeat the purpose of the LLM interface layer in the first place right?
for free: not.
but I'm certain the race has just begun: big service providers and online retailers are currently implementing widgets enabling the purchase of their services and goods directly within the ChatGPT or Claude chat windows.
I'd imagine the same thing will happen here: It will prove more flexible to not push the model (and user) towards a UI that may not match what the user is trying to accomplish.
To me this seems like something I categorically don't want unless it is purely advisory.
Something as simple as correlating a git SHA to a CI build takes 10s of seconds and some number of tokens if Claude is utilizing skills (making API calls to the CI server and GitHub itself). If you have an MCP server that Claude feeds a SHA into and gets back a bespoke, organized payload that adds relevant context to its decision making process (such as a unified view of CI, diffs, et. al), then MCP is a win.
MCP shines as a bespoke context engine and fails as a thin API translation layer, basically. And the beauty/elegance is you can use AI to build these context engines.
MCPs are specially well suited for cases that need a permanent instance running alongside the coding agent, for example to handle authentication or some long-lived service that is too cumbersome to launch every time the tool is called.
It really shines in custom implementations coupled to projects. I’ve got a QT desktop app and my mcp server allows the agents to run the app in headless mode, take screenshots, execute code like in Playwright, inspect widget trees, send clicks/text/etc with only six tools and a thousand tokens or so of instructions. Took an hour to build with Claude Code and now it can run acceptance tests before committing them to code end to end tests.
I don't think we can generate anywhere close to this kind of UI just yet.
We built https://usefractal.dev/ to make it easier for people to build ChatGPT Apps (they are technically MCP Apps) so I have seen the use cases. Most of these use cases LLM cannot generate the UI on the fly.
UIs should be fully remix-able and not set by the datasource/SaaS. So we built out a system to allow users to use the standard UI or remix apps as they want. Like Val.town, but with a flexible UX/workspace layer. Come check us out!
It’s basically a “web App Store” and we side step the existing app stores (and their content guidelines, security restrictions and billing requirements) because it’s all done via a mega app (the MCP client).
How could it go wrong?
If only someone had done this before, we wouldnt be stuck in Apples, etc’s walled gardens…
Seriously though; honest question: this is literally circumventing platform requirements to use the platform app stores. How do you imagine this is going to be allowed?
Is ChatGPT really big enough they can pull the “we’re gonna do it, watcha gonna do?” to Apple?
Who’s going to curate this app store so non technical users (the explicitly stated audience) can discover these MCP apps?
It feels like MCP itself; half baked. Overly ambitious. “We’ll figure the details out later”
I don’t see this being the future state. We’d be talking about a world where any and all apps exist inside of fucking ChatGPT and that just sounds ridiculous.
I'm processing thousands of files using Copilot, and even 20 at a time, it usually skips a couple, and sometimes, when skipping, it merges the data from one file to the next, not applying anything to the second file, other times it completely applies the data parsed from one file to the second --- not a big deal since I'm reviewing each operation manually, but the only reason the error rate is acceptable is the files are so inconsistent that normal techniques weren't working.
Is there an equivalent to "double-keying" where two different LLMs process the same input and it only moves forward if both match perfectly?
The LLM is working well enough for my needs (and I'm using a locked-down computer which installing/running development environments/scripts on is awkward), and it's a marked improvement over the previous technique of opening 50 files at a time, noting the Invoice ID, closing the file, typing the Invoice ID as a name, then quitting Adobe Acrobat and re-launching it for the next 50 (if that was not done, eventually Acrobat would reach a state where it would close a file and despite the name having been typed, not save it), then using a .bat file made using concatenation in an Excel column.
It would be nice if it were perfect, but each check has to be manually entered, and the filename updated to match the entry by hand.
Sigh.
The whole surface of the MCP specification is already pretty big, and barely any server implements anything beyond the core parts.
With elicitation there was already a lightweight version of this in place in the standard, and I'm not sure I've ever encountered a server or client implementation of it in the wild, and elicitation is an order of magnitude simpler to integrate on a conceptional level.
I fear that this has a significant risk of splintering the MCP ecosystem further (it's already pretty strained due to the transport protocol iterations), and there isn't really a reason to create a official extension (yet), that may worst case also require multiple iterations to get things right.
I think MCP-UI in it's current state can fill that role very well, and that's not the only lens with which to view a SEP like this.
From what I can tell the main thing that this SEP does is put the official blessing of the MCP project on the existing underlying protocol that MCP-UI is already using. I think the better way would be to let MCP-UI and competing implementations cook for a little bit longer, rather than trying to get an officially sanctioned extension out the door and then iterate on that, which will cause a lot more churn.
For a non-trivial feature like this, I would expect an SEP that highlights more alternative usage scenarios and potentially operates on a more technology agonistic layer (e.g. it's very JS+iframe bound, so what about mobile or terminal UIs?), similar to the main MCP spec. Especially given the backdrop of the main MCP specification, which feels much more well-rounded and throughly considered (though in a few areas still incomplete), this SEP does not seem to meet the same bar.
Reading that from the perspective of someone that builds a LLM Chat UI[0] (and aims to be able to maintain that long-term) that heavily builds on MCP as an interoperability concept, reading what is proposed here and seeing how prescriptive it is in many aspects does not spark a lot of joy.
[0]: https://erato.chat
---
EDIT: From what I can tell I'm replying to one of the co-creators of MCP. So let me just ask quite directly: Do you think that the risk of proprietary sprawl in terms of MCP-UI alternatives is currently greater than prematurely standardizing a potentially incomplete version?
In my mind what we are doing is building a common place to evolve the pattern. I wouldn’t really call it “standardize”, since in the end in this space adoption matters. (Writing a standard is worth nothing if nobody is using it). I expect MCP apps to evolve and iterate independent from the main spec for a while. You are right, it’s early and premature to say “this is done”. That’s the goal with extensions.
I do believe that I rather have commonalities between a Claude.ai and a ChatGPT as a developer (not sure that’s true if I was mostly looking at it from a product perspective). I also think you will see chat providers iterate on top of it, and mcp apps is more a common core than the full thing one can use on every platform.
A great example is Github, it's a significantly better dev experience having CC call out to the gh cli for actions than trying to invoke the MCP.
We've known for decades that its useful for APIs to be self documented and for responses to use schemas to define the shape of the data.
XML can be verbose and I understand why people preferred JSON for ease use. Had we stuck with REST for the last 20 years we'd be way ahead on that front, though, both in syntax and tooling.
We are where we are because there (sadly) hasn't been a reasonable business case to advance REST API documentation beyond the point of badly-documented OpenAPI schemas where the main utility is in generating type-safe API wrappers across different programming languages.
With MCP, there is at least a name to a new movement to build self-describing APIs, as with the advent of LLMs there is now enough of a utility for it. All other pushes into that direction have died out ~10 years ago.
I do think the problem is from business concerns, though, and are a clear predictor that MCP will fail. Coming out of the dotcom bubble those left standing wanted to build moats and walls, not APIs that any third party could easily discover and use. The need for REST only shows up when a new player makes a move to effectively gobble up the role of being the gate keeper to the internet. With scale other companies will follow, but begrudgingly and only for a short while.
I'm using a REST API to read your comment and submit my response — I didn't need to reference external HN API docs. The interface here is fully self-describing.
In a REST approach, I'd expect an LLM to need some kind of initial entry point, much like needing the initial URL for the HN home page. From there the LLM should be able to parse and discover possible actions, call those actions, and understand the response only by parsing the results and any schemas provided.
In any case, we're on the same page about REST. OpenAPI was mentioned though and I don't think that's ever actually compatible with that type of usage.
I don't think I've ever encountered an API following OpenAPI that was completely self describing and could be discovered and navigated from a single entry URL. They're always paired with some swagger docs or other external content. That instantly makes them not RESTful.
But you're totally right that my comment didn't fit. Might have meant it for the grandparent comment? Not sure.
Anyway - the REST movement served it's purpose - it killed SOAP and forced everyone back to simpler HTTP APIs without tons of over-engineered XML layers so it did well.
SOAP was a pain, XML wasn't doomed though and we didn't need to throw that baby out with the bath water.
I make tons of little REST APIs for my agents to use and in the AGENTS.md there's just a list of API entry points with descriptions on what they offer. Agents drive them with `curl` and it all works great.
https://www.anthropic.com/engineering/code-execution-with-mc...
The agent discovers tools by exploring the filesystem: listing the ./servers/ directory to find available servers (like google-drive and salesforce), then reading the specific tool files it needs (like getDocument.ts and updateRecord.ts) to understand each tool's interface. This lets the agent load only the definitions it needs for the current task. This reduces the token usage from 150,000 tokens to 2,000 tokens—a time and cost saving of 98.7%.
I'm mostly working with MCP servers in the context of enterprise/organizational usage for connecting internal data sources and workflows/internal APIs through a standardized interface. There is no equivalent of a "ls or AGENTS.md" there, as there is no file system and no shell in those contexts.
What concretely do you mean be definitions?
If you mean the configuration of the entry points similar to a .mcp.json for Claude Code, they can exist in any form, but ultimately, that only contains the endpoints of MCP servers (the entry points for discovery for each of the MCP servers).
If you mean the definitions of what tools are part of what MCP server (= the meat of what is involved in tool calling), those definitions are part of the MCP server and are only retrieved at runtime.
> how are the agents aware of them
Same as in Claude Code, the agent (= the control loop for the LLM) contacts the list of MCP endpoints, gathers all of the available tools on each MCP server, and then exposes those tools to the LLM (and handles the interactions between LLM and MCP server).
> There has to be a file somewhere.
Not in any way that's meaningful for an `ls` operation.
But it certainly also differs by how they are typically consumed. In what software do you just provide your consuming client just an OpenAPI schema URL to be retrieved at runtime? In my experience, none.
You can also have an LLM generate a client from an OpenAPI spec
vs openapi - there’s some more advanced agent-specific concepts in MCP, but primarily, I would say, it’s convention and optimizing for the client. Existing openapi specs will often have not great descriptions, or be a huge mess, or just be huge. Making an MCP server requires you to rethink the UX and tools to be optimized for the access patterns of an agent, and also makes you test it with them.
That said, Claude Skills[0] is something more in the vein of cli with —help (really, with a markdown guide), embracing what is already there with a bit of instructions, as opposed to building from scratch.
My personal opinion, having built multiple MCP servers, is that long-term they will not be the primary approach to tools, and that skills for instance are a better approach for most use-cases. But they do have their use-cases.
If you don't see this as useful, that is ok, but it is not the same as a command line executable.
Given LLMs can generate code complex frontend code, why is so difficult for Antropic / OpenAI to prompt their chat applications to create UI on the fly that matches 100% their Chat applications?
I know this is possible because this is how we do it.
The LLM generates some text that we know how to interpret and we render it on the screen.
Besides, this is exactly how their canvas thing works (both chtgpt and claude) when rendering documents on the side.
Not only this approach is more specific to the application where the UI is supposed to render but it also opens the door for the user to customise how the UI should work by adding their own prompts.
The approach that OpenAI has taken is not better - it is simply a mechanism to create some sort of app store - something they have attempted to do many times in the past as well.
I know because we are getting 3-4 spam messages per day on our discord server from various developers with previous experience in crypto now looking for opportunities in the agentic AI development space.
To me, this looks less like UI interactions and more like the MCP equivalent of maintaining state. You start your program and “click” buttons until you get the desired result, maintaining a constant state between interactions. Isn’t that currently possible if you passed through something like a session-id back to the LLM?
Am I missing something? I’m struggling to see what a UI makes possible that the current workflow does not.
I also generally see/use MCP in terms of remote access to programs through servers. Perhaps that’s where I’m getting lost. Is this exclusively something for local MCP?
# 1
- Hey claude, recommend me next book I could read
- sure, what about '1984'? Here is a button to send it directly to your kindle, it will be $4,99.
# 2
- I am looking for the hotel for two in Viena this weekend
- here are some offers and still available (displays grid view generated by mcp ui, handled by booking.com, each item has a button to "book now" directly)
So, basically -- it's a method for the MCP Server to display a UI element to the user allowing the user to perform some kind of action. Current, this UI will be an HTML iframe.
I think I got confused by their barchart example. I thought this would have been just as easily supported by the MCP server sending back a PNG/PDF of a barchart for the main interface to display (I also generally think of barcharts as static content). But, the idea of adding specific user interaction elements (buttons) helps to make the concept more clear to me.
I wouldn't underestimate it - it's the hammer can break up the information silos we've built up around websites/apps.
Why prompt Gertrude(tm) on Ramazon for a specific thing you need, if you can ask ChatGPT to find you said thing along with UIs to purchase it across all e-commerce platforms that agree to play in this market?
It is a requirement for banks to implement this and the FSA will be on your neck if you are not compliant with the standard.
[1] https://finance.ec.europa.eu/regulation-and-supervision/fina...
2015 WeChat mini program
...
2025 MCP-UI
I'm tired.
You need to:
1. Spin up a server that returns UI components.
2. Hand-write a bunch of JSON schemas + tool wiring
So we open-sourced a high-level MCP Server SDK that basically lets you have both the MCP server and React components in the same place:
- Every React component you put in your resources/ folder is automatically built and exposed as an MCP resource + tools. No extra registration boilerplate.
- We added a useWidget hook that takes the tool args and maps them directly into your component props, so the agent effectively “knows” what data the widget needs to render. You focus on UI + logic, the SDK handles the plumbing
Docs for that flow here: https://docs.mcp-use.com/typescript/server/creating-apps-sdk...
We also shipped an MCP Inspector to make the dev loop much less painful: you can connect your MCP server, test UI components from tools (with auto-refresh), and debug how it behaves with ChatGPT/agents as you iterate. https://docs.mcp-use.com/inspector/debugging-chatgpt-apps
Both the SDK and the Inspector are open-source, and any contributions are very welcome :)
Those are the repos:
- SDK: https://github.com/mcp-use/mcp-use
- Inspector: https://github.com/mcp-use/mcp-use/tree/main/libraries/types...
Eg you present a "display-graph-chart" tool as a MCP tool, and the agent calls it, it doesn't need to adhere to any protocol except the basic existing MCP protocol, and the UI that's used to interact with the agent would know the best presentation (show it as an embedded HTML graph if in a web ui, show it as a ascii chart if in a terminal, etc)?
Is the idea just to standardize the "output format" of the tool so that any agent UI could display stuff in the same way? so that one tool could work with any agent display?
I think I have to be missing what is huge here?
From my perspective the challenges for vendors and SaaS providers are [1] discovery [2] monetization [3] disintermediation
I think it's less of a concern if you're Shopify or those large companies that have existing brand moats.
But if you're a startup, I don't think MCP as a channel is a clear-cut decision. Maybe you can get distribution but monetization is not defined.
Also I'm sure the model providers will capture usage data and could easily disintermediate you , especially if your startup is just a narrow set of prompts and a UX over a specific workflow.
The Reforge guys have been talking about a channel shift and this being it but until incentives are clear I'm not sure this is it yet. Maybe an evolution of this.
I'm building an AI coach for job seekers / early stage professionals (Socratify) and while I'd love more distribution from MCP UI integration I think at this point risk is higher than reward...
> If you want a focused comparison next - for example, benchmarks on coding/math, token-cost examples for a typical session, or API usage differences - I can produce a compact table with sources and numbers.
--> can be answered with yes, so please add a yes button. A no button is not needed.
The MCP community is just reinventing, but yes, improving, what we've done before in the previous generation: Microsoft Bot Framework, Speaktoit aka Google Dialogflow, Siri App Shortcuts / Spotlight.
And interactive UIs in chats go back at least 20 years, maybe not with an AI agent attached...
The next thing that will be reinvented is the memory/tool combination, aka a world model.
I'm not confident the current AI craze will be net positive for humanity. But one possible good outcome could be that many people prefer simple chat UI to interact with services, most companies have to adopt them and are forced to provide simple, straight, no-nonsense content instead of what they want to sell, while LLMs are just commodity so unnamed Chinese companies can provide models as good as the one from the most VC-funded company so they can't enshittify the UX.
Trying to create custom agent APIs to embed apps in chat is a very "monopolist frontier lab" thing to try and do.
What I am imagining is something like a meta UI tool call that just creates a menu. The whole MCP server's purpose might be to add this menu creation capability to the chat user interface. But what you are selecting from isn't known ahead of time, it's the input to the UI.
When they select something I assume it would output a tool call like menuItemSelected('option B'). I suppose if you want your server to do anything specific with this then you would have to handle that in the particular server. But I guess you could also just have a tool call that just sends the inputs to the agent. This could make for what is a very slow to respond but extremely flexible overall UX.
I guess this is not the intended use, but suppose you give your agent generic MCP UI tools for showing any menu, showing any data table, showing a form, etc. So the inputSchemas would be somehow (if this is possible) quite loosely defined.
I guess the purpose is probably more about not having to go through the LLM rather than giving it the ability to dynamically put up UI elements that it has to react to individual interactions with.
But maybe one of the inputs to the dataTable are the query parameters for its data, and the table has a refresh button. Maybe another input is the URI for the details form MCP UI that slides over when you click a row.
Maybe there is an MCP UI for Layout what allows you to embed other MCP UIs in a specific structure.
This might not make sense, but I am wondering if I can use MCP Apps as an alternative to always building custom MindRoot plugins (my Python/web components agentic app framework) to provide unique web pages and UI for each client's agentic application.
I think I may have gotten the MCP Apps and MCP UI a bit conflated here so I probably need to read it again.
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.