Claude Skills Are Awesome, Maybe a Bigger Deal Than Mcp
Posted3 months agoActive3 months ago
simonwillison.netTechstoryHigh profile
calmmixed
Debate
70/100
AILlmsClaude SkillsMcp
Key topics
AI
Llms
Claude Skills
Mcp
The article discusses Claude Skills, a new feature that allows users to create and manage context for AI tasks, sparking a discussion on its comparison to MCP and its potential impact on AI workflows.
Snapshot generated from the HN discussion
Discussion Activity
Very active discussionFirst comment
5m
Peak period
134
0-6h
Avg / period
22.9
Comment distribution160 data points
Loading chart...
Based on 160 loaded comments
Key moments
- 01Story posted
Oct 17, 2025 at 1:40 PM EDT
3 months ago
Step 01 - 02First comment
Oct 17, 2025 at 1:45 PM EDT
5m after posting
Step 02 - 03Peak activity
134 comments in 0-6h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 20, 2025 at 4:15 PM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45619537Type: storyLast synced: 11/22/2025, 11:47:55 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
I do not understand this. cli-tool --help outputs still occupies tokens right?
Does anybody have a good SKILLS.md file we can study?
Furthermore, with all the hype around MCP servers and simply the amount of servers now existing, do they just immediately come obsolete? its also a bit fuzzy to me just exactly how an LLM will choose an MCP tool over a skill and vice versa...
if you're running an MCP file just to expose local filesystem resources, then it's probably obsolete. but skills don't cover a lot of the functionality that MCP offers.
I also think "skills" is a bad name. I guess its a reference to the fact that it can run scripts you provide, but the announcement really seems to be more about the hierarchical docs. It's really more like a selective context loading system than a "skill".
What bugs me: if we're optimizing for LLM efficiency, we should use structured schemas like JSON. I understand the thinking about Markdown being a happy medium between human/computer understanding but Markdown is non-deterministic for parsing. Highly structured data would be more reliable for programmatic consumption while still being readable.
*I use a TUI to manage the context.
Over time I would systematically create separate specialized docs around certain topics and link them in my CLAUDE.md file but noticeably without using the "@" symbol which to my understanding always causes CLAUDE to ingest the linked files resulting in unnecessarily bloating your prompt context.
So my CLAUDE md file would have a header section like this:
It seems like this is less of a breakthrough and more an iterative improvement towards formalizing this process from a organizational perspective.[1] https://www.anthropic.com/engineering/a-postmortem-of-three-...
https://github.com/anthropics/skills/blob/main/document-skil...
There are many edge cases when writing / reading Excel files with Python and this nails many of them.
Search and this document base pattern are different. In search the model uses a keyword to retrieve results, here the model starts from a map of information, and navigates it. This means it could potentially keep context better, because search tools have issues with information fragmentation and not seeing the big picture.
MCP gives the LLM access you your APIs. These skills are just text files with context about how to perform specific tasks.
Depends on who the user is...
A difference/advantage of MCP is that it can be completely server-side. Which means that an average person can "install" MCP tools into their desktop or Web app by pointing it to a remote MCP server. This person doesn't want to install and manage skills files locally. And they definitely don't want to run python scripts locally or run a sandbox vm.
Now wherever they're able to convert that house of cards into a solid foundation or it eventually spectacularly falls over will have to be seen over the next decade.
RAG was originally about adding extra information to the context so that an LLM could answer questions that needed that extra context.
On that basis I guess you could call skills a form of RAG, but honestly at that point the entire field of "context engineering" can be classified as RAG too.
Maybe RAG as a term is obsolete now, since it really just describes how we use LLMs in 2025.
Calling the skill system itself RAG is a bit of a stretch IMO, unless you end up with so many skills that their summaries can’t fit in the context and you have to search through them instead. ;)
I think vector search has shown to be a whole lot more expensive than regular FTS or even grep, so these days a search tool for the model which uses FTS or grep/rg or vectors or a combination of those is the way to go.
"Skills work through progressive disclosure—Claude determines which Skills are relevant and loads the information it needs to complete that task, helping to prevent context window overload."
So yeah, I guess you're right. Instead of one humongous AGENTS.md, just packaging small relevant pieces together with simple tools.
And, this is why I usually use simple system prompts/direct chat for "heavy" problems/development that require reasoning. The context bloat is getting pretty nutty, and is definitely detrimental to performance.
If we're considering primarily coding workflows and CLI-based agents like Claude Code, I think it's true that CLI tools can provide a ton of value. But once we go beyond that to other roles - e.g., CRM work, sales, support, operations, finance; MCP-based tools are going to have a better form factor.
I think Skills go hand-in-hand with MCPs, it's not a competition between the two and they have different purposes.
I am interested though, when the python code in Skills can call MCPs directly via the interpreter... that is the big unlock (something we have tried and found to work really well).
Were also at the point where the LLMs can generate MCP servers so you can pretty much generate completely new functionalities with ease.
You can drive one or two MCPs off a model that happily runs on a laptop (or even a phone). I wouldn't trust those models to go read a file and then successfully make a bunch of curl requests!
I hate how we are focusing on just adding more information to look up maps, instead of focusing on deriving those maps from scratch.
Rather than define skills and execution agents, letting a meta-Planning agent determine the best path based on objectives.
I don't mean to be unreasonable, but this is all about managing context in a heavy and highly technical manner. Eventually models must be able to augment their training / weights on the fly, customizing themselves to our needs and workflow. Once that happens (it will be a really big deal), all of the time you've spent messing around with context management tools and procedures will be obsolete. It's still good to have fundamental understanding though!
Browser engines could've been simpler; web development tools could've been more robust and powerful much earlier; we would be able to rely on XSLT and invent other ways of processing and consuming web content; we would have proper XHTML modules, instead of the half-baked Web Components we have today. Etc.
Instead, we got standards built on poorly specified conventions, and we still have to rely on 3rd-party frameworks to build anything beyond a toy web site.
Stricter web documents wouldn't have fixed all our problems, but they would have certainly made a big impact for the better.
Note that they don't actually suggest that the XML needs to be VALID!
My guess was that JSON requires more characters to be escaped than XML-ish syntax does, plus matching opening and closing tags makes it a little easier for the LLM not to lose track of which string corresponds to which key.
<instructions>
...
...
</instructions>
can be much easier than
{
"instructions": "..\n...\n"
}
especially when there are newlines, quotes and unicode
I would suspect that a single attention layer won't be able to figure out to which token a token for an opening bracket should attend the most to. Think of {"x": {y: 1}} so with only one layer of attention, can the token for the first opening bracket successfully attend to exactly the matching closing bracket?
I wonder if RNNs work better with JSON or XML. Or maybe they are just fine with both of them because a RNN can have some stack-like internal state that can match brackets?
Probably, it would be a really cool research direction to measure how well Transformer-Mamba hybrid models like Jamba perform on structured input/output formats like JSON and XML and compare them. For the LLM era, I could only find papers that do this evaluation with transformer-based LLMs. Damn, I'd love to work at a place that does this kind of research, but guess I'm stuck with my current boring job now :D Born to do cutting-edge research, forced to write CRUD apps with some "AI sprinkled in". Anyone hiring here?
Just look at HTML vs XHTML.
Similarly, my experience writing and working with MCPs has been quite underwhelming. It takes too long to write them and the workflow is kludgy. I hope Skills get adopted by other model vendors, as it feels like a much lighter way to save and checkout my prompts.
But I suppose yeah, why not just write clis and have an llm call them
- Writing manifests and schemas by hand takes too long for small or iterative tools. Even minor schema changes often require re-registration or manual syncing. There’s no good “just run this script and expose it” path yet.
- Running and testing an MCP locally is awkward. You don’t get fast iteration loops or rich error messages. When something fails, the debugging surface is too opaque - you end up guessing what part broke (manifest, transport, or tool logic).
- There’s no consistent registry, versioning, or discovery story. Sharing or updating MCPs across environments feels ad hoc, and you often have to wire everything manually each time.
With Skills you need none of them - instruct to invoke a tool and be done with it.
yes there is:
https://github.com/modelcontextprotocol/registry
and here you have frontends for the registry https://github.com/modelcontextprotocol/registry/blob/main/d...
Everything is new so we are all building it in real time. This used to be the most fun times for a developer: new tech, everybody excited, lots of new startups taking advantage of new platforms/protocols.
Eg I don't know where to put a skill that can be used across all projects
You can drop the new markdown files directly into your ~/.claude/skills directory.
Which kind of sounds pointless if Claude already knows what to do, why create a document?
My examples - I interact with ElasticSearch and Claude keeps forgetting it is version 5.2 and we need to use the appropriate REST API. So I got it to create a SKILL.md about what we used and provided examples.
And the next one was getting it to write instructions on how to use ImageMagik on Windows, with examples and trouble shooting, rather than it trying to use the Linux versions over and over.
Skills are the solution the problems I have been having. And came just at the right time as I already spent half of last week making similar documents !
There’s a fundamental misalignment of incentives between publishers and consumers of MCP.
Asking for snacks would activate Klarna for "mario themed snacks", and even the most benign request would become a plug for the Mario movie
https://chatgpt.com/s/t_68f2a21df1888191ab3ddb691ec93d3a
Found my favorite for John Wick, question was "What is 1+1": https://chatgpt.com/s/t_68f2bc7f04988191b05806f3711ea517
The former is a step function change. The latter is just a small improvement.
As is often the case, every product team is told that MCP is the hot new thing and they have to create an MCP server for their customers. And I've seen that customers do indeed ask for these things, because they all have initiatives to utilize more AI. The customers don't know what they want, just that it should be AI. The product teams know they need AI, but don't see any meaningful ways to bring it into the product. But then MCP falls on their laps as a quick way to say "we're an AI product" without actually having to become an AI product.
Agentic LLMs are, in a way, an attempt to commoditize entire service classes, across the board, all at once.
Personally, I welcome it. I keep saying that a lot of successful SaaS products would be much more useful and ergonomic for end users if, instead of webshit SPA, they were distributed as Excel sheets. To that I will now add: there's a lot more web services that I'd prefer be tool calls for LLMs.
Search engines have already been turned into features (why ask Google when o3 can ask it for me), but that's just an obvious case. E-mails, e-commerce, shopping, coding, creating digital art, planning, managing projects and organizations, analyzing data and trends - all those are in-scope too; everything I can imagine asking someone else to do for me is meant to eventually become a set of tool calls.
Or in short: I don't want AI in your product - I want AI of my choice to use your product for me, so I don't have to deal with your bullshit.
- bundled instructions, covering complex iteractions ("use the id from the search here to retrieve a record") for non-standard tools
- custom MCPs, the ones that are firewalled from the internet, for your business apis that no model knows about
- centralized MCP services, http/sse transport. Give the entire team one endpoint (ie web search), control the team's official AI tooling, no api-key proliferation
Now, these trivial `npx ls-mcp` stdio ones, "ls files in any folder" MCPs all over the web are complete context-stuffing bullshit.
But MCP has at least 2 advantages over cli tools
- Tool calling LLM combined w/ structured output is easier to implement as MCP than CLI for complex interactions IMO.
- It is more natural to hold state between tool calls in an MCP server than with a CLI.
When I read the OT, I initially wondered if I indeed bought into the hype. But then I realized that the small demo I built recently to learn about MCP (https://github.com/cournape/text2synth) would have been more difficult to build as a cli. And I think the demo is representative of neat usages of MCP.
MCP lets agents do stuff. Skills let agents do stuff. There's the overlap.
If I learned how to say "hello" in French today and also found out I have stage 4 brain cancer, they are completely different things but one is a bigger deal than the other.
> Where to get US census data from and how to understand its structure
Reminds me of my first time using Wolfram Alpha and got blown away by its ability to use actual structured tools to solve the problem, compared to normal search engine.
In fact, I tried again just now and am still amazed: https://www.wolframalpha.com/input?i=what%27s+the+total+popu...
I think my mental model for Skills would be Wolfram Alpha with custom extensions.
If you mean that it all breaks down to if/else at some level then, yeah, but that goes for LLMs too. LLMs aren't the quantum leap people seem to think they are.
The whole point of algorithmic AI was that it was deterministic and - if the algorithm was correct - reliable.
I don't think anyone expected that soft/statistical linguistic/dimensional reasoning would be used as a substitute for hard logic.
It has its uses, but it's still a poor fit for many problems.
We're still at the stage of eating pizza for the first time. It'll take a little while to remember that you can do other things with bread and wheat, or even other foods entirely.
Lisp was the AI language until the first AI Winter took place, and also took Prolog alongside it.
Wolfram Alpha basically builds on them, to put in a very simplistic way.
Doesn't need the craziest math capability but standard symbolic math stuff like expression reduction, differentiation and integration of common equations, plotting, unit wrangling.
All with an easy to use text interface that doesn't require learning.
https://maxima.sourceforge.io/
I used it when it was called Macsyma running on TOPS-20 (and a PDP-10 / Decsystem-20).
Text interface will require a little learning, but not much.
- Mathematica
- Maple
- MathStudio (mobile)
- Ti-89 calculator (high school favorite)
Others:
- SageMath
- GNU Octave
- SymPy
- Maxima
- Mathcad
We only call it AI until we understand it.
Once we understand LLMs more and there's a new promising poorly understood technology, we'll call our current AI something more computer sciency
Funnily enough, this was the result: `6.1% mod 3 °F (degrees Fahrenheit) (2015-2019 American Community Survey 5-year estimates)`
I wonder how that was calculated...
If you told the median user of these services to set one of these up I think they would (correctly) look at you like you had two heads.
People want to log in to an account, tell the thing to do something, and the system figures out the rest.
MCP, Apps, Skills, Gems - all this stuff seems to be tackling the wrong problem. It reminds me of those youtube channels that every 6 months say "This new programming language, framework, database, etc is the killer one", they make some todo app, then they post the same video with a new language completely forgetting they've done this already 6 times.
There is a lot of surface level iteration, but deep problems aren't being solved. Something in tech went very wrong at some point, and as soon as money men flood the field we get announcments like this. push out the next release, get my promo, jump to the next shiny tech company leaving nothing in their wake.
As the old adage goes: "Don't hate the player, hate the game?"
To actually respond: this isn't for the median user. This is for the 1% user to set up useful tools to sell to the median user.
If I had to guess, it would be because greed is a very powerful motivator.
> As the old adage goes: "Don't hate the player, hate the game?"
I know this advice is a realistic way of getting ahead in the world, but it's very disheartening and long term damaging. Like eating junk food every day of your life.
There is no problem to solve. These days, solutions come in a package which includes the problems they intend to solve. You open the package. Now you have a problem that jumped out of the package and starts staring at you. The solution comes out of the package and chases the problem around the room.
You are now technologically a more progressed human.
And the problem being solved is, LLMs are universal interfaces. They can understand[0] what I mean, and they understand what those various "solutions" are, and they can map between them and myself on the fly. They abstract services away.
The businesses will eventually remember that the whole point of marketing is to prevent exactly that from happening.
--
[0] - To a degree, and conditioned on what one considers "understanding", but still - it's the first kind of computer systems that can do this, becoming a viable alternative to asking a human.
My fairly negative take on all of this has been that we’re writing more docs, creating more apis and generally doing a lot of work to make the AI work, that would’ve yielded the same results if we did it for people in the first place. Half my life has been spent trying to debug issues in complex systems that do not have those available.
Haha, just kidding you tech bros, AI's still for you, and this time you'll get to shove the nerds into a locker for sure. ;-)
Programming was always a tool for humans. It’s a formal “notation” for describing solutions that can be computed. We don’t do well with bit soup. So we put a lot of deterministic translations between that and the notation that we’re good with.
Not having to do programming would be like not having to write sheet music because we can drop a cat from a specific height onto a grand piano and have the correct chord come out. Code is ideas precisely formulated while prompts are half formed wishes and prayers.
I’m attracted to this theory in part because it applies to me. I’m a below average coder (mostly due to inability to focus on it full time) and I’m exceptionally good at clear technical writing, having made a living off it much of my life.
The present moment has been utterly life changing.
For consumers, yes. In B2B scenarios more complexity is normal.
It might be superficial but it's still state of the art.
https://ampcode.com/news/toolboxes
Those are nice too — a much more hackable way of building simple personal tools than MCP, with less token and network use.
how are skills different from SlashCommand tool in claude-code then?
Claude Skills
https://news.ycombinator.com/item?id=45607117
https://modelcontextprotocol.io/docs/getting-started/intro
Basically the way it would work is, in the next model, it would avoid role playing type instructions, unless they come from skill files, and internally they would keep track of how often users changed skill files, and it would be a TOS violation to change it too often.
Though I gave up on Anthropic in terms of true AI alignment long ago, I know they are working on a trivial sort of alignment where it prevents it from being useful for pen testers for example.
The question is whether the analysis of all the Skill descriptions is faster or slower than just rewriting the code from scratch each time. Would it be a good or bad thing if an agent has created thousands of slightly varied skills.
https://github.com/microsoft/amplifier
I really enjoyed seeing Microsoft Amplifier last week, which similarly has a bank of different specialized sub-agents. These other banks of markdowns that get turned on for special purposes feels very similar. https://github.com/microsoft/amplifier?tab=readme-ov-file#sp... https://news.ycombinator.com/item?id=45549848
One of the major twists with Skills seems to be that Skills also have a "frontmatter YAML" that is always loaded. It still sounds like it's at least somewhat up to the user to engage the Skills, but this "frontmatter" offers… something, that purports to help.
> There’s one extra detail that makes this a feature, not just a bunch of files on disk. At the start of a session Claude’s various harnesses can scan all available skill files and read a short explanation for each one from the frontmatter YAML in the Markdown file. This is very token efficient: each skill only takes up a few dozen extra tokens, with the full details only loaded in should the user request a task that the skill can help solve.
I'm not sure what exactly this does but conceptually it sounds smart to have a top level awareness of the specializations available.
I do feel like I could be missing some significant aspects of this. But the mod-launched paradigm feels like a fairly close parallel?
210 more comments available on Hacker News