Skills Officially Comes to Codex

Posted21 days agoActive16 days ago

rochansinha

299 points

129 comments

developers.openai.comTech DiscussionstoryHigh profile

informativepositive

Debate

20/100

AILLMAI Research

Key topics

LLM

AI Research

The AI world is buzzing as Codex officially rolls out "Skills," a feature that lets users extend its capabilities with task-specific instructions and resources. Commenters are abuzz, with some poking fun at the concept while others are diving in, sharing their own experiences and creations, like custom skills for back-testing services. The conversation is also drawing parallels to Anthropic's similar "Agent Skills" feature, sparking curiosity about the overlap and potential applications in agentic workflows. As users explore this new functionality, the discussion is revealing a mix of excitement and skepticism, with some raving about the possibilities while others joke about the "particular set of skills" required to make the most of it.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

N/A

Peak period

101

0-12h

Avg / period

21.7

Comment distribution130 data points

Loading chart...

Based on 130 loaded comments

Key moments

01Story posted
Dec 20, 2025 at 3:09 AM EST
21 days ago
Step 01
02First comment
Dec 20, 2025 at 3:09 AM EST
0s after posting
Step 02
03Peak activity
101 comments in 0-12h
Hottest window of the conversation
Step 03
04Latest activity
Dec 24, 2025 at 8:02 PM EST
16 days ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (129 comments)

Showing 130 comments

rochansinhaAuthor

21 days ago

1 reply

Agent Skills let you extend Codex with task-specific capabilities. A skill packages instructions, resources, and optional scripts so Codex can perform a specific workflow reliably. You can share skills across teams or the community, and they build on the open Agent Skills standard.

Skills are available in both the Codex CLI and IDE extensions.

dan_wood

21 days ago

Thanks to Anthropic.

haffi112

20 days ago

3 replies

What are your favourite skills?

pylotlight

20 days ago

1 reply

nunchuck skills

not_a_toaster

20 days ago

The only skill that matters

frankc

20 days ago

The skills that matter most to me are the ones I create myself (with the skill creator skill) that are very specific and proprietary. For instance, a skill on how to write a service in my back-testing framework.

I do also like to make skills on things that are more niche tools, like marimo (a very nice jupyter replacement). The model probably does known some stuff about it, but not enough, and the agent could find enough online or in context7, but it will waste a lot of time and context in figuring it out every time. So instead I will have a deep thinking agent do all that research up front and build a skill for it, and I might customize it to be more specific to my environment, but it's mostly the condensed research of the agent so that I don't need to redo that every time.

dmd

20 days ago

A very particular set of skills.

stared

20 days ago

1 reply

Yes! I was raving about Claude Skills a few days ago (vide https://quesma.com/blog/claude-skills-not-antigravity/), and excited they come to Codex as well!

derrida

20 days ago

1 reply

Thanks for that! You mentioned Antigravity seemed slow, I just started playing with it too (but not really given it a good go yet to really evaluate) but I had the model set to Gemini Flash, maybe you get a speed up if you do that?

stared

20 days ago

My motivation was to use the smartest model available (overall, not only from Google) - I wanted to squeeze more out of Gemini 3 Pro that in Cursor. With new model releases usually there are things with outages. This are ever changing.

That said, for many tasks (summaries and data extraction) I do use Gemini 2.5 Flash, as it cheap and fast. So excited to try Gemini 3 Flash as well.

mikaelaast

20 days ago

4 replies

Are we sure that unrestricted free-form Markdown content is the best configuration format for this kind of thing? I know there is a YAML frontmatter component to this, but doesn't the free-form nature of the "body" part of these configuration files lead to an inevitably unverifiable process? I would like my agents to be inherently evaluable, and free-text instructions do not lend themselves easily to systematic evaluation.

Etheryte

20 days ago

2 replies

The modern state of the art is inherently not verifiable. Which way you give it input is really secondary to that fact. When you don't see weights or know anything else about the system, any idea of verifiability is an illusion.

mikaelaast

20 days ago

3 replies

Sure. Verifiability is far-fetched. But say I want to produce a statistically significant evaluation result from this – essentially testing a piece of prose. How do I go about this, short of relying on a vague LLM-as-a-judge metric? What are the parameters?

coldtea

20 days ago

1 reply

Would a structured skills file format help you evaluate the results more?

mikaelaast

20 days ago

1 reply

Yes. It would make it much easier to evaluate results if the contents were parameterized and normalized to some agreed-upon structure.

coldtea

20 days ago

"if the input contents were parameterized and normalized to some agreed-upon structure"

Just the format would be. There's no rigid structure that gets any preferrential treatment by the LLM, even if it did accept. In the end it's just instructions that are no different in any way from the prompt text.

And nothing stops you from making a "parameterized and normalized to some agreed-upon structure" and passing it directly to the LLM as skills content, or parsing it and dumping it as skills regular text content.

visarga

20 days ago

You 100% need to test work done by AI, if it's code it needs to pass extensive tests, if it's just a question answered, it needs to be the common conclusion of multiple independent agents. You can trust a single AI as much as a HN or reddit comment, but you can trust a committee of 4 as a real expert.

JamesSwift

20 days ago

How would you evaluate it if the agent were not a fuzzy logic machine?

The issue isnt the LLM, its that verification is actually the hard part. In any case, its typically called “evals” and you can probably craft a test harness to evaluate these if you think about it hard enough

hu3

20 days ago

At least MCPs can be unit tested.

With Skills however, you just selectively append more text to prompt and pray.

coldtea

20 days ago

>doesn't the free-form nature of the "body" part of these configuration files lead to an inevitably unverifiable process?

The non-deterministic statistical nature of LLMs means it's inherently an "inevitably unverifiable process" to begin with, even if you pass it some type-checked, linted, skills file or prompt format.

Besides, YAML or JSON or XML or free-form text, for the LLM it's just tokens.

At best you could parse the more structured docs with external tools more easily, but that's about it, not much difference when it comes to their LLM consumption.

heliumtera

20 days ago

Then rename your markdown skill files to skills.md.yaml.

There you go, you're welcome.

joshka

20 days ago

The DSPy + GEPA idea for this mentioned above[1] seems like it could be a reasonable approach for systematic evaluation of skills (not agents as a whole though). I'm going to give this a bit of a play over the holiday break to sort out a really good jj-vcs skill.

[1]: https://news.ycombinator.com/item?id=46338371

cube2222

20 days ago

3 replies

It's so nice that skills are becoming a standard, they are imo a much bigger deal long-term than e.g. MCP.

Easy to author (at its most basic, just a markdown file), context efficient by default (only preloads yaml front-matter, can lazy load more markdown files as needed), can piggyback on top of existing tooling (for instance, instead of the GitHub MCP, you just make a skill describing how to use the `gh` cli).

Compared to purpose-tuned system prompts they don't require a purpose-specific agent, and they also compose (the agent can load multiple skills that make sense for a given task).

Part of the effectiveness of this, is that AI models are heavy enough, that running a sandbox vm for them on the side is likely irrelevant cost-wise, so now the major chat ui providers all give the model such a sandboxed environment - which means skills can also contain python scripts and/or js scripts - again, much simpler, more straightforward, and flexible than e.g. requiring the target to expose remote MCPs.

Finally, you can use a skill to tell your model how to properly approach using your MCP server - which previously often required either long prompting, or a purpose-specific system prompt, with the cons I've already described.

hu3

20 days ago

5 replies

Perhaps you could help me.

I'm having a hard time figuring out how could I leverage skills in a medium size web application project.

It's python, PostgreSQL, Django.

Thanks in advance.

I wonder if skills are more useful for non crud-like projects. Maybe data science and DevOps.

jonrosner

20 days ago

2 replies

you could for example create a skill to access your database for testing purposes and pass in your tables specifications so that the agent can easily retrieve data for you on the fly.

derrida

20 days ago

1 reply

Oooooo, woah, I didn't really "get it" thanks for spelling it out a bit, just thought of some crazy cool experiments I can run if that is true.

dkdcio

20 days ago

it’s also for (typically) longer context you don’t always want the agent to have in its context. if you always want it in context, use rules (memories)

but if it’s something more involved or less frequently used (perhaps some debugging methodology, or designing new data schemas) skills are probably a good fit

hu3

20 days ago

1 reply

i made a small mcp script for database with 3 tools:

listTables getTableSchema executeQuery (blocks destructive queries like anything containing DROP, DELETE, etc..)

I wouldn't trust a textual instructions to prevent LLMs from dropping a table.

SatvikBeri

20 days ago

1 reply

That's why I give the LLM a readonly connection

wahnfrieden

20 days ago

1 reply

This is much better than MCP, which also stuffs every session's precious context with potentially irrelevant instructions.

kristo

20 days ago

1 reply

They could just make mcps dynamically loaded in the same way no?

wahnfrieden

20 days ago

It is still worse as it consumes more context giving instructions for custom tooling whereas the LLM already understands how to connect to and query a read-only SQL service with standard tools

JamesSwift

20 days ago

2 replies

Skills are the matrix scene where neo learns kungfu. Imagine they are a database of specialized knowledge that can an agent can instantly tap into _on demand_.

The key here is “on demand”. Not every agent or convention needs to know kung fu. But when they do, a skill is waiting to be consumed. This basic idea is “progressive disclosure” and it composes nicely to keep context windows focused. Eg i have a metabase skill to query analytics. Within that I conditionally refer to how to generate authentication if they arent authenticated. If they are authenticated, that information need not be consumed.

Some practical “skills”: writing tests, fetching sentry info, using playwright (a lot of local mcps are just flat out replaced by skills), submitting a PR according to team conventions (eg run lint, review code for X, title matches format, etc)

aed

20 days ago

1 reply

Could you explain more about your metabase skill and how you use it? We use metabase (and generally love it) and I’m interested to hear about how other people are using it!

JamesSwift

20 days ago

Its really just some rules around auth, some precached lookups (eg databases with ids and which to use), and some explanations around models and where to find them. Everything else it pretty much knows on it own.

wek

20 days ago

1 reply

Nice analogy!

JamesSwift

20 days ago

I cant claim credit. Im pretty sure Ive seen anthropic themselves use it in the original explainers

macNchz

20 days ago

1 reply

There’s nothing super special about it, it’s just handy if you have some instructions that you don’t need the AI to see all the time, but that you’d like it to have available for specific things.

Maybe you have a custom auth backend that needs an annoying local proxy setup before it can be tested—you don’t need all of those instructions in the primary agents.md bloating the context on every request, a skill would let you separate them so they’re only accessed when needed.

Or if you have a complex testing setup and a multi-step process for generating realistic fixtures and mocks: the AI maybe only needs some basic instructions on how to run the tests 90% of the time, but when it’s time to make significant changes it needs info about your whole workflow and philosophy.

I have a django project with some hardcoded constants that I source from various third party sites, which need to be updated periodically. Originally that meant sitting down and visiting a few websites and copy pasting identifiers from them. As AI got better web search I was able to put together a prompt that did pretty well at compiling them. With a skill I can have the AI find the updated info, update the code itself, and provide it some little test scripts to validate it did everything right.

hu3

20 days ago

2 replies

Thanks. I think I could use skills as "instructions I might need but I don't want to clutter AGENTS.md with them".

Sammi

20 days ago

Yes exactly. Skills are just sub agents.md files + an index.

Poor man's "skills" is just manually adding different .md files to the context.

Also every time you instruct the agent to do something correctly that it did incorrectly before, you ask it to revise a relevant .md file/"skill", so it has that correction from now on.

JamesSwift

20 days ago

Yes but also because skills are a semi special construct, agents are both better at leveraging them when needed and you can easily tap into them explicitly (eg “use the PR skill to open a PR”)

didibus

20 days ago

There can be a Django template skill for example, which is just a markdown file which reminds the LLM the syntax of Django Templates and best practices for it. It could have an included script that the LLM can use to test a single template file for example.

freakynit

20 days ago

Skills are not useful for single-shot cases. They are for: cross-team standardization (for LLM generated code), and reliable reusability of existing code/learnings.

NitpickLawyer

20 days ago

2 replies

On top of everything you've described, one more advantage is that you can use the agents themselves to edit / improve / add to the skills. One easy one to do is something like "take the key points from this session and add the learnings as a skill". It works both on good sessions with new paths/functionality and on "bad" sessions where you had to hand-hold the agent. And they're pretty good at summarising and extracting tidbits. And you can always skim the files and do quick edits.

Compared to MCPs, this is a much faster and more approachable flow to add "capabilities" to your agents.

fizx

20 days ago

2 replies

Add reinforcement learning to figure out which skills are actually useful, and you're really cooking.

NitpickLawyer

20 days ago

1 reply

DSPy with GEPA should work nicely, yeah. Haven't tried yet but I'll add it to my list. I think a way to share within teams is also low-hanging fruit in this space (outside of just adding them to the repo). Something more org-generic.

joshka

20 days ago

> DSPy with GEPA should work nicely

I think that would be a really really interesting thing to do on a bunch of different tasks involving developer tooling (e.g. git, jj, linters, etc.)

latentsea

20 days ago

Combine that with retrying the same task again but with the improved skills in some sort of train loop that learns to bake in the skills natively to obviate the need for them.

The path to recursive self-improvement seems to be emerging.

mycall

20 days ago

I think taking key points from a session and making a new skill is less useful than "precaching" by disseminating the key findings and updating related or affected skills, eliminating the need for a new skill (in most cases).

On the other hand, from a pure functional coding appeal, new skills that don't have leaking roles can be more atomic and efficient in the long run. Both have their pros/cons.

mycall

20 days ago

So a skill is effectively use case / user story / workflow recipe caching

jonrosner

20 days ago

2 replies

one thing that I am missing from the specification is a way to inject specific variables into the skills. If I create let's say a postgres-skill, then I can either (1) provide the password on every skill execution or (2) hardcode the password into my script. To make this really useful there needs to be some kind of secret storage that the agent can read/write. This would also allow me as a programmer to sell the skills that I create more easily to customers.

j_bum

20 days ago

2 replies

I have no clue how you’re running your agents or what you’re building, but giving the raw password string to a the model seems dubious?

Otherwise, why not just keep the password in an .env file, and state “grab the password from the .env file” in your Postgres skill?

JamesSwift

20 days ago

That’s exactly what I do.

jonrosner

20 days ago

I am thinking of distributing skills that I build to my clients. As my clients are mostly non-technical users I need this process of distribution to be as easy as possible. Even adding a .env file would probably be too much for most of them. With skills I can now finally distribute my logic easily, just send the raw files and tell them to put it into a folder - done. But there is no easy way for them to "setup" the credentials in those skills yet. The best UX in my opinion would be for Codex (or Claude, doesn't matter) to ask for those setup-parameters once when first using the skill and process the inputs in a secure manner, i.e. some internal secret storage

bavell

20 days ago

> there needs to be some kind of secret storage that the agent can read/write

Why not the filesystem?

I would create a local file (e.g. .env) in each project using postgres, then in my postgres skill, tell the agent to check that file for credentials.

freakynit

20 days ago

1 reply

I already was doing something similar on a regular basis.

I have many "folders"... each with a README.md, a scripts folder, and an optional GUIDE.md.

Whenever I arrive at some code that I know can be reused easily (for example: clerk.dev integration hat spans frontend and backend both), I used to create a "folder" of the same.

When needed, I used to just copy-paste all the folder content using my https://www.npmjs.com/package/merge-to-md package.

This has worked flawlessly well for me uptil now.

Glad we are bringing such capability natively into these coding agents.

diamondfist25

20 days ago

1 reply

For some reason, what you said here just explains what skills are in an eil5 way that I finally can understand

freakynit

18 days ago

hahaha... glad it worked that way :)

rdli

20 days ago

1 reply

This is great. At my startup, we have a mix of Codex/CC users so having a common set of skills we can all use for building is exciting.

It’s also interesting to see how instead of a plan mode like CC, Codex is implementing planning as a skill.

greymalik

20 days ago

2 replies

I’m probably missing it, but I don’t see how you can share skills across agents, other than maybe symlinking .claude/skills and .codex/skills to the same place?

rdli

20 days ago

Nothing super-fancy. We have a common GitHub repo in our org for skills, and everyone checks out the repo into their preferred setup locally.

(To clarify, I meant that some engineers mostly use CC while others mostly use Codex, as opposed to engineers using both at the same time.)

hugh-avherald

20 days ago

Codex 5.2 automatically picked up my claude agents' skills. Didn't prompt for it, it just so happened that what I asked it for, one of claude's agents' prompts was useful, so Codex ran with it.

andybak

20 days ago

6 replies

Skills, plugins, apps, connectors, MCPs, agents - anyone else getting a bit lost?

Frost1x

20 days ago

1 reply

In my opinion it’s to some degree an artifact of immature and/or rapidly changing technology. Basically not many know what the best approach is, all the use cases aren’t well understood, and things are changing so rapidly they’re basically just creating interfaces around everything so you can change flow in and out of LLMs any way you may desire.

Some paths are emerging popular, but in a lot of cases we’re still not sure even these are the long term paths that will remain. It doesn’t help that there’s not a good taxonomy (that I’m aware of) to define and organize the different approaches out there. “Agent” for example is a highly overloaded term that means a lot of things and even in this space, agents mean different things to different groups.

nlawalker

20 days ago

I liken the discovery/invention of LLMs to the discovery/invention of the electric motor - it's easy to take things like cars, drills, fans, pumps etc. for granted now, and all of the ergonomics and standards around them seem obvious in this era, but it took quite a while to go from "we can put power in this thing and it spins" to the state we're in today.

For LLMs, we're just about at the stage where we've realized we can jam a sharp thing in the spinny part and use it to cut things. The race is on not only to improve the motors (models) themselves, but to invent ways of holding and manipulating and taking advantage of this fundamental thing that feel so natural that they seem obvious in hindsight.

not_a_toaster

20 days ago

1 reply

They’re all bandaids

throwuxiytayq

20 days ago

Just like C++, JavaScript and every Microsoft product in existence

ksdnjweusdnkl21

20 days ago

2 replies

It's like JS frameworks. Just wait until a React emerges and get up to speed with that later.

riffraff

20 days ago

1 reply

React itself took a few years for react to decide how it should work (hooks not classes etc).

tartoran

20 days ago

Probably same will follow with LLMs. If you find something that works for you, sorry but that will change.

andybak

20 days ago

That's funny. My reaction to react emerging was to run away from JS frameworks entirely.

maddmann

20 days ago

It reminds me of llm output at scale. Llms tend to produce a lot of similar but slightly different ideas in a codebase, when not properly guided.

didibus

20 days ago

None of them matter that much. They're all just ways to bring in context. Think of them as conveniences.

Tools are useful so the AI can execute commands, but beyond that it's just ways to help you build the context for your prompt. Either pulling in premade prompts that provides certain instructions or documentation, or providing more specialized tools for the model to use along with instructions on using those tools.

iLoveOncall

20 days ago

All marketing names for APIs and prompts. IMO you don't need to even try to follow, because there's nothing inherently new or innovative about any of this.

orliesaurus

20 days ago

4 replies

If there was a marketplace or directory of skills.md files that were ranked with comments, it would be a good idea for the propagating of this tech

relativeadv

20 days ago

1 reply

it feels like people keep attempting this idea, largely because its easy to build, but in practice people aren't interested using others' prompts because the cost to create a customized skill/gpt/prompt/whatever is near zero

true2octave

20 days ago

People want inspiration rather than off-the-shelf prompts

More like a gallery than a marketplace

dkdcio

20 days ago

1 reply

ask, receive! https://github.com/anthropics/skills

not ranked with comments but I’d expect solid quality from these and they should “just work” in Codex etc.

LordGrey

20 days ago

It looks like the Codex version is https://github.com/openai/skills.

NitpickLawyer

20 days ago

It would be trivial to create something like this but there are a few major problems with running such a platform that I think makes it not worth while for anyone (maybe some providers will try it, but it's still tough).

- you will be getting a TON of spam. Just look at all the MCP folks, and how they're spamming everywhere with their claude-vibed mcp implementation over something trivial.

- the security implications are enormous. You'd need a way to vet stuff, moderate, keep track of things and so on. This only compounds with more traffic, so it'd probably be untenable really fast.

- there's probably 0 money in this. So you'd have to put a lot of work in maintaining a platform that attracts a lot of abuse/spam/prompt kiddies, while getting nothing in return. This might make sense to do for some companies that can justify this cost, but at that point, you'd be wondering what's in it for them. And what control do they exert on moderation/curation, etc.

I think the best we'll get in this space is from "trusted" entities (i.e. recognised coders / personalities / etc), from companies themselves (having skills in repos for known frameworks might be a thing, like it is with agents.md), and maybe from the token providers themselves.

nickdichev

20 days ago

I created a skill to write skills (based on the Anthropic docs). I think the value is really in making the skills work for your workflows and code base

astra90

20 days ago

2 replies

I think Skills could turn into something like open source libraries: standardized solutions to common problems, often written by experts.

Imagine having Skills available that implements authentication systems, multi-tenancy, etc.. in your codebase without having to know all the details about how to do this securely and correctly. This would probably boost code quality a lot and prevent insecure/buggy vibe coded products.

freakynit

18 days ago

Reminds me of that helicopter scene from The Martix movie where Trinity loaded relevant instruction on helicopter piloting on demand :)

JimDabell

20 days ago

And then you make a global index of those skills available to models, where they can search for an appropriate skill on demand, then download and use them automatically.

A lot of the things we want continuous learning for can actually be provided by the ability to obtain skills on the fly.

ithkuil

20 days ago

1 reply

I wonder if generated skills could be useful to codify the outcome of long sessions where the agent has tried a bunch of things and then finally settled on a solution based on a mixture of test failures and user feedback

dkdcio

20 days ago

yeah I have a “meta” skill and often use it after a session to instruct CC to update its own skills/rules. get the flywheel going

pupppet

20 days ago

3 replies

How are skills different than tool/function calling?

jinushaun

20 days ago

1 reply

I agree. I don’t see how this is different from tool calling. We just put the tool instructions in a folder of markdown files.

yousif_123123

20 days ago

It doesn't need to be describing a function. It could be explaining the skill in any way, it's kind of just like more instructions and metadata to be load just in time vs given all at once to the model.

mkagenius

20 days ago

You can achieve what Skills achieve via function calling somewhat.

I've this mental map:

Frontmatter <---> Name and arguments of the function

Text part of Skill md <---> description field of the function

Code part of the Skill <---> body of the function

But the function wouldn't look as organised as the .md, also, Skill can have multiple function definitions.

esafak

20 days ago

It's the catalog for the tools. Especially useful if you have custom tools; they expect the basics like grep and jq to be there.

btown

20 days ago

3 replies

Something that’s under-emphasized and vital to understand about Skills is that, by the spec, there’s no RAG on the content of Skill code or markdown - the names and descriptions in every skill’s front-matter are included verbatim in your prompt, and that’s all that’s used to choose a skill.

So if you have subtle logic in a Skill that’s not mentioned in a description, or you use the skill body to describe use-cases not obvious from the front-matter, it may never be discovered or used.

Additionally, skill descriptions are all essentially prompt injections, whether relevant/vector-adjacent to your current task or not; if they nudge towards a certain tone, that may apply to your general experience with the LLM. And, of course, they add to your input tokens on every agentic turn. (This feature was proudly brought to you by Big Token.) So be thoughtful about what you load in what context.

See e.g. https://github.com/openai/codex/blob/a6974087e5c04fc711af68f...

jimmydoe

20 days ago

2 replies

but that's same for MCP and tools, no?

mkagenius

20 days ago

1 reply

Yes. Infact you can serve each Skill as a tool exposed via MCP if you want. I did the same to make Skills work with Gemini CLI (or any other tool that supports MCP) while creating open-skills.

1. Open-Skills: https://github.com/BandarLabs/open-skills

brumar

20 days ago

Interesting. Skills on MCP makes a lot of sense in some contexts.

wincy

20 days ago

A consultant started recommending the Azure devops MCP and my context window would start around 25% full. It’s really easy to accidentally explode your token usage and destroy your context windows. Before I’d use az cli calls as needed and tell the agent to use the same, which used significantly less context and was more targeted.

Sammi

20 days ago

Honestly the index seems as much a liability as a boon. Keeping the context clean and focused is one of the most important things for getting the best out of lmms. For now I prefer just adding my md files to the context whenever I deem them relevant.

erichocean

20 days ago

Some agentic systems do apply RAG to skills, there's nothing about skills that requires blind insertion into prompts.

This is really an agentic harness issue, not LLM issue per se.

In 2026, I think we'll see agentic harnesses much more tightly integrated with their respective LLMs.

arnabgho

20 days ago

1 reply

Anthropic: Chief Product Officer of OpenAI

jimmydoe

20 days ago

even better, compensation free

well_ackshually

20 days ago

1 reply

Ah, yes, simple text files that describe concepts, and that may contain references to other concepts, or references to dive in deeper. We could even call these something like a link. And they form a sort of... web, maybe ?

Close enough, welcome back index.htm, can't wait to see the first ads being served in my skills

username223

20 days ago

Imagine SUBPROGRAMs that implement well-specified series of operations in a COmmon Business-Oriented Language, which can CALL each other. We are truly sipping rocket fuel.

mellosouls

20 days ago

2 replies

How can skills be monetised by creators?

Obviously they are empowering Codex and Claude etc, and many will be open source or free.

But for those who have commercial resources or tools to add to the skills choice, is there documentation for doing that smoothly, or a pathway to it?

I can see at least a couple of ways - skills requiring API keys or other similar approaches, but this adds friction to an otherwise smooth skill integration process.

Having instead a transparent commission on usage sent to registered skill suppliers would be much cleaner but I'm not confident that would be offered fairly, and I've seen no guidance yet on plans in that regard.

shrx

20 days ago

How would you enforce DRM on a markdown file?

nextaccountic

19 days ago

Sometimes, monetization is just impossible. But if you insist, have the skill call an API that needs a token, and charge for the API

zahlman

20 days ago

2 replies

Recently there was a submission (https://news.ycombinator.com/item?id=45840088) breaking down how agents are basically just a loop of querying a LLM, sometimes receiving a specially-formatted (using JSON in the example) "request to use a tool", and having the main program detect, interpret and execute those requests.

What do "skills" look like, generically, in this framework?

colonCapitalDee

20 days ago

1 reply

Before the first loop iteration, the harness sends a message to the LLM along the lines of.

  <Skill>

    <Name>postgres</Name>

    <Description>Directions on how to query the pre-prod postgres db</Description>
            <File>skills/postgres.md</File>

  </Skill>

</Skills>

The harness then may periodically resend this notification so that the LLM doesn't "forget" that skills are available. Because the notification is only name + description + file, this is cheap r.e tokens. The harness's ability to tell the LLM "IMPORTANT: this is a skill, so pay attention and use it when appropriate" and then periodically remind them of this is what differentiates a proper Anthropic-style skill from just sticking "If you need to do postgres stuff, read skills/postgres.md" in AGENTS.md. Just how valuable is this? Not sure. I suspect that a sufficiently smart LLM won't need the special skill infrastructure.

(Note that skill name is not technically required, it's just a vanity / convenience thing).

zahlman

20 days ago

1 reply

> The harness's ability to tell the LLM "IMPORTANT: this is a skill, so pay attention and use it when appropriate" and then periodically remind them of this is what differentiates

... And do we know how it does that? To my understanding there is still no out-of-band signaling.

afro88

20 days ago

A lot of tools these days put an extra <system> message into the conversation periodically that the user never sees. It fights against context rot and keeps important things fresh.

didibus

20 days ago

The agent can selectively loads one or more of the "skills", which means it'll pull it's prompt once it decided that it should be loaded, and the skill can have accompanying scripts that the prompt also describes to the LLM.

So it's just like a standard way to bring in prompts/scripts to the LLM with support from the tooling directly.

ericflo

20 days ago

2 replies

People are really misunderstanding Skills, in my opinion. It's not really about the .md file. It's about the bundling of code and instructions. Skills assume a code execution environment.

chickensong

20 days ago

1 reply

You could already pre-approve an executable and just call that from your prompt. The context savings by adding/indexing metadata and dynamically loading the rest of the content as-needed is the big win here IMHO.

ericflo

20 days ago

Yes, and! It's the combo/bundling/distribution of these things that makes this powerful.

tehjoker

20 days ago

literate programming reborn

tacone

20 days ago

3 replies

I don't understand how skills are different than just instructing your model to read all the front-matters from a given folder on your filesystem and then decide if they need to read the file body.

shimman

20 days ago

1 reply

Yes I'm confused as well, it feels like it's still all prompting which isn't new or different in the LLM space.

mbreese

20 days ago

It’s all just loading data into the context/conversation. Sometimes as part of the chat response the LLM will request for the client do something - read a file, call a tool, etc. The results of which end up back in the context as well.

pests

20 days ago

1 reply

That is basically what it is tho.

One difference is the model might have been trained/fine-tuned to be better at "read all the front-matters from a given folder on your filesystem and then decide..." compared a model with those instructions only in its context.

tacone

20 days ago

Not my method really, just a comparison. I didn't know about the sandbox.

I see there might be advantages. The manual alternative could be tweaked further though. For example you might make it hierarchical.

Or you could create an "howTo" MCP with more advanced search capabilities. (or a grandma MCP to ask advice to after a failure)

Interesting topic, I guess has found a real best practice, everybody is still exploring.

fassssst

20 days ago

Post training :)

karolcodes

20 days ago

anyone using this in agentic workflow already? how is it?

summarity

20 days ago

Jump to the original sources

Read the primary article or dive into the live Hacker News thread when you're ready.

Open link View on HN