I Spent a Year Making an Asn.1 Compiler in D

Posted2 months agoActive2 months ago

BradleyChatha

297 points

227 comments

bradley.chatha.devTechstoryHigh profile

calmmixed

Debate

40/100

Asn.1D Programming LanguageCompiler Development

Key topics

Asn.1

D Programming Language

Compiler Development

The author spent a year developing an ASN.1 compiler in D and shares their experience, sparking a discussion about the complexities and challenges of working with ASN.1.

Snapshot generated from the HN discussion

Discussion Activity

Very active discussion

First comment

N/A

Peak period

0-6h

Avg / period

Comment distribution160 data points

Loading chart...

Based on 160 loaded comments

Key moments

01Story posted
Oct 23, 2025 at 8:47 AM EDT
2 months ago
Step 01
02First comment
Oct 23, 2025 at 8:47 AM EDT
0s after posting
Step 02
03Peak activity
64 comments in 0-6h
Hottest window of the conversation
Step 03
04Latest activity
Oct 25, 2025 at 2:17 PM EDT
2 months ago
Step 04

Generating AI Summary...

Analyzing up to 500 comments to identify key contributors and discussion patterns

Discussion (227 comments)

Showing 160 comments of 227

BradleyChathaAuthor

2 months ago

5 replies

In short: I wanted to talk a bit about ASN.1, a bit about D, and a bit about the compiler itself, but couldn't think of any real cohesive format.

So I threw a bunch of semi-related ramblings together and I'm daring to call it a blog post.

Sorry in advance since I will admit it's not the greatest quality, but it's really not easy to talk about so much with such brevity (especially since I've already forgot a ton of stuff I wanted to talk about more deeply :( )

whizzter

2 months ago

2 replies

As someone that had the dis-pleasure to work with Asn.1 data (yes, certificates) I fully symphatise with anguish you've gone through (that 6months of Ansible HR comments cracked me up also :D ).

BradleyChathaAuthor

2 months ago

5 replies

It makes me laugh that absolutely no one can say "I've worked with ASN.1" in a positive light :D

StopDisinfo910

2 months ago

4 replies

There was an amusing chain of comments the last time protobuf was mentionned in which some people were arguing that it had been a terrible idea and ASN.1, as a standard, should have been used.

It was hilarious because clearly none of the people who were in favor had ever used ASN.1.

mananaysiempre

2 months ago

1 reply

Cryptonector[1] maintains an ASN.1 implementation[2] and usually has good things to say about the language and its specs. (Kind of surprised not he’s not in the comments here already :) )

[1] https://news.ycombinator.com/user?id=cryptonector

[2] https://github.com/heimdal/heimdal/tree/master/lib/asn1

cryptonector

2 months ago

1 reply

Thanks for the shout-out! Yes, I do have nice things to say about ASN.1. It's all the others that mostly suck, with a few exceptions like XDR and DCE/Microsoft RPC's IDL.

mananaysiempre

2 months ago

1 reply

Derail accepted! Is your approval of DCE based only on the serialization not being TLV or on something else too? I have to say, while I do think its IDL is tasteful, its only real distinguishing feature is AFAICT the array-passing/returning stuff, and that feels much too specialized to make sense of in anything but C (or largely-isomorphic low-level languages like vernacular varieties of Pascal).

cryptonector

2 months ago

Well, I do disapprove of the RPC-centric nature of both, XDR and DCE RPC, and I disapprove of the emphasis on "pointers" and -in the case of DCE- support for circular data structures and such. The 1980s penchant for "look ma'! I can have local things that are remote and you can't tell because I'm pretending that latency isn't part of the API hahahaha" research really shows in these. But yeah, at least they ain't TLV encodings, and the syntax is alright.

I especially like XDR, though maybe that's because I worked at Sun Microsystems :)

"Pointers" in XDR are really just `OPTIONAL` in ASN.1. Seems so silly to call them pointers. The reason they called them "pointers" is that that's how they represented optionality in the generated structures and code: if the field was present on the wire then the pointer is not null, and if it was absent the then pointer is null. And that's exactly what one does in ASN.1 tooling, though maybe with a host language Optional<> type rather than with pointers and null values. Whereas in hand-coded ASN.1 codecs one does sometimes see special values used as if the member had been `DEFAULT` rather than `OPTIONAL`.

cryptonector

2 months ago

You're likely to find my comments among those saying that. I've been using ASN.1 in some way for a couple of decades, and I've been an ASN.1 implementor for about half a decade.

whizzter

2 months ago

It's not entirely horrible, parsing DER dynamically enough to handle interpreting most common certificates can be done in some 200-300 lines of C#, so I'd take that any day over XML.

The main problem is that to work with the data you need to understand the semantics of the magic object identifiers and while things like the PKIX module can be found easily, the definitions for other more obscure namespaces for extensions can be harder to locate as it's scattered in documentation from various standardization organizations.

So, protobuf could very well have been transported in DER, the problem issue was probably more one of Google not seeing any value of interoperability and wanting to keep it simple (or worse, clashing by oblivious users re-using the wrong less well documented namespaces).

thayne

2 months ago

ASN.1 seems like something that could have been good ... If it was less complicated, had more accessible documentation, and had better tooling.

hamburglar

2 months ago

2 replies

As a former PKI enthusiast (tongue firmly in cheek with that description) I can say if you can limit your exposure to simply issuing certs so you control the data and thus avoid all edge cases, quirks, non-canonical encodings, etc, dealing with ASN.1 is “not too terrible.” But it is bad. The thing that used to regularly amaze me was the insane depths of complexity the designers went to … back in the 70’s! It is astounding to me that they managed to make a system that encapsulated so much complexity and is still in everyday use today.

You are truly a masochist and I salute you.

cyberax

2 months ago

1 reply

It's also amazing that we're basically using only a couple of free-form text fields in the WebPKI for the most crucial parts of validation.

Completely ignoring the ASN.1 support for complicated structures, with more than one CVE linked to incorrect parsing of these text fields m

cryptonector

2 months ago

1 reply

No we're not. We're using dNSName subjectAlternativeName values. We used to use the CN attribute of the subject DN, and... there is still code for that, but it's obsolete.

We _are_ using subject DNs for linking certs to their issuers, but though that's "free-form", we don't parse them, we only check for equality.

cyberax

2 months ago

1 reply

CN is absolutely used everywhere. And it can contain wildcards. SANs are also free-form.

cryptonector

2 months ago

1 reply

SANs are not free-form. A dNSName SAN is supposed to have an FQDN. An rfc822Name SAN is supposed to carry an email address. And, ok, sure, email addresses' mailbox part is basically free-form, but so what, you don't interpret that part unless you've accepted that certificate for that email address' domain part, and then you interpret the mailbox part the way a mail server would because you're probably the mail server. Yes, you can have directoryName SANs, but the whole point of SANs is that DNs suck because x.400/x.500 naming sucks so we want to use something that isn't that.

cyberax

2 months ago

1 reply

> to have an FQDN

With wildcards.

cryptonector

2 months ago

Ah yes, you're right. That is a horrible bodge.

cryptonector

2 months ago

2 replies

ASN.1 is from the mid-80s, and PKI is from the late 80s.

The problems with PKI/PKIX all go back to terrible, awful, no good, very bad ideas about naming that people in the OSI/European world had in the 80s -- the whole x.400/x.500 naming style where they expected people to use something like street addresses as digital names. DNS already existed, but it seems almost like those folks didn't get the memo, or didn't like it.

noAnswer

2 months ago

1 reply

They got grant money to work on anything but TCP/IP. :-) A lot of European oral history about how "the Internet" got to a Uni talks about how they were supposed to only use ISO/OSI but eventually unofficially installed IP anyway.

cryptonector

2 months ago

1 reply

But of course.

p_l

2 months ago

There's the other story of corporate vendors saying "yes, we will implement OSI, give us X time but buy our product now and we will deliver OSI" then actually going "we mangled BSD Sockets enough to work if you squint enough, let's try to wait the client off while racking the profit"

yujzgzc

2 months ago

1 reply

Organizational unit, location, etc all these concepts were pretty dumb to tie with digital identity in retrospect.

p_l

2 months ago

1 reply

Unless you need things like ability to address groups in flexible ways, which is why X.400 survives in various places (in addition to actually supporting inline cryptography and binary attachments).

What people forget is that you do not have to use the whole set of schema attributes.

cryptonector

2 months ago

1 reply

Does Internet email not support binary attachments? Of course it does.

And encrypted and/or signed email? That too, though very poorly, but the issue there is key management, and DAP/LDAP don't help because in the age of spam public directories are not a thing. Right now the best option for cryptographic security for email is hop-bby-hop encryption using DANE for authentication in SMTP, with headers for requesting this, and headers for indicating whether received email transited with cryptographic protection all the way from sender to recipient.

As for the "ability to address groups in flexible ways", I'm not sure what that means, but I've never see group distribution lists not be sufficient.

p_l

2 months ago

1 reply

And how long did it take for binary attachments to be reliable, encodings unfucked, etc?

As for group addressing, distribution lists are pitiful in comparison especially on discovery side.

Anyway, ultimately the big issue is that the DAP schema is always presented as "oh you need all the details", when... you don't. And we never got to point of really implementing things well outside the more expected use case where people do not, actually, use them directly but pick by name/function from directory.

cryptonector

2 months ago

> And how long did it take for binary attachments to be reliable, encodings unfucked, etc?

Oh I can't remember. Binary attachments have worked since I started using them long long ago. It worked at least in the mind-90s. Back then I was using both, Internet email and x.400 (HP OpenMail!), and x.400 was a massive pain (for me especially since I was one of the people who maintained a gateway between the two). I know what you're referring to: it took a long time for email to get "8-bit clean" / MIME because of the way SMTP works, but MIME was very much a thing by the mid-90s.

So it took a while if you count the days of UUCP email -- round it to two decades. But "by the md-90s" was plenty good enough because that's when the Internet revolution hit big companies. Lack of binary attachments wasn't something that held back Internet adoption. As far as the public and corps are concerned the Internet only became a thing circa 1994 anyways.

> As for group addressing, distribution lists are pitiful in comparison especially on discovery side.

Discovery, meaning directories. Those are nice inside corporate networks, which is where you need this functionality, so I agree, and yes people use Exchange / Exchange 365 / Outlook for this sort of thing, though even mutt can do LDAP-based discovery (poorly, but yes). Outside corporate networks directories are only useful within academia and governments / government labs. Outside all of that no one wants directories because they would only encourage the spammers.

cryptonector

2 months ago

4 replies

Bzzt! Wrong! I have worked with ASN.1 for many years, and I love ASN.1. :)

Really, I do.

In particular I like:

- that ASN.1 is generic, not specific to a given encoding rules (compare to XDR, which is both a syntax and a codec specification)

- that ASN.1 lets you get quite formal if you want to in your specifications

For example, RFC 5280 is the base PKIX spec, and if you look at RFCs 5911 and 5912 you'll see the same types (and those of other PKIX-related RFCs) with more formalisms. I use those formalisms in the ASN.1 tooling I maintain to implement a recursive, one-shot codec for certificates in all their glory.

- that ASN.1 has been through the whole evolution of "hey, TLV rules are all you need and you get extensibility for free!!1!" through "oh no, no that's not quite right is it" through "we should add extensibility functionality" and "hmm, tags should not really have to appear in modules, so let's add AUTOMATIC tagging" and "well, let's support lots of encoding rules, like non-TLV binary ones (PER, OER) and XML and JSON!".

Protocol Buffers is still stuck on TLV, all done badly by comparison to BER/DER.

tambre

2 months ago

1 reply

How do you feel about something like CBOR? In which stage would you say it's stuck in evolution compared to ASN.1 (since you said Protobuf is still TLV)?

cryptonector

2 months ago

1 reply

CBOR and JSON are just encodings, not schema, though there are schemas for them. I've not looked at their schema languages but I doubt they support typed hole formalisms (though they could be added as it's just schema). And since CBOR and JSON are just encodings, they are stuck being what they are -- new encodings will have compatibility problems. For example, CBOR is mostly just like JSON but with a few new types, but then things like jq have to evolve too or else those new types are not really usable. Whereas ASN.1 has much more freedom to introduce new types and new encoding rules because ASN.1 is schema and just because you introduce a new type doesn't mean that existing code has to accept it since you will evolve _protocols_. But to be fair JSON is incredibly useful sans schema, while ASN.1 is really not useful at all if you want to avoid defining modules (schemas).

tambre

2 months ago

I was considering CBOR+CDDL heavily for a project a while so they're a tad intertwined in my head. I very much liked CBOR's capability of being able to define wholly new types and describe them neatly in CDDL. You could even add some basic value constraints (less than, greater equal, etc.). That seemed really powerful and lacking ASN.1 experience it sounds like a very lite JSON-like subset of that.

zzo38computer

2 months ago

I also like ASN.1; I think it is better than JSON, XML, Protocol Buffers, etc, in many ways. I use it in some of my programs.

(However, like many other formats (including JSON, XML, etc), ASN.1 can be badly used.)

yujzgzc

2 months ago

I feel the experience of many people writing with ASN.1 is that of dealing with PKI or telecom protocols, which attempt to build worldwide interop between actually very different systems. The spec is one thing, but implementing it by the book is not sufficient to get something actually interoperable, there are a ton of quirks to work around.

If it was being used in homogenous environments the way protocol buffers typically are, where the schemas are usually more reasonable and both read and write side are owned by the same entity, it might not have gotten such a bad rap...

BradleyChathaAuthor

2 months ago

Yeah I know I'm making fun of it a lot (mostly in jest) but it genuinely is a really interesting specification, and it's definitely sad - but not surprising - it's not a very popular choice outside of its few niche areas.

:) Glad to see someone else who's gone down this road as well.

coderjames

2 months ago

2 replies

I worked with ASN.1 for a few years in the embedded space because its used for communications between aircraft and air traffic control in Europe [1]. I enjoyed it. BER encoding is pretty much the tightest way to represent messages on the wire and when you're charged per-bit for messaging, it all adds up. When a messaging syntax is defined in ASN.1 in an international standard (ICAO 9880 anyone?), its going to be around for a while. Haven't been able to get my current company to adopt ASN.1 to replace our existing homegrown serialization format.

[1] https://en.wikipedia.org/wiki/Aeronautical_Telecommunication...

p_l

2 months ago

1 reply

Isn't PER or OER more compact? especially for the per-bit charging thing

coderjames

2 months ago

Oh yeah, derp. I was thinking unaligned-PER, not BER.

lepicz

2 months ago

of all the encoding i like BER the most as well

(i worked in telecommunications when ASN.1 was common thing)

HelloNurse

2 months ago

I suspect that typical interactions with ASN.1 are benign because people are interested in reading and writing a few specific preexisting data structures with whatever encoding is required for interoperability, not in designing new message structures and choosing encodings for them.

For example, when I inherited a public key signature system (mainly retrieving certificates and feeding them to cryptographic primitives and downloading and checking certificate revocation lists) everything troublesome was left by dismissed consultants; there were libraries for dealing with ASN.1 and I only had to educate myself about the different messages and their structure, like with any other standard protocol.

i2pi

2 months ago

(void) space. :P

giancarlostoro

2 months ago

2 replies

At least you might be summoning Walter Bright in talking about D. One of my favorite languages I wish more companies would use. Unfortunately for its own sake, Go and Rust are way more popular in the industry.

pjmlp

2 months ago

2 replies

Unfortunately it lost the opportunity back when Remedy Games and Facebook were betting on it.

The various WIP features, and switching focus of what might bring more people into the ecosystem, have given away to other languages.

Even C#, Java and C++ have gotten many of features that were only available in D as Andrei Alexandrescu's book came out in 2011.

Clouudy

2 months ago

1 reply

I wouldn't say that it's unable to make a comeback, there is still a valid use case from my experience with it. The syntax, mixed-memory model, UFCS, and compilation speed are nice quality of life features compared to C++, and it's still a native binary compared to C# and Java. So if you're starting out with a new project from scratch there's not much reason not to beyond documentation reasons. And you can interface pretty easily to C/C++ as well as pretty much any other language designed for that sort of thing, but without a lot of syntax changes like Carbon.

I imagine that the scope of its uses has shrunk as other languages caught up, and I don't think it's necessarily a good language for general enterprise stuff (unless you're dealing with C++), but for new projects it's still valid IMO. I think that the biggest field it could be used in is probably games too, especially if you're already writing a new engine from scratch. You could start with the GC and then ease off of it as the project develops in order to speed up development, for example. And D could always add newer features again too, tbh.

pjmlp

2 months ago

1 reply

You always have to compare ecosystems, not programming languages syntax on their own.

Another thing that Java and C# got to do since 2011, is that AOT is also part of the ecosystem and free of charge (Java has had commercial compilers for a while), so not even a native binary is an advantage as you imagine.

First D has to finish what is already there in features that are almost done but not quite.

Clouudy

2 months ago

1 reply

TBH I don't necessarily think that ecosystem is what matters in every application, but it is necessary for most people, I agree. And I do agree with finishing a lot of the half-baked features too, but I'm unsure if the people maintaining the language have the will or the means to do that.

Do you have any other ideas about how D could stand out again?

pjmlp

2 months ago

It is what matters, as most companies pick languages based on SDKs, not the other way around being one pony trick and trying to solve everything with the same language.

That is why outside startups selling a specific product, most IT departments are polyglot.

For D to stand out, there must be a Rails, Docker like framework, something, that is getting such a buzz that makes early adopters want to go play with D.

However I don't see it happening on LLM age, where at a whim of a prompt thoughts can be generated in whatever language, which is only a transition step until we start having agent runtimes.

WalterBright

2 months ago

2 replies

C#, Java and C++ have poor copies of the D features. For example, C++ constexpr is a bad design because a special keyword to signify evaluating at compile time is completely redundant. (Just trigger on constexpr in the grammar.)

C++ modules, well, they should have more closely copied D modules!

pjmlp

2 months ago

1 reply

They might have, still they are good enough to keep people on those ecosystems and not bother at all with D, and tbe current adoption warts concerning IDE tooling and available libraries.

Worse is better approach tends to always win.

WalterBright

2 months ago

1 reply

Good enough is fine until one discovers how much better it can be.

pjmlp

2 months ago

Unfortunately the opportunity window is gone now, and in the LLM age programming languages are becoming less relevant, when programming itself increasingly turns into coordinating agents.

az09mugen

2 months ago

I have a very naïve and maybe dumb question coming from someone who is used to scripting languages. It's about the `auto` keyword, while being a nice feature, why is it necessary to write it down ? Isn't it possible to basically say to the compiler : "Hey you see this var declared with no type ? Assume by yourself there is an `auto` keyword."

mort96

2 months ago

4 replies

I feel like back when D might've been a language worth looking into, it was hampered by the proprietary compilers.

And still today, the first thought that comes to mind when I think D is "that language with proprietary compilers", even though there has apparently been some movement on that front? Not really worth looking into now that we have Go as an excellent GC'd compiled language and Rust as an excellent C++ replacement.

Having two different languages for those purposes seems like a better idea anyway than having one "optionally managed" language. I can't even imagine how that could possibly work in a way that doesn't just fragment the community.

sfpotter

2 months ago

3 replies

Sounds like you should look into it instead of idly speculating! Also, the funny thing about a divisive feature is that it doesn't matter if it fragments the community if you can use it successfully. There are a lot of loud people in the D community who freak out and whine about the GC, and there are plenty more quiet ones who are happily getting things done without making much noise. It's a great language.

mort96

2 months ago

2 replies

Are you saying that if I'm using D-without-GC, I can use any D library, including ones written with the assumption that there is a GC? If not, how does it not fracture the community?

> There are a lot of loud people in the D community who freak out and whine about the GC, and there are plenty more quiet ones who are happily getting things done without making much noise

This sounds like an admission that the community is fractured, except with a weirdly judgemental tone towards those who use D without a GC?

MrRadar

2 months ago

1 reply

> Are you saying that if I'm using D-without-GC, I can use any D library, including ones written with the assumption that there is a GC? If not, how does it not fracture the community?

"Are you saying that if I'm using Rust in the Linux kernel, I can use any Rust library, including ones written with the assumption they will be running in userspace? If not, how does that not fracture the community?"

"Are you saying that if I'm using C++ in an embedded environment without runtime type information and exceptions, I can use any C++ library, including ones written with the assumption they can use RTTI/exceptions? If not, how does that not fracture the community?"

You can make this argument about a lot of languages and particular subsets/restrictions on them that are needed in specific circumstances. If you need to write GC-free code in D you can do it. Yes, it restricts what parts of the library ecosystem you can use, but that's not different from any other langauge that has wide adoption in a wide variety of applications. It turns out that in reality most applications don't need to be GC-free (the massive preponderance of GC languages is indicative of this) and GC makes them much easier and safer to write.

I think most people in the D community are tired of people (especially outsiders) constantly rehashing discussions about GC. It was a much more salient topic before the core language supported no-GC mode, but now that it does it's up to individuals to decide what the cost/benefit analysis is for writing GC vs no-GC code (including the availability of third-party libraries in each mode).

mort96

2 months ago

The RTTI vs no-RTTI thing and the exceptions vs no-exceptions thing definitely does fracture the C++ community to some degree, and plenty of people have rightly criticized C++ for it.

> If you need to write GC-free code in D you can do it.

This seems correct, with the emphasis. Plenty of people make it sound like the GC in D is no problem because it's optional, so if you don't want GC you can just write D without a GC. It's a bit like saying that the stdlib in Rust is no problem because you can just use no_std, or that exceptions in C++ is no problem because you can just use -fno-exceptions. All these things are naïve for the same reason; it locks you out of most of the ecosystem.

sfpotter

2 months ago

1 reply

> This sounds like an admission that the community is fractured, except with a weirdly judgemental tone towards those who use D without a GC?

That's not what I'm saying, and who cares if it's fractured or not? Why should that influence your decision making?

There are people who complain loudly about the GC, and then there are lots of other people who do not complain loudly and also use D in many different interesting ways. Some use the GC, some don't. People get hyper fixated on the GC, but it isn't the only thing going on in the language.

mort96

2 months ago

1 reply

> who cares if it's fractured or not? Why should that influence your decision making?

Because, if I want to write code in D without the GC, it impacts me negatively if I can't use most of the libraries created by the community.

sfpotter

2 months ago

1 reply

What domain are you working in? It's hard to be able to say anything specific that might help you if you can't explain a little more clearly what you're working on.

I will say there are a larger and larger number of no-GC libraries. Phobos is getting an overhaul which is going to seriously reduce the amount that the GC is used.

It is probably also worth reflecting for a moment why the GC is causing problems for you. If you're doing something where you need some hard-ish realtime guarantee, bear in mind that the GC only collects when you allocate.

It's also possible to roll your own D runtime. I believe people have done things like replace the GC with allocators. The interface to the GC is not unduly complicated. It may be possible to come up with a creative solution to your problem.

mort96

2 months ago

1 reply

I work in many domains, some where a GC is totally no problem, and some where I'd rather not have a GC (such as microcontrollers, games, the occasional kernel code, software running on embedded Linux devices with very limited memory).

I'm happy with C++ and Rust for those tasks. For other tasks where I want a GC, I'm perfectly happy with some combination of Go, Python and Typescript. I'm realistically never going to look into D.

sfpotter

2 months ago

I'm glad you've found a nice stack that you like working with! D has been used for every domain mentioned. Whether or not you look into D in the future is no business of mine.

giancarlostoro

2 months ago

5 replies

Go is a GC language that has eaten a chunk of the industry (Docker, TypeScript, Kubernetes... Minio... and many more I'm sure) and only some people cry about it, but you know who else owns sizable chunks of the industry? Java and C# which are both GC languages. While some people waste hours crying about GCs the rest of us have built the future around it. Hell, all of AI is eaten by Python another GC language.

timeinput

2 months ago

1 reply

I'm not strongly for or against (non-deterministic) GC. Deterministic GC in Rust or the (there's no real Scotsman) correctly written C++ has benefits, but often I don't care and go / java / c# / python are all fine.

I think you're really overstepping with AI is eaten by python. I can imagine an AI stack with out python llama.cpp (for inference not training... isn't completely that, but most of the core functionality is not python, and not-GCd at all), I can not imagine an AI stack with out CUDA + C++. Even the premier python tools (pytorch, vllm) would be non-functional with out these tools.

While some very common interfaces to AI require a GC'd language I think if you deleted the non-GC parts you'd be completely stuck and have years of digging your self out, but if you deleted the 'GC' parts you can end up with a usable thing in very short order.

pjmlp

2 months ago

NVidia has decided that the market demand to do everything in Python justifies the development cost of making Python fast in CUDA.

Thus now you can use PTX directly from Python, and with the new cu Tiles approach, you can write CUDA kernels in a Python subset.

Many of these tools get combined because that is what is already there, and the large majority of us don't want, or has the resources, to spend bootstrapting a whole new world.

Until there is some monetary advantage in doing so.

mort96

2 months ago

1 reply

There's a place for GC languages, and there's a place for non-GC languages. I don't understand why you seem so angry towards people who write in non-GC languages.

WalterBright

2 months ago

And there's a place for languages that smoothly support both GC and non-GC. D is the best language at that.

sfpotter

2 months ago

1 reply

And in D, there's nothing stopping from either using or not using the GC. One of the key features of D is that it's possible to mix and match different memory management strategies. Maybe I have a low level computational kernel written C-style with memory management, and then for scripting I have a quick and dirty implementation of Scheme also written in D but using the GC. Perfectly fine for those two things to co-exist in the same codebase, and in fact having them coexist like that is useful.

mort96

2 months ago

1 reply

> And in D, there's nothing stopping from either using or not using the GC.

Wait so are you, or are you not, saying that a GC-less D program can use libraries written with the assumption that there's a GC? The statement "there's nothing stopping [you] from not using the GC" implies that all libraries work with D-without-GC, otherwise the lack of libraries written for D-without-GC would be stopping you from not using the GC

sfpotter

2 months ago

1 reply

Sorry, but it's more complicated than this. I understand the point you're making, but if your desiderata for using a language is "If any feature prevents me from using 100% of the libraries that have been written for the language, then the language is of no use to me", well... I'm not sure what to tell you.

It's not all or nothing with the GC, as I explained in another reply. There are many libraries that use the GC, many that don't. If you're writing code with the assumption that you'll just be plugging together libraries to do all the heavy lifting, D may not be the right language for you. There is a definite DIY hacker mentality in the community. Flexibility in attitude and approach are rewarded.

Something else to consider is that the GC is often more useful for high level code while manual memory management is more useful for low level code. This is natural because you can always use non-GC (e.g. C) libraries from GC, but (as you point out) not necessarily the other way around. That's how the language is supposed to be used.

You can use the GC for a GUI or some other loose UI thing and drop down to tighter C style code for other things. The benefit is that you can do this in one language as opposed to using e.g. Python + C++. Debugging software written in a mixture of languages like this can be a nightmare. Maybe this feature is useful for you, maybe not. All depends on what you're trying to do.

mort96

2 months ago

3 replies

You're the guy who said that "nothing is preventing you" from using D without a GC. A lack of libraries which work without a GC is something preventing you from using D without a GC. Just be honest.

sfpotter

2 months ago

I just said there are loads of libraries which have been written explicitly to work with the GC disabled. Did you read what I wrote?

It looks like you work in a ton of different domains. I think based on what I've written in response to you so far, it should be easy to see that D is likely a good fit for some things you work on and a bad fit for others. I don't see what the problem is.

The D community is full of really nice and interesting people who are fun to interact with. It's also got a smaller number of people who complain loudly about the GC. This latter contingency of people is perceived as being maybe a bit frustrating and unreasonable.

I don't care whether you check D out or not. But your initial foray into this thread was to cast shade on D by mentioning issues with proprietary compilers (hasn't been a thing in years), and insinuating that the community was fractured because of the GC. Since you clearly don't know that much about the language and have no vested interest in it, why not leave well enough alone instead of muddying the waters with misleading and biased commentary?

pests

2 months ago

Still nothing prevents you from rewriting those libraries to not use a GC.

Stop making excuses or expecting others to do your work for you.

Nothing is preventing you.

Clouudy

2 months ago

There are third party libraries for the language that don't use the GC, but as far as I know there isn't a standardized one that people pick.

homebrewer

2 months ago

"AI" is built on C, C++, and Fortran, not Python.

pjmlp

2 months ago

For 99% of the industry use cases, some kind of GC is good enough, and even when that isn't the case, there is no need to throw away the baby with the babywater, a two language approach also works.

Unfortunely those that cry about GCs are still quite vocal, at least we can now throw Rust into their way.

WalterBright

2 months ago

1 reply

Once one realizes that the GC is just another way to allocate memory in D, it becomes quite wonderful to have a diverse collection of memory management facilities at hand. They coexist quite smoothly. Why should programs be all GC or no GC? Why should you have to change languages to switch between them?

alphaglosined

2 months ago

Indeed the GC is just a library with some helpful language hooks to make the experience nice.

If you understand how it's hooked into, it's very easy to work with. There is only one area of the language related to closure context creation that can be unexpected.

giancarlostoro

2 months ago

3 replies

I don't think the proprietary compilers is a true set back, look at for example C# before it became as open as .NET has become today (MIT licensed!) and yet the industry took it. I think what D needed was what made Ruby mainly relevant: Rails. D needs a community framework that makes it a strong candidate for a specific domain.

I honestly think if Walter Bright (or anyone within D) invested in having a serious web framework for D even if its not part of the standard library, it could be worth its weight in gold. Right now there's only Vibe.d that stands out but I have not seen it grow very much since its inception, its very slow moving. Give me a feature rich web framework in D comparable to Django or Rails and all my side projects will shift to D. The real issue is it needs to be batteries included since D does not have dozens of OOTB libraries to fill in gaps with.

Look at Go as an example, built-in HTTP server library, production ready, its not ultra fancy but it does the work.

mort96

2 months ago

C# has Microsoft behind it. D ... doesn't.

There are plenty of people who aren't interested in using languages with proprietary toolchains. Those people typically don't use C#. The people who don't mind proprietary toolchains typically write software for an environment where D isn't relevant, such as .NET or the Apple world.

alphaglosined

2 months ago

Getting a web framework into the standard library is something I want to get working, along with a windowing library.

Currently we need to get a stackless coroutine into the language, actors for windowing event handling, reference counting and a better escape analysis story to make the experience really nice.

This work is not scheduled for PhobosV3 but a subset such as a web client with an event loop may be.

Lately I've been working on some exception handling improvements and start on the escape analysis DFA (but not on the escape analysis itself). So the work is progressing. Stackless coroutine proposal needs editing, but it is intended to be done at the start of next year for approval process.

Clouudy

2 months ago

I do agree with you that there needs to be a good framework though. Either in Web or Games. Web because it's more familiar than Go but also has Fibers, and Games because it's an easier C++. There is also Inochi2D which looks rather professional: https://inochi2d.com/

One of the issues I've seen in the community is just that there aren't enough people in the community with enough interest and enough spare time to spend on a large project. Everyone in the core team is focused on working on the actual language (and day-jobs), while everyone else is doing their own sort of thing.

From your profile you seem to have a lot of experience in the field and in software in general, so I'd like to ask you if you have any other advice for getting the language un-stuck, especially with regards to the personnel issues. I think I'd like to take up your proposal for a web framework as well, but I don't really have any knowledge of web programming beyond the basics. Do you have any advice on where to start or what features/use case would be best as well?

pjmlp

2 months ago

Go would be an excellent GC'd compiled language if it actually learnt from history of computing.

I rather give that to languages like C# with Native AOT, or Swift (see chapter 5 of GC Handbook).

D only lacks someone like Google to push it into mainstream no matter what, like Go got to benefit from Docker and Kubernetes.

schveiguy

2 months ago

D has been fully unproprietary since 2017. https://forum.dlang.org/post/oc8acc$1ei9$1@digitalmars.com

But that's only the reference compiler, DMD. The other two compilers were fully open source (including gcc, which includes D) before that.

Fully disagree on your position that having all possibilities with one language is bad. When you have a nice language, it's nice to write with it for all things.

olvy0

2 months ago

1 reply

Just wanted to say I enjoyed your post very much. Thank you for writing it. I love D but unfortunately I haven't touched it for several years. I also have some experience writing parsers and implementing protocols.

BradleyChathaAuthor

2 months ago

Thank you :)

throw_a_grenade

2 months ago

Don't worry, it's your blog, and your way. Keep it up, if it makes you whole.

mananaysiempre

2 months ago

A small nitpick: I don’t think your intersection example does what you want it to do. Perhaps there’s some obscure difference in “PER-visibility” or whatnot, but at least set-theoretically,

  LegacyFlags2 ::= INTEGER (0 | 2 ^ 4..8) -- as in the article

is exactly equivalent to

  LegacyFlags2 ::= INTEGER (0) -- only a single value allowed

as (using standard mathematical notation and making precedence explicit) {0} ∪ ({2} ∩ {4,5,6,7,8}) = {0} ∪ ∅ = {0}.

Keyframe

2 months ago

8 replies

I salute your for deep dive into this. History would have it that ASN.1 was already there as both an IDL and serialization format when HTTPS certs were defined. If it were today, would it be the same or would we end up with protobuf or thrift or similar?

woodruffw

2 months ago

2 replies

> If it were today, would it be the same or would we end up with protobuf or thrift or similar?

The main advantage of ASN.1 (specifically DER) in an HTTPS/PKI context is that it's a canonical encoding. To my understanding Protobuf isn't; I don't know about Thrift.

(A lot of hay is made about ASN.1 being bad, but it's really BER and other non-DER encodings of ASN.1 that make things painful. If you only read and write DER and limit yourself to the set of rules that occur in e.g. the Internet PKI RFCs, it's a relatively tractable and normal looking serialization format.)

jcranmer

2 months ago

5 replies

I'm hardly a connoisseur of DER implementations, but my understanding is that there are two main problems with DER. The first is that the format isn't really parseable without using a schema, unlike (say) XML or JSON. This means your generic DER parser needs to have an ASN.1 schema passed into it to parse the DER, and this leads to the second problem, which is that this ends up being complex enough that basically every attempt to do so is full of memory safety issues.

woodruffw

2 months ago

3 replies

> The first is that the format isn't really parseable without using a schema, unlike (say) XML or JSON.

You can parse DER perfectly well without a schema, it's a self-describing format. ASN.1 definitions give you shape enforcement, but any valid DER stream can be turned into an internal representation even if you don't know the intended structure ahead of time.

rust-asn1[1] is a nice demonstration of this: you can deserialize into a structure if you know your structure AOT, or you can deserialize into the equivalent of a "value" wrapper that enumerates/enforces all valid encodings.

> which is that this ends up being complex enough that basically every attempt to do so is full of memory safety issues.

Sort of -- DER gets a bad rap for two reasons:

1. OpenSSL had (has?) an exceptionally bad and permissive implementation of a DER parser/serializer.

2. Because of OpenSSL's dominance, a lot of "DER" in the wild was really a mixture of DER and BER. This has caused an absolutely obscene amount of pain in PKI standards, which is why just about every modern PKI standard that uses ASN.1 bends over backwards to emphasize that all encodings must be DER and not BER.

(2) in particular is pernicious: the public Web PKI has successfully extirpated BER, but it still skulks around in private PKIs and more neglected corners of the Internet (like RFC 3161 TSAs) because of a long tail of OpenSSL (and other misbehaving implementation) usage.

Overall, DER itself is a mostly normal looking TLV encoding; it's not meaningfully more complicated than Protobuf or any other serialization form. The problem is that it gets mashed together with BER, and it has a legacy of buggy implementations. The latter is IMO more of a byproduct of ASN.1's era -- if Protobuf were invented in 1984, I imagine we'd see the same long tail of buggy parsers regardless of the quality of the design itself.

jcranmer

2 months ago

3 replies

> You can parse DER perfectly well without a schema, it's a self-describing format. ASN.1 definitions give you shape enforcement, but any valid DER stream can be turned into an internal representation even if you don't know the intended structure ahead of time.

> rust-asn1[1] is a nice demonstration of this: you can deserialize into a structure if you know your structure AOT, or you can deserialize into the equivalent of a "value" wrapper that enumerates/enforces all valid encodings.

Almost. The "tag" of the data doesn't actually tell you the type of the data by itself (most of the time at least), so while you can say "there is something of length 10 here", you can't say if it's an integer or a string or an array.

woodruffw

2 months ago

1 reply

> The "tag" of the data doesn't actually tell you the type of the data by itself (most of the time at least), so while you can say "there is something of length 10 here", you can't say if it's an integer or a string or an array.

Could you explain what you mean? The tag does indeed encode this: for an integer you'd see `INTEGER`, for a string you're see `UTF8String` or similar, for an array you'd see `SEQUENCE OF`, etc.

You can verify this for yourself by using a schemaless decoder like Google's der-ascii[1]. For example, here's a decoded certificate[2] -- you get fields and types, you just don't get the semantics (e.g. "this number is a public key") associated with them because there's no schema.

[1]: https://github.com/google/der-ascii

[2]: https://github.com/google/der-ascii/blob/main/samples/cert.t...

jcranmer

2 months ago

1 reply

It's been a long time since I last stared at DER, but my recollection was for the ASN.1 schema I was decoding, basically all of the tags ended up not using the universal tag information, so you just had to know what the type was supposed to be. The fact that everything was implicit was why I qualified it with "most of the time"; it was that way in my experience.

woodruffw

2 months ago

1 reply

Oh, that makes sense. Yeah, I mostly work with DER in contexts that use universal tagging. From what I can tell, IMPLICIT tagging is used somewhat sparingly (but it is used) in the PKI RFCs.

So yeah, in that instance you do need a schema to make progress beyond "an object of some size is here in the stream."

cryptonector

2 months ago

IMPLICIT tagging is used in PKIX (and other protocols) whenever a context or application tag is needed to disambiguate due to either a) OPTIONAL members, b) members that were inserted as if the SEQUENCEs/SETs were extensible, or c) CHOICEs. The reason for IMPLICIT tagging instead of EXPLICIT is simply to optimize on space: if you use EXPLICIT you add a constructed tag-length in front of the value that already has a tag and length, but if you use IMPLICIT then you merely _replace_ the tag of the value, thus with IMPLICIT you save the bytes for one tag and one length.

Kerberos uses EXPLICIT tagging, and it uses context tags for every SEQUENCE member, so these extra tags and lengths add up, but yeah, dumpasn1 on a Kerberos PDU (if you have the plaintext of it) is more usable than on a PKIX value.

jeroenhd

2 months ago

DER is TLV. You don't know the specifics ("this integer is a value between 10 and 53") that the schema contains, but you know it's an integer when you read it.

PER lacks type information, making encoding much more efficient as long as both sides of the connection have access to the schema.

zzo38computer

2 months ago

> The "tag" of the data doesn't actually tell you the type of the data by itself (most of the time at least)

In my experience it does tell you the type, but it depends on the schema. If implicit types are used, then it won't tell you the type of the data, but if you use explicit, or if it is neither implicit nor explicit, then it does tell you the type of the data. (However, if the data type is a sequence, then you might not lose much by using an implicit type; the DER format still tells you that it is constructed rather than primitive.)

jeroenhd

2 months ago

1 reply

You can parse DER, but you have no idea what you've just parsed without the schema. In a software library, that's often not very useful, but at least you can verify that the message was loaded correctly, and if you're reverse engineering a proprietary protocol you can at least figure out the parts you need without having to understand the entire thing.

woodruffw

2 months ago

2 replies

Yes, it's like JSON in that regard. But the key part is that the framing of DER doesn't require a schema; that isn't true for all encoding formats (notably protobuf, where types have overlapping encodings that need to be disambiguated through the schema).

cryptonector

2 months ago

1 reply

It's really not an advantage that DER can be "parsed" without a schema. (As compared to: XDR, PER, OER, DCE RPC, etc., which really can't be.) It's only possible because of the use of tag-length value encoding, which is really wasteful and complicates life (by making it harder or impossible to do online encoding, since you have to compute the length before you can place the value because the length itself is variable length so you have to reserve the correct number of bytes for it and shoot-me-now).

woodruffw

2 months ago

1 reply

I don't have a strong opinion about whether it's an advantage or not, that was more just about the claim that it can't be parsed without a schema.

(I don't think variable-length-lengths are that big of a deal in practice. That hasn't been a significant hurdle whenever I've needed to parse DER streams.)

cryptonector

2 months ago

Variable length lengths are not a big deal, but they prevent online encoding. The way you deal with that anyways is that you make your system use small messages and then stream those.

jeroenhd

2 months ago

I'd argue that JSON is still easier as it allows you to reason about the structure and build up a (partial) schema at least. You have the keys of the objects you're trying to parse. Something like {"username":"abc","password":"def",userId:1,admin:false} would end up something like Utf8String(3){"abc"}+Utf8String(3){"def"}+Integer(1){1}+Integer(1){0} if encoded in DER style.

This has the fun side effect that DER essentially allows you to process data ("give me the 4th integer and the 2nd string of every third optional item within the fifth list") without knowing what you're interpreting.

BradleyChathaAuthor

2 months ago

2 replies

> You can parse DER perfectly well without a schema, it's a self-describing format.

If the schema uses IMPLICIT tags then - unless I'm missing something - this isn't (easily) possible.

The most you'd be able to tell is whether the TLV contains a primitive or constructed value.

This is a pretty good resource on custom tagging, and goes over how IMPLICIT works: https://www.oss.com/asn1/resources/asn1-made-simple/asn1-qui...

> Because of OpenSSL's dominance, a lot of "DER" in the wild was really a mixture of DER and BER

:sweat: That might explain why some of the root certs on my machine appear to be BER encoded (barring decoder bugs, which is honestly more likely).

woodruffw

2 months ago

1 reply

Ah yeah, IMPLICIT is the main edge case. That's a good point.

cryptonector

2 months ago

1 reply

Even if where is no use of IMPLICIT you still have the problem that it's just a bunch of primitive values and composites of them, but you don't know what anything means w/o reference to the defining module. And then there's all the OCTET STRING wrappers of things that are still DER-encoded -- there are lots of these in PKIX, even just in Certificate you'll find:

  - the parameters in AlgorithmIdentifier
  - the attribute values in certificate names
  - all the extensions
  - otherName choices of SubjectAlternativeName
  - certification policies
  - ...

Look at RFCs 5911 and 5912 and look for all the places where `CLASS` is used, and that's roughly how many "typed holes" there are in PKIX.

woodruffw

2 months ago

1 reply

Sure, but that's the same thing as you see with "we've shoved a base64'd JSON object in your JSON object." Value opacity is an API concern, not evidence that DER can't be decoded without a schema.

cryptonector

2 months ago

For sure. Typed holes are a fact of life.

The wikipedia page on serialization formats[0] calls ASN.1 'information object system' style formalisms (which RFCs 5911 and 5912 make use of, and which Heimdal's ASN.1 makes productive use of) "references", which I think is a weird name.

[0] https://en.wikipedia.org/wiki/Comparison_of_data-serializati...

cryptonector

2 months ago

Is it really because of OpenSSL? Anyways, I don't see much of this in the wild.

syncsynchalt

2 months ago

2 replies

One of my big problems with ASN.1 (and its encodings) is how _crusty_ it is.

You need to populate a string? First look up whether it's a UTF8String, NumericString, PrintableString, TeletexString, VideotexString, IA5String, GraphicString, VisibleString, GeneralString, UniversalString, CHARACTER STRING, or BMPString. I'll note that three of those types have "Universal" / "General" in their name, and several more imply it.

How about a timestamp? Well, do you mean a TIME, UTCTime, GeneralizedTime, or DATE-TIME? Don't be fooled, all those types describe both a date _and_ time, if you just want a time then that's TIME-OF-DAY.

It's understandable how a standard with teletex roots got to this point but doesn't lead to good implementations when there is that much surface area to cover.

cryptonector

2 months ago

Eh, for all new things use only UTF8String and you're done. For all old things limit yourself to US-ASCII in whatever kind of string and you're done.

Implementing GeneralString in all its horror is a real pain, but also you'll never ever need it.

This generality in ASN.1 is largely due to it being created before Unicode.

zzo38computer

2 months ago

I think that it is useful to have different types for different purposes.

> You need to populate a string? First look up whether it's a UTF8String, NumericString, PrintableString, TeletexString, VideotexString, IA5String, GraphicString, VisibleString, GeneralString, UniversalString, CHARACTER STRING, or BMPString.

They could be grouped into three groups: ASCII-based (IA5String, VisibleString, PrintableString, NumericString), Unicode-based (UTF8String, BMPString, UniversalString), and ISO-2022-based (TeletexString, VideotexString, GraphicString, GeneralString). (CHARACTER STRING allows arbitrary character sets and encodings, and does not fit into any of these groups. You are unlikely to need it, but it is there in case you do need it.)

IA5String is the most general ASCII-based type, and GeneralString is the most general ISO-2022-based type. For decoding, you can treat the other ASCII-based types as IA5String if you do not need to validate them, and you can treat GraphicString like GeneralString (for TeletexString and VideotexString, the initial state is different, so you will have to consider that). For the Unicode-based types, BMPString is UTF-16BE (although normally only BMP characters are allowed) and UniversalString is UTF-32BE.

When making your own formats, you might just use the most general ones and specify your own constraints, although you might prefer to use the more restrictive types if they are known to be suitable; I usually do (for example, PrintableString is suitable for domain names (as well as ICAO airport codes, etc) and VisibleString is suitable for URLs (as well as many other things)).

> How about a timestamp? Well, do you mean a TIME, UTCTime, GeneralizedTime, or DATE-TIME?

UTCTime probably should not be used for newer formats, since it is not Y2K compliant (although it may be necessary when dealing with older formats that use it, such as X.509); GeneralizedTime is better.

In all of these cases, you only need to implement the types you are using in your program, not necessarily all of them.

(If needed, you can also use the "ASN.1X" that I made up which adds some additional nonstandard types, such as: BCD string, TRON string, key/value list, etc. Again, you will only need to implement the types that you are actually using in your program, which is probably not all of them.)

zzo38computer

2 months ago

> The first is that the format isn't really parseable without using a schema, unlike (say) XML or JSON.

You can parse DER without using a schema (except for implicit types, although even then you can always parse the framing even if the value cannot necessarily be parsed; for this reason I only use implicit types for sequences and octet strings (and only if an implicit type is needed, which it often isn't), in my own formats). (The meaning of many fields will not be known without the schema, but that is also true of XML and JSON.)

I wrote a DER parser without handling the schema at all.

cryptonector

2 months ago

> I'm hardly a connoisseur of DER implementations, but my understanding is that there are two main problems with DER. The first is that the format isn't really parseable without using a schema, unlike (say) XML or JSON.

That's not really the problem. The problem is that DER is a tag-length-value encoding, which is quite redundant and inefficient and a total crutch that people who didn't see XDR first could not imagine not needing, but yeah, they really didn't need it. That crutch made it harder, not easier, to implement ASN.1/DER.

XML is no picnic either, by the way. JSON is much much simpler, and it's true you don't need a schema, but you end up wanting one anyways.

whizzter

2 months ago

I wrote an Asn.1 decoder and since it contains type/size info you can often read a subset and handle the rest as opaque data objects if you need round-tripping, this is required as there can be plenty of data that is unknown to older consumers (like the ETSI EIDAS/Pades personal information extensions in PDF signatures).

However, to have a sane interface for actually working with the data you do need a schema that can be compiled to a language specific notation.

cryptonector

2 months ago

> The main advantage of ASN.1 (specifically DER) in an HTTPS/PKI context is that it's a canonical encoding. To my understanding Protobuf isn't; I don't know about Thrift.

There should be no need for a canonical encoding. 40 years ago people thought you needed that so you could re-encode a TBSCertificate and then validate a signature, but in reality you should keep the encoding as-received of that part of the Certificate. And so on.

jeroenhd

2 months ago

1 reply

Protobuf is pretty much ASN.1 with better tooling, optimized for message exchange protocol rather than files, when it comes down to the details. Withouth ASN.1 and the lessons learned from it, another binary serialization protocol would've probably taken its place, and I bet Protobuf and similar tools would look and perhaps work quite differently. The same way JSON would look and act quite differently if XML had never been invented.

dathinab

2 months ago

1 reply

> Protobuf is pretty much ASN.1

no, not at all

they share some ideas, that doesn't make it "pretty much ASN.1". Its only "pretty much the same" if you argue all schema based general purpose binary encoding formats are "pretty much the same".

ASN.1 also isn't "file" specific at all it's main use case is and always has been being used as message exchange protocols.

(Strictly speaking ASN.1 is also not a single binary serialization format but 1. one schema language, 2. some rules for mapping things to some intermediate concepts, 3. a _docent_ different ways how to "exactly" serialize things. And in the 3rd point the difference can be pretty huge, from having something you can partially read even without schema (like protobuff) to more compact representations you can't read without a schema at all.)

jeroenhd

2 months ago

1 reply

> if you argue all schema based general purpose binary encoding formats are "pretty much the same"

At the implementation level they are different, but when integrating these protocols into applications, yeah, pretty much. Schema + data goes in, encoded data comes out, or the other way around. In the same way YAML and XML are pretty much the same, just different expressions of the same concepts. ASN.1 even comes with multiple expressions of exactly the same grammar, both in text form and binary form.

ASN.1 was one of the early standardised protocols in this space, though, and suffers from being used mostlyin obscure or legacy protocols, often with proprietary libraries if you go beyond the PKI side of things.

ASN.1 isn't file specific, it was designed for use in telecoms after all, but encodings like DER work better inside of file formats than Protobuf and many protocols like it. Actually having a formal standard makes including it in file types a lot easier.

cryptonector

2 months ago

PB is a lot more invasive at the build system layer, and in the libraries you have to link with. But that's not an essential aspect of PB, more like accidental, thus you're quite right :)

WorldMaker

2 months ago

2 replies

If it were designed today, I would imagine it could end up looking like JWT (JOSE) and use JSON. I've seen several key exchange formats in JSON beyond JWT/JOSE in the wild today as well, so we may even get there eventually in a future upgrade of TLS.

whizzter

2 months ago

1 reply

Yes and no, the JSON handling of things like binary data (hashes) and big-ints leaves a bit to be desired (sure we can use base64 encoding). Asn.1 isn't great by any extent but for this JSON really isn't much better apart from better library support.

Yes, JOSE is still infinitely better than XmlSignatures and the canonical XML madness to allow signatures _inside_ the document to be signed.

cryptonector

2 months ago

Or use COSE, which uses CBOR, which doesn't have to base64-encode all the binary blobs.

dathinab

2 months ago

2 replies

possible but unlikely for multiple reasons

- huge braking change with the whole cert infrastructure

- this question was asked to the people who did choose ASN.1 for X509 and AFIK they saied today they would use protobuf. But I don't remember where I have that from.

- JOSE/JWT etc. aren't exactly that well regarded in the crypto community AFIK or designed with modern insights about how to best do such things (too much header malleability, too much crypto flexibility, too little deterministic encoding of JSON, too much imprecise defined corner cases related to JSON, too much encoding overhead for keys and similar (which for some pq stuff can get in the 100KiB ranges), and the argument of it being readable with a text editor falls apart if anything you care about is binary (keys, etc.) and often encrypted (producing binary)). (And IMHO opinion the plain text argument also falls apart for most non-crypto stuff I mean if you anyway add a base64 encoding you already dev need tooling to read it, and weather your debug tooling does a base64 decode or a (maybe additional) data decode step isn't really relevant, same for viewing in IDE which can handle binary formats just fine etc. but thats an off topic discussion)

- if we look at some modern protocols designed by security specialists/cryptographers and have been standardized we often find other stuff (e.g. protobuf for some JWT alternatives or CBOR for HSK/AuthN related stuff).

WorldMaker

2 months ago

> JOSE/JWT etc. aren't exactly that well regarded in the crypto community

That is true, but it's also true that JWT/JOSE is a market winner and "everywhere" today. Obviously, it's not a great one and not without flaws, and its "competition" is things like SAML which even more people hate, so it had a low bar to clear when it was first introduced.

> CBOR

CBOR is a good mention. I have met at least one person hoping a switch to CWT/COSE happens to help somewhat combat JWT bloat in the wild. With WebAuthN requiring CBOR, there's more of a chance to get an official browser CBOR API in JS. If browsers had an out-of-the-box CBOR.parse() and CBOR.stringify(), that would be interesting for a bunch of reasons (including maybe even making CWT more likely).

One of the fun things about CBOR though is that is shares the JSON data model and is intended to be a sibling encoding, so I'd also maybe argue that if CBOR ultimately wins that's still somewhat indirectly a "JSON win".

bccdee

2 months ago

I thought protobuf didn't expose any canonical binary encoding. Who's been using protobuf as a cryptographic protocol primitive?

otabdeveloper4

2 months ago

1 reply

ASN.1 seems orders of magnitude simpler than Protobuf or Thrift.

dathinab

2 months ago

4 replies

how did you end up believing that?

- ASN.1 is a set of a docent different binary encodings

- ASN.1's schema languages is IMHO way better designed then Protobuf but also more complex as it has more features

- ASN.1 can encode much more different data layouts (e.g. things where in Protobuf you have to use "tricks") each being layout in the output differently depending on the specific encoding format, annotations on the schema and options during serialization

- ASN.1 has many ways to represent things more "compact" which all come with their own complexity (like bit mask encoded boolean maps)

overall the problem of ASN.1 is that it's absurdly over engineered leading to you needing to now many hundred of pages of across multiple standard documents to just implement one single encoding of the docent existing ones and even then you might run into ambiguous unclear definitions where you have to ask on the internet for clarification

if we ignore the schema languages for a moment most senior devs probably can write a crappy protobuf implementation over the weekend, but for ASN.1 you might not even be able to digest all relevant standards in that time :/

Realistically if ASN.1 weren't as badly overengineered and had shipped only with some of the more modern of it's encoding formats we probably would all be using ASN.1 for man things including maybe your web server responses and this probably would cut non image/video network bandwidth by 1/3 or more. But then the network is overloaded by image/video transmissions and similar not other stuff so I guess who cares???!???

otabdeveloper4

2 months ago

1 reply

ASN.1 doesn't have a schema language. It has a schema spec, how you encode the schema is up to you. This is a huge boon.

ASN.1 has many encoding standards, but you don't need to implement them all, only the specific one for your needs.

ASN.1 has a standard and an easy to follow spec, which Protobuf doesn't.

In sum: I could cobble together a working ASN.1 implementation over a weekend. In contrast, getting to a clean-room working Protobuf library is a month's work.

Caveat: I have not had to deal with PKI stuff. My experience with ASN.1 is from LDAP, one of the easiest protocols to implement ever, IMO.

dathinab

2 months ago

1 reply

my experience is from helping a coworker writing a ASN.1 serialization/deserialization library limited to a subset of encodings but (close to) the full spec each encoding ;)

> to follow spec, which Protobuf doesn't.

I can't say so personally, but from what I heard from the coworker I helped the spec isn't always easy to follow as there are many edge cases where you can "guess" what they probably mean but they aren't precise enough. Through they had a surprising amount of success to get clarification from authoritative sources (I think some author or maintainer of the standard but I'm not fully sure, it was a few years ago).

In general there seem to be a huge gap between writing something which works for some specific usage(s) of ASN.1 and something which works "in general/all the standard" (for the relevant encodings (mainly DER, but also at least part of non-DER BER as far as I remember)).

> Protobuf doesn't.

yes but it's wire format is relatively simple and documented (not as a specification but documented anyway) so getting something going which can (de-)serialize the wire format really isn't that hard and the mapping from that to actual data types is also simpler (through also won't work for arbitrary types due to dump design limitations). I would be surprised if you need a month of work to get something going there. Through if you want to reproduce all the tooling eco-system or special ways some libraries can interact with it (e.g. in place edits etc.) it's a different project altogether. What I mean is just a (de-)serializer for the wire format with appropriate mapping to data types (objects,structs, whatever the language you use prefers).

otabdeveloper4

2 months ago

> Through if you want to reproduce all the tooling eco-system

But for Protobuf you kinda have to. Needing to parse the .proto files and comform to Google's code gen ideas is implied.

For ASN you just need to follow a spec, not a concrete implementation.

cryptonector

2 months ago

> ASN.1's schema languages is IMHO way better designed then Protobuf but also more complex as it has more features

Yes, but you can use a subset of ASN.1. You don't have to implement all of x.680, let alone all of x.681, x.682, and x.683.

cryptonector

2 months ago

> Realistically if ASN.1 weren't as badly overengineered and had shipped only with some of the more modern of it's encoding formats we probably would all be using ASN.1 for man things including maybe your web server responses and this probably would cut non image/video network bandwidth by 1/3 or more. But then the network is overloaded by image/video transmissions and similar not other stuff so I guess who cares???!???

ASN.1 was not over-engineered in 1990. The things that kept it from ruling the world are:

- the ITU-T specs for it were _not_ free back then

- the syntax is context dependent, so using a LALR(1) parser generator to parse ASN.1 is difficult, though not really any more than it is to parse C with a LALR(1) parser generator, but yeah if it had had a LALR(1)-friendly syntax then ASN.1 would have been much easier to write tooling for

- competition from XDR, DCE/MS RPC, XML, JSON, Protocol Buffers, Flat Buffers, etc.

The over-engineering came later, as many lessons were learned from ASN.1's early years. Lessons that the rest of the pack mostly have not learned.

anentropic

2 months ago

for "docent", do you mean "dozen"?

I had to look up https://www.merriam-webster.com/dictionary/docent

dfox

2 months ago

1 reply

It is not only that ASN.1 was there before SSL, but even the certificate format was there before SSL. The certificate format comes from X.500, which is the "DAP" part of "LDAP", L as in "Lightweight" in "LDAP" refers mostly to LDAP not using public key certificates for client authentication in contrast to X.500 [1]. Bunch of other related stuff comes from RSA's PKCS series specifications, which also mostly use ASN.1.

1] the somewhat ironic part is that when it was discovered that using just passwords for authentication is not enough, the so called "lighweight" LDAP got arguably more complex that X.500. Same thing happened to SNMP (another IETF protocol using ASN.1) being "Simple" for similar reasons.

cryptonector

2 months ago

x.400 and x.500 are the real horrors lurking in PKI/PKIX. Absolute horrors.

elcritch

2 months ago

The IETF has made a bunch of standards lately like COSE for doing certificates and encryption stuff with CBOR. It’s largely for embedded stuff, but I could see it being a modern alternative. I haven’t used it myself yet.

CBOR is self-describing like JSON/XML meaning you don’t need a schema to parse it. It has better set of specific types for integers and binary data unlike JSON. It has an IANA database of tags and a canonical serialization form unlike MsgPack.

thadt

2 months ago

No, we would use something similar to S-Expressions [1]. Parsing and generation would be at most a few hundred lines of code in almost any language, easily testable, and relatively extensible.

With the top level encoding solved, we could then go back to arguing about all the specific lower level encodings such as compressed vs uncompressed curve points, etc.

[1] https://datatracker.ietf.org/doc/rfc9804

cryptonector

2 months ago

ASN.1 goes back to 1984. PKI goes back to 1988.

If it were created today it would look a lot like OAuth JSON Web Tokens (JWT) and would use JSON instead of ASN.1/DER.

lukeh

2 months ago

1 reply

I worked on a Swift ASN.1 compiler [1] a while back (not swift-asn1, mine used Codable). I saved myself some time by using the Heimdal JSON compiler, which can transform ASN.1 into a much more parseable JSON AST.

[1] https://github.com/PADL/ASN1Codable

[2] https://github.com/heimdal/heimdal/tree/master/lib/asn1

BradleyChathaAuthor

2 months ago

2 replies

Not heard of either of those projects before, but I love how libasn1's README has a thinly veiled hint of disdain for ASN.1

> which can transform ASN.1 into a much more parseable JSON AST

The sign of a person who's been hurt, and doesn't want others to feel the same pain :D

marcosdumay

2 months ago

1 reply

Hey, I love how the author describes ASN.1 as a "syntax" in quotes.

What I disagree is on the disdain being veiled. Seems very explicit to me.

Anyway, yeah, I hadn't heard about it before either, and it's great to know that somebody out there did solve that horrible problem already, and that we can use the library.

cryptonector

2 months ago

1 reply

Ugh, I did not mean to express disdain.

marcosdumay

2 months ago

1 reply

I'm sorry for misinterpreting it then. But I have to say, it's funnier to imagine you did. And easy to relate.

cryptonector

2 months ago

Fair.

cryptonector

2 months ago

1 reply

I think you're mistaking this:

> ASN.1 is a... some would say baroque, perhaps obsolete, archaic even, "syntax" for expressing data type schemas, and also a set of "encoding rules" (ERs) that specify many ways to encode values of those types for interchange.

for me expressing disdain for ASN.1. On the contrary: I'm saying those who would say that are wrong:

> ASN.1 is a wheel that everyone loves to reinvent, and often badly. It's worth knowing a bit about it before reinventing this wheel badly yet again.

BradleyChathaAuthor

2 months ago

Ah sorry, emotions and written text are always a hard thing to put together.

lepicz

2 months ago

2 replies

some people simply like pain :D

(i worked with asn1c (not sure which fork) and had to hack in custom allocator and 64bit support. i shiver every time something needs attention in there)

BradleyChathaAuthor

2 months ago

1 reply

Honestly any compiler project in pure C is pretty hardcore in my eyes, ASN.1 must amplify the sheer horror.

cryptonector

2 months ago

Well, C is the source of the horror, for me anyways.

morshu9001

2 months ago

I was using asn1c with a Rust project since there was no Rust asn1 compiler at the time. It became a bottleneck, and in profiling I found that the string copying helper used everywhere was doing bit-level copying even in our byte-aligned mode, which was extra weird cause that function had a param for byte alignment.

One memcpy made it like 30% faster overall.

usrbinenv

2 months ago

1 reply

I really love D, it's one of my favorite languages. I've started implementing a vim-like text editor in it from scratch (using only Raylib as a dependency) and was surprised how far I was able to get and how good my test coverage was for it. My personal favorite features of D:

* unit tests anywhere, so I usually write my methods/functions with unit tests following them immediately

* blocks like version(unittest) {} makes it easy to exclude/include things that should only be compiled for testing

* enums, unions, asserts, contract programming are all great

I would say I didn't have to learn D much. Whatever I wanted to do with it, I would find in its docs or asked ChatGPT and there would always be a very nice way to do things.

gavinray

2 months ago

1 reply

D is a bittersweet topic for me.

From a philosophical/language-design standpoint, it ticks so many boxes. It had the potential to be wildly popular, had a few things gone differently.

If the language tooling and library ecosystem were on par with the titans of today, like Rust/Go, it really would be a powerhouse language.

binaryturtle

2 months ago

2 replies

Isn't D supported by the GNU compiler collection? I personally would prefer this type of tooling over what Rust and Go do (I can't even get their compilers to run on my old platform anymore; not to mention all this dependencies on remote resources typical Rust/Go projects seem to have: which seems to be enforced by the ecosystem?)

pjmlp

2 months ago

2 replies

It is, however keeping LDC and GCC up to date is a volunteer effort with not enough people, so they are always a bit behind dmd.

Still much better than GCCGO, kind of useless for anything beyond Go 1.18, no one is updating it any longer, and may as well join gcj.

sfpotter

2 months ago

1 reply

Having written real code in D, I can say that the slight discrepancy between dmd, LDC, and gdc isn't a roadblock in practice.

pjmlp

2 months ago

Depends how creative you happen to be with some features.