The Cost of a Closure in C

22 days ago

5 replies

[delayed]

juvoly

22 days ago

2 replies

That sounds cool, but this quickly gets complicated. Some aspects that need to be addressed:

- where does the automatically defined struct live? Data segment might work for static, but doesn't allow dynamic use. Stack will be garbage if closure outlives function context (ie. callback, future). Heap might work, but how do you prevent leaks without C++/Rust RAII?

- while a function pointer may be copied or moved, the state area probably cannot. It may contain pointers to stack object or point into itself (think Rust's pinning)

- you already mention recursion, compilation

- ...

fuhsnn

22 days ago

1 reply

IMO the C way is to allow users to explicitly manage context area, along the lines of posix ucontext.h or how the author's closure proposal handle closure allocation[1]. [1] https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Functio...

20 days ago

Yes that's what I'm thinking. Essentially a stateful function definition defines both a function, and a struct containing the state. I think there needs to be two ways of invoking a stateful function f: (1) if you invoke f within another stateful function g, each call site in g that calls f automatically gets a distinct state instance that becomes part of g's state, on the other hand, (2) if you want to invoke f in a regular (non-stateful) function, you need to manually manage the state and explicitly pass it in. That would be one purpose of the statetype(f) operator: to allow you to explicitly declare a state instance. Manual state management would also be used when you want to invoke f with the same state multiple times (e.g. from within a loop).

20 days ago

In C I don't think the copy/move thing is an issue. It has the same hazards as copying struct instances. And yes I am thinking of this as a C extension.

Another complication is that it would be beneficial to be able to optimize state storage in the same way that stack frame resources are optimized, including things like coalescing equal values in conceptually distinct state instances. This would (I think) preclude things like sizeof(statetype(f)) which you really want for certain types of manual memory management, or it would require multiple compiler passes.

tyushk

22 days ago

1 reply

Would this be similar to how Rust handles async? The compiler creates a state machine representing every await point and in-scope variables at that point. Resuming the function passes that state machine into another function that matches on the state and continues the async function, returning either another state or a final value.

[1] https://github.com/ThePhD/future_cxx/issues/55#issuecomment-...

20 days ago

[delayed]

fuhsnn

22 days ago

I dreamed up a similar idea[1] upon reading the author's closure proposal, it's also really close to async coroutines.

1f60c

22 days ago

> a "state" keyword for declaring variables in a "stateful" function

Raku (née Perl 6) has this! https://docs.raku.org/language/variables#The_state_declarato...

vintagedave

22 days ago

Yes, though it was a remarkably brief mention. I believe Borland tried to standardise it back in 2002 or so,* along with properties. (I was the C++Builder PM, but a decade and a half after that attempt.)

C++Builder’s entire UI system is built around __closure and it is remarkably efficient: effectively, a very neat fat pointer of object instance and method.

[*] Edit: two dates on the paper, but “bound pointer to member” and they note the connection to events too: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n13...

mgaunard

22 days ago

1 reply

I feel the results say more about the testing methodology and inlining settings than anything else.

Practically speaking all lambda options except for the one involving allocation (why would you even do that) are equivalent modulo inlining.

In particular, the caveat with the type erasure/helper variants is precisely that it prevents inlining, but given everything is in the same translation unit and isn't runtime-driven, it's still possible for the compiler to devirtualize.

I think it would be more interesting to make measurements when controlling explicitly whether inlining happens or the function type can be deduced statically.

22 days ago

1 reply

Given a Sufficiently Good™ compiler, yes, after devirtualization and heap elision all variants should generate exactly the same code. In practice is more complicated. Devirtualization needs to runs after (potentially interprocedural) constant propagation, which might be too late to take advantage of other optimization opportunities, unless the compiler keeps rerunning the optimization pipeline.

In a simple test I see that GCC has no problem with completely removing the overhead of std::function_ref, but plain std::function is a huge mess.

Eventually we will get there [1], but in the meantime I prefer not to rely on devirtualization, and heap elision is more of a party trick.

[1] for example 25 years ago compilers were terrible at removing abstraction overhead of the STL, today there is very little cost.

mgaunard

22 days ago

You can just write the benchmark in such a way that the optimizations are not possible.

nesarkvechnep

22 days ago

2 replies

I'm thinking of using C++ for a personal project specifically for the lambdas and RAII.

I have a case where I need to create a static templated lambda to be passed to C as a pointer. Such thing is impossible in Rust, which I considered at first.

pornel

22 days ago

4 replies

Yeah, Rust closures that capture data are fat pointers { fn*, data* }, so you need an awkward dance to make them thin pointers for C.

    let mut state = 1;
    let mut fat_closure = || state += 1;
    let (fnptr, userdata) = make_trampoline(&mut &mut fat_closure);

    unsafe {
        fnptr(userdata);
    }

    assert_eq!(state, 2);

    use std::ffi::c_void;
    fn make_trampoline<C: FnMut()>(closure: &mut &mut C) -> (unsafe fn(*mut c_void), *mut c_void) {
        let fnptr = |userdata: *mut c_void| {
            let closure: *mut &mut C = userdata.cast();
            (unsafe { &mut *closure })()
        };
        (fnptr, closure as *mut _ as *mut c_void)
    }

It requires a userdata arg for the C function, since there's no allocation or executable-stack magic to give a unique function pointer to each data instance.

nesarkvechnep

22 days ago

1 reply

I know about this technique but it uses too much unsafe for my taste. Not that it's bad or anything, just a personal preference.

pornel

22 days ago

1 reply

[delayed]

nesarkvechnep

21 days ago

Yes but my problem wasn’t with the user data pointer but the fact that I needed a STATIC generic lambda. Static because the C library then forks and continues to call the lambda in the new process.

eqvinox

22 days ago

If Rust has a stable ABI on where the data* is in the function arguments (presumably first?), you don't need to do anything if it matches the C code's expected function signature including the user context arg.

Unfortunately a lot of existing C APIs won't have the user arg in the place you need it, it's a mix of first, last, and sometimes even middle.

MindSpunk

22 days ago

This is a problem for all capturing closures though, not just Rust's. A pure fn-ptr arg can't have state, and if there's no user data arg then there's no way to make a trampoline. If C++ was calling a C API with the same constraint it would have the same problem.

skavi

22 days ago

> Rust closures that capture data are fat pointers { fn, data }

This isn’t fully accurate. In your example, `&mut C` actually has the same layout as usize. It’s not a fat pointer. `C` is a concrete type and essentially just an anonymous struct with FnMut implemented for it.

You’re probably thinking of `&dyn FnMut` which is a fat pointer that pairs a pointer to the data with a pointer to a VTable.

queuebert

22 days ago

In Rust, could you instead use a templated struct wrapping a function pointer along with #[repr(C)]?

22 days ago

3 replies

I think local functions (like the GNU extension) that behave like C++ byref(&) capturing lambdas makes the most sense for C.

You can call the local functions directly and get the benefits of the specialized code.

There's no way to spell out this function's type, and no way to store it anywhere. This is true of regular functions too!

To pass it around you need to use the type-erased "fat pointer" version.

I don't see how anything else makes sense for C.

nutjob2

22 days ago

1 reply

The price you pay for GCC nested (local) functions is an executable stack with 'trampolines'.

I'm a fan of nested functions but don't think the executable stack hack is worth it, and using a 'display' is a better solution.

See the Dragon Book or Compiler Construction: Principles and Practice (1984) by Louden

22 days ago

1 reply

You misunderstood my comment. GNU local function syntax, C++ [&] lambda behavior (i.e., a hidden struct).

nutjob2

22 days ago

1 reply

I really did, my comment is specific to C.

LegionMammal978

22 days ago

1 reply

The only reason that GCC needs executable trampolines is for the program to be able to create an ordinary function pointer and have all the captured data come along with it. The proposal is to reuse the syntax of nested functions, but change the semantics so that they are no longer callable via ordinary function pointers, but rather "fat pointers" that reference the captured data. This method is effectively used by C++ and Rust and does not need trampolines.

https://news.ycombinator.com/item?id=46243298

21 days ago

1 reply

Meaning something that generates code similar to what I have in this comment?

https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Functio...

21 days ago

Yeah something like that, though built-in to the compiler.

__phantomderp

20 days ago

1 reply

For what it's worth, that is the primary feature of the proposal linked in the blog post. It's just not talked about in the post because that post is about... performance!

18 days ago

That actually goes a bit further than my suggestion, since it allows the closure to be returned with its unique type. I'm not a fan of introducing these "unnamable types" to C since it means the closure producing function cannot be declared in a header.

I do like the trampoline trick in 3.2.4, however, neat alternative to a fat pointer!

22 days ago

> There's no way to spell out this function's type, and no way to store it anywhere. This is true of regular functions too!

well regular functions decay to function pointers. You could have the moral equivalent of std::function_ref (or similarly, borland __closure) in C of course and have closures decay to it.

Rochus

22 days ago

1 reply

The benchmark demonstrates that the modern C++ "Lambda" approach (creating a unique struct with fields for captured variables) is effectively a compile-time calculated static link. Because the compiler sees the entire definition, it can flatten the "link" into direct member access, which is why it wins. The performance penalty the author sees in GCC is partly due to the OS/CPU overhead of managing executable stacks, not just code inefficiency. The author correctly identifies that C is missing a primitive that low-level languages perfected decades ago: the bound method (wide) pointer.

The most striking surprise is the magnitude of the gap between std::function and std::function_ref. It turns out std::function (the owning container) forces a "copy-by-value" semantics deeply into the recursion. In the "Man-or-Boy" test, this apparently causes an exponential explosion of copying the closure state at every recursive step. std::function_ref (the non-owning view) avoids this entirely.

22 days ago

2 replies

Even if you never copy the std::function the overhead is very large. GCC (14 at least) does not seem to be able to elide the allocation, nor inline the function itself, even if used immediately after use and the object never escapes the function. Given the opportunity, GCC seems to be able to completely remove one layer pf function_ref, but fails at two layers.

boris

22 days ago

1 reply

GCC (libstdc++) as all other major C++ runtimes (libc++, MSVC) implements the small object optimization for std::function where a small enough callable is stored directly in std::function's state instead of on the heap. Across these implementations, you can reply on being able to capture two pointers without a dynamic allocation.

22 days ago

You would think so, but it actually doesn't. last time I checked, libstdc++ could only optimize std::bind closures. A trivial test with a stateless lambda shows this is still the case in GCC14 and 15. In fact I can't even seem to trigger the library optimization with bind.

Differently from GCC14, GCC15 itself does seem to be able to optimize the allocation (and the whole std::function) in trivial cases though (independently

Rochus

22 days ago

This is exactly right, and the "Man-or-Boy" benchmark hits the worst-case scenario for libstdc++ specifically. The optimization fails here. My "copy-by-value" comment refers to the ownership semantics. Since std::function owns its storage, and the Man-or-Boy recursion passes the closure into the next layer (often by value or by capturing it into a new closure), we trigger the copy constructor. If the SBO limit is exceeded, that copy constructor performs a new heap allocation and a deep copy of the state.

unwind

22 days ago

6 replies

This was very interesting, and it's obvious from the majority of the text that the author knows a lot about these languages, their implementation, benchmarking corners, and so on. Really!

Therefore it's very jarring with this text after the first C code example:

This uses a static variable to have it persist between both the compare function calls that qsort makes and the main call which (potentially) changes its value to be 1 instead of 0

This feels completely made up, and/or some confusion about things that I would expect an author of a piece like this to really know.

In reality, in this usage (at the global outermost scope level) `static` has nothing to do with persistence. All it does is make the variable "private" to the translation unit (C parliance, read as "C source code file"). The value will "persist" since the global outermost scope can't go out of scope while the program is running.

It's different when used inside a function, then it makes the value persist between invocations, in practice typically by moving the variable from the stack to the "global data" which is generally heap-allocated as the program loads. Note that C does not mention the existence of a stack for local variables, but of course that is the typical implementation on modern systems.

kreco

22 days ago

1 reply

That's a very weird comment, your spreading your knowledge and not really addresse what could have been changed in the article.

If I follow your comment, you mean that he could have use a non-static global variable instead and avoid using "static" keyword afterward?

unwind

22 days ago

1 reply

Oh! Thanks, I was not being as concrete as I imagined. Sorry.

Yes, the `static` can simply be dropped, it does no additional work for a single-file snippet like this.

I tried diving into Compiler Explorer to examine this, and it actually produces slightly different code for the with/without `static` cases, but it was confusing to deeply understand quickly enough to use the output here. Sorry.

22 days ago

1 reply

I see exactly the same assembly from x86-64 GCC 15.2 with -O2 the first example in the article both as is and without `static`, which makes sense. The two do differ if you add -fPIC, as though you’re compiling a dynamic library, and do not add -fvisibility=hidden at the same time, but that’s because Linux dynamic linking is badly designed.

Chabsff

22 days ago

1 reply

TU-level concepts disolve during the linking stage. You need to compile with -c to generate an object file to see the distinction.

Also, the difference manifests in the symbols table, not the assembly.

22 days ago

To clarify, I was talking about Compiler Explorer-cleaned disassembly, same as the comment I was replying to.

pjmlp

22 days ago

1 reply

The author contributes to ISO C and ISO C++ working groups, and his latest contribution was #embed.

steveklabnik

22 days ago

1 reply

Not just that, the author is the Project Editor for WG14.

This doesn’t mean that it’s impossible to make mistakes, but still.

22 days ago

1 reply

It means he can edit LaTeX. Of course, JeanHeyd is very qualified, but being project editor for an ISO standard does not require this.

steveklabnik

22 days ago

1 reply

I mean, you're closer to the committee than I am, but while that is true in a literal sense, I'd assume that you all would not let someone who knew how to edit LaTeX but not know anything about C hold that position.

22 days ago

Assuming we have some choice. Not many people volunteer their time to this work, which is quite a lot and not much fun. Companies also do not invest a lot of resources into C.

gldrk

22 days ago

>This uses a static variable to have it persist between both the compare function calls that qsort makes and the main call which (potentially) changes its value to be 1 instead of 0

The only misleading thing here is that ‘static’ is monospaced in the article (this can’t be seen on HN). Other than that, ‘static variable’ can plausibly refer to an object with a static storage duration, which is what the C standard would call it.

sfpotter

22 days ago

I had a completely different response reading the sentence. I've been programming in C for 20+ years and am very familiar with exactly the problem the author is discussing. When they referred to a "static variable", I understood immediately that they meant a file static variable private to the translation unit. Didn't feel contrived or made up to me at all; just a reflection of the author's expertise. Precision of language.

debugnik

22 days ago

[delayed]

22 days ago

I'm finding myself in a weird position now, because I disagree with a whole lot of things in the blog post (well, the parts I was willing to read anyways), but calling that variable static for the sake of persistence was correct.

The fact that you are questioning the use of the term shows that you are not familiar with the ISO C standard. What the author alludes to is static storage duration. And whether or not you use the "static" keyword in that declaration, the storage duration of the object remains "static". People mostly call those things "global variables", but the proper standardese is "static storage duration". In that sense, the author was right to use "static" for the lifetime of the object.

22 days ago

3 replies

Thread locals do solve the problem. You create a wrapper around the original function. You set a global thread local user data, you pass in a function which calls the function pointer accepting the user data with the global one.

srcreigh

22 days ago

1 reply

Yep. Thread locals are probably faster than the other solutions shown too.

It’s confusing to me that thread locals are “not the best idea outside small snippets” meanwhile the top solution is templating on recursion depth with a constexpr limit of 11.

22 days ago

The method of having static variables to store state in functions is used heavily in ANSI C book. It’s honestly a beautiful technique when used prudently.

22 days ago

1 reply

reentrancy.

22 days ago

1 reply

It doesn’t store state for later. It’s literally impossible to tell it’s happening.

Quekid5

21 days ago

1 reply

Imagine a comparison function that needs to call sort() as part of its implementation. You could argue that's probably a bad idea, but it would be a problem for this case.

(You could solve that with a manually maintained stack for the context in a thread local, but you'd have to do that case-by-case)

21 days ago

1 reply

That is true. It can be protected against with assert.

I think the times you need to do this are few. And this version is much more pruden.

Quekid5

21 days ago

1 reply

Assert what, exactly?

Anyway, the larger point is that a re-entrant general solution is desirable. The sort example might be a bit misguided, because who calls sort-inside-sort[0]? Nobody, realistically, but these types of issues are prevalent in the "how to do closures" area... and In C every API does it slightly differently, even if they're even aware of the issues.

[0] Because there's no community that likes nitpicking like the C (or C++) community. I considered preempting that objection :). C++ has solved this, so there's that.

20 days ago

> Assert what, exactly?

That you do not call it recursively by checking that the thread local is nil before invocation.

> a re-entrant general solution is desirable.

I know what you mean, but I just don't know why you want to emulate that in C. There is a real problem of people writing APIs that don't let you pass in data with your function pointer - the thread local method can solve 99% of those without changes to the original API.

But if you really want to do all kinds of first class functions with data, do you want to use C?

sparkie

22 days ago

1 reply

Thread locals don't fully solve the problem. They work well if you immediately call the closure, but what if you want to store the closure and call it later?

    #include <stdlib.h>
    #include <string.h>
    #include <stddef.h>

    typedef int (*comp)(const void *untyped_left, const void *untyped_right);

    thread_local int in_reverse = 0;

    __attribute__((noinline))
    int compare_impl(const void *untyped_left, const void *untyped_right, int in_reverse) {
        const int* left = untyped_left;
        const int* right = untyped_right;
        return (in_reverse) ? *right - *left : *left - *right;
    }

    comp make_sort(int direction) {
        in_reverse = direction;
        int compare(const void *untyped_left, const void *untyped_right) {
            return compare_impl(untyped_left, untyped_right, in_reverse);
        }
        return compare;
    }

    int main(int argc, char* argv[]) {

        int list[] = { 2, 11, 32, 49, 57, 20, 110, 203 };

        comp normal_sort = make_sort(0);
        comp reverse_sort = make_sort(1);

        qsort(list, (sizeof(list)/sizeof(*list)), sizeof(*list), normal_sort);
            
        return list[0];
    }

Because we create `reverse_sort` between creating `normal_sort` and calling it, we end up with a reverse sort despite clearly asking for a normal sort.

22 days ago

1 reply

The answer is you wrap it and you don’t return until the thing stored in the thread local is not needed

sparkie

22 days ago

2 replies

So we can't return the closure?

Then it's clearly only half a solution.

The example I gave above should work fine in any language with first-class closures.

21 days ago

1 reply

> The closure problem can be neatly described by as “how do I get extra data to use within this qsort call?”

    _Thread_local void *_qsort2_userData = NULL;
    _Thread_local int (* _qsort2_compare)(const void *a, const void*, void*);

    static void _qsort2_helper(const void *a, const void *b) {
         _qsort2_compare(a, b, _qsort2_userData);
    }

    void qsort2(void *base, size_t elements, size_t width, int (*compare)(const void *a, const void*), void *userData) 
    {
        _qsort2_userData = userData;
        _qsort2_compare = compare;
        qsort(base, element, width, _qsort2_helper);
    }

21 days ago

1 reply

you also need to restore the _qsort2_closure when done. But again you are reinventing dynamic scoping with all its advantages and disadvantages.

21 days ago

> you also need to restore the _qsort2_closure when done

No I do not. It will reassigned next call.

> But again you are reinventing dynamic scoping

No. I’m not reinventing anything. I’m using the existing feature of thread local variables.

The usage of such is entirely an implementation detail of qsort2 with the exception of recursion.

Dynamic scoping typically refers to defining variables which have scope outside of their call stack. No usage of this API requires it.

Can you just try to learn something new?

21 days ago

1 reply

This is the classic lexical vs dynamic scoping. Dynamic scoping works great until it doesn't.

21 days ago

Don’t use C then? It sounds like you want JavaScript, Python, or Lisp.

Once again, the caller of the API does not declare any variables so there is no dynamic scoping.

hyperbolablabla

22 days ago

1 reply

Stewart Lynch in his 10x VODs mentions his custom Function abstraction in C++. It's super clean and explicit, avoiding auto/type erasure of C++ lambdas. It's use looks something akin to:

    // imagine my_function takes 3 ints, the first 2 args are captured and curried.
    Function<void(int)> my_closure(&my_function, 1, 2);
    my_closure(3);

I've never implemented it myself, as I don't use C++ features all too much, but as a pet project I'd like to someday. I wonder how something like that compares!

spacechild1

22 days ago

Isn't this basically the same as passing the function to std::bind_front and storing it in a std::function or std::function_ref?

22 days ago

2 replies

[delayed]

fuhsnn

22 days ago

1 reply

> Heap-allocated trampolines have an obvious deallocation problem; it would be interesting to see what strategy is used for that.

With -ftrampoline-impl=heap, GCC automatically insert[1] pairs of constructor/destructor routines from libgcc which were built around mmap/munmap.

[1] https://godbolt.org/z/7s5nooMPz

22 days ago

[delayed]

21 days ago

1 reply

Why oh why isn’t 'uecker still pushing his GCC patch enabling -fno-trampolines for C. I know it’s an ABI break, but it would be so nice :(

19 days ago

(I think this is my personal record wrt the amount of errors in a short code snippet. You get the idea, and I’m frankly afraid to try and post a fixed version at this point :) )

22 days ago

2 replies

[delayed]

22 days ago

1 reply

That's all there is to it. I don't understand the whole obsession with closures.

I've used lambdas extensively in modern C++. I hate them with a passion.

I've also used OCaml. An awesome language where this stuff is super natural and beautiful.

I don't understand why people want to shoehorn functional programming into C. C++ was terrible alraedy, and is now worse for it.

> we’re going to be focusing on and looking specifically at Closures in C and C++, since this is going to be about trying to work with and – eventually – standardize something for ISO C that works for everyone.

Sigh. My heart sinks.

Asooka

22 days ago

2 replies

It does seem needlessly complex. I think a better idea is to just have a type that is a pair of pointer-sized words. That pattern crops up again and again - context pointer and function pointer, array and its size, memory allocation and effective size, etc. The problem with having both pieces in separate variables is that it is very easy to lose track of what is where. If you have it in a single bundle it is a lot simpler to use. The exact design needs a lot more consideration for sure, because I would like something simpler than writing anonymous structs everywhere (which I can already do), but at the same time flexible enough for most use cases.

https://dlang.org/spec/betterc.html

22 days ago

[delayed]

fithisux

21 days ago

betterC is all you need

eqvinox

22 days ago

Honestly all I would want from C closures is a better syntax for doing exactly that. Whatever it ends up being (if it goes in), it better have some way to interop with "legacy" function pointer + user context APIs. (Including some way to tell the compiler where the user context pointer goes on the closure's arguments.) Otherwise it's just completely useless.

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3654.pdf

22 days ago

1 reply

BTW: I wrote why the lambda design does not fit C well here:

(and I am not impressed by micro benchmarks)

22 days ago

1 reply

From the introduction, your paper seems like a counterproposal: support closures, just not the way others propose. But the paper seems to accept that closures / nested functions, supported at the language level directly, are a "good thing" for C specifically. I disagree with that. When and how has it become the consensus?

22 days ago

1 reply

At the moment there is no consensus for anything related to this. But most people agree that closures are a good thing because they make certain reoccurring programming patterns safer and easier to write. Why do you disagree?

21 days ago

1 reply

Thanks for the answer.

I disagree because I've seen closures shine (in OCaml) and suck terribly (in C++). Same concept, extremely different programming experience and debuggability. Syntax matters, language psychology matters. Closures naturally and obviously suit programming languages that are genuinely functional, or at least manage memory transparently for you. C is neither, and so "alloca", "defer", the cleanup attribute, closures -- all stick out sorely. The whole selling point of C is explicitness. The tedium of C is the price we pay for the great control that C offers us with a relatively simple vocabulary. I couldn't be more content with that deal.

C++ is impossible for any single person to learn, there is so much insanely complicated implicit behavior in it. C can mostly be learned by a single (persistent) person, but it's been getting harder.

I don't want C to be fashionable, or attractive. I want it to remain minimal. If someone feels hamstrung by it, there are so many other languages to choose from. I simply want the particular tradeoff that C offers (or used to offer) to remain in existence. And that is what's been going away, with each issue of ISO C being the only official standard (obsoleting/superseding all earlier issues of the standard).

Why give people closures or "defer" or ... whatever ... when they can't even remember the concept of the usual arithmetic conversions? Which has been standard since C89? Have you met a "practitioner" (= any C programmer with no particular interest in the standard proper) that could explain the effective type rules? Why make it more complicated?

I apologize -- I guess this is just my semi-diplomatic way to say, "please, get off my lawn". (Not to you personally, of course!) I'm very sorry.

20 days ago

2 replies

> I disagree because I've seen closures shine (in OCaml) and suck terribly (in C++).

Well, I also do not want to see C++ style closures in C and I fully agree about your point regarding control and explicitness. I also agree that some of the initiatives we see now are regrettably motivated by the attempt to make C fashionable, and sometimes by poorly adopting C++ features.

Yet, I think nested functions fit C perfectly way and I use them for a long time in some projects. They exist in very similar languages (PASCAL, Ada, D, ...) and even in C's ancestor ALGOL. This also shows that this type of nested functions are also not a functional programming concept. There is not really anything to learn, as syntax and semantics follow very naturally from all existing rules and the improvement in code quality for callbacks or higher-level iteration over data types is very real.

The usual arithmetic conversion have seen unfair criticism in my opinion. Effective types rules are mess, to some degree also because compilers invented their own rules or simply ignore the C standard. But this is a different topic. From a programmer's point of view, the rule that you just access each variable should have one type that is used consistently is enough to know to stay on the safe side.

https://archive.org/download/Yoshizuki_UnRenamed_Files__D-V/...

20 days ago

1 reply

The other example of nested functions which you've not mentioned was in Metaware High C.

There they allowed nested functions, but also what they termed "full function values", being a form of fat pointer. Certainly I came across it in High-C v1.7 in 1990, and the full manual for an earlier version (1.5?) from around '85 can be found on Bitsavers.

It had a syntax like:

    extern void Quick_sort(
      int Lo, int Hi, int Compare(int a, int b)!,
      void Swap(int a,int b)!
    );

    static Sort_private_table() {
      Entry Entries[100];
      int Compare(int a,int b) {
        return Entries[a] < Entries[b];
      }
      void Swap(int a,int b) {
        Entry Temp = Entries[a];
        Entries[a] = Entries[b];
        Entries[b] = Temp;
      }
      ...
      Quick_sort(1,100,Compare,Swap);
    }

The above is an extract from their language reference, which you can find here:

20 days ago

1 reply

Note - as far as I can see, it has similar behaviour to what you propose with _Closure() and a wide pointer. Just that it is existing practice, from 40 years ago.

I believe the High-C compiler with this support is still available, for modern embedded CPUs.

19 days ago

1 reply

Yes, I know about High-C although I did not know that it still exists. Thanks!

https://www.synopsys.com/dw/ipdir.php?ds=arc-metaware-mx

18 days ago

I ran across it recently. From a quick search now, possibly this lot:

20 days ago

Thanks!

psyclobe

22 days ago

c++ for the win!! finally!!

trgn

22 days ago

i wish JS gurus understood this before jumping all in on hooks and bloating the runtime footprint of every web app out there

zzo38computer

21 days ago

Something I had thought of (which does not fully solve the problems mentioned there, but would allow GNU nested functions to work in a way that can be implemented without trampolines, so that it can work in standard C and with the standard ABI), is to allow a nested function to optionally be defined with the "static" and/or "register" keywords.

With "static", it is implemented as an ordinary function, but the name is local to the function that contains it; it cannot access stuff within the function containing it unless those things are also declared as "static".

With "register", the address of the function cannot be taken, and if the function accesses other stuff within the function that contains it then the compiler will add additional arguments to the function so that its type does not necessarily match the type which is specified in the program.

This is not good enough for many uses though, so having the other extensions would also be helpful (possibly including implementing Apple Blocks in GCC).