and pass that to a function that is silently expecting a length and other stuff in front of it. Calling these things "pointers" is a bit of a promotion from what they really are. Naming them "char*" or whatever is pretty dangerous if you ever do pointer movement.
Seriously, if you want descriptors and typed blobs, just make up a struct for what you want and pass it around. Say what you mean. Lying like this is going to hurt you.
Head is not checked for null. This will segfault when heap allocations fail. [Preemptive reply to folks used to high-level languages and the Linux overcommit behavior: yes, heap allocations do fail, and yes, it is possible to handle such a failure well.]
That's going to crash when you try to access the T on many non-x86 architectures. On x86 you will have subtle problems like atomic ops failing to work.
I have seen cello come up on HN before. Seems like a cute hobby project based on some kind of flawed ideas of what "C" is. If people interested in learning C are reading, I suggest learning C "for realsies" and avoiding this thing.
Fully agreed about "what C is". Many, many sins have been committed in the name of saving keystrokes or making things temporarily easier. That little voice in the back of your head telling you that something might be a bad idea? Listen to it, it's probably right.
On handling out-of-memory conditions:
- Yes, you can do it
- It's also hard to get right, in the general case, for large systems. It's often a pyrrhic victory
Most systems I've worked with have simply restarted, rather than risk getting complicated recovery logic wrong (and winding up in a worse situation -- corrupting persistent data, or giving wrong answers -- than if they simply crashed). A few systems have a 'reserve tank' that they can use to do a controlled crash, saving important state and whatnot before quitting. OOM can get pretty wicked.
There are large projects that handle OOM and it isn't that hard to do.
Imagine every function in your call stack handles errors consistently. In most cases these functions will bubble up all errors to the caller. In many cases they will perform allocations themselves and free those allocations when they fall out of scope due to either success or error.
The function at the top of the stack hits a malloc error. It will bubble up its status to its caller, who will do the same for his caller, etc. The chain of functions will free intermediate allocations they made along the way. By the time you get to some top-level or near-top-level function, you can react to the error, and you likely even have quite a bit of heap space when the rest of the stack frees its work. But if you don't happen to have the heap space, then you can structure that top-level error handler so that it performs very few allocations, does allocations at upfront at initialization time, etc.
I don't think any of this is hard and I've seen it work well in practice. It's sad to me when I see the opposite, some unreasonable allocation quite reasonably fails, and it takes down the entire process because whoever wrote that code thought it was too hard to do otherwise.
I totally disagree. I've done a lot of operational and low-level systems work and handling obscure OOM scenarios is /really/, /really/ hard, and not at all worth the engineering effort.
But, even assuming you handle every failed page allocation perfectly and your stack does the right thing, you can still screw yourself /trivially/ by blowing out the network stack, or because the writeback daemon can't keep up with your filesystem writes. So now we're talking about clamping kernel resources with userspace logic that does things like periodic fsyncs and userspace tcp acks before attempting to send more -- good luck.
The reason it's hard is that everything you do has to become a "transaction" with the ability to roll back. Say you send five messages in a function, now you have to pre allocate all of them in order to cancel them all if any allocation fails, before sending any. But this pre allocation tends to break abstraction barriers (every API might need separate "prepare" and "fire" calls). It doesn't sound that complicated at first but it gets that way in a hurry. Almost every function can fail, every operation needs the ability to rollback midstream... it makes a mess in a hurry.
It's also a LOT of extra code, really material bloat.
If you haven't written the test harness to test almost every malloc failing, you might think this is easier than it really is.
Adding 30-40% more code to your code base that will almost never get tested except maybe by your unit tests ... no thanks, not if it's possibly avoidable for a given application.
I find it odd to see a reply written as if I haven't written in a transactional style and seen it working well. But ignoring that for a second. Your blog post says the error handlers did the wrong thing 5% of the time. Can I read from that they did the right thing 95% of the time? And you will dismiss the technique for that?
Not to mention that there are coding styles that make the transactional approach less difficult. (OK, so reverting your work gets hairy in the presence of certain side effects. In many cases I would rather chose some behavior and stick with it than take down an entire process dereferencing a null pointer.)
all I'm saying is that "not hard" as you put it and "30% more code, transactions, and a complicated test harness" don't go together for me. If they do for you then enjoy :-)
A pointer you get from calloc() is guaranteed to be properly aligned to not cause SIGBUS on RISCs and some such. You cannot just add random sizeof(Header) to it and call it a general purpose allocation. struct Header needs to be a bit more elaborate than what they have. In fact, it shouldn't be a struct, but a union -
union Header
{
var type;
max_align_t foo;
};
whereby you get max_align_t from your standard libraries or do something like this -
typedef union max_align
{
int i;
long l;
double d;
void * p;
...
} max_align_t;
Are you sure this is not wasteful? I always thought you could align on any word (32 bits) on a 32-bit x86 CPU, but your max_align_t would be 64 bits long (because it includes a double as well).
The Windows heap functions align to 8 bytes on 32-bit. I once worked on code that depended on this.
You want 8 byte alignment. For one, you could have doubles. (Though I just googled this and apparently gcc will 4 byte align those by default on x86) Another possibility is you could have a data structure that relies on "lock cmpxchg8b".
Actually, you can align on any byte in x86 and it will still work, the accessing is just slower (And you loose some other guarantees). That said, lots of malloc implementations on 32-bit x86 actually return 8-byte aligned addresses, for the reason that you can have 64-bit ints and such which benefit from 8-byte alignment.
While there are some cases where this is an issue (writes that cross 4KB page boundaries), this generally isn't true for any x86/x64 processor made in the last decade. There may be legitimate portability reasons for avoiding unaligned access, but performance on x86 is probably not a good reason.
I generally agree with what you've said, but a performance penalty is definitely still possible by crossing a cache-line boundary - I would assume unaligned accesses are just as fast as aligned ones only when the value itself is still contained within the same cache-line. But with that said, I think you're right in that the performance here isn't really the big deal - the portability reasons alone are enough to justify making sure the returned pointer is correctly aligned.
8.1.1 Guaranteed Atomic Operations
The P6 family processors (and newer processors since)
guarantee that the following additional memory operation
will always be carried out atomically:
• Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a cache line
The first P6 was Pentium Pro, which came out in 1995. It does go on to say that although modern processors will guarantee atomic operations that cross cache lines, it's a bad practice that can badly hurt performance.
These aren't fat pointers. A fat pointer is, well, fat: it's twice the size of a normal pointer, because it has both a memory address and a length. But Cello's 'fat pointers' are just regular pointers that happen to point to memory preceded by a length. Cello's 'fat pointers' don't solve the array slice problem.
It is called a struct. It holds arbitrary collection of any type of data safely while enabling the lazy programmer to pass only one parameter around. It is also safe (as must as it can be in C) unlike pointers that hide data behind their back and are just asking for trouble.
I will ignore all the problems and point out the worst offender, stack allocation example. That is not how you allocate arbitrary data on the stack, automatic char array is not malloc. In fact, in C you can't do it without using special platform specific functions. Anything else will cause undefined behavior.
I think the reason for this is that they're trying to manipulate C's syntax to get a certain kind of intuitive style. They seem to want it to be concise and C-like (without resorting to macros), but they want to add their own functionality without actually extending the language. In this case, they want you to just be able to cast a Cello "var" pointer to a specific type pointer using the normal cast syntax.
I'm not saying it's the right choice. Just that I understand why they did it that way.
Still, I think a macro would be a perfectly reasonable way to do it.
If I understand this correctly, this are not fat pointers as used in e.g. Rust, where instead of a pointer you'd be passing in an argument a struct with two pointers, one to the object and the other to the implementation of the interface/typeclass.
Instead, they're more akin to vtables in C++, i.e. placing a pointer to the interface at the beginning of the object, before the data.
I need to read the source (or generated code) to fully understand how they implemented support for multiple interfaces/typeclasses.
I think you might be misunderstanding the intentions of this. AFAICT it's not a mechanism for supporting multiple interfaces, just a bit of a light trickery to get a length passed into functions within the limitations of array->pointer decay in C. It also helps avoid cases where a miscalculated n is passed by 'encapsulating' it with the data.
Edit: I stand corrected. Just had a look over the source and this does a lot more than the OP link indicates. The Github README is more informative.
Anybody who wrote a medium-sized project in C is aware of such a trick. And also of the load of problems that comes with it! (They are not really pointers anymore.) Please stop trying to make C look like javascript, just create a new language from scratch, the pile of syntax tricks you're playing with is bound to collapse.
A subset of this kind of tricks are usually embedded in the source code of any project of decent size or larger. But that sort of implementation usually comes with big disclaimers that spell out the hacky parts in very loud terms. I'm kind of wary of a library trying to glue new behaviour onto C on a general level. There are inevitably too many corner-cases which break the newly built abstraction. Those corner-cases can be explained away for one specific codebase that already comes with its own set of practices, rules of thumb, and coding patterns that the programmers are already forced to learn. But a library as general as this is easily used liberally, even casually, and without a complete map of the shortcomings and broken corners I can see lots of potential hair-pulling coming right up when things are no longer what they seem they should be.
Neat punning on C pointers with minimal increase in memory-footprint.
Some people have suggested putting array sizes in structs. But that will increase the level of indirection and hurt performance. Also it will require rewriting most applications.
With Cello's approach you just have to avoid pointer arithmetic. Everything should work.
This is the same as Pascal strings, popular among Windows programmers in the form of the BSTR type. BSTR also uses the same trick of putting the length before the string data so that the BSTR pointer can be treated as a char pointer.
I like the idea of breathing more life into C. Look at what's happened to C++. These days, when I need the efficiency, I prefer coding in C to C++. But I'm not sure Cello is the right way to make C better. If you're starting a new project, what's wrong with a struct that has an array and length of the array? Cello seems to break the simplicity of C and its memory model.
Seriously, if you want descriptors and typed blobs, just make up a struct for what you want and pass it around. Say what you mean. Lying like this is going to hurt you.