Hacker News new | past | comments | ask | show | jobs | submit login
A Review of the Zig Programming Language (Using Advent of Code 2021) (duskborn.com)
341 points by mkeeter on Dec 27, 2021 | hide | past | favorite | 270 comments



Computer language inventors are torn between a voice whispering “use the language Luke” and and a more gravelly “let your feelings for the compiler grow, embrace the syntax side.”

I did a two-day sprint through Zig a month ago and really really liked it. It has some quirks that I would have done differently, but overall I think it’s pretty cool. I just need a good small scale project to try it out on now.

My favorite example of the “use the language” ethos is the way Zig does generics. I have hated generics/templates in every language I use them in. They are the gordian knot of confusion that most languages in pursuit of some sort of unified type theory impale themselves on (yes, I’m mashing metaphors here). But Zig just uses its own self to define user types and generics come along for free. It resonates (with me at least) as a realization of what Gilda Bracha has shared about how to reify generics.

[1] https://gbracha.blogspot.com/2018/10/reified-generics-search...


The way zig does generics is brilliant, and made me think "this is how every language should do it". One of those things that seems obvious in retrospect: just pass the types as normal parameters, and for generic data structures just represent their type as a function that receives those parameters and returns a new type. Man that's just beautiful. Only downside is, i suppose the types can never be inferred so they always need to be explicitly passed. But being explicit seems to be their thing anyway.


Zig's approach isn't unique, it's basically what dependently typed languages do. The problem is that if you don't do it right, it's too expressive and the type checker can loop forever.

Of course, this is also true of C++ templates since they're Turing complete, but it's not true of generics in most languages.


This isn't a gripe directed at you specifically.

I've noticed that when somebody says they like a growing language (Rust, Zig, ...) for this or that feature, people often come out of the woodwork to claim that it isn't unique, that some research project did it 5 years ago, etc. etc.

First, that's not even what they were saying. They like a feature of the language, they weren't making a claim about the novelty of it.

Second, even if they did erroneously make a claim about the feature's novelty, I think the theory type of people dismiss these sorts of languages too readily. Yes, somebody did it before. But it is usually very hard to bring these features into the mainstream or even adjacent to it.

It's just annoying when we're appreciating a language and the work that went into it and somebody pops their head in and says "Ackshually, Joe Gringoff published a paper in '95 detailing that exact thing, so it isn't anything new." Like I'm trying to enjoy "Samson and Delilah", I'm not really thinking about who did that kind of lighting first, so why are you using the lack of novelty to diminish the effort put into the lighting? If you want to say, "Fun fact, Giorno Capucelli was the first one to popularize that kind of lighting!" Then that's cool, but instead these people always use these facts to diminish something else instead of enhancing it. Just let me enjoy their handiwork!


I'm going to bring it back to the comment you are responding to. It's saving grace, is that it introduces to the thread the idea, "it's basically what dependently typed languages do". That has a couple of benefits, the commenter he's responding to can find out if that's true - https://en.wikipedia.org/wiki/Category:Dependently_typed_lan... (Zig isn't on there, I see other's are debating the point). If it does turn out to be true, the person might find a whole class of languages that share a feature he seems to really like. So while I kinda get your frustration, at least in this case it seems useful.


I argue to the contrary that it's quite disappointing to see features developed without referencing previous work. The annoying "Actually, so-and-so" is really just a symptom of the general lack of citations.

Maybe it's possible to design a language de novo without being aware of previous designs in the space, as Zig seems to be doing, but it seems quite dangerous - one wrong step and you've locked in a bad design. Whereas using proven designs (as Rust claims to be doing in their FAQ) is at least making use of some form of validation.


and what was the proof of these existing designs? peer review?

There is nothing new under the sun. What matters is what is /useful/ and that seems to be the focus for Zig, to be useful. The proven design it tries to improve on is C.


Really? You're using the Bible to argue with me?

Computing is still a young field, only 80 or so at this point. There are many areas that haven't been explored. For example polyhedral optimization only became a thing in the past 20 years. Zig only gets away with being shallow because LLVM does all the heavy lifting. If you follow LLVM you will find a steady flow of academic papers (it even started as an academic project).


When someone says "I like feature F in system S", I for one appreciate the response "F is actually called F' and exists in a purer form in system S'". It broadens my view of F in two ways! There might be nicer ways to phrase the response (along with links, maybe), but I for one appreciated the direct and very googleable approach the comment you're responding to took.


Comments like that provide context that you could use as search terms for finding more documents from which you could learn learn more about the feature.

If you're interested in programming languages in particular, the earlier papers could tell you about the philosophy and rationale about how and why the feature came about in the first place.


While I understand your reaction, I wasn't really diminishing anything. The OP I was responding to was saying other languages should do generics like Zig, and I basically just replied that other languages have done it that way (or rather, that Zig did it their way), and also pointed out some common pitfalls.

Zig may of course have found a better set of tradeoffs to make this feature more ergonomic, but I think this is more due to comptime than generics specifically. comptime would restrict the expressiveness that would normally lead to undecidability when reifying types as values, if I understand its semantics correctly.


> Second, even if they did erroneously make a claim about the feature's novelty, I think the theory type of people dismiss these sorts of languages too readily. Yes, somebody did it before. But it is usually very hard to bring these features into the mainstream or even adjacent to it.

It is, because people are unwilling to use them in existing languages. So it's extremely frustrating when what feels like the same people who've been trashing Haskell for ten years get all excited about how their new trendy language has this brilliant new feature... that is the same thing they were complaining about being pointless academic complication when Haskell did it.



Most of that cleverness requires generics + something else, usually some kind of subtyping. This is the case at least for Java, Haskell, Scala and Rust. Generics alone typically do not admit Turing complete expressions.

In Zig and dependently typed languages where types are ordinary values, you typically have the regular looping/recursion available, and so Turing completeness follows naturally unless you take steps to mitigate this. Zig makes a hard distinction between runtime and comptime which might solve this, and dependently typed languages have termination checkers which extend this sort of distinction in more flexible ways.


Dependent types go beyond what Zig does by removing the distinction between comptime variables and runtime variables (so types can depend on runtime variables). Zig goes beyond dependent types in the sense that the comptime/runtime distinction allows Zig to handle all comptime values at compile time, which is important for efficiency. It would be interesting to combine the two approaches, via partial evaluation or staging.


To my understanding, comptime is a form of partial evaluation. PE is typically done by a "binding time analysis"; comptime declarations are explicit binding time declarations.

You can also see Zig as a three stage language (comptime+compile time+runtime), where most compiled languages are two stage languages (compile time+runtime). There's a lot of overlap here.


> the type checker can loop forever

Oversimplifying: Zig anyways gives you "compiler branching tokens" so your type system can't loop forever, if I'm not mistaken


Kind of how Ethereum uses "gas" to make sure that bad/lazy actors won't waste CPU cycles when running smart contracts.


sure, but you aren't going to run out of compiler tokens unless you are trying to do something really crazy, like generate a precompiled table of prime numbers, as I have done. And I don't think you use them up in general, just when your compiler is taking certain branching operations (i don't actually know what the rules are)... I believe you can have a long program that consumes as many tokens as a short program, if their typesystems are the same and you don't ever use compile-time branches in the functions you're writing.


Do you have a preferred (favorite) implementation of dependent types?

Quickly scanning the list on wiki... Ada is the only imperative language listed.

Corecursive has had a few episodes about dependent types. Here's two:

https://corecursive.com/015-dependant-types-in-haskell-with-...

https://corecursive.com/023-little-typer-and-pie-language/

Learning more is on my to do list. To noob me, the descriptions remind me of Eiffel's constraints (asserts, pre/post-conditions). And maybe user defined typedefs, like constraining dayOfWeek to values 0..6, a bit like enums.


It looks very nice and I look forward to using it.

A downside for package authors (once Zig gets to the point of having a package ecosystem) will be that syntax errors are checked only with the comptime parameters that are actually used at call sites. To maintain compatibility, authors of public APIs will need tests with commonly-used comptime parameters.

It seems like a good way to avoid lots of type-level complexity in the language, though.


I've grown used to the idea that generics (data structure macros, C++ templates...) aren't that useful. If I find myself in a situation where I'm thinking of a solution that involves generics, I stop and ponder what is actually the essence of the repeated stuff. It rarely is on the syntactic level, often it runs deeper. Probably the commonalities can be distilled into a data structure.

Simple example: Intrusive linking headers (e.g. Linux kernel list.h). While those can benefit from an optional very very thin generics layer on top, essentially the code for linking e.g. nodes in a tree should be the same regardless of the data structure where the headers are embedded.

Getting this right simplifies the code but can also speed up compile times.


How do you write a generic quicksort function without generics?

There are two ways I know of: (1) throw type and memory safety out the window and use void pointers plus size/alignment like qsort(3) does; (2) require that users manually write an interface with a swap(int i, int j) function like Go before generics does. Both solutions are really bad.


Void pointers also add an unfortunate layer of dereferencing.

In C# for example, generics are important to keep values on the stack instead of the heap, not only avoiding garbage collection but improving data locality.


Well, a modern optimizing compiler can get rid of the indirection if it inlines qsort() and then inlines the comparison function. Of course, it needs to be able to see the source of qsort() to do so (which might be a problem if it's dynamically linked).


If I'm understanding correctly, you're saying a compiler will optimize out "void*" to instead be the value put into it, so theres no pointer dereference to get to it?

That's what I am talking about. I don't know much about modern C optimizations.


There isn't any pointer to optimize out really, I don't think - at the machine level, a pointer is a pointer is an integer number regardless of its type at the source code level.

What can be measureable though is function call overhead - if you write a sort function that takes a function pointer and your sort function calls that, there is a lot of work for saving the local state, preparing all the arguments for the function, etc. That work can potentially be avoided if the compiler can inline the function call into (a copy of) your sort function.


In the case of qsort, void* isn't a problem, since sort() would always take a pointer, but there absolutely is a difference at the machine level between void* and int - one is a value, the other is the address of a value. One is used for calculations, the other is used for calculations OR loading data from memory.


At the machine level there is absolutely a difference. one means your data is right here, the other means the data is elsewhere. That poor data locality can cause cache misses and always consumes extra cycles.


What is "right there"? If the computer should sort an array that is located in main memory, it needs to know their addresses (pointers) to even load them. There is no way around.


It depends on how many times removed it is.

Lets say it's 4 8bit ints. It can be sequentially and directly in memory like:

  stack: [0x01, 0x02, 0x03, 0x03]
Each int is directly next to each other in memory. But it could instead be pointers to the ints, like:

  stack: [0x07002412, 0x0700241A, 0x07002424, 0x070036A0]

  heap (or stack, really):
  0x07002412: 0x01
  0x0700241A: 0x02
  0x07002424: 0x03
  0x070036A0: 0x04
The CPU needs to dereference that pointer, and the cache is getting all screwed up.

That is what "void*" is doing. You're using a pointer instead of a value.

In C# this is what generics allow you to avoid.

  // This is like my first memory layout above.
  struct Foo<T> { public T MyField; }
vs:

  // This is like the second layout above.
  struct Foo { public object MyField; }


Among other things, yes, compilers will do that.


That seems strange because you'd think you may need the guarantee the pointer remains a pointer for an ABI? Or does it only do this in very limited circumstances? Do you know the term I can use to Google more info about this?


It only works if you use link-time optimization or the qsort() definition is otherwise available for the compiler to see.


The similarity of this optimization and what is typically thought of as "generic programming" is interesting.


I rarely ever do need a sorting algorithm for what I do. I might not have done even one sort call in 2021. Usually the data I have is sorted by construction, or doesn't need to be sorted in any particular order.

When I do need to sort, I just use qsort from libc. It's also easy to write my own version of qsort.

I once measured qsort vs std::sort on plain ints (basically the most pessimistic case for qsort vs std::sort) and the difference was like 2x. If this becomes a bottleneck then it's worth investigating how to take advantage of additional context to speed up the sorting much more than any generic sorting algorithm could do anyway. Simple example: bucket sort. (I'm 100% confident sorting performance hasn't ever been noticeable in my career so far, but I've done simple optimizations like that once in a while for fun).

> Both solutions are really bad.

Due to what I described above, I'm actually in favour of qsort - no new code is generated, just a simple library call that takes a pointer to the comparison function. Really really easy to use.


Maybe it’s lack of imagination but I cannot see how qsort is easier to use than std::sort.

In particular because std::sort is templated you can use anything callable including a lambda or function pointer as the comparator. But if you already have operator< defined for the type you don’t even need to define a comparator.

It seems like std::sort is shorter and simpler to write, safer, faster and generic across container types (including c arrays).


I _believe_ std::sort used to be more complicated because you had to define a class with a compare method for custom comparison functions, or something like that. Looking here https://en.cppreference.com/w/cpp/algorithm/sort , I cannot be arsed to find out what works since which version of C++. Come on, look at that link and tell me with a straight face you can't grok the qsort() signature much more easily than that mess.

Other than that, the STL is generally slower compiling compared to only including plain C headers, which is also taxing on the "ease of use" department.


`sort(begin(foo), end(foo))`

Has worked in C++ for ~25 years as long as `foo` is a container of items with an `operator<`. It has never required defining a separate class. It will also optimize better than qsort when qsort fails to inline.


Your main point is correct, but std::begin and std::end were added in C++11. You would have used pointers or begin/end member functions for most of those 25 years.


qsort is a horrible function, and infamously so. First, since it's part of the C stdlib, it is almost universally dynamically linked, precluding any chance for it to be inlined, and even worse, it calls a function pointer.

Even ignoring performance, it has no support for checking the types, which in C means it can easily mess up memory badly. For example, I can call `qsort(arrayOfInts, num, sizeof(float), funcThatComparesTwoStrings)` and I'll get a really bad time, with absolutely no way for the compiler to tell me I'm doing something obviously wrong.


It does not matter. Your concerns are entirely academic. Simplicity wins.

> I can call `qsort(arrayOfInts, num, sizeof(float), funcThatComparesTwoStrings)`

Don't do this. Do "qsort(a, n, sizeof *a, compare)"

> with absolutely no way for the compiler to tell me I'm doing something obviously wrong.

In the event that it does not work, two things are very likely true. 1) it was a really dumb mistake. 2) you will notice immediately and can fix the mistake.

The idea that everything must be caught by the compiler is narrow-minded and ultimately can't be achieved. Stop worrying about hypothetical problems that aren't problems in practice. There are enough real problems to work on.


> you will notice immediately

No, you won't notice immediately. You will only notice when you run your code, because it will crash. And it might not crash right away, because the bad code is not being run, so you have a crash waiting to happen at some time in the future, which might have been avoided with a competent type system.

Users of statically typed languages have a different understanding of "immediately" than you do. If the compiler knows my code will crash, it should simply refuse to compile it until I fix it.

> The idea that everything must be caught by the compiler is narrow-minded and ultimately can't be achieved.

Ah, the famous fallacy "We can't reach 100% so let's just be content with 0%".

That's not how the world works. Static types provide not just non crash guarantees but plenty of other advantages (automatic refactorings, performance, code that's easier to read and maintain, etc...).

You have just gotten used to a crappy developer experience because you are using a dynamically typed (or very badly typed) programming language. Get out of that boiling pot and be more exigent from your programming language, there are some very good ones available today which will make you go back to your statements above in a few years and think "Yeah... that was a pretty silly thing to say, but I know better now".


> You will only notice when you run your code

How did you infer that that's not immediately?

> If the compiler knows my code will crash, it should simply refuse to compile it until I fix it.

Sure but what if it doesn't know that, then I won't complain about the next best opportunity, which is pressing F5 once and seeing the code I just typed doesn't work as intended.

> Static types provide not just non crash guarantees

How about no?

> Ah, the famous fallacy "We can't reach 100% so let's just be content with 0%".

If using a void-pointer parameter once in a blue moon for some convenience (callback function pointer), how is that 0% and not 99%? I recognize I've written too many comments, seeing as the replies are starting to get extremely poor. Ignoring the rest.


Since type safety is entirely academic, you should take a look at void (https://github.com/kyouko-taiga/void-lang), it seems like the perfect programming language for you.


You should take a look at what I wrote, because I did not make any such claim.


I've taken a look at pretty much all of your comments on this story, and that seems to be the logical conclusion of your stance on generics. I'm not really sure how you could think otherwise and write the comment I replied to, and lots of others in here.

To someone that said "Even ignoring performance, it has no support for checking the types, which in C means it can easily mess up memory badly." you replied "It does not matter. Your concerns are entirely academic. Simplicity wins.". Type safety is a real concern, and lack of it has conséquences in the real world, often in the form of bugs or vulnerablities, but you seem deaf to any arguments so I don't see the point of arguing with you. Instead, I've linked you something that you might like.


My point is there is no need to be so preconcerned about a rarely used function that requires giving up a little safety (in a place with very little "code entropy" even). Doing so does not equal giving up type safety completely, not even 1% of it. There are way bigger issues to address even in the best of codebases.

> Type safety is a real concern, and lack of it has conséquences in the real world, often in the form of bugs or vulnerablities

To me type safety is first and foremost a usability feature - already a primitive type system like C's saves tons of time when changing code. But is it a great concern with regards to bugs and vulnerabilities? Type safety is static, once you've run a particular section of code, you can be relatively sure that there are no type issues on that path. (Granted that to run all code sections even once can be lots of work). That is contrary to other types of bugs which might only show up on rare occasions, and possibly require malicious intent.


Type safety is a soundness guarantee and can be , in some implementations, a usability feature.


I don't know what you do, but for our codebase, we run transactions against the database in most of the integration tests. We can't generally be sure the database won't return rows in some order it likes that aren't the rows we expect -- so if we don't have a generic sorting function we will have flaky tests.


If you can't bring the database to return the values in a well defined order, absolutely go ahead and sort it! No need for _generics_ though, qsort() will do just fine.

(C probably isn't great for DB interop in any case).


You may not need to sort, but std::rotate? std::partition? std::all_of? The list goes on. Without templates, these wouldn’t provide the value they do. (Thanks Stepanov!)


Use Smalltalk?


Don't you mean "If you use a late bound language, you don't need to chase down the type rabbit hole and encounter the nasty teeth that are generics along the way"?

An example of which is Smalltalk, but could also include Dart 1, Python, Elixir, CLOS/Lisp, JavaScript, etc.

Smalltalk is an interesting exhibition of a similar principle though. If you did add types to Smalltalk, you'd likely solve this containment parameterization problem the same way you do nearly everything in Smalltalk: by binding some new behavior (adding a method) to some data (a class object). You'd add a message to the MetaClass types that could create flyweight class types parameterized on the fly. And when you compiled, you'd send messages to the instance of the class itself to manufacture that synthetic on-the-fly type. Hand wave, hand wave, hand wave.

The "let's see where naval gazing on a unifying principle goes?" that is employed in Zig is to use the same language itself to recurse on itself when defining itself, types included.

What happens in more traditional/mainstream/Algol derived languages, is that each part of the system gets a sort of sub department of the language. You have a set of keywords/syntax for expressing run time behavior. And then someone says, but how do actually bring data to life--a sort of "Creatio ex nihilo" problem--and the answer is we need a new department in the language, so we get keywords/syntax for constructors. When behavior/construction needs to be parameterized across compilation paths, we get another set of keywords/syntax. And another for type definition. And yet another for scope management. And then again with meta things. And maybe a preprocessor.

I do think there's a trick to finding a sweet spot in these "turtles all the way down" systems. In Smalltalk, I was happy with how the "don't know what's going on? just follow the message sends", but at the same time still providing enough structure/composition to have to be reflecting/introspecting all the time. Forth is event simpler, but I found I had to constant recompile things in my head to resolve. I'm relatively new on the Elixir curve, and am beginning to worry this happens with macros there. Figuring out how Plug works really messes with my brain. I haven't done enough with Zig to know where it's @comptime ends up in this balance.


Generics are the way that useful semantics can find their way into useful libraries. A collection of useful libraries is what is needed to make a language useful.

This is why C++ usage is still growing fast: almost every new feature makes it possible to write more powerful libraries that get, thereby, easier to use.


> This is why C++ usage is still growing fast: almost every new feature makes it possible to write more powerful libraries that get, thereby, easier to use.

LOL no, it's growing because people can't get the performance and battery life they want out of other languages and it's a hugely entrenched player with a gigantic existing ecosystem and is able to seamlessly call into all existing C code as well. Language features after C++11 (arguably C++98) have very little to do with new growth except keeping interest by showing it's alive and spurring book sales and conference tickets. Most libraries still target old versions of the standard and don't use most new features. C++14 vs C++17 vs C++20 makes very little difference for 99% of users. Most people are not using polymorphic allocators to deserialize Unicode inside their constexpr DSL parser.


C++ pre-11 was in decline. Since 11, it has exploded. Every ISO meeting, 3 times/year, had more attendees than any previous meeting. It is impossible to explain that change except by the extra features that showed up in 11, and subsequent Standards. Many of the important new features are in the standard library, that are useful building blocks for other libraries.

Most of the use of new features has been to make libraries more powerful, and easy and safe to use. So, users of those libraries don't need to use the new features directly to benefit from them.

Which was my point.


This is a blanket assertion. Why should useful semantics require generics? Why can't they come with simple fixed data structures? If you can provide a nice counterexample to the example that I gave, this would be more convincing.

Absent the absolute necessity of generics to support essential semantics, I'm probably in favour of a simpler generics-less version. I don't see why "more powerful libraries" get automatically easier to use. It could often be the opposite.

There is a tradition in some C++ subcultures where they try to cram in as many invariants as possible into types, concepts, templates, etc. at all costs. I tend to think that if all this heavy machinery is needed, the functionality might be too complicated from the start.


If you were writing a library, and a language feature made it possible to make your library easier to use safely, why would you not?

You might as well say, "I don't see why a more expensive dinner has to be tastier."

It is always easy to make libraries that are hard to use, in any language, but those do not become popular if there are better alternatives.

We need generics because users have types that they need libraries to work with. Your own example, of an intrusively-linked list, illustrates this: without generics, you could not write a linked-list library component that would be usable for that. This is why C programs are crammed with so many custom one-off hash tables: You cannot code a useful hash table library in C.

Libraries start simple, and accumulate your "heavy machinery" as they are made more useful and usable for their growing family of users. A library that lacks users does not.


It's more like, "I don't see why a tastier dinner has to be more expensive". It's a tradeoff situation, and while I might be willing to buy a great $50+ meal from time to time instead of just a $20 one, I don't have any inclination to pay $1000 even if the meal is a tiny little bit better than the just-great one.

It's a matter of tradeoffs. Library design is a balancing act.

> We need generics because users have types that they need libraries to work with.

This is right where I become sceptical. Most libraries shouldn't care for the user's types at all. They should expose their own types so you can work the library - not the other way around.

Please provide an actual use case where the library has to "know" about the user's types (and please don't mention std::sort. It's as common an example as is "Dog :: Animal" examples to argue for class inheritance, and is just as irrelevant).

> Your own example, of an intrusively-linked list, illustrates this: without generics, you could not write a linked-list library component that would be usable for that.

As I mentioned, linked list can tangentially benefit from a very very thing generics layer on top of a fixed data structure implementation. The layer does nothing more than "instanciate" the types, but there is no code generated. But then again, the added convenience / safety is minimal, I once wrote a C++ implementation that I was quite happy with, and never used it.

> This is why C programs are crammed with so many custom one-off hash tables

There are probably not huge issues with a design like "HashTable ht = { .hash = &foo_key_hash, .equals = &foo_key_equals }" (if quick results are what you're after), but yes it is nice to have a few lines of type-safe wrap generated over some generic container interface.

An alternative reason why you'll find a good amount of custom hashtables in C code bases is, I suppose, that there are a lot of different ways to implement hash tables. Also writing a little code here might not be so bad. I've had 1 use case for a "hash" table in 2021 (glyph hash table for a new iteration of my font rendering code) and I've gotten away without even implementing it - the for-loop I've written has never shown up in any performance profile.

Considering that container data structures are probably the highest profile application for generics, I'm still not convinced that generics are needed in a systems programming language...


> This is right where I become sceptical. Most libraries shouldn't care for the user's types at all. They should expose their own types so you can work the library - not the other way around.

This is an utterly bizarre statement. Should people convert collections of Xs from library 1 to other kinds of collections of Ys to work with library 2?

A very basic use case that shows up all the time everywhere, especially in systems programming, is that I have an array of items and I want to pass it to some library. But, to work with an array, you need to know the size of elements of the array, so that arr[x] knows how to compute the address of x. There are exactly 3 ways to achieve this:

1. The library only accepts arrays of some specific type (e.g. int[]). If you have some other type, you have to find some way of converting to the type known in the library - in O(n) time.

2. The library expects arrays of some fixed-size type that can hold any value (e.g. void*[]). This typically adds huge overhead, since now the elements of the array are not together in memory, they are scattered all over the place; and the library must fetch them from memory before doing anything with them.

3. Generics - the library accepts T[], and can easily do sizeof(T) to know how to access arr[x].

Even qsort is actually an example of C's convoluted support for generic arrays: you're explicitly passing the array base and its type (the number of bytes) to qsort. C's extremely basic type system anyway essentially identifies types with byte sizes in practice, so it's not obvious that qsort works like this (and of course, it's less safe, since instead of passing an int, you have to pass void* + sizeof(int), potentially getting them mixed up).


> A very basic use case that shows up all the time everywhere, especially in systems programming, is that I have an array of items and I want to pass it to some library.

My thinking is this. Either a library function cares about what type of data you pass to it because it is designed to do some operation on it - then it will accept only a specific type, like "void func(Foo *foos, int count)". Or, the function does not care about the type of data but just wants to send it somewhere else. In this case, you can pass a void-pointer + size to it.

Interfaces like sorting functions can be seen as exceptions under the umbrella term "container library". Like the qsort() example, usually all you need is a few metrics for the data type - size, maybe alignment - and at most a few "method calls" - equality, simple get/set operations - and that's it. For all I can say these situations are not very common in systems programming, at least not for what I do. This stuff tends to come up in application level programming. It's definitely very common in my Python scripts. When it does come up in systems occasionally, a vtable struct is a good way to deal with the situation generically. In the qsort() case, the function pointer can be passed directly.

> 1. The library only accepts arrays of some specific type (e.g. int[]). If you have some other type, you have to find some way of converting to the type known in the library - in O(n) time.

Considering how computers work, I don't think it is much of a constraint to assume that arrays are of the form "Foo *foos". Coincidentally C _does_ have builtin "generics" for arrays of things, and real machines do have assembly instructions to quickly compute addresses based on hardcoded element size.

> (and of course, it's less safe, since instead of passing an int, you have to pass void* + sizeof(int), potentially getting them mixed up).

There is little real chance of getting them mixed up. If you actually run the code you will notice.


> void-pointer + size to it

How do I make sure they always stay in sync ? (adding removing elements)

Because the ever increasing number of CVE's where brilliant programmers who screw this up, says doing this manually is prone to error.

Sure you could do everything without generics. But its a lot more error prone. Generics can in addition with other language construct provide better safety.

Generics also provide a tradeoff, that library code is more complicated, but its usage its easier to read.


You don't. A pointer is a pointer, an int is an int. Memory is memory. Dynamic Arrays that support "adding or removing" elements are a high-level feature that don't mesh well with direct memory manipulation. If you are really really concerned you can try to code in Rust with its ownership tracking system. Otherwise, try to write simple enough code and test it as good as you can. You can use a language or compiler switch that at least supports "fat pointers" for implicit bounds checking, but this won't deal with reallocations.

But in general "systems programming" means "buffer processing" i.e. a lot of "streaming" and most buffers are short lived - they are never overwritten and you drop them soon after they were created - or they never change in size. (reallocation incurs unnecessary copies, and there are many good reasons to chunk large data sets up into fixed-width sizes). All this makes "stay in sync" very much a non-problem. I haven't had to worry about this in years. It's almost an academic exercise that only exists to prove that we need garbage collection ;-)

> Because the ever increasing number of CVE's where brilliant programmers who screw this up, says doing this manually is prone to error.

If you code in a memory unsafe language like C, things can go horribly wrong and your code can be exploited. If you code in a memory safe language, you can end up with horribly slow and unmaintainable code and you might still be exploited.

> Generics can in addition with other language construct provide better safety.

Better than using a hash table with generics, is not using a hash table at all...


Or, best, not even programming at all?

Or, in practice, hiring somebody who actually understands how to program, understands what makes a library or language a good choice, and understands how to get the most value from both, without insane blinders.


You miss the point. You can get more money for the dinner only if you deliver more. Your $20 meal delivers more than your $2 meal. If you can only deliver the $2 meal, you aren't getting $20 for it.

I have already cited two separate examples, which you have studiously ignored. Look online and find them in their thousands. It is hard to find even one C++ library that is not made better by its ability to integrate with users' choices of types.


Updated to not ignore your examples.


If std::vector, std::array or std::unique_ptr could somehow have been implemented better without the use of templates in the C++ standard library, I think you'll find that they would have been implemented that way.

I don't know what you mean by "essential semantics", but the fact remains, as ncmncm says, that templates is the feature that allows new types of type safe generic abstractions to be built and added to the arsenal of C++ programmers. It's the feature that, most of all, has kept the C++ language alive and thriving over the past 20 years, in my opinion. Without it, the language would probably be dead.


> This is a blanket assertion. Why should useful semantics require generics? Why can't they come with simple fixed data structures?

What do you mean by "fixed data structures"?

> If you can provide a nice counterexample to the example that I gave

I may be missing somehing but I don't understand your example. The idea of a list is that we want it to hold any data, so the API to a list by necessity becomes generic. Unless you cas everything to void*.

---

I have another counterexample from userspace. Our internal APIs all use the same converntions which make working with them predictable and same from any language. An API returns:

  {
     result: [a list of data],
     nextPageToken: 
  }
Different APIs will of course have different data. Some API will return a list of contracts. Another API will retur a list of media. A third API will... etc.

I had the misfortune of working with these APIs in Go before generics. Welcome to copy-pasting code that deals with this. Or cast from interface{}/void* to proper data in runtime. Any sane language of course lets you write something like GetAllData<T>()


> the code for linking e.g. nodes in a tree should be the same regardless of the data structure where the headers are embedded.

Why should that be so? It's particularly NOT true for one of the simplest and most efficient data structures in any program: the array. The code that needs to work with an array needs to know exactly the size of each element of the array, so you basically can't have an efficient language that doesn't support generic arrays. Haskell maybe sometimes gets by with a very smart compiler and heavy use of laziness, but even C has support for generic arrays.


You can totally make array iteration generic, just pass the element size. Not saying it is necessarily a good idea.

I agree that there are cases where the "there should be a single central piece of code" argument is not a strong one, because the generic code is so simple that it should be inlined into the usage code. Arrays iteration is one example - but note that for example sorting code is, while the sorting algorithm might be generic, more than just generic array iteration.

Doubly linked lists already profit from shared code in some ways, and definitely I'd find it a bad idea to replicate for example tree mutation code (RB tree or similar) for each node type. Embedding the header structure and keeping the links just between these embedded headers, to be able to reuse a common (compiled) piece of code is a very good idea here IMO.


> Initializing arrays is weird in Zig. Lets say you want to have a 0 initialized array, you declare it like [_]u8{0} * 4 which means I want an array, of type u8, that is initialized to 0 and is 4 elements long. You get used to the syntax, but it’s not intuitive.

Alternatively:

  var some_array = std.mem.zeroes([4]u8);
Though as mentioned later in the article the standard library documentation is not very good, making this not as obvious as it could be.

> Everything in Zig is const x = blah;, so why are functions not const bar = function() {};?

Good question, there's an accepted proposal to fix this: https://github.com/ziglang/zig/issues/1717

> The builtin compiler macros (that start with @) are a bit confusing. Some of them have a leading uppercase, others a lowercase, and I never did work out any pattern to them.

In idiomatic zig, anything that returns a type is uppercased as though it were itself a type. Since only a few builtins return types, the vast majority of builtins will start with a lowercase letter. I think it is only `@Type`, `@TypeOf`, `@This`, and `@Frame` that don't.


Any recommendations on learning more about what constitutes idiomatic zig? This is an issue I have with learning any new language - it’s kind of hard for me to figure out what writing idiomatic code in that language looks like. I usually go looking for popular/high quality projects and reading that code but it takes away from the experience of actually just toying around not to mention it being hard knowing what a high quality project is. Thanks in advance!



I don't think of this as an idiomaticity guide.


It's unfortunately the only comprehensive reference I know of besides the official docs.

There is "Ziglings", but those are a collection of small exercises with answers rather than a full guide.

https://github.com/ratfactor/ziglings

If you know of better resources than these two, please do share (not being passive aggressive here).


The style guide in the language reference explains the accepted naming conventions [0].

[0]: https://ziglang.org/documentation/master/#Style-Guide


Oh cool, I hadn’t seen this either! Thanks!

Also thanks for the other links, ziglearn is great.


oh man! I'd missed this! Thanks


honestly, the standard library. But there should be an idiomaticity guide. At least for capitalization patterns, any other naming conventions, etc (things that zig fmt can't capture). I don't know that this officially exists, anywhere yet. Here is an example of what I'm talking about in my $DAYJOB lang:

https://hexdocs.pm/elixir/naming-conventions.html#content

edit: see sibling comment, apparently I never noticed the style guide in the standard docs


I don't know zig. Is one a constant initializer while the other is not?


It looks like the value returned by std.mem.zeroes will end up being a compile-time constant, but I'm only like 90% on this.


if you need to import a package to clear an array, something went very wrong somewhere..


In Zig zero initialization is not idiomatic. Unless you have an active reason to do so (and during AoC you need zero init a lot more than normal IME), you should just set the array to undefined like so:

    var foo: [64]usize = undefined;


Why?


because it's trivial, it's like assigning a value to an integer, it shouldn't require a package


Depending on your perspective, it's not trivial. It's significantly more expensive than assigning a value to an integer. Zeroing a [4096]u64 would require several thousand times more operations than zeroing a u64. In the areas that zig targets, this can be quite important.


i disagree, it's just backward to need to import a package


I also did AoC 2021 in Zig: https://github.com/avorobey/adventofcode-2021

One thing the OP didn't mention that I really liked was runtime checks on array/slice access and integer under/overflow. Because dealing with heap allocation is a bit of a hassle, I was incentivized to use static buffers a lot. I quickly figured out that I didn't have to worry about their sizes much, because if they're overrun by the unexpectedly large input or other behavior in my algorithms, I get a nice runtime error with the right line indicated, rather than corrupt memory or a crash. Same thing about choosing which integer type to use: it's not a problem if I made the wrong choice, I'll get a nice error message and fix easily. This made for a lot of peace of mind during coding. Obviously in a real production system I'd be more careful and use dynamic sizes appropriately, but for one-off programs like these it was excellent.

Overall, I really enjoyed using Zig while starting out at AoC problem 1 with zero knowledge of the language. To my mind, it's "C with as much convenience as could be wrung out of it w/o betraying the low-level core behavior". That is, no code execution hidden behind constructors or overloads, no garbage collection, straight imperative code, but with so much done right (type system, generics, errors, optionals, slices) that it feels much more pleasant and uncomparably safer than C.

(you can still get a segmentation fault, and I did a few times - by erroneously holding on to pointers inside a container while it resized. Still, uncomparably safer)


> (you can still get a segmentation fault, and I did a few times - by erroneously holding on to pointers inside a container while it resized. Still, uncomparably safer)

This is a severe problem, and I predict that this is going to cause real security issues that will hurt real people if Zig gets used in production before it gets production-ready memory safety. This exact pattern (pointers into a container that resized, invalidating those pointers) has caused zero-days exploited in the wild in browsers.


> This is a severe problem, and I predict that this is going to cause real security issues

That is a nasty problem, particularly in larger projects with different subsystems interacting (like say an xml parser and another).

I suspect it's worse in some ways as Zig has good marketing as being "safer" language despite still having the same fundamental memory flaws as C/C++. In the worse case that could lull programmers into complacency. I mean it looks "modern" so it's safe right? Just do some testing and it's all good.

Currently I'm skeptical Zig will get a production-ready memory safety. Currently there's only GC's or linear/affine types and Zig doesn't appear to be pursuing either. Aliased pointers aren't something that's properly handled by adhoc testing IMHO.


FWIW, "safe" doesn't appear anywhere on the Zig homepage. I've been trying out Zig for the past couple weeks, and while I love it so far, it gives anything but the feeling of safety. I would say there's guardrails, but those are optionally disabled in the compiler for faster execution.

It seems to be that Zig is really not trying to be a replacement for all programming, but fill its niche as best it can. If your niche requires memory safety as a top priority because it accepts untrusted input, Rust would probably be a better choice than Zig.


Reminds me of modern C++, where the language and standard library features all work together to increase safety. And then it backfires because you feel like it's safe, but in reality it's not, and bugs still happen, they just catch you off-guard.


Some sort of pointer tagging system, like 128-bit pointers where the upper word is a unique generation ID, might be the simplest approach to eliminate security problems from use-after-free, but it's going to have some amount of runtime overhead (though new hardware features may help to reduce it).

Alternately, use a GC.


Another option is something like Type-After-Type (make allocations use type-specific regions, so use-after-free is still fully type safe at least):

https://download.vusec.net/papers/tat_acsac18.pdf


Yes, something like that may work. Note that this approach also has time and memory overhead quoted in the paper. There's no free lunch.


If you write tests in zig you will probably find this using the testing allocator. Yes, I get that some people really don't like writing tests.


Many of the highest-profile memory safety security issues are in very well-tested codebases, like browsers.


What's your point? You're comparing apples to oranges.


The point is that "write tests" has empirically not been a satisfactory solution to this class of vulnerability.


I think you don't get it. This isn't "write tests to make sure the vulnerability doesn't exist" this is "as you're testing, all of your code is automatically scanned for these vulnerabilities".

For a big project like a browser, I would imagine the tests would include property tests, fuzzing, etc.

This is obviously strictly less powerful than a proof assistant, which, yes, rust has, but we don't empirically know what the delta and the risk factor is between something like what zig gives you and something like what rust gives you... Moreover, I think it's likely that something like an proof assistant will be developed to track resources based off of zig's AIR. This is something that would be costly, but you could write it as a "linter" that blocks commits as a part of CI.


> "as you're testing, all of your code is automatically scanned for these vulnerabilities".

For browsers, that's been done for years and years, probably even a decade at this point. Tooling for memory safety has gotten incredibly good.


Yes, when I invariably had to debug the first UAF in Zig I did pause for a bit and pondered my rust. It's definitely an argument against Zig that is unlikely to go away anytime soon.


Zig is not memory safe on purpose. So when you need or want that you don’t use Zig


Zig apparently has valgrind support. Maybe it’s not turned on by default?


A better way to put it is "valgrind integration". It is enabled by default (when compiling for a target that has Valgrind support). Mainly it integrates with the `undefined` language feature which helps catch branching on undefined values. The nice thing you get beyond what C gives you, is that in Zig you can set things to undefined when you are done with them. Meanwhile in C, Valgrind is only aware of undefined for uninitialized variables.

But as others have pointed out, although Valgrind is a nice debugging tool, you would not run your application in it under normal circumstances. It's also not available on some important targets, such as macOS and Windows.


I don't think Zig has any particular Valgrind support, it's just a binary after all. In order to properly utilize valgrind though you're going to have to change from the GPA or whatever allocator you're using to the libc one so that Valgrind can trace memory allocations correctly via preloading.


Here is some kind of valgrind API [1] and a here is a report from someone who tried using valgrind [2]. Yes, it doesn’t sound all that special.

[1] https://github.com/ziglang/zig/blob/master/lib/std/valgrind.... [2] https://dev.to/stein/some-notes-on-using-valgrind-with-zig-3...


Valgrind support is cool but it's not a solution to the problem.


"runtime checks on array/slice access and integer under/overflow"

I'm probably missing something. I feel like you'd get this and a lot of the other benefits you list if you just compile C/C++ with Debug options - or run with Valgrind or something. Are you saying you get automatic checks that can't be disabled in Zig? (that doesn't sound like a good thing.. hence I feel I'm missing something :) )


You're correct: you do get virtually all of the safety benefits of Zig by using sanitizers in C++. (Not speaking to language features in general, obviously.) In fact, C++ with sanitizers gives you more safety, because ASan/TSan/MSan have a lot of features for detecting UB.

Especially note HWASan, which is a version of ASan that is designed to run in production: https://source.android.com/devices/tech/debug/hwasan


The runtime safety checks are enabled in Debug and ReleaseSafe modes, but disabled in ReleaseFast and ReleaseSmall modes. They can be enabled (or disabled) on a per-scope basis using the `@setRuntimeSafety` builtin.


What "Debug options" are you imagining will provide runtime checks for overflow and underflow in C and C++ - languages where this behaviour is deliberately allowed as an optimisation?

In C it's simply a fact that incrementing the unsigned 8-bit integer 255 gets you 0 even though this defies what your arithmetic teacher taught you about the number line it's just how C works, so a "Debug Option" that says no, now that's an error isn't so much a "Debug Option" as a different programming language.


> What "Debug options" are you imagining will provide runtime checks for overflow and underflow in C and C++ - languages where this behaviour is deliberately allowed as an optimisation?

-fsanitize=undefined.

> In C it's simply a fact that incrementing the unsigned 8-bit integer 255 gets you 0 even though this defies what your arithmetic teacher taught you about the number line it's just how C works, so a "Debug Option" that says no, now that's an error isn't so much a "Debug Option" as a different programming language.

Yes, but this happens to be defined behavior, even if it’s what you don’t want most of the time. (Amusingly, a lot of so-called “safe” languages adopt this behavior in their release builds, and sometimes even their debug builds. You’re not getting direct memory corruption out of it, sure, but it’s a great way to create bugs.)


That’s a distinction without a difference. Yes it’s defined behavior. No, there isn’t a strictness check in C++ nor a debug option that will catch it if it causes a buffer overwrite or similar bug. Your comment is basically “no need to watch out for these bugs, they are caused by a feature”.


Did you read the same comment that I wrote? The very first thing I mentioned is a flag to turn on checking for this. And I mentioned the behavior for unsigned arithmetic is defined, but then I immediately mentioned that this behavior is probably not what you want and that other languages are adopting it is kind of sad.


People read the comment that you wrote, in which you, in typical "real programmer" fashion redefined the question so that it matched your preferred answer, by mentioning a flag that does not in fact, check for overflow and then clarifying that you've decided to check for undefined behaviour not for overflow.

[ saagarjha has since explained that in fact the UBSan does sanitize unsigned integer overflow (and several other things that aren't Undefined Behaviour) so this was wrong, left here for posterity ]

Machines are fine with the behaviour being whatever it is. But humans aren't and so the distant ancestor post says they liked the fact Zig has overflow checks in debug builds. So does Rust.

If you'd prefer to reject overflow entirely, it's prohibited in WUFFS. WUFFS doesn't need any runtime checks, since it is making all these decisions at compile time, but unlike Zig or indeed C it is not a general purpose language.


I would personally prefer a stronger invariant–overflows checked in release builds as well. Compile time checks are nice in the scenarios where you can make them work, of course, but not feasible for many applications.


> -fsanitize=undefined.

As you yourself almost immediately mention, that's not checking for overflow.

Was the goal here to show that C and C++ programmers don't understand what overflow is?

> Yes, but this happens to be defined behavior, even if it’s what you don’t want most of the time

The defined behaviour is an overflow. Correct. So, checking for undefined behaviour does not check for overflow. See how that works?


Sorry, perhaps I assumed a bit too much with my response. Are you familiar with -fsanitize=unsigned-integer-overflow? Your response makes me think you might not be aware of it and I wanted you to be on the same footing in this discussion.


I was not. So, UBSan also "sanitizes" defined but undesirable behaviour from the language under the label "undefined". Great nomenclature there.

It also, by the looks of things, does not provide a way to say you want wrapping if that's what you did intend, you can only disable the sanitizer for the component that gets false positives. I don't know whether Zig has this, but Rust does (e.g. functions like wrapping_add() which of course inline to a single CPU instruction, and the Wrapping<> generic that implies all operations on that type are wrapping)

But you are then correct that this catches such overflows. Thanks for pointing to -fsanitize=unsigned-integer-overflow.

Since we're on the topic of sanitizers. These are great for AoC where I always run my real input under Debug anyway, but not much use in real systems where of course the edge case will inevitably happen in the uninstrumented production system and not in your unit tests...


> It also, by the looks of things, does not provide a way to say you want wrapping if that's what you did intend

This would be something for C/C++ to add, which they (for reasons unknown to me) failed to make progress on. I applaud Rust for having them; they're table stakes at this point.

> Since we're on the topic of sanitizers. These are great for AoC where I always run my real input under Debug anyway, but not much use in real systems where of course the edge case will inevitably happen in the uninstrumented production system and not in your unit tests...

Right, they are not perfect. They're a bandaid; a valiant effort but even then not a particularly great bandaid. As I've described elsewhere, I don't actually think this situation is going to get any better :(


Runtime checks for signed overflow can be enabled with -ftrapv in GCC and clang. Having this option open is why some people prefer to use signed integers over unsigned.


C unsigned integers are completely well behaved: they do arithmetic modulo 2^n, and I hope you had a teacher that exposed you to that. C has many problems but that isn't one of them: overflow of unsigned is designed and documented to wrap around.


> C unsigned integers are completely well behaved: they do arithmetic modulo 2^n

Sadly, one rarely finds an excuse to work in the field Z_(2^32) or Z_(2^64), so while that behavior is well-defined, it's rarely correct for whatever your purpose is.


It is usually correct for my purposes (electronic design automation). When it isn't I need to figure out how to handle overflow. There is no automatic solution that is right in all cases, and a trap certainly isn't.


Array indices should arguably be unsigned (and struct/type sizes), so I'd say it's a lot more common than you imply.


I would have used to argue this, until I learned that Ada not only allows enum-indexing into arrays (compiler handled), but it also allows non-zero-based indexing.

Example: #1

    -- You just index into this using 100 .. 200 and let the compiler handle it.
    type Offsetted_Array is array (Positive range 100 .. 200) of Integer;
Example: #2

    -- Indexing using an enumeration (it's really just a statically sized map)

    -- An enumeration.
    type c_lflag_t is (ISIG, ICANON, XCase, ... etc.

    -- Create an array which maps into a single 32-bit integer.
    type Local_Flags is array (c_lflag_t) of Boolean
        with Pack, Size => 32;


Yes, Ada is pretty flexible in this regard, but I'm not sure how useful this actually is.


It's actually super useful, especially since you effectively get a statically sized map. Also, you can iterate over enums, and move forward ('Succ) or backwards ('Pred) or to 'First or 'Last. You can also return VLA arrays, which means fewer "allocate just to return" problems (GNAT uses a second stack per thread allocated ahead of time).


What I meant was, how useful non-zero indexing is in general. The utility of indexing by enum is clear, as you say.


I've only used it a few times but IIRC it was contiguous value ranges of grouped values (I think it was error codes coming from C code) anchored to the middle of a range. e.g. an enum which goes from 0 .. N, but values 10-30 were some specific set of logical values and I didn't care about the rest. It was nice that Ada automatically did all range checks for me and I didn't have to remember to subtract to check the correct array index.

The most common thing I've seen it for is that most arrays (and containers) in Ada are written as 1 .. N, but if you're share index information with C code, you want 0 .. N-1 indexing.


And exactly how is silent wraparound useful or even sane for that use case? You just proved the point of the one you responded to.


Wrapping is more sensible than negative indices.


It is still dogshit though. The reasonable behaviour would be an error.


And you can raise the error if that index is actually out of bounds. I don't see why the wrapping specifically is the problem here, the only unsafety is indexing operation itself.


Sure, Rust for example will let you do that (although in debug builds it will panic unless you explicitly said this is what you intended). However from a correctness point of view, it is extremely unlikely that things[n] is doing what you intended if n wrapped.

Most likely you thought you were harmlessly increasing n, after all it's an unsigned integer, and you added something to it. But when it wrapped, adding something to it made it decrease dramatically and you probably didn't consider that.

This can be punished by bad guys, where you expected a value like 10 or maybe 420 instead the bad guys provide a huge number, you do some arithmetic with their huge number, you wrap the offset to the very start of your data structure. Now it's inside the bounds, but not where you expected at all.

This is why people talk about "hoping you get a segfault" in languages like C++ because the alternative is much worse.

If you need to care about this (fiddling with files somebody provided e.g. by uploading or emailing them to you is an obvious place this comes up in web services) you should use WUFFS to do that. You can't make this mistake in WUFFS.


I agree that ___domain-specific ranged types as found in Ada are close to ideal. Unbounded integers or naturals are second best. Wrapping and checked arithmetic are distant thirds, but I don't think either is intrinsically superior to the other in terms of safety. It depends on the program's specific design IMO, but if we're talking about a C-like language where checked arithmetic is not common, I still think it's clear that indexing should be unsigned. Not the approach I'd take in a new language of course.

The pointer arithmetic you describe is the real source of most unsafety. The reason most C/C++ programmers prefer segfaults is because such arithmetic lacks bounds checking.

Thanks for the reference to WUFF though, looks cool.


It’s useful when working with bits and bytes and stuff. Aside from that, I fully agree.


I think the programmer should be able to specify what happens on overflow.

Maybe they're bit twiddling and silent wrapping is expected. Maybe they want the program to hard fault. Both are valid.


Perhaps you'd like Rust, where all the choices are offered, as functions on integers such as:

carrying_add (separate carry flag on input and output)

checked_add (result is None if it would overflow)

unchecked_add (explicitly unsafe, assumes overflow will never occur)

overflowing_add (like carrying_add but does not provide carry flag input)

saturating_add (the integer "saturates" at its maximum or, in the opposite direction, minimum - useful for low-level audio code)

wrapping_add (what C does for unsigned integers)

Rust also has variants that handle potentially confusing interactions e.g. "I have a signed integer, and I want to add this unsigned integer to it". With 8-bit integers, adding 200 to -100 should be 100, and Rust's provided function does exactly what you expected, whereas in C you might end up casting the unsigned integer to signed and maybe it works or maybe it doesn't. Likewise for "What's the magnitude of the difference between these two unsigned integers?" Rust provides a function that gets this right, without needing to consult a textbook for the correct way to tell the compiler what you want.

If you can't afford to ever get it wrong, WUFFS simply forbids overflow (and underflow) entirely, WUFFS programs that could overflow aren't valid WUFFS programs and won't compile.


Right, but in almost all languages one of the possible options is chosen by default because people want "+" to do something instead of having to specify each time. My personal opinion is that "+" should trap by default and the various other behaviors that are available (which 'tialaramex lists below as examples of which Rust provides) via some other mechanism. Some languages (C, C++) do it yet another wrong way in that "+" does a thing and there is no other way to do addition, and it's even worse because they picked one of the bad ones to use as a default.


-fsanitize=address,undefined,etc

There's even threadsanitizer which will tell you about deadlocks and unjoined threads.


Defaults matter a lot. Just because something is possible doesnt mean it is likely to happen.

Are most people going to enable asan, run their programs through valgrind extensively, or just do the easy thing and not do any of that?

This is also why neovim is being actively developed and successful and vim is slowly decaying. The path of least resistance is the path most well travelled.


Any project with a decent test coverage and CI can easily set up an ASAN / Valgrind run for their tests. I know I've had this on the last few C++ codebases I've worked with.


I would say that keeping the checks in runtime for release builds is the smart default. For most usages, removing the checks in release builds only adds security holes without measurable impact on performance.


Slices allow catching a lot of bounds errors that you can't reliably catch when using raw pointers.


For what it's worth, I find a lot of Zig code benefits from switching to u32/u64 indexes into an array instead of using pointers. This is only really doable if your container doesn't delete entries (you can tombstone them), but the immediate benefit is you don't have pointers which eliminates the use after free errors you mentioned.

The other benefit is that you can start to use your ID across multiple containers to represent an entity that has data stored in multiple places.

See [1] for a semi-popular blog post on this and [2] for a talk by Andrew Kelley (Zig creator) on how he's rebuilding the Zig compiler and it uses this technique.

[1] https://floooh.github.io/2018/06/17/handles-vs-pointers.html [2] https://media.handmade-seattle.com/practical-data-oriented-d...


> Everything in Zig is const x = blah;, so why are functions not const bar = function() {};?

This may or may not happen: https://github.com/ziglang/zig/issues/8383

> Fixing the standard library documentation would be my biggest priority if I worked on Zig, because I think that is the only thing holding back general usage of the toolchain.

This is a valid concern, but I believe the zig team is deliberately holding off on improving the std lib documentation, because they are expecting (potentially huge, maybe not? who knows) breaking changes down the line. The "stdlib is not documented" is a deliberate choice to signal "use at your own risk, especially with respect to forwards compatibility".

> there are still quite a few bits of syntatic sugar hiding the real cost of certain operations (like the try error handling, there is implicit branches everywhere when you use that...

I dunno, that's like saying that `if` hides branching. It's a language-level reserved word, you're expected to understand how they work under the hood.


yes, our first priority is stage2, after that, we might deal with stdlib. Andrew is going to go through the stdlib before the 1.0 release.


it's super reasonable to expect language-level stability before shoring up the stdlib. I know 'gatekeeping' is a bad word sometimes here, but this is soft-gatekeeping, and imo, a good thing (for now) to help focus the language.


unfortunately yes, it somewhat is, but the devs try to maintain extremely readable source, not the best thing but i think its really good and important cuz its the best example of good zig code and might teach you a bit or two like i learnt how to write saner and better code.

and the stdlib breaks sometimes soo its better to not put a loot of effort in docs


> Try and read a moderately complex Rust crate and it can be mind boggling to work out what is going on.

I do that all the time, even reading the source of the std, something that I cannot do sanely in C++. IME Rust code is easy to read, with symbols that are always either defined in the current file, imported, or referred to by their full path.


Agreed. This was my first AOC, and I did every day in rust (except for one that I did by hand).

Multiple times I'd go look at the source of a data structure and it reads very easily. I'd even share my code with friends and coworkers who weren't familiar with Rust (so we could compare..they were most familiar with Python). Not only could they easily grok my code, I showed them how docs.rs let's you easily see source. All of those that looked, could read it easily with some explanation from me on traits, pattern matching and generics.

I think it's obviously a subjective thing...but I very much disagree with the author that idiomatic Rust is difficult to read or comprehend.

In fact, I find Rust easier to grok, because I need to keep less in my head at any given time. Function bodies become almost self contained, without me having to think about lots of details like errors and return validity etc...


To be fair, the thing that makes a working C++ standard library unreadable is also a hazard in understanding Rust's std. Macros. The macros in a C++ standard library are horrible, because it is here that essential compliance and compatibility are squirreled away, and because the C++ macros aren't hygienic they're bigger than they'd otherwise need to be (e.g. you mustn't call it foo, say __x5_foo instead). But while they're far more readable on their own terms, the Rust macros littering std do mean it's harder to see how say, a trivial arithmetic Trait is implemented on u32 because a macro is implementing that trait for all integer types.

A macro-pre-processed std might be easier for the non-expert rustacean to grok even though it isn't the canonical source.

The symbol thing is pure insanity, machines have no problem knowing what symbol8164293 refers to, but humans can't get that right, and programming languages, including in theory C++ are intended for humans to write.


The thing that makes the C++ standard library source difficult to understand in my experience is heavy usage of templates and very deep inheritance chains.


The _Weird_identifier_naming_convention that the STL has to use to avoid colliding with potential user-defined macros doesn't help either.


Remember that expanding macros includes things like `println!()`. I'm not sure beginners will find the following particularly easy to read:

    {
        ::std::io::_print(::core::fmt::Arguments::new_v1(
            &["Hey ", "!\n"],
            &match (&name,) {
                _args => [::core::fmt::ArgumentV1::new(
                    _args.0,
                    ::core::fmt::Display::fmt,
                )],
            },
        ));
    };
Although, to be honest, I don't think there are many usages of these macros in std.


Good point. I hadn't considered panic!() in particular which is used in std, and the Debug implementations in std won't make a huge amount of extra sense after macro-pre-processing either.


Macros are extremely hard to grok and and so many use such short variable names that it looks like absolute gibberish.

They also look so different than normal Rust code. Python metaprogramming still looks exactly like Python, for example.


Macro definitions can be hard to grok, but that's not usually what you look on.

Macro uses can be hard, but macros are not used commonly in Rust (I mean, there are not many macros - but those that exist are very common). And they also look very much like Rust: `vec![a, b, c]` vs. `[a, b, c]`, `zip!(a, b, c)` (itertools) vs. `zip(a, b)` (`std::iter::zip()`), `#[tokio::main] async fn main() {}` vs `fn main() {}`...

Attribute procedural macros only accept valid Rust syntax, and most of them are derive macros that just derive some trait.


Oh yeah. I don't know if I've ever tried to implement a macro. The macro_rules syntax is hard to read and it doesn't feel like there are many examples explaining how it works.


The Little Book of Rust Macros - https://veykril.github.io/tlborm/introduction.html.


Agreed, when I see comments like this I tend to think they haven't spent much time using the language. It takes a while, but after a month or so you can read just about any Rust code. Honestly, feels like a much simpler language in day to day usage than say a language like Scala (just an example) to me.


I also agree with this sentiment, although there are some examples of really weird meta programming that remains opaque to me. For instance, I’m able to use `warp` as a framework, but the use of things like type level peano arithmetic is mostly incomprehensible to me at the moment. I also find that I run into Higher Rank Trait Bounds so rarely that I have a poor grasp of it (which might be as intended). All that to say that there are some odd corners of the language, given that I’ve been using it for five years now and as my main professional language for three years.


I love Rust, but e.g. macros, lifetimes, generic trait parameters etc are all very difficult to parse for the uninitiated. Of course, I'd bet on the readability of rust over cpp template wizardry any day of the week.


The part about making things easy to type is interesting, because this generally only works with a single international keyboard layout (usually US English), e.g. making things easy to type on the US keyboard layout may make it harder on an international layout.

It's an old problem though, for instance the {}[] keys are terribly placed on the German keyboard layout, requiring the right-Alt-key which was enough for me to learn and switch to the US keyboard layout, and not just for coding.

I think a better approach for a programming language would be to use as few special characters as possible.

PS: Zig balances the '|' problem by using 'or' instead of '||' ;)


It's a bit bizarre to complain about the pipe symbol IMNHO (as a user of Norwegian kbd layout, where åæø/ÆØÅ takes up prime estate) - without pipe you can't use a posix shell at all - so if you're on a layout without pipe, it's not like you likely could use any languages outside Smalltalk/Self, or possibly Pascal...

That said, yes, I think there's room for languages with very limited use of special characters. But I think they'd always be somewhat specialized.

Like Markdown.


I've heard this before but personally I've never had a problem with {}[], I just use the right thumb for shifting to the ancient greek layer.


>PS: Zig balances the '|' problem by using 'or' instead of '||' ;)

I wish Rust had made that decision as well.


While the standard library documentation is non existent, using grep on it and just reading through it is very easy, compared to almost any other language I have used.

I would actually say this is preferred: it's early days, so the documentation can't go out of sync because it doesn't exist, and library maintainers are incentivized to write understandable code, which most people who are getting into the language are forced to read, creating a consensus of what is considered idiomatic in the community.


Yep, and we also encourage this, if you open the (incomplete, buggy) autogenerated doc for the standard library, you get a banner at the top that links you to a wiki page that explains how the standard library is structured.

https://github.com/ziglang/zig/wiki/How-to-read-the-standard...


I've also found the tests for the standard library pretty useful when digging around trying to figure out how to use stuff.


> For loops are a bit strange too - you write

  for (items) |item| {}
>, which means you specify the container before the per-element variable. Mentally I think of for as for something in many_things {} and so in Zig I constantly had to write it wrong and then rewrite.

That does feel like the syntax is missing an "each" or a "with", as in "for each somethings as some do" or "with each somethings as some" - or in a similar terse/compact syntax:

  each (items) |item| {}
I'm surprised there's no mention about (lack of) string type - considering the ___domain (advent of code). I've not found the time to actually work on aoc this year, but I also had a brief look at starting with Zig - and quickly met a bit of a wall between the terse documententation on allocator, and the apparent lack of standard library support for working with strings.

I think the documentation will improve as the language stabilizes and there's likely to be more tutorials that work with text (both trivial like sirt/cat/tac in zig, and more useful like http or dns client and servers etc).


I found the standard library's support for strings was plenty fine, doing AoC problems in zig tests it out thoroughly. Tokenize[1], split[2] and trim[3] were the most common ones I used.

Was there something in particular you were looking for and didn't find?

[1]: https://github.com/ziglang/zig/blob/master/lib/std/mem.zig#L...

[2]: https://github.com/ziglang/zig/blob/master/lib/std/mem.zig#L...

[3]: https://github.com/ziglang/zig/blob/master/lib/std/mem.zig#L...

* After I read my own comment, I'd note that AoC tests out string manipulation pretty thoroughly but things like unicode handling not at all, so []const u8 as a string may be more annoying in the real world than in AoC answers and I haven't used zig's unicode facilities at all


In AoC it's completely fine to conflate a character and a byte. Neither your daily input nor the provided tests will have anything beyond ASCII.

Which is fine for AoC, good choice, but it means the language needn't get this right, or even provide any help to programmers who need to get it right, in a similar way to how "big" numeric answers in AoC will fit in a 64-bit signed integer, never testing whether your chosen language can do better if the need arises.


> Was there something in particular you were looking for and didn't find?

Unicode handling. Treating a string as a byte array is all fine and dandy if you're only processing english latin alphabet data, but it's a PITA as soon as you start using e.g. extended characters (math symbols, fancy quotes, ...) other languages, or emojis.


There is the `std.unicode` module[1] which provides standard unicode functions (encode, decode, length, iteration over code points), so I don't think it's fair to say that the language's library lacks strings in any real sense.

I will re-emphasize that I've not used it, so I cannot speak for its quality.

[1]: https://github.com/ziglang/zig/blob/master/lib/std/unicode.z...


Also here is a blog post that gives more info on how to deal with unicode in Zig.

https://zig.news/dude_the_builder/unicode-basics-in-zig-dj3


What do you need to do with them? All my data is UTF-8, and low level code is generally parsing, which doesn't involve any of those characters. It generally just works with all special characters (e.g. on my blog).

I think Unicode on the server or CLI is very different than Unicode on the desktop/GUIs.

Since Zig interfaces well with C, it should be set up well for the harder desktop case, because the "real" Unicode libraries handling all the corner cases are written in C (or have C interfaces). I don't think even Python's unicode support is up to the task for desktop apps.


> What do you need to do with them? All my data is UTF-8, and low level code is generally parsing, which doesn't involve any of those characters.

For example, file names on MacOS are Unicode. Depending on how low level your code is and what parsing it does, you will run into issues because UTF-8 will not save you.

> I think Unicode on the server or CLI is very different than Unicode on the desktop/GUIs.

This is a non-sensical statement. It's the same Unicode. Yes, you won't have all the same use cases as a GUI, but Unicode is the same.

The best example I have for when things are not handled is not strictly Unicode-related. WiFi SSIDs are 32 octets. That is any byte value at all. Recent standard modifications allow devices to specify whether the SSID is in UTF-8.

And yet. Too many devices assume that, basically, " All my data is UTF-8, and low level code is generally parsing, which doesn't involve any of those characters", and Internet is full of people asking questions like "cannot connect to WiFi with non-English SSID"


Yeah filenames on OS X are a pain.

However, Python 3 has the opposite problem! All the file system APIs were changed to assume that filenames are strings (sequences of code points), when they're actually BYTES in POSIX! (which may or may not be utf-8 encoded, but commonly are)

So my point is: when you ask for "unicode" support in a language, you have to be careful what you ask for, or it can create more problems. Having less in the core language is good.

For the specific case of filenames, I think you just need basename() and dirname() routines that find '/' (or \) regardless of encoding. You don't need a separate string type.

I should have said "localization on the server" vs. "localization on desktop/GUIs". What people really want is localized apps, not just unicode, and having a string type in a language doesn't get you that far. Because localization is inherently tied to the operating system, not the language. The filename example shows that, and there are also several other issues like fonts, left-to-right, etc.


Woe be to they who attempt to save Japanese mp3 files on a Linux NAS and read them from a Mac.


Yeah that's true, but this has nothing to do with the programming language, and everything to do with the file system implementation in the kernel, and the OS APIs.

The programming language can only do so much about this problem, and doing too much is harmful (e.g. Python 3).

The default OS X file system isn't POSIX compliant, which creates a portability issue. In POSIX file systems, filenames are bytes, and file contents are bytes. But that isn't true on OS X.


Agreed all around. "Strings are byte arrays" is imo the right approach to living in a hellworld where programs like filesystems expect you to send them all sorts of different precise arrangements of bytes.


> Was there something in particular you were looking for and didn't find?

Mostly dealing with allocation - ie dynamic string handling, reading strings from a file, passing a string to a function, and returning a different string to another function and so on.

Ed: this was pretty much before solving concrete aoc problems, just figuring out how to read "suffix" from standard input, passing it to a function "prefix_suffix", getting a string "prefix suffix" back and outputting that string; trivial manipulations, but lots of dynamic allocation.


I thought this syntax was weird too, but one cool thing is that you can ask for the index too

    for (items) |i, item| {
    }
I guess Go does something similar, but it's a little weirder IMO because it has maps, while Zig doesn't.

I suppose it could have been

    for i, item in items {
    }
But I guess coming from Python that feels like tuple unpacking, which I don't think Zig has, but could make sense?


Lua also has

    for i, v in ipairs (t) do
        -- i is the index, v is the value
    end
Not getting the index in C++ easily was very frustrating. Often in C++ I fall back on C-style for-loops just because the functional stuff is hard to remember.

In Rust it doesn't give you the index by default, but the `enumerate` adapter adds an index anywhere in a functional pipeline.


In "The Good" section, the author says there are only while loops and no for.. but apparently there is a for, now I'm unsure what it means. Is `for` a function?


`for` is a foreach. If you want to increment a number through a range like a typical C `for`, you have to use a `while` loop. I don't really see the draw.


man if I were andrew I'd just rename for to "foreach", because this is a huge complaint and source of confusion.


Seems fine to me. Rust has a "foreach" which is named for, working ok. Of course Rust has ranges as iterators, so it's not necessarily noticed that "there is no numerical for" but it works.


> Rust has a "foreach" which is named for, working ok.

Also ruby / python.

Don't see the point of a C-style for loop in a modern language, reusing the keyword is perfectly sensible (even more so as many which have both use the same keyword).

C# actively weirds me out every time from having both for and foreach, complete waste of brainspace.


It seems fine to just delete it if it's only ever going to work on slices and arrays


for strings, these might be good (haven't tried them yet): https://github.com/jecolon/zigstr https://github.com/jecolon/ziglyph


It saddens me that a proper string type is a hill the zig folks are willing to die on. The language would have _felt_ a lot better if they would have (at least) just copied what rust did with strings.

I understand the argument that in most situations a byte array might be what you actually want, but in practice it feels very dirty to be passing byte arrays around instead of expressing the underlying meaning of that byte array as a type (in this instance a String type).

Having a string type also makes any standard library functions on strings infinitely easier to discover.


> One nugget of knowledge I’ve worked out though - Zig is not a replacement for C. It is another replacement for C++.

I hope this isn't the case, since I see Rust as the C++ replacement, and another replacement isn't very interesting to me. The main reason I've been interested in Zig is because I thought it was a replacement for C, which is an interesting idea.


I don't understand where that came from. It's really a replacement for C. The place where complexity comes from in zig is pretty much the comptime type system, which is emergent from the idea of replacing irregular consteval rules for C and replacing preprocessor macros

I would say that Zig is:

C - {make, autoconf, etc., preprocessor, UB[0]} + {*defer, !, ?, catch/try, "async"[1], alignment, comptime}

I don't think that rises to the level of "C++ replacement". Maybe it's that comptime lets you do generics a la C++ templates?

[0] by default, in zig you can have UB for performance

[1] in quotes because async is not actually async, it's a control flow statement that is usually and most usefully used to do async things.


Hint: Rust will not be replacing C++. C++ and Rust will coexist indefinitely. At some point in the future, it is possible that more Rust coders will be using it daily in their work than the number who pick up C++ for the first time in any given week, who will go on to use it professionally. Or, that might not happen, and Rust will join Ada and so many other languages that never got their miracle.

Even if Zig doesn't fizzle like the overwhelming majority of languages, it won't replace, or displace, C, never mind C++. Everybody willing to move on from C already did a long time ago. People still using C today like it for its failings, so a language that fixes them is exactly what they don't want. It doesn't give C++ users any of the things they need.

The only real advance in systems languages in the last 50 years is the destructor, so it is frankly weird to find a new language without it. The Drop trait is all that makes Rust a viable prospect for its own miracle.


Oh I definitely meant "replacement" as in "replacement for me and many people", not that C++ would vanish. C and C++ are not going anywhere.


Is a C replacement (which is not also a C++ replacement) really what anybody wants? Like with no generics, no dedicated error handling, and no automatic cleanup? I get that everyone enjoys a simple language, but these features feel like table stakes now.


Exception handling and garbage collection are two features that feel superfluous, if not outright noxious, to a number of programmers; in particular those doing systems, embedded, real time programming. That is the crowd that still uses C. There's a cost associated with the execution environment taking control over from the program itself, and by proxy the programmer. It's not something you want when you design your code with the assumption that is it a more or less accurate representation of its execution flow.

Long story short : those who want that sort of programming language know why.


Agreed about GC and exceptions, but I wanted to focus on other approaches. Rust and Go are good examples of doing error handling through regular return values. Similarly Rust and C++ do cleanup with destructors, and Go does at least some of it with deferred function calls. There are ways to do these things that are suitable for low-level programming. But doing nothing no longer seems viable to me.

To the extent that Zig's features in these areas (and also generics) make it "not a real C replacement", that's when I question whether a real C replacement is actually what anybody wants.


> Go does at least some of it with deferred function calls.

For what it's worth, Zig has a very similar deferred function call mechanism for deferred cleanup, and something like Rust's Result type. See https://ziglang.org/documentation/master/#defer and https://ziglang.org/documentation/master/#Error-Union-Type .


Zig has all of those things, if you consider defer to be a form of automatic cleanup.


Which it is not. It is harder to forget it, but it's possible.


For what it's worth, I think it makes sense to include defer in the "automatic cleanup" category. If we don't, then I think we end up suggesting that the only popular languages that support automatic cleanup for anything other than memory are C++ and Rust. That does have some truth to it, but if the goal is to describe "languages that are viable today", it seems pretty clear that e.g. Python and Go are viable.


True but most languages need something like it to manage non-memory resources. Stuff like Python's with block or C#'s using block. So Zig is on par with those for non-memory stuff and much much better than C. Having everything accept an allocator also makes managing memory a lot simpler since you can use a simple arena allocator for temporary heap allocations.


That's the one part in which I really disagreed and the author does a bad job of explaining why they think that.


The optionals story in Zig seems a bit weak to me, because it has dedicated syntax to support conditional unwrapping:

   if(optional) |captured_optional| { ... }
if actually is three different syntaxes:

   if(expression) {} else {}

   if(optional) |captured| {)
   if(optional) |captured| {) else {}

   if(errunion) |result| {} else |err| {}
The latter is kinda awkward because it looks exactly like the optional syntax until the else and you have to know the type of the variable to know which is which. Capturing doesn't allow shadowing, which makes the optional case awkward.

This is one area that e.g. Kotlin has done better by checking if the expression of any if statement implies non-nullity of variables and then implicitly unwrapping them, as they can't ever be null:

   if(optional != null) { use optional directly }
This works much better for multiple optionals:

   if(optA != null && optB != null) { can use both optA and optB directly }
You can write this in Zig as well, but it results in a sea of unchecked .?, while Kotlin while give you compile errors if you use an optional without unwrapping that was not implied to be non-null.

Or you go multiple levels deep, as the if-optional syntax only allows one optional:

   if(optA) |capturedOptA| {
      if(optB) |capturedOptB| {
      }
   }
The error union story is fairly sound so far but one major annoyance is that while it composes well for returning errors, it doesn't compose well for error handling. You can't do:

   someComplexThing(file1, file2) catch |err| switch(err) {
      CryptoErrorSet => handle_crypto_error(err);
      FileErrorSet => ...
   }
as switch does not support error sets for branches, only error values. This seems to me like it incentivizes you to do have either something like this:

   someComplexThing(file1, file2) catch |err| {
       if(cryptoErrorToString(err)) |errdescription| {
          // ...
       }
       if(ioErrorToString(err)) |errdescription| {
          // ...
       }
   }
Or just a huge handleAllTheErrorsPls thing.

Errors are also just a value - if you want some extra information/diagnostics to go along with an error, you'll have to handle that yourself out-of-band.

On errors, Zig doesn't seem to have a strerror for std.os errors - awkward.


> you have to know the type of the variable to know which is which.

That's not correct. The error version differs from optional by the capture on the else branch. The optional version can't have it, and all error versions must have both captures. You can always tell which case it is just by looking at the code, without having to know the types involved.


Yes, I meant if you're looking at the if() part - you either have to know if that's an optional or an error, or go looking for the else to see if that captures an error.

> because it looks exactly like the optional syntax until the else


Consider the difference between an optional and an error union:

An error union is a value or error.

An optional is a value or null.

The if/else capture syntax follows: "if" captures the value. "else" captures the error if there can be one, or doesn't capture anything if there can't. That is, you don't need (and therefore don't want) a difference between error union and optional in the if part of the syntax.


That's a fair point but doesn't really distract from the syntax being awkward for optionals. The if/else for errors matters less, at least to me, as there are already dedicated control-flow primitives for errors. [1]

[1] Which reminds me that the syntax for blocks-returning-values (named blocks) is really awkward. That might be the ugliest bit of syntax in the entire language:

    const something = if(bar) foo else someblock: {
        // stuff
        break :someblock value_to_return;
    };


The part I don't get with block expressions is that it requires using a label, but the label name is arbitrary/meaningless... It looks like the label serves two purposes, (1) to signal that the block returns a value and (2) so the statement that returns a value can identify the block it's returning a value for. I guess maybe three purposes, where the third is it lets the "break" keyword be overloaded for the purpose of returning a value from a block since it distinguishes it from other usages of "break".

You could, e.g., drop the arbitrary label, just leave the : and it all still works unless you want to nest block expressions and return for the outer block from an inner block (which is pretty iffy, IMO, but could be supported by making the label optional). I'm not sure whether or not colons lying around is a good way to do this, just making the point that the label is arbitrary. (Maybe = or maybe <= before the block and => after break, something like that.)


write shorter blocks so you can see the else? If the ifs get too nested, encapsulate them in functions? Buy an extra monitor and mount it vertically (or diagonally)?


> One nugget of knowledge I’ve worked out though - Zig is not a replacement for C. It is another replacement for C++.

While comptime is a potential source of complexity, I sort of think C++ developers won't accept a replacement that has no RAII or automatic invocation of destructors.


Meh. Not trying to start a language war but I was grateful to switch from C++to Go when the price was to lose generics and a few other things in exchange for the language’s simplicity and clarity.


I think there are so many corners where people are using C++ that making generalizations about them is likely to fail.


That’s why I referred exclusively to my preferences. It is obviously the more comprehensive and versatile language.


I was mostly referring to gp's broad statements, comment was in support of your experience, which is good feedback to hear.


There's an open issue to add some kind of function annotation+errors for functions which require you to call a cleanup function.

The discussion has had a lot of back and forth and they haven't really settled on a desirable solution yet, but it's something they're hoping to add.

https://github.com/ziglang/zig/issues/782

I work in games with C++ and we already do so much manual management and initialization+teardown functions that lack of RAII isn't a deal-breaker. Though I'd definitely prefer it if there was something either well-enforced or automatic.


This sounds good. While I don't have much preference about "being explicit" vs having automatically-invoked dtors, it will be nice to be nudged when I actually forget to clean up.


I got into Rust by working on Advent of Code 2021. The problems seem arbitrary, repetitive and sometimes unnecessarily hard. But they are well-designed for starting on a new language. We are forced to repeatedly use basic concepts of a language, so that is useful to get a few reps in on a new language. We are also forced to build utils that can be used a few times.

And if you challenge yourself to solve the problem as quickly as possible so as to see where the story leads, you can stay motivated to work thru the problems. Helps if you have a friendly competition going with a few friends.


> I wanted something more like Rust’s cargo test that’d find all tests and run them. Maybe Zig does have this but I just didn’t find it?

Try `zig build test`

https://ziglang.org/documentation/master/#Zig-Build-System



Now you can really move zig.

For great justice.


If dereferencing the pointer is through ptr.*, wouldn't it be more consistent to take the address with myvar.& instead of &myvar?


The one thing I personally love about Zig, from an outsiders perspective, is the relatively clear scope and the "no, we won't add each and every feature we can imagine"-stance.

It also seems rather elegant.


the world has learnt from the horrors of C++, the same mistakes should not be repeated lol.


I don't see that the world has learned.

Take Rust for example. Rust is a language that does a lot right. It's currently my favorite language.

Still there is feature creep. And a lot of it.


swift, too


> and so making getting at heap allocations harder by explicitly getting them through an allocator is a great thing.

Did not follow this logic.


I think the idea is that heap allocations are costly, so making them explicit exposes their cost clearly, and promotes careful thought about strategies for heap allocaiton.


How is that any more explicit or intentional than calling malloc though? You have to tell it exactly how many bytes you want on the heap.


If I call some function foo() I have no idea if it does any heap allocation internally.

In zig I always know because all functions that allocate memory explicitly request an allocator. That also allows me to control which allocator they use.


I see, I didn’t understand that’s what they meant about zig.


It's more explicit to pass an allocator to things that will allocate than not to, in the same way that other local variables are more explicit than other global variables. By cultural convention, you'll be able to bring your own allocators to almost any library, so you won't have to let your libraries decide when your program performs heap allocations.


For many types of programs it's really, really important to have control over allocation behaviour which malloc doesn't provide (for instance a per-frame bump allocator, an allocator which reuses memory blocks but still works within a pre-allocated memory chunk, a specialized allocator for GPU memory blocks, etc...).


Zig makes it supremely easy to use different allocators for different pieces so you can do much better than just calling malloc everywhere. This enables very easy and straightforward use of arena allocators in particular.


Can someone more knowledgeable than me in Zig explain this:

> var x = try foo(); means x is equal to the result of foo() unless there was an error in the result.

> If there was an error, return from the function with the error now.

> This meant that you don’t have the messy littering of if conditionals after every function that you

> typically get in C, but you also don’t have the complete disaster that is exceptions in C++/C#.

How is this different from exceptions?

Exiting a function immediately in case a function call fails sounds exactly like an exception.


> Exiting a function immediately in case a function call fails sounds exactly like an exception.

No, this would be an early return. The specificity of exceptions is that their default behavior is to bubble up through the entire stack. try in Zig only bubbles up to the calling function.


Ah, ok.

So... you have to do the bubbling up manually if the caller cannot handle the early exit.

Maybe I'm missing something but it still feels like a solution that looks like exception but is inferior to them.


I think you are missing something, at least on the "ideological" aspect of things: it's part of the values/guarantees of Zig that all control flow is going to be explicit. On the landing page of Zig (https://ziglang.org/), "No hidden control flow." is the first bullet point. So "try" is the way they found to make control flow explicit, but also less painful that if (...) { return something; }. It's also a solution that's checked by the type system, exceptions usually aren't.

I don't know if it's the best solution, but I can understand where they come from.


Because it's a part of the type signature and cannot be accidentally ignored. Every possible error case must be handled when unwrapping the error value.


So... like checked exceptions (e.g. Java needs to declare those in the signature, and they are an integral part of said signature).

The more I hear about how Zig implements it, the more it is exactly like exceptions.


> The more I hear about how Zig implements it, the more it is exactly like exceptions.

It's exactly like exceptions aside from all the problematic bits of exceptions:

* it is very explicit

* it is much easier to interact with

* it can be abstracted over (since a "result" is a reified value of a known type)

* it supports but is not over-optimised for happy path or split-path scenarios

* it doesn't require separate allocations or RTTI

* it has a much more uniform cost model

That doesn't necessarily mean it's the right system for what you usually work with, but it's a really nice way to do error handling.

For an expansion of this, Brian Cantrill has a paean to this error handling style in his "falling in love with rust"[0], I can't direct-link the section but it's "1. Rust’s error handling is beautiful".

[0] http://dtrace.org/blogs/bmc/2018/09/18/falling-in-love-with-...


It is different in implementation, for better or worse. Instead of stack unwinding, it's a part of the return value.


There is an open issue for something like `cargo test` https://github.com/ziglang/zig/issues/10018


All it needs to be actually useful is destructors. And, constructors.

I am always amazed when a new language omits destructors. Sure, CS assignments don't need them, but out here we have real resources, not just memory, to manage.


The application I'm working on has about 8K lines of bare bones C + Win32 + some optional OpenGL currently. It probably has about 5 lines of repetitive cleanup code that can't be easily folded into a single place (so might benefit from RAII style cleanup). If I want to properly release even "static" resources so the app can be used as a library (which is not necessary), that number might grow to 20 lines.

It's not something I'm sweating about, and I'm happy about all the time I saved by not doing "proper" RAII design prematurely, which has more ramifications and constraints than one might think.


Most of the problems that C++ added on top of C are caused by RAII though (e.g. a too rigid coupling of code and data). I think Zig made the right decision with the 'defer' keyword, even if it requires some change of perspective.


By "problems" I guess you meant usefulness: "Most of the usefulness that C++ added on top of C is a product of RAII." Because usefulness is why C++ is used so much.

And, "too rigid" meaning maximally flexible, likewise.


That's what I mean with "change of perspective". POD structs are entirely fine if a library API is designed from the ground up for them. It only becomes a problem when trying to write C++ code in C (or Zig in this case).


POD structs are fine if your wants are limited. Most usually, they are more error-prone and hard to use, and often slower. Limiting yourself to PODs is, first and foremost, limiting yourself.

There are reasons why, in the highest-paid programming jobs, you would be laughed from the room if you suggested coding something in C. Dull knives, cracked pots, and filthy ingredients make a bad stew.


There's plenty of great languages with those features. Zig brings a different mindset to programming, so you need to empty your cup of tea before being able to enjoy Zig.

https://ashidakim.com/zenkoans/1acupoftea.html


Destructors are good. Though if the language does not have unwinding, catchable panics you could be even better served with linear types.

Constructors, though, are solely an attractive nuisance.

For the constructors themselves you're better off with factory function whose semantics match your needs (fallibility, or the lack thereof, being a big one) and for other protocols (e.g. copy, move, conversion, ...) you're better off either not having them or having dedicated protocols which are clearly and explicitly opt-in, depending on language semantics.

> I am always amazed when a new language omits destructors.

Destructors require deterministic destruction, so that only works for non-managed languages. That's why managed languages try to find alternative solutions to the resource problem (except for languages used to research substructural typing I guess, as well as languages which mandate refcounting as part of their semantics).


How does it compare to other C-replacement languages, like Beef?


Zig is definitely a C replacement in my opinion.

I agree about the array and pointer syntax being hard to remember.

I’ve high hopes for the language.


> For loops are a bit strange too - you write for (items) |item| {}, which means you specify the container before the per-element variable. Mentally I think of for as for something in many_things {} and so in Zig I constantly had to write it wrong and then rewrite. Also you use the | character in Zig quite a lot, and while this may just be a problem with Apple UK keyboards, actually getting to the | character on my laptop was uncomfortable. When doing C/C++ or Rust, you use the | character much less and so the pain of writing the character was something I never noticed before. Jonathan Blow has gone on the record to say that with his language, Jai, he spent a lot of time working out how easy it would be to type common things, such that the more common an operation in the language, the easier it would be to type.

I've written much more "Jai" than Zig but this is one of the things that stuck out to me the most in Zig's syntax as being strange. in "Jai", for loops iterate over arrays, ranges (0..10), or anything else that has a for_expansion defined for it. implicitly, "it" is the iterator and "it_index" is the index. to for-loop over an array, you simply write

    foos: [..] int;
    for foos { /*...*/ }
if you don't want to use it and it_index, likely because you're nesting loops, you write

    for foo: foos { }
    for foo, foo_index: foos { }
this has some very nice properties in addition to relative terseness: when you want to iterate over something, which is something you do all the time in all kinds of contexts, you just write "for foos do_something_to_foo(it);" suddenly you find you need to use something other than the implicit it/it_index, so you just change it to "for foo: foos do_something_to_foo(foo);" maybe when you're "sketching out" your program code, "foos" is just an array of ints, but as you flesh things out further, you realize you want it to be a custom data structure with additional information that can nonetheless be iterated over as if it were still an array. you simply write a for_expansion for the new data structure:

    Foo_Storage :: struct {
        items: [..] int;
        additional_info: string;
    }
    for_expansion :: (using storage: *Foo_Storage, body: Code, flags: For_Flags) #expand {
        for `it, `it_index: items {
            #insert body;
        }
    }
    foos: Foo_Storage;
    for foos { /*...*/ } // the loop interface remains unchanged
I completely agree with the author here in that I appreciate this approach as opposed to Zig's, with regards to making it as easy as possible to write basic constructs ("loop over some stuff") that you're going to be writing a lot, in a lot of different places, in a lot of different contexts, all the time, constantly. this is the one area in which this language and the design ethos behind it is completely different from Zig and other contemporaries—it balances power and simplicity with developer ergonomics quite nicely.


Helpful article.


I get this feeling its a one / two / three man I want to write a compiler phase. I seen dozens of these languages over the years. I looked at the language, and without something revolutionary, this will die a slow death. I think it is experiencing that , main developer(s) are using crowd funding and nothing has really been "wow".

I was right about dart and flutter being the next big one. However, zig is a dead language in a grave yard of 100s. It tries to revolution coding without changing the methodology.


To me Zig doesn't seems like a zombie.

* It is actively developed, has frequent releases and it is steered toward a stable 1.0

* The crowd funding model proves that some users want that language.

* It doesn't try to revolutionize coding, it tries to be a better C, hence the wowlessness

Nim¹ and Crystal² have probably more chances of eventually becoming dead languages but I hope they don't as they are both fun languages.

1- https://nim-lang.org/ : Nim is a statically type checked compiled Pythonesque language with a type system residing somewhere between Pascal and Ada.

2- https://crystal-lang.org/ : Crystal is a statically type checked compiled Ruby-like language.


I've been using Nim for a few years now. You're right, it is fun, by syntax alone. The performance is great, too. The development of Nim has been slower than some would like, but this has not been a problem for me.


> main developer(s) are using crowd funding and nothing has really been "wow"

It's funny: I think these are both positive signs.

The weight of history tells us that the overwhelming majority of new languages will die. So, in that sense, I agree zig probably will too. However, I think it's worth separating languages into ones that have essentially zero chance to survive long-term, and those that have a reasonable non-zero chance to succeed (and those in the middle).

To me, zig is the most promising "better C" (which is why I'm one of those people crowd-funding zig). Since I think we really need a better C, I think zig is on the side of those with a reasonable non-zero chance to succeed.

Some things that might get it over the hump:

Zig aspires not just to be a better C, but a better C compiler. It includes a first-class C compiler (Clang/LLVM), and adds first class cross-compiling support.

Along with native C integration and a language designed to appeal to C programmers, it's potentially low-friction to adopt in places where one would go to C now.

That is, it seems to me there is a viable, incremental path from C to Zig. (Incremental is important. E.g., even if zig succeeds, there won't be a RIIZ movement because there's no need. People will write or rewrite as it makes sense for their project, not their programming language.)

I get that zig is a work-in-progress, and that it may well not succeed. Just that it looks like it has the best chance to me (or maybe it's just that I like the approach it is taking).


Were you right about Dart being the next dead language or the next language propped up by Google?

We're having a great time using Zig in production. Thanks.


nice! where are u using zig in production?


Anecdotal, but I had a strong and immediate "wow" feeling when I first looked at Zig which I didn't have with most of the other "better C" languages before. Also, not having a big company behind is a good thing, not a disadvantage (because it means that bad design decisions can't be forced on users just because of the "it's made by Google or Apple, so it must be good" effect - as a real world example for this problem, see WebAudio).

PS: Dart being the next big thing, or the next big flop? Because from my bubble, Dart and Flutter don't look all that popular.


I'd say comptime and having enough features to be better than C at using and (cross-) compiling C libraries is somewhat revolutionary.


Dart has a lot of really neat features that should have caught on in other languages by now. In particular, the Dart ".." operator, implicit interfaces on every declared class, and mixins really should have made their ways to C# and Java by now.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: