Hacker News new | past | comments | ask | show | jobs | submit login
The Egison Programming Language (egison.org)
99 points by ubavic on April 29, 2023 | hide | past | favorite | 70 comments



Here is my brief explanation of the example given on the website in order to clear up some confusion, that I gleaned from the thesis paper. Admittedly, this language does require knowledge of Haskell to really comprehend:

   -- Extract all twin primes from the infinite list of prime numbers with pattern matching!
   def twinPrimes :=
      matchAll primes as list integer with
         | _ ++ $p :: #(p + 2) :: _ -> (p, p + 2)

"matchAll is composed of an expression called target, matcher, and match clause, which consists of a pattern and body expression."

In the example `primes` is a list of primes as in Haskell: `[2,3,5,7...]` considered the "target", and the "matcher" `list integer` may be thought of as a Haskell type like `[Int]`. So I suppose you could simply write it as `primes :: [Int]` and mean the same thing. This notion of a "Haskell list" is important because in the "match clause" the "pattern" is a combination of concatenation using the operator `++` and cons from Lisp using the operator `::`---note that this deviates from Haskell syntax in a somewhat confusing way where Haskell uses `:` and `::` for typing. Nonetheless, the integer list is deconstructed according to concatenation first, in every way, i.e. `[] ++ [2,3,..]`, `[2]++[3,5..]`, etc.., then according to cons'ing with the head stored in the variable `$p`. Yet, the "rest" of the list in this case is actually matched according to the pattern `x::y::_`, therefore, the second element must be 2 from the first, which is why the first pattern `[]++(2::3::_)` is discarded. The `#p` notation simply means to reuse the previous value of p to create a literal match, therefore for the first pattern `p is 2` and `#(p + 2) is 4` thus the pattern becomes 2 followed by 4 followed by the rest, which again doesn't exist. Finally if a match does exist, the values are constructed according to the "body expression", in this case a pair, and all of the results kept in a list. Therefore the type of this value is

   twinPrimes :: [(Int, Int)]


I really hope that

         | _ ++ ($p :: #(p + 2) :: _) -> (p, p + 2)
is valid syntax. That would've cleared the whole thing up for me.


I really like this language. Very intuitive and simple syntax (once you understand the fundamentals). And very powerful too! The pattern matching examples get addictive after you've read a few.

Egison can be used to build a symbolic math backend and do all kinds of pattern matching. But its really niche requirement and it's never really occurred to me in all this time "hey, this might be a good time to use Egison"

I wonder when it might be a good idea to use Egison and if there are some current users in production.


Maybe I'm just a dummy who should stick to scripting, but I had exactly the opposite reaction: based on everything I've seen, this is wholly uninteresting. I can't imagine a single practical application that isn't more comprehensibly written in an existing language. But I feel the same way about Haskell, too.

The older I get the more I value code that is easy to understand and debug by either myself or someone else months or years down the line when the original context might be forgotten. Ultra-dense syntax like this only makes that almost superhumanly difficult.


try implementing symbolic differentiation in a language like haskell or egison, and then again in a language like python or js, and your perspective on what is 'easy to understand and debug' is very likely to change


Do you have an example of such an implementation in haskell?


googling i found https://gist.github.com/amalex5/541dc739201bbd8e26037b6738cd... although it's missing the type declaration

http://5outh.blogspot.com/2013/05/symbolic-calculus-in-haske... is more complete and also includes an explanation and a bunch of algebraic simplifications you'd usually want along with the symbolic differentiation itself


Related:

The Egison Programming Language - https://news.ycombinator.com/item?id=17524239 - July 2018 (67 comments)

Egison – Pattern-matching-oriented Programming Language - https://news.ycombinator.com/item?id=7992661 - July 2014 (33 comments)

Egison: A Lisp Written in Haskell with Advanced Pattern Matching - https://news.ycombinator.com/item?id=7924168 - June 2014 (27 comments)


It looks interesting, but I don't understand the syntax. The only documentation is a doctoral thesis, which contains the usual padding:

> Interestingly, this basic definition of map has been almost unchanged for 60 years since McCarthy first presented the definition of maplist in [62].

but it jumps from what looks like standard Haskell, to the new syntax, without explanation.


“Egison makes programming dramatically simple!”



This comment originally said that the link was useless. Actually, this is a nearly-complete explanation of Egison's syntax. The only difference I've noticed so far is : for Egison's ::.


Edit: glad it helped!

(This comment originally suggested attempting to reverse-engineer the syntax after learning the semantics)


    def twinPrimes :=
      matchAll primes as list integer with
        | _ ++ $p :: #(p + 2) :: _ -> (p, p + 2)
why do people do this? this is unreadable to me. what is the second line, a comment? then they somehow found a way to cram 11 symbols in a single line after that. bravo?


>why do people do this

do what? write programming languages that you don't already know?


This is not about "already know". This is about "in how many ways you can plausibly parse the chunk of text you are looking at". Haskell and Forth come to mind as the worst languages when it comes to humans being able to parse them. The language in OP is, well, a close relative of Haskell. It's awful to read and work with in general because human readers cannot parse the structure of the program easily.

Another aspect / problem: humans vocalize (well, adults, usually, using their "inner voice") anything they read. This is why, for example, people deaf from birth struggle more with reading comprehension. So, when programmers read programs they use some made up language to read it out loud to themselves. When a language is just a string of punctuation / non-typographical symbols it's a struggle for any human to read it to themselves in any way that they could process it.

Try it yourself:

Pipe underscore increment dollar pie colon colon hash open paren...

Probably, with some experience, it will be more like

Pipe unimportant list-add variable pie namespace hashtable with arithmetic expression?

In other words, this is not a language anyone would use in a setting where they just need a computer to write useful programs. This is a crypto-arithmetic puzzle for people with too much time on their hands.


Given that a few people can master high-level math that is impenetrable to most, and programming itself is impenetrable to some people, it might be that some languages require minimal cognitive level to get into. That is, easy for some, challenging yet possible for others, and unlikely at all for others. But sometimes it sounds like struggling to learn something only takes place in school, and once we're out of high school it's just easy things thereafter.

The question of "useful" is good. Useful for what? There are lots of different needs and purposes in the world.


I wouldn't bring math into this. Math is genuinely hard for everyone. Some people just have enormous amounts of dedication and sheer xxx will. The hardships extend far beyond learning the language.


> Another aspect / problem: humans vocalize (well, adults, usually, using their "inner voice") anything they read.

Not necessarily, e.g.:

> Hurlburt … estimates that inner monologue is a frequent thing for 30 to 50 per cent of people. [https://www.cbc.ca/news/canada/saskatchewan/inner-monologue-...]

For myself, I rarely read out Haskell code ‘literally’ (to the extent that I read any code literally at all; I’m not sure I have much of an internal monologue). Instead I tend to perceive it at a higher level: say, ‘modify a state variable s to fromMaybe s (f s)’ for ‘modify $ \s -> fromMaybe s (f s)’. Reading it out character-by-character would be as pointless as reading an ordinary word that way, like reading ‘cee-aye-tee’ instead of ‘cat’.


I forgot the name of the reseacher, it was one of Chomsky's students who also researched how people (learn to) understand music. He also researched the subject of reading comprehension in people who lost hearing at different stages of their lives.

His research convinced me that any tool that wants to parse any spoken language absolutely has to incorporate vocalization information. Also, it convinced me that while it's possible that some people (esp. deaf from birth) can learn alternative ways of reading that don't involve vocalization, these are unnatural for humans, hard to learn and have poorer yield.

So, yeah, maybe some people don't read their code out loud to themselves, but most do, and for most of those who do this makes things easier. So, even if there's a way around it, it's not worth considering when we are talking about ease. People who don't do it are "playing with a handicap", they aren't in the competition for the easiest way to do things.

Also, I'm not convinced that having an internal monologue is the same as reading text using inner voice. Most people learn to read by first reading out loud, and only later learning to use inner voice. Only few people continue on to learn advanced reading techniques which allow to skip selectively and to perform some other mental manipulations on the text, but these techniques also often involve some internal vocalization, but it wouldn't produce the equivalent of speech. More like how people who have to control their computer with voice commands learn all kinds of shortcuts / ways to invoke functionality hard to describe using regular speech elements with clicking noises, whistles etc.


Investing so much creativity into obfuscation of meaning.


>>> why do people do this

>> do what? write programming languages that you don't already know?

> Investing so much creativity into obfuscation of meaning.

that's just saying the same thing in different words. of course the meaning is obfuscated to you, you don't know how to read it.


I don't know Ruby, and I'd argue it has dissimilar syntax to many other languages I use regularly (that's changed over the years but whatever, my point still stands).

I can still read ruby code, because it's still readable. Same with Lisp, which I never bothered to sit down and learn until about a year ago. Prior to that, I still had no problem reading and understanding it (perhaps not as intuitively as someone who writes it regularly).

The quoted example is, for all intents and purposes, gibberish to me. And I have worked with CoQ.


Well, and to me this syntax makes sense intuitively. And to most beginner programmers, all code looks very hard to understand. I'm really bad at understanding pgsql, I dont't like it at all. People just have different experiences.

Ruby still has algol-like structure, like most common languages. And basic lisp syntax is "normal" function calls with the parens moved around.

Of course, some things are objectivly more complex than others, but making that argument here requires a bit more evidence.


I'm with you about all code is difficult to beginners. About language designers, they should decide who they want to address with their language and how popular they want it to be. If they want to become mainstream and they don't have a way to force people to use it (example: Apple with Objective C and Swift) then they should make a language with a syntax similar to something already mainstream. Example: Elixir got reasonably popular and in its early days it was common to think about it as a functional Ruby. It is not. Anybody using it would find it very different from Ruby by the end of the first day. However there is at least a correspondence between many Module.function / Class.module names in the standard library and the same do / end general structure of code blocks.

My advice to any language designer aiming to mainstream status is to copy some curly braces language (Java, JS, C++) or Python. Then do any weird thing to them, but don't scare away millions of developers with your hello world.


Coq is quite readable imho. I work mostly with k, Ocaml and Haskell (trading); I find ruby unreadable. I find it extremely ugly and my brain simply sees it as line noise, while k (and Egison by the way) reads fine. Each their own, I guess.


regex much? perl does it too.


Just got it in my head after wondering the same. So we have some prefix of the list we don't care about (_) appended (++) to a variable ($p) consed before p + 2 (#(p + 2)) consed before a suffix we don't care about (_), then produce the pair (p, p + 2) for each match.


Yeah... and nobody knows what happens first. It's impossible to follow the execution of this gibberish because nobody knows operator precedence, and even if that was somehow hard-wired into your brain, having to re-scan the same line multiple times and sort out the priorities still takes a lot of time and will fail for large-ish (5+ elements) problem instances because people cannot hold that many elements in active memory.


I'm starting to see the allure of plain Lisp. All it has is functions and functions have names which convey a meaning. Plus some syntax for creating macros.

Now CAR and CDR don't convey a lot of meaning I agree. But that's just a matter of poor naming. They should (in my opinion) be called 'first' and 'rest', or something similarly meaningful and descriptive.

Of course short-hands and aliases are good for most often used functions. But I would prefer those to be aliases, with a proper descriptive name available as well.

And operator precedence, that makes code in a different language really hard to understand. Parenthesis make it explicit. Keep it simple. That makes code more verbose but then you can use macros to make it less so.


> Now CAR and CDR don't convey a lot of meaning I agree. But that's just a matter of poor naming. They should (in my opinion) be called 'first' and 'rest', or something similarly meaningful and descriptive.

Some Lisps do that, but I think it's important to remember what CAR and CDR actually do, which isn't always related to lists: They access what's being pointed to by the pointers in a cons cell, which can be used to create multiple data structures in a given Lisp, so keeping the names abstract prevents the code from having the wrong names all over the place. For example, if I'm using cons cells to construct a key-value store, CAR isn't "head of list" it's "key" and CDR isn't "tail of list" it's "value" especially if the cons cell looks like this: (a . b) such that the CDR isn't a list of any kind.

In Common Lisp, of course, there are more efficient data structures for a lot of what older Lisps used cons cells for, so this is less of a concern, but I still think it's important to prevent confusion.


Good to understand. So maybe key() and value() would be good, more descriptive names for them then.

But I do think using a natural-language name like CAR even if very specialized, is better than using single-char symbols as function names.


> So maybe key() and value() would be good, more descriptive names for them then.

My point is that they can be used in many different ways, and are, such that any mnemonic name would be misleading in some contexts.


They could have multiple aliases, and you could pick the one that fits the context


By my estimation, it seems you've now spent about ∞× more time making suggestions how to improve Lisp than actually working in it.

Without working in in Lisp solidly for a good 6 months, you don't know where the actual pain points are.


What in your estimation are the actual pain points of Lisp?


It depends exactly on which implementation and dialect we are talking about.

Mainly people sometimes report things lacking in the ecosystem. The number of people doing any kind of Lisp are relatively small.

If you're doing FOSS, you might find it hard to get collaborators and pull it solo.



car and cdr are deeply entrenched in the Lisp culture: in the systems, historic and past and volumes of literature. Criticizing these always reads like someone stepping off a plane on their first trip to some country, are telling the locals that their words are confusing, and maybe they should speak English.

When you see car, cdr and other related functions in Lisp code, it communicates that the code has strictly a physical interpretation, using well-known, decades-old idioms. You know exactly what it's doing with what kind of structure, and you know not to try to read any meaning into it. As to what that structure represents, what it's being used for, you have to look elsewhere; those names don't give that away. But to a Lisp hacker, spots in a program littered with car/cdr/cons/caddr/... are comfort zones.

It's good to have one set of names for this "just structure, no meaning" stuff and stick with them, than to go inventing new ones. That's more important than their specific choice. There is no end to what a cons-cell-based structure can mean, and so no end to the potential sets of better names for its parts; it's best to leave that to individual programs.

It's good that the names are short. MacCarthy saved us from awful ones. Before Lisp, there was a project called FLPL: Fortran List Processing Language. The names came from that, but in that project, for some inexplicable reason, they had XC ... RF prefixes and suffixes: XCARF, XCDRF, ... MacCarthy had the presence of mind to shorten the insane naming scheme, but not all the way to just A, D.

Mainstream Lisp dialects like Common Lisp have first and rest. Plus other names like second, third, as well as numeric indexing into a list. These names are still meaningless with regard to what a list is used for. first is no more informative than car about what is expected to be first; it's only clearer to complete neophytes who haven't gone through the 15 minute tutorial on car and cdr.


People do it because they work mostly in isolation or small groups of experts on things that compute values once from command line.


| begins a pattern match. ++ is list concatenation. :: means "thing-on-the-left is of type thing-on-the-right". That's all standard Haskell syntax.

Egison seems to introduce some kind of advanced pattern-matching syntax with $ and #. I can't figure out what it is, or how it works; but I imagine it's quite simple once you actually know what it means.


Interpreting :: as a typing judgement makes no sense here. What does make sense is interpreting :: as a list "cons", i.e. as : in Haskell.


"Quite simple" is the same thing as "left as an exercise for the reader", i.e. when someone couldn't be bothered to do things right / didn't know how to do things right, but made it work by shuffling and other kinds of rearrangements before it sorta worked.

Maybe this language is simple to write if you know the rules. It won't be easy to read, ever, not even if this is the only language you have ever learned and practiced for decades.


> I imagine it's quite simple once you actually know what it means.

The first part isn't notable. Olympian athletes, for example, do the most baffling of intricate movements in grace and simplicity. Instead it's the amount of effort and context to get to the understanding that's at play.

Maybe this syntax is intuitive for a mathematician, formal logician or some other specialist, but as just a mere every day programmer, it looks like nothing I'm familiar with.

This is fine. Just let's not pretend that it's low effort to get onboard


Nah, this mess isn't intuitive to anyone. There also isn't anything like an agreed upon mathematical or logical notation. Everything in this field beyond the basics of algebra / set theory is awful one-off NIH languages with awful / inconsistent grammar that nobody beside the "inventor" can understand, while the inventor has perhaps a 50/50 chance of understanding their own writing.

The lack of requirement of mechanically verifiable proofs and adherence to common standards of expression of mathematical concepts is a huge drawback of what happens in this area of academia.

People who come up with languages like Haskell (which this one certainly is) are people who want the absurdity of languages created w/o system or any general design to spill over into the programming land. Solely based on the fact that they've been exposed to this cuneiform during their apprenticeship and learned to associate it with the actually valuable stuff it's meant to convey.


it's probably low effort for a 'mere every day' haskell programmer; it's just a question of what you're familiar with, not actually deep math

(minor correction, as tromp points out, :: is evidently list construction, as in ocaml or haskell, not a type annotation)

you asked what `matchAll primes as list integer with` meant ('what is the second line, a comment?') but apparently the person who answered you didn't understand that you didn't understand. it means 'evaluate the expression `primes`, which should have the type `list integer`, and then attempt to match the value resulting from that evaluation against each of the following expressions in order'

deep math isn't simple to do even when you know what it means; it isn't just a matter of learning what all the symbols mean. this is just a matter of learning what all the symbols mean, like reading english instead of chinese. so to me your complaint reads like someone saying (in chinese) 'just a mere every day reader of novels, it looks like nothing I'm familiar with' because some text is written in english

however, the situation is not quite so symmetric as with chinese and english. with a good notation, cramming lots of symbols onto a line is a really good lever to empower your reasoning ability. consider trying to explain how to play the seventeenth measure of pachelbel's canon in words. or writing assembly instead of c, even though they're both at pretty much the same level of abstraction

pattern-matching really significantly improves the clarity of certain kinds of code, and you're missing out if you don't know what that's like

i'm guessing that #() (the only weird part) is analogous to ${} in `-strings in javascript or #{} in ""-strings in ruby: it embeds an expression to be evaluated in a context where you wouldn't normally expect expression evaluation, in this case a pattern for pattern-matching

(disclaimer, i don't actually know haskell, though i've implemented my own programming language featuring pattern-matching)


Thanks. As a correction, the comment you are directly replying to is the only one that I personally made.

I just happened to agree with the sympathies of the first person and was trying to rephrase things in a more productive and positive manner.

I appreciate the response not only for myself but for others who may be reading it as well. Thank you for taking the time.


i appreciate both the correction and the appreciation


>or writing assembly instead of c, even though they're both at pretty much the same level of abstraction

They aren't.


it must be simple because this code is, i presume, runnable.


That's now how simple works. Malborge's "Hello World" is a single line!

(=<`#9]~6ZY327Uv4-QsqpMn&+Ij"'E%e{Ab~w=_:]Kw%o44Uqp0/Q?xNvL:`H%c#DD2^WV>gY;dts76qKJImZkj

Look at how compact that is! Must be simple!

You can't claim simplicity by using character or line count.


there are an awful lot of separate entities in that line


> That's all standard Haskell syntax

It's the reason Haskell has remained a niche language.

> I can't figure out what it is, or how it works; but I imagine it's quite simple once you actually know what it means.

You can't figure it out but you're sure it must be simple?

Newsflash: if you, a 1 in a 500 programmer who knows Haskell syntax cannot figure it out just by context, it's nowhere near simple.

Simple is when 9 out of 10 programmers understand the thing.


You're confusing simple and easy.


Related: the great and timeless talk called "Simplicity Matters" by Rich Hickey - https://www.youtube.com/watch?v=rI8tNMsozo0

(By the way, any talk by Rich Hickey is worth watching)


For the rare occasion when the author doesn't mean easy when they say simple, they should probably clarify that they mean non complex and difficult.


words have different meanings; the root sense of 'simple' is 'one-fold', that is, only having one layer rather than two or four, and that sense is still present in phrases like the spanish hoja simple, meaning 'one-ply' (toilet paper, for example), but it's had multiple meanings for thousands of years including 'mentally disabled', 'honest', 'harmless', 'pure', 'unadorned', and, as you point out, 'easy'

it's true that using a word with multiple meanings gives rise to ambiguity, and avoiding that by choosing a different word is desirable

unfortunately there isn't a better term for 'simple' in the sense of 'not possessed of much detail' or 'composed of very few parts'; saying 'non complex' doesn't really help because 'complex' is often used to mean 'difficult' for the same reason 'simple' is often used to mean 'easy'

except in hucksters' advertising brochures, i don't agree with your implicit assertion that all the polysemic complexity of 'simple' is merely historical, leaving only 'easy' as a live meaning; i think all the meanings i listed above except for 'harmless' have some currency today, even merely in english

https://en.wiktionary.org/wiki/simple#Adjective lists nine main meanings of which seven are current, though i admit i just added the 'easy' one myself; it wasn't listed previously (i think because of confusion induced by the polysemy of 'complicated')


> except in hucksters' advertising brochures, i don't agree with your implicit assertion that all the polysemic complexity of 'simple' is merely historical, leaving only 'easy' as a live meaning; i think all the meanings i listed above except for 'harmless' have some currency today, even merely in english

Look at this comment thread and count all the instances where people are complaining that 'simple' is being used to describe difficult things:

https://news.ycombinator.com/item?id=35759449

I'm not the only one who thinks that 'simple' should mean, in technical docs, 'easy'.


git gud


I do sometimes think there are other words that might be more appropriate for the context: parsimony or consistency are related concepts that could be more precise.

But there are some things that are just simple, like a Bankers Box.


'parsimonious' is a pretty decent word here

we could try defining a new word for specifically this sense of 'simple'. for example, 'fivous', 'blorsy', 'bonatchy', 'jatomurous', or 'wislous'. the problem is that if the word gets adopted, it's likely to undergo the same sense shifts as its synonym 'simple'


It sounds like you're confusing familiar (and popular) with simple.


Another Haskell-style cuneiform where a ten-lines-long function will give you a massive headache for days before you can decipher it.


> Egison makes programming dramatically simple!

that doesn't mean anything, it should be "Egison makes programming dramatically simpler!"


This comment makes you patronisingly incorrecter.


This is not hard to read. The main novelty is pattern matching itself, which is making its way into languages that are used (java, javascript, etc) from languages that are interesting (Haskell, lisp, etc)

Take a look at this example from the text, which contains an obvious ___domain modeling error while demonstrating cool things:

  def suit := algebraicDataMatcher
    | spade
    | heart
    | club
    | diamond
  
  def card := algebraicDataMatcher
    | card suit (mod 13)
   
  def poker cs :=
    match cs as multiset card with
    | [card $s $n, card #s #(n-1), card #s #(n-2), card #s #(n-3), card #s #(n-4)]
      -> "Straight flush"
This matches a new kind of poker hand called the "wrap around straight flush," where a straight can wrap around Q, K, A, 2, 3.

  assertEqual "poker hand 1"
    (poker [Card Spade 3, Card Spade 2, Card Spade 1, Card Spade 0, Card Spade 12])
    "Straight flush"
  TRUE
IOW, in their eagerness to demonstrate a really cool match on a mod 13 expression (something I haven't seen before), the author models the ___domain incorrectly.

It's also somewhat confusing that the cards are modeled as n-1 - ace=0, 2=1, 3=2, etc, which is also confusing. I tried for about 10 minutes to fix it, but the only notation documentation I found is math that is way over my head.


For systems use, binary manipulation is essential. Elixir/Erlang's binaries are nearly optimal in this regard. I maybe wrong, but it doesn't seem far removed to include such features.


I’ve had the chance to chat with the author of Egison a while back. A brilliant character.


if i wanted abstractions for the problems this language aims to solve i would personally prefer to either find a suitable libraries in lisp or write some myself than learn an esoteric language where i don’t know how it works under the hood


> if i wanted abstractions for the problems this language aims to solve i would personally prefer to either find a suitable libraries in lisp

Here you go:

- As a Common Lisp library: https://github.com/zeptometer/egison-common-lisp

- As a Scheme library: https://github.com/egison/egison-scheme




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: