Also, I've previously looked into Ometa, the predecessor of Ohm and I found it to be the possibly cleanest parsing solution available. Check it out if you ever feel parsing-monious.
A colleague of mine did some very impressive work with Ometa; it seems to be not just a parser but a system for building all sorts of compiler-like things.
It's a metaprogramming language. The idea is you can both describe new ways to express a program and the program itself. The most well-known form of this is probably DSL's that let you express a solution concisely in a given ___domain. Kay and his people do that extensively at VRI.
OMeta is/was a pleasure to work with. I used it for parsing s-expressions [1]. I also chose it because there were (experimental) C# implementations. Now ANTLR 4 also supports C# but OMeta has less friction for learners.
Yeah, the only implementation right now is in JS. They do want it to be language independent though, although I'm not seeing that happening anytime soon x]
Ohm is language-independent, but our reference implementation is in JS. There is also a Smalltalk-based implementation that was done by Patrick Rein at HPI: https://github.com/hpi-swa/Ohm-S
We're really happy that HN is interested in our work, but apart from Ohm and the Ohm Editor, most of the projects listed are in an early stage and are not yet ready for public release. It's too early to say which ones will become "real" in the way that Ohm has, but hopefully we can at least release them in some form, like we have the older projects linked at the bottom.
Is it possible to contribute to YCR without being a permanent member? Not looking to be paid but I am working on related technologies and prefer to be connected to a cohesive group.
I'm also interested in this -- I've been working on something similar in spirit to these and it would be really awesome if there was some way to share ideas and get feedback.
Though I also understand that they probably want to work as deeply as possible on these things without the distraction of an open source/contributor community.
code release isn't always the most important. bret victor never released code to any of his prototypes, but the ideas conveyed in his interactive essays have inspired many hackers to come up with their own solutions, or to adapt his ideas into their systems. e.g.,
Not sure if this is intentional or not but at first I thought this was something to do with Flex (the Lexer generator - as in one half of Flex and Bison). Something that is even more confusing since the first graphic in the page contains a BNF document and talks about compiler generation.
If Flex was a business with a trademark this would count as infringement without a doubt. Same type of product, same name.
The name "flex" itself was based on the name of the original Unix lexer-generator "lex", just like its sibling parser generator "bison" was a play on words of lex's parser generator sibling "yacc" (which stands for "yet another compiler compiler"). Fun fact: one of the original authors of lex was Eric Schmidt, now executive chairman of Alphabet.
Yeah, but how many of them are involved in compiler development? I think this is in poor taste. I can't imagine that they're not trying to capitalize on flex's fame.
Alan Kay's Flex quickly became Smalltalk while Flex (the lexer) spent the last 30 years doing its thing under that name. It might not be deliberate but it's definitely confusing and they definitely should have realized it would clash within the compiler development namespace. It's like making an HTML parser and calling it Chrome.
I appreciate the demos for each of these applications, but the descriptions could be more specific. I'm interested in what Trainee _does_ but it's hard to understand. There are many questions left unanswered by this page, such as:
How are problems with unknown variables displayed?
How are more sophisticated games displayed, such as two-player games (Connect 4), imperfect information games (Minesweeper), etc?
I wish the Ohm editor allowed custom grammars for describing the grammar.
In fact, I wish more projects based on Ometa and Ohm allowed this so there's a better chance they can bootstrap each other. Of course, some semantics might still be missing but at least parsing would work.
Hi, I'm one of the developers of the Ohm Editor. Can you elaborate a bit on what you mean? Or maybe you'd just like to file an issue at https://github.com/harc/ohm-editor and we can discuss it there?
So right now, I can write the grammar G1 for a language L1 in Ohm and parse strings written in L1 by putting them in examples.
If L1 is a language for describing grammars (such as a Ohm with a slightly different syntax), one of the "example" could be the grammar G2 for a language L2. But then I can't give the editor examples of L2 to be parsed!
(Of course if L1 has extra semantics instead of just syntax, it wouldn't be easy to change anyway, except by editing the Ohm Editor's source.)
I could of course, rewrite this input in Ohm's grammar just so I can visualize the parse in the Ohm editor but it would be faster if I only needed to write G1 (that is, G2 translated to Ohm but matching the same language as L2=L1) and make minor adjustments to the input.
A full translation of this "input" to Ohm would also make it harder for me to port bugs fixed in the Ohm version back to the original version.
The next step would be to make this input the grammar G3 and give it inputs which can be visualized in the editor.
I'm not sure if you intend to have the Ohm editor used like this or not though.
Very nice project, by the way. I really like how you show the parts of the inputs that matches, with the option of expanding rules.
So basically ohm is Ian Piumarta's peg compiling to JavaScript, not C. And ohm-editor is the visualizer when going down the operator precedence rabbit-hole, which is the only peg problem, compared to a conventional back-recursive parser generator. One advantage of peg is that it doesn't need a flex lexer. Strange name then. I would have called it pegjs.
> Like its older sibling OMeta, Ohm supports object-oriented grammar extension. One thing that distinguishes Ohm from other parsing tools is that it completely separates grammars from semantic actions. In Ohm, a grammar defines a language, and semantic actions specify what to do with valid inputs in that language. Semantic actions are written in the host language -- e.g., for Ohm/JS, the host language is JavaScript. Ohm grammars, on the other hand, work without modification in any host language. This separation improves modularity, and makes both grammars and semantic actions easier to read and understand.
In the "Selected Past Work" section there's "Natural Language Datalog" which I would be interested learning more about, but the link seems to be wrong. Anyone knows where to find more info on it? My google skills are failing me.
Interesting work. I'm particularly curious to see how the "Natural Language Datalog" will evolve; it seems to have various potential use cases, from conversation in social and occupational settings to specialized project tasks, learning and building across various sectors, etc.
It reminds me of some previous roles where my key task was "translating" abstracted information and creating interfaces for less technical audiences to interact with technical information.
I just added a link to a GitHub repo and live demo for the Natural language Datalog project. (You can see them if you hover over the image, on the Flex website.)
Where is the chorus project hosted? Was browsing around on github for related devs and YC but couldn't find it. Have these tools been officially released yet?
reply
I have been following the Edwards projects for years; I find them (and his views) mostly very interesting, but they move extremely slow. This one seems the most interesting by far, but me too would be very interesting to see a repository / download. Maybe it is there somewhere, but...
Hi Jonathan, "Subtext 3 and 4 were successive experiments in a new semantics of mutable state I describe as synchronously updateable views." What does this mean? Is Chorus a database?
edit: the answer from his paper: "Typical applications are built from at least three distinct technology stacks: a database, a programming language, and a UI framework. Each of these technologies has its own semantics, and much of the complexity of application programming stems from the need to glue them together. Chorus instead provides a single unified model built upon our prior work on Subtext (Edwards 2004-2014). Our statically typed tree structures are effectively databases: the types are schemas, collections serve as tables, and references serve as relationships. All data is persistent, and all execution is performed in concurrent transactions."
Together with Brett Victor his Inventing on Principle, I usually show this https://vimeo.com/140738254 to people when explaining my daily existence and the unfairness of it (as in; some of us have these ideas and yet, my CPU cores + memory are overloaded doing trivial db queries to do code completion and rendering HTML because that is what we write GUI's including code editors in now :).
I sympathize with the frustration! It's debilitating to know that there is a better way to program something yet you can't use it because it hasn't been implemented yet. I mitigate the pain by spending some time every week working on my own ideas of how computers / programming should work. Some of these ideas are very similar (at first glance) to subtext / chorus. I knew I couldn't be the only one thinking like that.
I love programming with computers still (and have so for the past 34 years); the ideas set forward in these videos and implementations do give the feeling we might get somewhere. I myself only write tools for specific projects. I hope I get a chance of distilling something more generic from that some day but that might be simply too hard. As for Alarming Development, I much liked Subtext & Coherence (where did that site go? [0]); from what I have seen Chorus is a more high level approach and very curious to try that out even though we have not seen much yet.
Would love to see more about the past work projects, especially the bottom right one. Is this research group opening a line of communication with the public? Gitter, slack, a google group?
> The Flex group uses technology to improve the range and fluidity of human expression. We invent new concepts and representations that amplify people’s ability to create, connect, and understand. We create tools that blur the line between using and creating, in order to provide a conversational medium for thinking and doing.
What the hell does this even mean? Usually I don't mind marketing-speak at all, but this is literally comical. I was expecting some art or creative related .. something. What I got was tools for building programming languages.
It's what we honestly do (and aspire to do) as best explained in a couple of sentences. But as always, judge us by what we do rather than the vision that we profess.
> At flex we believe that good interfaces should empower users. We try to blur the line between simply using a user-interface and being a programmer. We do this by thinking about new ways to intuitively represent programming concepts, and endeavor to provide tools that have a smooth learning curve that takes you all the way from just being a user to being a programmer.
Unless I'm horribly misunderstanding your goals. Which really I probably am because you're horribly vague.
Well as I understand them it's not wrong, it's just very limited. It would be like describing Engelbart's projects as "Making interaction with the computer more graphical and efficient such as by employing pointing devices and interactive text", whereas ARC's aim was much more visionary.
I think their ideas go beyond just making programming more accessible (your summary), but more along the lines of "using computers as instruments of enriching thinking".
One of the projects (Chorus) even specifically mentions and rejects programming as a problem-solving tool (for that specific ___domain of problems), instead opting for thought-process enhancement from the other side, basically making spreadsheets on steroids.
I'm interested in and sympathetic to the vision I believe you're trying to express - that we have powerful new opportunities with computers/software to develop and express ideas in more powerful, more intuitive ways, and that we should be explicitly thinking about how we can take advantage of this opportunity to increase the range of ideas that it's possible, and easy, to think about and with.
I agree with the other commenters, though, that the statement of vision is unclear; the meaning of it is hard to pin down to something concrete. What the Flex group is actually trying to do remains ambiguous after reading it. If it matters that people who run across that statement without other context can understand it, you may be leaving value on the table that could be captured with a more intelligible version of it.
I'm still having trouble figuring out why anyone would actually want to make their own language, on a real project and not just BYOSchemeForTheLearningAndLulz or whatever. I guess I'm just really spoiled by Ruby, which makes it so easy to add semantics to a system that I absolutely never want new syntax.
Introducing a parser generator workflow to a project that already has access to Ruby is almost certainly Doing It Wrong. Every time I've tried it, I ended up scrapping it and just did it in Ruby with a DSL. You're going to want to access and control whatever it is you're writing with Ruby, so why not just stay in the language?
I guess if you're coding in something really dull like Go or Java, you can easily get to a point to where the developer experience is constrained by the language so you need a new one.
One very simple reason is just a file format. Let's say you want to make a configuration file that's more expressive than JSON or something similar. You really don't want to make a Ruby DSL for your configuration file because executing a configuration file is "a really bad idea" (TM) :-)
I've made many parsers in my career. I even wrote a SIP parser using antlr which I'm particularly proud of. There are lots of places where you need to parse.
Moreover, there are many places where people fiddle with hand-written parser code, nested functions evaluating regexes with many holes and unhandled edge cases - where they should have used a parser generator instead.
So my impression is that in general, parser generators are underused, not overused.
Anything that encourages people to use parser generators is IMHO a good thing. Ideally, parser generators should become as ubiquitous as template languages, with an similar race between libraries for ease of use, suitability for the most common cases, and lower entry barrier.
Templating systems, now that might actually be a use case! But again, there are already lots of templating systems out there. Just pick one and go on with your project.
> But again, there are already lots of templating systems out there. Just pick one and go on with your project.
But that's exactly my point: You can do that for template systems, but not for parser generators. (Well, you can, but your choices aren't nearly as nice.)
Well, sure, but you don't just use parser generators for the hell of it. They have to fit into the broader overall context of a system that drives outcomes.
It's easy to see what role a template system can play in such a project, much harder, for me anyway, to see what one would actually use parsing for outside of the rarefied context of building a new general purpose programming language. Anything you'd want to parse, there are already going to be parsers written. Just pick one and go.
If you're just screwing around, by all means knock yourself out. But in all the real work serving real humans I've ever done, I have never been able to justify building out a parsing system. I can't even look at a regular expression without wanting to rip it out.
I can see if you're maintaining the library that, say, does TOML parsing for Lua. But that's not in the realm of the everyday.
There are lots of configuration formats that already have parsers, why does everyone need their own? TOML, YAML, INI, bash convention, those are just off the top of my head.
Whatever you need out of a configuration file format, it's probably already been done, and someone else has already done the hard work of specifying the language and writing a parser for it, all you have to do is hook into it.
Also, I've previously looked into Ometa, the predecessor of Ohm and I found it to be the possibly cleanest parsing solution available. Check it out if you ever feel parsing-monious.