> So instead of [building the browser one feature/spec at a time], we tend to focus on building “vertical slices” of functionality. This means setting practical, cross-cutting goals, such as “let’s get twitter.com/awesomekling to load”, “let’s get login working on discord.com”, and other similar objectives.
Seems similar to how Wine is developed: instead of just going down the list of API functions to implement, the emphasis seems more on "let's get SomeProgram.exe to run" or "let's fix the graphics glitch in SomeGame.exe". Console emulators (especially of the HLE variety) seem to have a similar flow.
I really like this approach because it’s a low-effort way to prioritize development.
I did this with the CPU for my GameBoy emulator: I picked a game I wanted to work and just kept implementing opcodes each time it crashed with an “opcode not implemented” error.
It's also a great approach to port games - replace all platform-specific code with assert(false) until it compiles and then fix the asserts as you encounter them until everything works.
This approach works better for Wine where the Windows binaries are a fixed target.
On the web, you may get Twitter's feed rendering acceptably, and then two days later they ship an insignificant redesign that happens to use sixteen CSS features you don't have and everything is totally broken again.
The point isn’t that “getting X to work” is a one-and-done job. Rather, you’re using major websites as indicators for what features to target next, because those are largely one-and-done.
Of course you are correct that supporting Twitter or any service is a moving target, so long as it changes. But that doesn't mean specific bugs can't be captured in a test case.
At one point, he deletes half the HTML file to isolate where in the site the problematic code is. In a way doing a kind of binary search. After a few iterations of this, he comes up with a very small case that exhibits the problem he's trying to solve.
It's clear he knows his way around the codebase and where to make changes, but isolating these test cases is probably as important. And presumably if you fixed enough of these issues (while following the specs), 99% of the modern web should work just fine.
> On the web, you may get Twitter's feed rendering acceptably, and then two days later they ship an insignificant redesign that happens to use sixteen CSS features you don't have and everything is totally broken again.
this is not directed at you, but at this attitude which is very common and which I see all the time: everyone is lightning fast to come up with reasons that something won't work.
why?
why do people say things without understanding that almost any given problem has subproblems, and that those can be solved.
in humans, negativity is always just under the surface, and positivity is often buried deeply, and I do not understand this. I don't think I ever will. people just love to be contrarians.
It's really tiring. It's everywhere on HN, I deal with it everyday at work, and everywhere else I look.
Instead of it being criticism, the commenter could've seen it as a positive. Every time a site changes, you discover functionality you haven't implemented yet. Over time, you've implemented more and more. It's progress. Progress is good. Choosing the negative interpretation is so endemic and arbitrary and simply unnecessary.
If you have infinite time and resources with perfect communication/understanding you can solve a lot of engineering problems. No one has that. This is where the original quoted claim from the article comes from, "building a web browser is impossible". That's encoding a lot of experience and reality of the Brobdingnagian challenge of building a web browser from scratch on 2023's web.
It's not negativity to point out a downside to an approach to a particular problem. It's potentially useful to get feedback on a development approach. Constructive criticism is very important in engineering projects because it encodes assumptions of limitations we all have.
A positive statement like "oh those sub problems are solvable!" doesn't really provide any help. No shit the problems are solvable in a perfect world. Such statements aren't even necessarily constructive because they don't offer any analysis or advice. It smacks of toxic positivity[0].
not all positivity is toxic positivity, you know. I wasn't even positive, I was just anti-negative. being against negativity is not the same as being positive at all.
spouting out a problem you foresee being revealed after another problem is solved is not constructive criticism, it is reactionary and attention-seeking.
my comment is about comments like yours; unlimited time and energy to mention anything that makes what I say sound bad, improbable or difficult, and zero time or energy to even entertain the idea that my point of view is valid, and worth considering.
It's just an effort to enforce social conformity. A specific case of this type of browbeating may not be helpful, but on the whole it's often positive for the group to be mostly uniform. It's also often negative! More a value-neutral standard human tribal grouping behavior than anything.
With regards to the negative interpretation having positive value, yes - working in a company that values ‘not failing in particular cases’ higher than ‘working in general, room for improvement’.
For example in a conservative corporation where a given project requires a ‘go’ from several departments were the success of the project does not give an immediate advantage to those departments, but a failure will require them to explain why they didn’t ‘catch it in review’.
Sure but Wine is a tool where if 90% of the stuff user wants to run in it works, it's still great. Say if you use it for gaming you can play most of the games and only boot into windows every few months once you hit one you can't.
But that's not how you use web. If 90% of the pages worked in browser I wouldn't use that browser, ever, because chances are I'd hit one that didn't at least once every few days.
It depends on the types of bugs and inconsistencies that show up in practice. If a page (or worse, the whole browser) crashes, that's a problem. If a column of ads doesn't scale right and gets bumped down to become its own lonely row below the main content, that's kind of ugly but almost a feature.
> But that's not how you use web. If 90% of the pages worked in browser I wouldn't use that browser, ever, because chances are I'd hit one that didn't at least once every few days.
This is a fact that Microsoft understood and pushed when they tried to get people to build pages for IE instead of working across both Navigator and IE.
I come across a variety of sites that show unusual behaviour every day. And as others pointed out, the page is mostly degraded, not unusable.
There are many grievances when using the web today, some are down to the lack of a set CSS spec. Others due to the complete and utter disregard for browser compatibility. I'm not going to tackle the monopoly of Chrome as a browser. However, there are a number of specific uses of this collection of ever changing specs and implementations that eventually lead to page-breakage in every new browser release.
The web is complex to tackle because everyone seems to think that they've a better idea of what a page is. Some of it is fair, some of it unfair. Nevertheless I would take this approach any day.
I completely agree. The people advocating for this type of development style are in another universe. Web is incredibly fragile.
There's a vast difference between a page being degraded by all browsers in a consistent manner per W3C specs (especially the critical parts of a webpage such as JS execution or malformed HTML) vs the damn thing breaking in such a unique way that the web devs will never be able to fix the page for this new browser while getting it to work the same for others. Worst case would be security is compromised and that is a very long list of things to implement in both the HTTP layer and browser behavior before you even get started trying to render a page.
Web devs shouldn't be fixing pages for individual browsers but instead using features conservatively and degrading gracefully where features are not available. Browser bugs shuld be fixed in the browser.
And for 99% of websites security doesn't matter at all because it's just a one-off visit to read an article or look at some funny pictures without any user account.
Web browsers tend to degrade a webpage rather than fail to load it. Given that every day using Chrome I'll come across a website that is having issues with rendering, it should be acceptable.
No wise old pro who was actually worth listening to I ever met in any field ever says things like "my young Padawan". Saying that makes you look like two kids in a trenchcoat trying to pretend to be an adult.
Which in this case is not inconsistent with this apparent misunderstanding of what moving goalposts means.
To use your own silly words, targeting the features that are the most used is litterally one way to choose challenges wisely.
In general the Windows binaries are not really a fixed target for Wine either - many applications (e.g. multiplayer games) have online components that require you to run the latest version. And even for others, people will demand the latest version work. Fixing things that are actually work first is still a good way to prioritize things - it's not like you stop implementing functionality once one target program works, you just move to another one. And if you run out of interesting programs then you can look at 100% API coverage.
yep, even better in unmaintained consoles, where the popular demand for concrete binaries is pretty much set in stone and you can even aim to complete the entire library of software binaries over time - as they're usually in the hundreds or low thousands rather than millions or billions
however the aim to build a reasonably sized and not ossified-to-previous-spec web browser is very interesting, especially if it's well engineered and made to be portable
> Console emulators (especially of the HLE variety) seem to have a similar flow.
Good insight.
I was going through this same spiel in my head the other day.
It's a flow that if properly managed can provide a good feedback system. It provides the developer positive feedback and at the same time successful milestones.
Say I'm building an emulator for a simple architecture with a few dozen opcodes...
"Alright. Let's start. Where do I start? How about NOP." So you implement NOP. You write some tests for it. Maybe you build a pretty printer into your opcode and you test it on disassembling a single byte file with a single NOP opcode.
Suddenly you have a working dissassembler! It's obviously an artificial toy, but it works.
Maybe next you add an INC instruction. Add some tests. You'll need registers...
Build a simple one INC opcode binary file. Maybe add an executor in addition to a dissassembler. Suddenly you've got registers working. And if if add another INC opcode byte, you can see your emulator changing behavior based on real external input!
And so on. It's an interesting flow, you're right.
I don't think that link supports your comment. It says that vertical slices (at least as described by that article) are generally unrealistic in game dev.
You're right! I skimmed the first paragraph, as I was mostly just looking for a description of vertical slices in games. The rest of the content does not in fact validate my statement.
You're right that the "vertical slice" approach has been used very successfully in games development, though. Mark Cerny[0] has been evangelizing[1] the idea that preproduction isn't over until you have a "publishable first playable" (a.k.a., a complete vertical slice) of your game.
Basically, you shouldn't switch from "preproduction" to "production" until you can show (A) here is actual gameplay, (B) it is, in fact, fun, and (C) you know how to actually implement it. Until you can demonstrate those things, how are you supposed to estimate how long it will take to build? Or that, once you're done, whatever gameplay mechanics you dreamed up are actually entertaining to a player?
[0] arcade game programmer, producer / studio exec who got Insomniac and Naughty Dog to scale beyond their founders, & eventual lead architect of the PS4 and PS5
Do we really pretend this is a architecture decision? This is the classic Agile Management wants to show result for reward fast, that leads to huge tech debt build up, as layers are not properly designed and reuseable.
It’s neither. There’s no management or promotions or really any incentive to do this other than it’s fun. It’s also not an architectural decision, just how they as a team decide what to work on.
But how does developing a half-baked browser that targets some websites for fun refute that building a browser is impossible? Doesn’t it provide another example that it is impossible, at least for this team?
Well, it takes time to make something big. And, one way to do it is to choose end-to-end functionality. Idk, it doesn’t seem so controversial to me. I’d wait and see before calling it half-baked.
A vertical slice of the properly integrated features needed by some practical use case is certainly more efficient and effective than implementing the whole of a large API (e.g. some fancy recent, unproven and unstable CSS module) "in a vacuum", getting numerous rare cases wrong, and struggling to test the new features.
This is a common misconception. I practice TDD at work and side-projects and find it very productive. TDD is best done with end-to-end tests (or automated integration tests, whatever you wanna call it). You write an end-to-end test (I give input A to the entire system, expect output B), first the test fails (because it's unimplemented) and then you implement and it passes.
It works because then your tests become the spec of the system as well, but if you only write unittests there is no spec of the system, only modules of your code. Which is not useful because if you refactor your code and change this module you need to rewrite the test. Whereas in TDD your tests should never be rewritten unless spec changes (new features added, new realization of bugs etc). This way "refactor == change code + make sure tests still pass".
You're of course free to write unittests as well, when you see fit, and there is no need to target a religious X% coverage rate at all. I think coverage targets are cargo-culted, unnecessary, time-consuming and religious. The crucial thing is, while you're writing new code (i.e. implementing a new feature or solving a bug) you need to write an automated test that says "if I do X I expect Y", see it fail, then see it pass, such that "if I do X I expect Y" is a generic claim about the system that will not change in the future unless the expectation from the system changes.
In other words, the example in this comment chain: "run a game, see 'opcode X doesn't exist', implement X, rinse repeat" is actually how TDD is supposed to work.
>Console emulators (especially of the HLE variety) seem to have a similar flow.
And emulators that have gone this path have all regretted it, because they end up making hacks to make <popular game> work, because everyone simply wants to play <popular game>. Dolphin is still paying the price of that method years down the line. Project64 took years to unfuck themselves up, ZSNES is forgotten and overtaken by many more that have done the proper thing.
So, sure, you can get some initial usage. But making a browser isn't about being able to open twitter.com
I feel like you may have misunderstood what kling is saying. He's not saying "we will cut any corner to get Twitter to load", he's saying "we look at what it would take to get Twitter to load, read through the relevant specs, and try our best to implement the required features cleanly and correctly".
Loading Twitter (etc.) is not really the goal, it's more of a prioritisation mechanism for tackling a huge spec. Actually getting Twitter to run is a nice reward for all that hard work though, and a series of such rewards keeps the contributors motivated in the marathon that is building a browser.
No, I understand what Andreas is saying. But the reality is, when you read and implement specs for a specific website, you end up cutting corners, even accidentally. Maybe Twitter relies on a particular behavior of fetch() that was screwed up in Chrome 97 and has had to be kept for backwards compat this entire time. Maybe it uses some CSS that never got properly documented or specified.
By targeting a single website, you end up accidentally writing in those site-specific fixes _in_ your implementation. You only realize it's fucked up because you visited Twitter. but maybe it screws up another site. Maybe something else depends on a quarter of that functionality, and you've accidentally broken it.
It seems that GP already addressed your concern: "read through the relevant specs, and try our best to implement the required features cleanly and correctly"
Of course some errors will be made along the way. That's to be expected, regardless of the approach taken.
Maybe so. Hopefully you’d find that out while looking at the spec for that particular function. If not, you may have to rewrite some code as you learn more. Life goes on
> ZSNES is forgotten and overtaken by many more that have done the proper thing.
When ZSNES did its thing, which, just to remind people, was back when Pentium IIs ruled the roost and CPUs topped out at 450mhz, doing things the proper way was not a choice because the proper way needed 3x the CPU power of doing things the ZSNES way that worked.
Dolphin became popular because it could actually play games, and I bet if they had instead spent an extra 3 or 4 years working on code that was "correct" without releasing anything, well odds are they wouldn't have such a large following and would not have attracted so many contributors.
Users do not benefit from "perfect code" that they never get to use because it is still in development.
Since “The reckless, infinite scope of web browsers” is depicted at the start of the article, I think it’s worth pointing out that its claim of W3C having 1,217 specs totalling 114 million words is wildly wrong, probably by 2–3 orders of magnitude in the total. The considerable majority of the documents considered were not specs or not web-relevant, and dozens of versions of the same thing were often counted. Source: https://news.ycombinator.com/item?id=22617721.
So the web is maybe only 1.2 specs with 114,000 words? I think it's considerably more than that. If that estimate is off, it's by no more than a factor of 10, IMO. No need to exaggerate.
In the source comment he gives it's more explicit, but the "2-3 orders of magnitude" is referring to "the total", i.e. the 114 million words, not the number of specs.
Given that someone that needs to develop a browser probably needs to hunt and peck through all that trash to find the relevant bits of information, is it not actually more damning that such a vast quantity of irrelevant cruft exists?
Is this not the corporate equivalent of creating a walled garden (perhaps not the right phrase here, gastric moat sounds more apt), by exhausting the resources of all that should choose to attempt to scale this mountain of junk?
That being said, I can't make any suggestions as to how you could shortcut through that other than just having decades of experience in the field.
Isn't that true for any system that's been around for a few decades? Try implementing XMPP; which XEPs do you pick? It's a long list.[1] Try implementing email: there's probably more RFCs to exclude than include at this point, and what do you need and what is optional?
This is in fact one of the big issues with xmpp. Everything is sorta-kinda compatible but not really. And email is so getting so complicated that many people are scared of running their own server let alone programming one.
One thing the web specs do incredibly well is cross-linking. I've found it quite easy to start with a high-level spec (e.g. flexbox) and drill down into the bits I need because anywhere another spec is referenced it's linked to directly.
I believe it’s still wildly wrong. Had arp242 not spoken up at that time, I’d have been saying something similar, because the numbers were to me blindingly obviously extremely unrealistic. The entire HTML Standard (which is somewhat of a misnomer now, it covers much more than just HTML, quite a bit of CSS interactions, other web platform functionality, JavaScript APIs, and the likes) is now about half a million words by the most generous of counting methods (it’s written in a fairly verbose style, which is a really really really good thing when you compare it to the average IETF RFC), and I suspect it’s bigger than everything else put together, apart from ECMAScript (around 270,000 words¹ and probably growing at a faster rate than all other specs: it’s written in an even more verbose style, most of which is effectively straight code in prose form, whereas in the HTML Standard “straight code” is only a decent chunk of it).
As for WebGL, the WebGL parts are actually quite little. https://registry.khronos.org/webgl/specs/latest/1.0/ is only about 20,000 words. I gather it defers significantly to GLES20 (PDF, 204 pages, ~60,000 words), and GLES20GLSL (PDF, 119 pages, ~30,000 words), and it has GL32CORE in its references (PDF, 404 pages, ~125,000 words), but doesn’t actually use cite it in the text and I don’t know if it’s relevant. There doesn’t look to be anything else significant that wouldn’t already be included.
But really, WebGL is a fairly thin layer atop OpenGL ES 2.0, just removing some functionality and applying some restrictions. I believe you would reasonably expect a browser to use an existing OpenGL ES 2.0 implementation, so I’d be quite content to exclude the 90,000 (or perhaps it’s ~215,000?) words of that, just like it’s common to reuse an existing JavaScript engine (though you also don’t have to). Yet note this: it seems that even if we include it all (and presuming I haven’t missed anything, which I admit I could easily have done, I’m not conversant with these specs like I am with HTML/CSS/JS specs), it’s still under 0.2% of Drew’s massively-inflated figure.
—⁂—
¹ Whew, https://262.ecma-international.org/ took me several minutes to download, despite being only 7MB. Sigh; the trials of being in Australia, where things hosted in the USA are often inexplicably painfully slow—like, sub-256kbps. When already downloaded, it renders in under four seconds, which is really fairly impressive when it’s doing all that layout on a document a million pixels tall—this ain’t a PDF where you can only render one page at a time. The HTML Standard is almost two million pixels tall, and also loads completely in under four seconds—simpler styles, perhaps? I refer to it often enough that I build it locally so I don’t have to download its 13MB all the time, or compromise with the multipage version that you can’t search through as easily.
To underscore something that might get lost in the wall:
The entire premise given in Reckless, Infinite Scope is that the number of words in the specification is positively correlated with the intractability of implementing a given thing. From this foregone conclusion, it tries to quantify how much worse the task of implementing a Web browser is. The problem is that that the premise is a bad one; even if it takes more time to read a wordier spec, it is easier to implement one that describes well-defined behavior than a terse one that glosses over things and leaves huge gaps of undefined behavior. This is not just conjecture—it tracks with the development and progress of implementing, say, the HTML parsing algorithm; it is easier to implement a correct and acceptable HTML reader in 2023 armed with only the spec than it was to try to do the same thing in 2003 which involved reading the spec and also reverse engineering how other (esp. proprietary) browsers deal with the pages that you find authors actually publishing in the wild. This is a task that was made easier because the standard got bigger.
The point is that its broken methodology doesn't even matter; we don't have to try to come up with better ways of evaluating whether a spec should be included or not because its whole premise is flawed to begin with. Any attempt to produce an input set that you can then use to run a word count analysis is a moot academic exercise at best that will only tell you how many words it contains.
A more detailed spec might have more concrete definitions but it also means more actual code for someone to write. An under-detailed spec you might have a case switch with a couple defined values and an undefined catch all. A super detailed spec just adds case statements and requires more code to handle them. The detail in the spec makes for lower cognitive load but the code still needs to be written and ideally tests written.
> A more detailed spec [...] means more actual code for someone to write.
No, it doesn't. A detailed spec has the same amount of code to write as a spec for the same thing with less detail; for the types of specs relevant to this discussion, the primary requirement of "does what the other browsers do" exists whether the details are made explicit in the spec or not. More code is a consequence of an increase in requirements, not detail.
In any case, neither circumstance is I/O bound to begin with.
No. If a spec doesn't define a behavior code can jump to some "undefined" handler which could be anything from a no-op to some quirks mode. Unless you're Microsoft writing specs "do what Word 97 does", copying the behavior of existing browsers is not a specification.
Please don't ignore the context. We are talking about Web browsers.
You don't, in reality, have the latitude to do "anything from a no-op to some quirks mode" of your choice. The requirement is absolutely the one stated: to be compatible with what other browsers are doing. If your browser doesn't satisfy that requirement, then you break the Web, regardless of whether the spec is a hundred words or a hundred million. No amount of pointing at a standard and arguing that it doesn't specify clearly defined behavior in some area will ever be enough to teach a site to be able to say, "Oh, I'll just unbreak myself then so you can go ahead and view/use this page on your computer."
Besides that, even if you were right—and to be clear, you aren't—that doesn't change the fact that, again, arguing for underspecification because "a couple defined values" isn't as much "actual code" that "still needs to be written" is an argument that approaches a problem that isn't I/O bound as if it is.
> and I suspect it’s bigger than everything else put together, apart from ECMAScript
The thing is, it's not just the HTML standard. It's also all the standards it references. And all the standards they reference, and all the standards those standards reference, ad infinitum.
For example, HTML 5 references SVG 2 which references CSS 2 which references Unicode and XML 11. Or, to go the same route, HTML 5 references SVG 2 which references CSS 2 which references CC.1:2004-10 (Profile version 4.2.0.0) Image technology colour management which references (normative) ISO/IEC 646:1991, Information technology — ISO 7-bit coded character set for information interchange, IEC 61966-2-1 (1999-10), Multimedia systems and equipment — Colour measurement and management — Part 2-1: Colour management — Default RGB colour space — sRGB and TIFF 6.0 Specification, Adobe Systems Incorporated among other things.
Yes, some of those overlap (as many standards will reference many the same standards), but the number of those standards is definitely non-trivial. Some of them you can probably pull in as system libraries or external libraries. The question is, how many?
Edit: and some of them are definitely not relevant to the web, but how would you know until you read through the spec that references it, and through the referenced spec to find and understand the relevant bits?
Essentially any specification that includes any kind of image support will include this kind of chain of specifications; just as any system that does networking will eventually end up with TCP, any system that does text ends up with Unicode, etc. Even the simplest possible 1995-esque browser will have to deal with that (support for images was added in 1993, and text and networking were always central).
> Even the simplest possible 1995-esque browser will have to deal with that (support for images was added in 1993, and text and networking were always central).
To make a web browser from scratch is like making a hamburger from scratch. The problem is not about the first part but what you truly mean by from scratch.
ISO 646 and 61966? I won’t disagree with your annoyance with ISO water torture[1], but ASCII and sRGB are not the examples of needlessly sprawling web of references I would’ve chosen. Even if sRGB is an utter mess[2], it’s a mess you essentially have to use if you’re doing colour on computers.
I just randomly selected some without going too deep into details. But yes, sRGB is also referenced from CSS because, you guessed it, CSS deals with color :)
How do you even start implementimg such a spec? I know it's probably a dumb question but how would one structure their code to check all those boxes? Does it usually involve reading the entire spec, then figuring out the foundational parts and building from there? Does that work when you need multiple specs to "fit" together?
How do you make sure or even check that some code you built for some part of the spec does not interfere with something else?
I implemented a few specs in my short career but nothing even close to that. It's actually mind boggling that we manage to have all those moving parts fit together.
Other than layout and rendering, implementing HTML, ECMAScript and CSS is genuinely easy. There’s a lot of it so that it’ll take you a long time, but it’s very much not hard, because the HTML and ECMAScript specs fully spell out the algorithm, telling you exactly what you must do (or, more precisely, what you must be equivalent to doing: e.g. “implementations must act as if they used the following state machine to tokenize HTML”), so it’s largely mechanical. This is very unusual in specs. I wish it were less so.
The question is what exactly is needed for a useful and functional browser. You certainly don't need all features from Chrome, but you do need more than, say, Lynx or Dillo.
Is WebGL needed? I've browsed the web for years with it disabled and have not suffered any inconvenience. I'd probably say it's not needed, but I'm a bit on the fence about it and can understand if people would disagree. All browsers implement XSLT, but is that actually needed for a functional modern browser? Maybe not? I can't remember the last time I've seen it used, but perhaps it is. And do you include HTTP? Or is that too low-level? Do you include PNG and SVG or just PNG? If you include SVG then why not PNG?
There are some obvious "we need this", some obvious "we don't need this", and a lot of unclear and somewhat subjective area. I do know that you can't really say "yes there's bad data, but it probably cancels out against stuff omitted"; if anything, it only underscored my point that the list is not good.
An uncurated or minimally curated document dump is not the correct approach in the first place, if you do that for SMTP you'd end up with a lot of irrelevant documents too simply because the specification is a few decades old and stuff gets superseded, some things never sees real-world implementations, things no one uses any more, etc.
I started making a better list when the article was originally posted, starting from "okay, let's just check what you need for a useful browser normal people can use every day" and ended up with a few dozen things, but I never really posted it as I wasn't quite sure that was fully correct either and because I never really figured out some of the questions above.
I think most of the complexity stem not just from the word count, but rather that everything interacts with everything else. Consider the relatively new "position: sticky" in CSS. Okay, great. But it doesn't work well with flexboxes, or RTL, or negative margins, or z-index, etc. etc. [1] Adding what seems like a fairly simple feature is quite complex because it interacts with so many things. It's not hard to imagine a fresh new HTML and CSS which allows all the features the current does but does so in a much simpler and orthogonal way, which would of course break backward compatibility and every website.
[1]: In 2020 anyway; I'm not sure on the current state; here are some of the links of my post from 2020 which like most of my posts I never finished:
> I think most of the complexity stem not just from the word count, but rather that everything interacts with everything else.
And most specifically in layout and rendering. HTML, JavaScript and the parts of CSS that aren’t, y’know, doing anything, are all very straightforward, despite having the significant majority of the word count. If anything, I’d say that in web matters implementation difficulty is inversely proportional to word count, because its verbosity pretty consistently comes from precision (which makes implementation easy). Layout stuff would be much harder to define exhaustively in that fashion, nor is it done so in most places.
> I think most of the complexity stem not just from the word count, but rather that everything interacts with everything else.
That is definitely the main issue.
And you're completely correct on the needed/non-needed/subjective front. Many of the standards reference (in a recursive manner) a lot of other standards. A listed some here: https://news.ycombinator.com/item?id=35524018 As an outsider it's impossible to know whether TIFF spec or ISO 7-bit coded character set for information interchange are relevant, an need to be studied, or are there just because they define some minor values referenced in some more higher-level spec.
It's not hard. Start with the WHATWG's spec, then incorporate the other specs it references using a reasonable heuristic to determine if a given item should be included or not.
If you don't think the estimate from Reckless, Infinite Scope is wildly off, then you either didn't read the methodology and do a spot-check of the dataset, or you really don't understand the scope of what gets published by W3C and how little much of it has to do with Web browsers or how many revisions of them there are.
The only bar that the heuristic has to pass here is "delivers a result that doesn't suck as bad as the analysis in Reckless, Infinite Scope". The analysis in that article is so bad, however, that your heuristic can literally be, "if you encounter an item that was also in Drew DeVault's input set, then assign an arbitrary probability 0.9 (or whatever) of whether the item should be counted", and it would still give you a more realistic result than what the article says (and that people are actually relying on in their arguments—and that you are defending) here.
Aside from that, given how many logical errors and weird counterconclusions[1] you've managed to stuff into this discussion, though (and to have been able to do so economically[2]), I'm going to go ahead and say this is my last response to you that I spend more than 10 seconds writing out.
Your example is heavily underspecified. In what form are people’s details added? How is the list printed? A spec that’s actually implementable will be a good deal longer. One that defines behaviour completely (what ordering should you use for equal weights?) will be longer still.
The HTML and ECMAScript specs that comprise most of what we’re talking about are very much closer to line-by-line, because they’re designed to be both implementable and completely specified.
Write a program that keeps track of the name and weight of each person added, sorted by weight, lightest to heaviest. The input should be a command line prompt asking for input in 3 fields - first name, last name, weight. If two people have the same weight, order them alphabetically by last name. At the end, when a blank line is entered, print the list with headings first name, last name, weight. Check the input, if it's not 3 sections or empty, print an error explaining the input format.
497 input characters, 1317 output characters.
In the case of a detailed or verbose spec, you're probably right. I'm just replying to the assertion that it generally takes many words of English to equal little code. If that were true, nobody would be using ChatGPT to scaffold.
Now, if you're going to be detailed about -how- each line should look, I'd agree that English would be more verbose than code.
Still seriously underspecified for an interoperable spec. You haven’t defined the input or output forms anywhere near precisely enough, or how to order alphabetically by last name (sorting depends on locale: e.g. is æ equivalent to ae—though that still raises stability questions—a letter after a, a letter after z, something else? I think there are languages that treat it as each of these. Or are you just rejecting anything beyond ASCII letters, which will cause different trouble?), or what to do about two people with the same weight and last name.
Web specs need to consider all of these sorts of things. That’s why they’re verbose—they’re designed to be implementable and complete.
The fact there are better specs doesn’t help if a large part of the work is handling things that are outside the spec. You better show every “buggy” page similar to how the major browsers show them or the new browser will be considered defective. That’s the unfortunate reality of web tech (I wish every page with an js error or incorrectky closed tag would be a big fat error message but it isn’t). And that’s still a lot of slow guesswork I imagine.
> You better show every “buggy” page similar to how the major browsers show them or the new browser will be considered defective.
Used to be true, I doubt that it is anymore.
There are too few (I could find exactly none, to be honest) sites are around these days that are unreadable when rendered strictly according to a newish (say, 2019) HTML/Javascript spec.
The proliferation of front-end frameworks means that almost no site is going out of spec, and because any site that doesn't meet a large portion of the spec is invisible to search engines, having the site be broken when sticking to the various specs is no issue.
In short:
1. With practically all large-traffic sites using a framwork, a browser that strictly sticks to the specs and the specs alone is not at a disadvantage.
2. With important on SEO, a site that is unreadable on a recent spec is not going to be found anyway by the large body of traffic.
Conclusion: a browser that sticks to the spec and the spec alone has a fighting chance.
It's the complexity and edge-cases of an exceptionally large, complicated and self-contradictory spec with thousands of edge-case when different parts of the spec are combined that's the problem.
All this "quirks mode" stuff is part of the specification, no? It makes it all a bit more complex than it has to be, but I do believe it's specified.
I'm not really sure if "you need to be bug-compatible" is still true; it probably was 15 years ago, but Chrome, Firefox, and WebKit tend to be pretty decent these days.
Quitks mode is one thing, but most browsers have specific rules for specific websites, a manual process to update and handle those cases. Pretty sure chrome and safari have hundreds of these rules.
It's not clear to me if those are due to shortcomings in WebKit, the site, or if it's to be "bug-compatible" with anything else. Either way, 1,600 lines of code doesn't seem a lot to me.
If anything, websites have become way less clean and invalid HTML. I remember people, including myself, putting W3C validator icons on websites. Rarely do I see any these days, because of all the invalid HTML and dynamically created websites. Maybe all the tags are closed nowadays, so maybe at least that. But which elements are used inside which other elements and whether they are semantically appropriately used is another matter.
One of the ideas behind HTML5 is that while there is some concept of validity and well-formedness, essentially any random stream bytes describes exactly one DOM tree, in some cases the resulting tree is surprising, but even then should be same across all conformant parsers (modulo scripting support).
The end result is that validation is not that much interesting anymore, because the idea was that valid (X)HTML document should parse the same accross all browsers (which it mostly did, but that did not say much about how it was actually rendered).
Like most people I gave up on the whole semantic pedantry a long time do. Correct header ordering, basic semantics like <nav>: sure, that's great. But "no <p> inside <dt> allowed!" just makes no real sense and is exceedingly pedantic.
The validator badges were kind of a backlash against the tag soup of the day; part of the reason for that was that everyone who knew how to program a VCR could get employed as a "webmaster" in those days, but also because the authoring tools for non-tech authors weren't as good. HN sees a lot of posts from non-tech people, often written on WordPress, Medium, or whatnot. 25 years ago it would more likely have been "tag-soup'd" by some non-tech person who just learned a bit of HTML.
Nowadays, HTML parsing is exhaustively defined in the form of a couple of state machines, so it’ll behave the same everywhere. It’s genuinely easy to implement perfectly (though it’ll still take a while because there is quite a bit of it).
> The fact there are better specs doesn’t help if a large part of the work is handling things that are outside the spec
The way that web specs are handled means that better specs actually bring a lot of those things into the spec. i.e. browser implementers will define a new spec that clearly explains the quirk, and then align on the implementation. There is also a huge test suite which can be used to test conformance.
It's not perfect, but it's definitely a significantly better situation than we had.
It looks like they ported Qt to SerenityOS. I saw a package called "qt6-serenity". Perhaps they use the SerenityOS GUI libraries underneath. Does anyone know?
Ladybird is a browser based on SerenityOS technologies that uses Qt as the GUI framework. In SerenityOS, they have their own browser using the same underlying technologies, but a different in-house GUI framework.
WebKit and Blink are similar in how they have their different counterparts like QtWebEngine or WebKitGTK. The equivalent to WebKit and Blink in SerenityOS is called LibWeb.
Hmm, I was actually asking if SerenityOS released a "fork" of Qt that re-implements class QPaint to use their native GUI API. I assume yes. QPaint is turtles all the way down to paint pixels on any platform -- MacOS, Win32, X Windows, Wayland, Android, iOS, embedded (auto), etc.
The "modern" distinction would be they are building a browser + browser engine, while most fancy new browsers tend to just reuse existing engines, like Edge did with Blink.
I would really like to see a version with a C API capable of being embedded, there's a lot of places where a lightweight HTML renderer would be useful, plus it would make it easier to port to other hobby kernels.
Agree. This market is currently cornered by WebKit, which is easily the most language and UI framework agnostic engine there is. Blink might be similarly easy to embed but I’ve not seen it used that way — Blink embedding tends to be via Electron, CEF, or Qt, whereas you might run into WebKit in some random program written with any number of UI frameworks.
There used to be Gecko as an option here too, but Mozilla decided that it shouldn’t be usable outside of XULRunner and made it effectively unembeddable unless you’re willing to commit to XUL.
I care more about the choice of licence… and sadly enough, they went with a pushover/permissive one. This seems like the kind of project where copyleft is by far the better choice.
Yes, but they are replacing it bit by bit - I mean, they even started Rust for exactly that purpose. So (without being a huge fan of Rust) the decision to start a "greenfield" browser project in a memory-unsafe language is questionable IMHO...
I don't have an over-time series, but if you're willing to take my memory at its word Rust's percentage has hovered at around 10% for a while now. It seems to have actually gone down recently. Combine that with efforts like Servo being wound down and their team being let go, and it makes me wonder what the future of Rust looks like in Firefox.
If anyone can shed some light on this I'd be interesting to know.
I think they stopped the rebuild. They were previously building a new browser engine called servo. Some of that work made it through to Firefox gecko. And then team was gutted.
This chart looks like it could use some filtering for what constitutes a language used to build Firefox. It seems questionable that HTML is used to build 16 % of it. I suspect that is a result of test cases being included in the chart as it is based off of the whole repo. I checked out the repository and it doesn't have the GitHub language bar I see on other repos, so I can't click the HTML bit in it to filter down the HTML files in the repo and see if they are mostly tests or not, but it is hard to imagine they would be anything else. Maybe bits of the browser Chrome but still, that wouldn't be a whole 16 % I think.
> Yes, but they are replacing it bit by bit - I mean, they even started Rust for exactly that purpose. So (without being a huge fan of Rust) the decision to start a "greenfield" browser project in a memory-unsafe language is questionable IMHO...
Maybe, but the speed with which SerenityOS, its programs and the browser has been implemented, with so few man-hours thrown at it kinda displays why C++ was chosen over Rust.
There is no comparable project in Rust that demonstrates just how quick you can go from "nothing" to Full-Fledged OS, with applications, with a browser.
Just from the Serenity project (if you've been following it), it looks like C++ is about 10x faster to write performant and safe code in than Rust.
Sort of... but that could be due to a host of other factors besides which language is better: how good the core developer(s) are at community-building, how committed they themselves are to the project... hell, even the fact that one project used GitHub (which reduces the friction for developers who are already on GitHub to start working on the project), while the other one has its own GitLab might be relevant.
Note that SerenityOS started in 2018, they decided to use C++ for it, and even the newly created language for safer userspace (Jakt) generates C++ as target.
I do mostly Python on my dayjob, but for low-level side-projects I've gotta say C++ with the C++17 or C++20 standard it's way faster to iterate with than say Rust or even something like Zig.
For me iteration speed's a big selling point that (plus the fact that's easier to find contributors) might also be important for projects like these.
I’m puzzled by your comment. I am expert in Rust for almost a decade, but only mildly conversant in C++, and have no interest in actively learning more C++.
Rust seems to me far easier to learn and get going in due in major part to its incontrovertibly superior standard tooling.
I can’t see any place for any meaningful difference in iteration speed between the two, save that you may well have to iterate more in C++ due to memory safety bugs the compiler doesn’t catch.
As for finding contributors, I get the impression that Rust is considerably more accessible, and thus will increasingly find contributors more easily, as people that just love programming will actively choose to learn Rust far more often than C++. (For the current state of affairs, I think it’ll depend on what sort of contributor you’re looking for, in skill, industry, paidness, &c. Some segments will certainly go one way, and others certainly the other.)
Iteration speed with both Rust and C++ is abysmal. Builds take for fucking ever on large projects and it's just slightly less bad for small-to-medium and medium-sized projects.
With Rust, though, it's as if someone looked at C++ compilation times (not to mention resource requirements) and said, "I think we can find a way to make it worse."
I have a hard time deciding where in this thread to drop this link, but maybe here is a good spot. Andreas has a video about this topic, and I believe it's this one: https://www.youtube.com/watch?v=vAZvTFoSIFU
Last time I tried the browser, most crashes were the typical "not implemented yet" code paths. Some feature wasn't built yet, so necessary flags weren't set, so the program caught the invalid state and died.
I don't think I've ever seen a crash in either Serenity or Ladybird that I could attribute directly to memory management. For volunteer C++ projects, their memory management seems to have been done excellently. Using modern C++ features and things like error return types instead of null seems to be a key part in making the browser this good.
It's also worth mentioning that as far as I know Ladybird doesn't implement a JIT engine, using bytecode to execute Javascript instead. That should also make life significantly easier for memory management.
It's still a young browser and I'm sure there are some nasty memory corruption bugs lurking in the depths, but I haven't seen those yet.
There are some languages that can be formally verified, and have themselves been formally verified.
Formal verification is a complete pain in the ass to do and there's a reason it's mostly done only in the most critical of systems, but if a program passes 100% formal validation, you're as close to crash free as you can possibly be.
I believe Ada and some other lesser used language sport well supported formal verification methods. You won't be able to use C/C++/Java/Rust it you're going for 100% formal verification though. There are attempts to bring the concepts to more commonly used languages (Frama-C, for example) but in my experience they're stuck in PhD-ware hell, great for writing papers but terrible for writing actual software.
Frama-C is used in production to meet normative requirements for critical software at least at Airbus (DO-178C), THALES (CC EAL6/7), EDF (ISO 60880). I think that they use actual software in production.
But there are actually other languages that are better suited to such methods: More or less everything from the functional space is quite well applicable to those.
This post doesn't give reasons to doubt that building a new state of the art browser is basically impossible now. It might well be possible to build a new browser that kind of works on many popular websites, and that would be surprising enough. But the amount of work needed to build something comparative to the rendering engines of Chrome, Firefox, or Safari, something really usable, would probably take decades rather than years. If it is possible to catch up at all. (I remember once seeing a graph which compared software projects by lines of code, and browsers were only topped by a few things like major operating systems.)
The vast majority of the complexity is in the engine. One can whip up reasonable chrome (browser UI) in whatever UI framework one prefers in a few days tops. While there are slightly more involved parts like writing the bookmarks and history systems, those are pretty run of the mill tasks that can be completed in a relatively short period of time.
Yeah, I think servo had the right idea. The main problem with servo's components is that they're severely underdocumented, which has made it harder than it should be for some components (like webrender) to become widely adopted (some of the other ones like html5ever and cssparser are widely used).
I suppose it depends on what the goal was. If the goal was to end up with widely reusable web browser components, the lack of documentation might've been a problem; but if the goal was to improve Firefox, it seems to have been a smashing success.
It has been good at improving Firefox, but if you look at the wider picture then you see that Firefox has been falling in usage and struggling to keep up with webkit/blink. And IMO a large part of that is because core parts of Firefox Gecko and Spidermonkey are much less widely used (by as many apps/companies) than equivalents like Webkit/Blink/V8/JSC, and this because they are not easily embeddedable and their codebases are harder to work with.
From this perspective, not focussing on documentation and making components usable externally is pretty short sighted.
More specifically, Chrome used Safari's rendering engine, and Safari used Konqueror's rendering engine, because even in 2001, starting a browser engine from scratch seemed like too much work.
I would say that in 2001, starting a browser engine from scratch was more work than today, because (as other commenters have noted) since then the specifications have become more robust and the "tag soup" sites not following the specs have become fewer.
“Invalid HTML” is completely irrelevant. HTML parsing is defined exhaustively; “parse errors” are purely “you probably made a mistake, but I’ll keep going” indications, and all browsers will do the same thing.
And that's why we can't have nice things, er, why we will never have valid HTML on a significant percentage of websites: browsers are historically very lenient with HTML errors (because otherwise they wouldn't be able to show 90% of all sites), and no one uses HTML validators to check if their HTML actually conforms to the spec. It's a chicken and egg problem really: the browsers can't be more strict because there are so many broken sites, and the sites won't be fixed because the browsers aren't strict enough.
I did a quick check, and most of the errors it reports are "unknown attribute" or "element such-and-such not allowed here". Those "errors" would be allowed anyway for forward compatibility, and aren't really a big deal.
IMO the validator's definition of "invalid HTML" is just too strict; it should only count parse errors and completely non-sensible things. And the specification is also too strict at times; on my own website I have "Element style not allowed as child of element div in this context." This is because on some pages it adds a few rules that apply only to that page and this is easiest with Jekyll. I suppose I could hack around things to "properly" insert it in the head, but this works for all browsers and has for decades and why shouldn't it, so why bother?
If the specification doesn't match reality, then maybe the specification should change...
This does not surprise me at all, with all the "must be a web app", still treating HTML mostly as a string in many web frameworks and semantically inappropriately using tags. It is exactly as I thought, the ratio of invalid HTML has become even worse. Probably most web devs these days do not even check their websites for HTML validity, because achieving it with the frameworks they chose is hard or impossible.
Moreover, tools are better too. Even if one is still using only vim on the terminal, plugins work better, screens are bigger, code compiles faster, and the internet has better resources for everything from programming and communicating with your team, to just finding music that helps you stay productive, for example.
In 2001 the entirety of HTML+CSS spec was probably less than just some of CSS modules like CSS Color.
Today the complexity lies not in the robustness of the specs, but in the sheer number of of them, and their many interactions. I mean, just distance units... There are over forty of them
Guys, I don't mean to sound miserable, but please don't turn this into Reddit comments with puns and jokes. Lets keep the signal-to-noise ratio optimal.
The specs really are drastically better than they used to be. Compare the modern specification for CSS Table Layout (https://www.w3.org/TR/css-tables-3/) with the older CSS2 one (https://www.w3.org/TR/CSS2/tables.html). The older one doesn't even attempt to define the "automatic layout algorithm" at all!
> Not Ready For Implementation
> This spec is not yet ready for implementation. It exists in this repository to record the ideas and promote discussion.
> Before attempting to implement this spec, please contact the CSSWG at [email protected].
It means that the spec is a draft and that it hasn't been finished yet (there may still be bits missing or wrong). But it is clearly already so much better than the old version.
- the web is numerous specifications: HTTP, HTML, CSS, JS, SVG. At worst, a regular compiler needs to worry about macros and the language syntax
- each of those specifications has numerous versions which, in some cases, can be significantly different from other versions of the same language or protocol. A language compiler generally only focuses on one version of that language
- A browser needs to support broken websites. A compiler only needs to fail gracefully
- A browsers output is graphical, which is much harder to unit test
In short, you’re dealing with a harder problem across a broader number of specifications. I would liken writing a browser more closely to writing a new graphical OS than writing a compiler.
(“Browser” here means “browser + engine et al” and not just a reskin of Chromium).
A browser rendering engine's output isn't purely graphical, and most things can be tested through other means such as by reading console.log output, looking at the DOM, or looking at computed CSS styles and/or bounding-box information.
In fact there's a good reason to keep graphical tests to a minimum: web specs do not dictate things down to the pixel level, so pixels can shift around from version to version, requiring the occasional golden data rebase.
Fun aside: Chrome's test suite contains a font named ahem.ttf where (almost) every character is an identical black rectangle. This allows tests to include text without relying too much on the details of a particular font.
The main reason it's difficult is because the output criteria seem properly defined but they actually aren't at all.
Yeah it's "just" building some parsers and figuring out live updates, but you have to keep in mind that this is ~the internet~. People have been uploading broken, against spec, webpages since forever. Coding a web browser as a serious project (so not as a flight of fancy) borders on the impossible mostly because of that.
The main sites people test against/use aren't the "simple" CSS/JS/HTML sites from the past. Few people will care for a browser whose main job is to be able to render a neocities website. People want their popular sites working - Discord, Facebook, reddit, twitter. All of those are big JS apps.
The real bugbear here is JS though, HTML and CSS are complex but workable. JS is an ever-moving target as spec implementers (mostly Chrome) dump more and more of the jobs a browser was meant to do as the user agent into JS[0]. (And that's without delving into how widevine became part of the spec, which means it's legally impossible to make a fully spec compliant browser.)
Polyfills can offer a lot of fallback/lenience, but polyfills are a moving target too - older browsers get deprecated, polyfills get removed for performance/optimization reasons, so your baseline spec for functional JS becomes ever-increasing unless you somehow get the people making popular JS libraries to accept that your browser project is important enough to keep the necessary polyfills around for.
[0]: Presumably so that Google can take away the User part from the browsers job as the User Agent, but typically covered up as a poorly defined "privacy problem".
This comment gets some pretty important fundamentals wrong.
> The real bugbear here is JS though, HTML and CSS are complex but workable. JS is an ever-moving target
What you characterize as "JS" is, in reality, more HTML and CSS than JS. JS is a language. The fact that all the behavioral details of the HTML and CSS objects and related host objects have bindings available to JS programs does not make those things "JS"...
Doing a new JS engine from scratch is an order of magnitude easier than doing a browser engine. It is directly analogous to the eminently tractable "building a compiler" problem that the other commenter mentioned.
Fair. I meant JS here as in "fully DOM compatible, as-used-in-your-browser JS". JS engines themselves aren't that hard to make (I think there's about 9 or 10 actively maintained ones?), but to make one that's usable in situations that aren't things like node or as a sub-language in a different project... that's far more difficult.
> "fully DOM compatible, as-used-in-your-browser JS"
That's still wrong.
s/DOM//
s/JS/DOM/
Continuing to say JS when you're really talking about what is, again, still in the land of HTML, CSS, etc just confuses things. Viz:
> to make [a JS engine] that's usable in situations that aren't things like node or as a sub-language in a different project... that's far more difficult
It's really, really not about JS. You don't make a browser that's compatible with the Wild Wild Web by adding stuff to the JS engine. You do it by implementing moar browser.
I think the distinction comes from the fact that a compile that's unfinished is unfinished but as a devloper you knew that and either you contribute or you suck it up.
A browser that's unfinished really cant be used by users at all. Either it lacks security, so nobody should use it, or it lacks vital features (of the spec, not end-user features), so nobody can really use it because every time a website relies on that API something doesn't work.
I don't think anyone has argued that building a browser from scratch is impossible, it's clearly not, just that building a competitive engine from scratch is impossible. SerenityOS is a sort of very cool art project, it's not attempting to justify itself in any specific way. If they make an engine that's 1% as good as Blink, works OK for the sites the authors personally care about and eventually they lose interest, OK, so what, no big deal.
It depends on how much of a browser you want to implement I guess. Comparing it with a compiler is a skewed comparison I think; compilers are built as part of many people's college/university education, but are only a small part of turning programming language into working software. Likewise, I'm sure most developers on here could feasibly write a web browser that can fetch websites and render the HTML.
But that's just one aspect, next you need to add support for CSS [1] and Javascript [2], each of which has had lifetimes of work invested in the standards and implementations.
So yeah, while it's doable to build a new browser, if you want to build a big one that has feature parity or is on-par with the existing browser landscape, you need a large team and many years of work. And that's just the practical aspect, the other one is, would a new browser actually be better? Could it compete with the existing market? So many players have just given up over time.
Big problem is the constant feature churn in the web space. Getting from zero to browser is probably doable. Staying at the mark with the ever shifting CSS standards and the constant deluge of web extensions is hard and expensive.
I don't think that's true at all. We get only a handful new CSS features every year, and features introduced today are much more carefully defined than the ad-hoc features of yesteryear. Implementing them is pretty straightforward. Certainly not more difficult than any of the million other things you have to do when building an OS from scratch.
The difficulty of building a state-of-the-art browser is almost entirely about performance. Everything else is straightforward by comparison.
> features introduced today are much more carefully defined than the ad-hoc features of yesteryear.
Many of them are just as ad-hoc, even if they are better defined, and meant to cover some holes in previous ad-hoc specifications. For example, the entire `subgrid` spec is patching one specific hole which actually has a proper general definition: "These <children> however are independent of the parent and of each other, meaning that they do not take their track sizing from the parent. " [1]
So instead of solving that general problem, we have a hyper-specific patch for a single feature. Which will definitely clash with something else in the future.
I mean, the entire web components saga is browser developers patching one hole after another that exist only because the original implementation was just so appalling.
> The difficulty of building a state-of-the-art browser is almost entirely about performance.
But that performance is directly affected by the numbe rof specs and features.
What is harder, starting with nothing and building the equivalent of 2020 Chromium from scratch, or taking Chromium from 2020 and extending it with the new features to get it up to date with today's Chromium?
The former is 100x harder than the latter, and the prior statement that new CSS/JS features are a burden to keep up with is patently absurd. Because it's a tiny amount of work relative to the total work required to make a browser. (But still hard in the absolute sense, because browser engines are among the most complicated software projects.)
Posing the problem as making just one change hides the problem. The problem is long term. A constant deluge of externally driven changes inevitably creates a quagmire of technical debt since you have no choice but to implement them regardless whether your chosen architecture is suitable for the change.
> the prior statement that new CSS/JS features are a burden to keep up with is patently absurd.
Chrome ships up to 400 new APIs a year (that is JS, CSS etc.)
Safari and Firefox ship 150 to 200 new APIs a year. [1]
Even Microsoft gave up on trying to keep up with browser development and switched to Chromium.
> Because it's a tiny amount of work relative to the total work required to make a browser.
That is, like, the primary work required. And many of those things often don't even have a solution until someone finally figures them out in a performant manner (like CSS's :has)
> I'm finding it weird that unlike other non-trivial projects like OSes or compilers, people often discourage building web browser engine because it is "hard" or something like that like... how is it different from building a compiler?
A conformant C++ compiler is in the same ballpark as a browser, but a naive C compiler is orders of magnitudes simpler.
Recreating the Windows OS is in the same ballpark as a browser, but a simple OS that boots and runs command-line apps is orders of magnitudes simpler.
People don't casually start new C++ compilers or projects such as WINE, but they do start toy compilers and OS all the time.
Compilers aren't routinely built anew either, except for new languages. Browsers are rarely written around a new language. (if you did invent a new language substituting html+css+js you'd likely want to implement it on the existing stack first, kind of like an equivalent to "complies to C")
I've been working on CSS Layout as a library recently[0] (we have Flexbox and CSS Grid support so far). It seems to me that librifying everything (much like is already done with JS engines) could be a good approach to making building a new browser engine more accesible.
Because that way any one wanting to build a new one can starting by pulling in a bunch of libraries (layout, rendering, etc), and then just customise the bits they need (ideally publishing them as a new iteroperable library that others can also use).
> It seems to me that librifying everything (much like is already done with JS engines) could be a good approach to making building a new browser engine more accesible. ... Because that way any one wanting to build a new one can starting by pulling in a bunch of libraries (layout, rendering, etc), and then just customise the bits they need (ideally publishing them as a new iteroperable library that others can also use).
I think that you are right; this is what will be needed. However, that alone won't do because it is also needing to write them to be good, and not too slow/inefficient and not too incapable of doing many customization stuff.
And, ensure things are properly separated. (Looking at your examples, it seems like it is properly separated, to me. You can define styles independently of parsing them, which improves efficiency as well as allowing adding other steps in between such as "meta-CSS" if desirable.)
In some cases, it may be desirable to modify parts of the libraries, although then it may be necessary to maintain a fork of that library, which is not always desirable.
(For example, I may want to add proper support for non-Unicode text, and being able to customize text layout functions, including all possible text directions (vertical, horizontal, boustrophedon, etc). Adding other CSS rules might also be needed for some other purposes, too. And then, we will also need to do accessibility features.)
(I like to use C programming; looking in issues, it look like they might be added, so that can be good; unfortunately, Rust has a Unicode string type and this can be problematic even if using C, unless the Rust programming is done very carefully to avoid this problem.)
We are planning to add a C API to Taffy, but tbh I feel like C is not very good for this kind of modularised approach. You really want to be able to expose complex APIs with enforced type safety and this isn't possible with C.
Question about implementation: while you are not building a browser (so I expect HTML is out of your scope) are you able to use an existing testsuite to compare your implementation?
We have our own test suite (orginally derived from the test suite of Meta's Yoga layout library [0]) which consists of text fixtures that are small HTML snippets [1] and a test harness [2] that turns those into runnable tests, utilising headless chrome both to parse the HTML and to generate the assertions based on the layout that Chrome renders (so we are effectively comparing our implementation against Chrome). We currently have 686 generated tests (covering both Flexbox and CSS Grid).
We would like to run the Web Platform Test suite [3] against Taffy, however these are not in a standard format and many of the tests require JavaScript so we are not currently able to do that.
I recommend to listen to Andreas Kling (the leader of that project and the author of the post) interview on the Corecursive podcast to learn about his background in working on web browsers code:
https://corecursive.com/serenity-os-with-andreas-kling/
Others have already mentioned performance is one of the hard part.
Another aspect is that for any complex, large-scale project like browser, lots of, if not most of, effort is actually in the long tail: to make 90% or even 99% websites work probably is as hard as making the rest 1%.
So while the team probably could cruise through when working on current gen spec and popular sites like Discord/Twitter, it's what left is going to be a nightmare to manage at the end.
But again, nothing is impossible, and I really look forward to having a new browser engine in the wild.
Is this something you know from experience or are you armchair guessing?
If I recall correctly, the work Andreas did at Apple was mostly focused on performance, and Safari has long had a reputation for excellent performance. Maybe you’ve also done that type of work, but otherwise I’ll trust his judgement.
My experience with big projects would confirm this. Getting the basic things implemented is quite fast, but the devil is in the details and they can drag on for years, if you didn't account for all of them in the beginning.
And there are a hell lot of details with the plattform called the web.
But I would think in this case here, they have no intention of going to 100% by all means, to support all the broken pieces of web garbage out there.
The goal is to implement the W3C specs. (they are even working on fixing the specs)
Oh and Kling specifically worked on browsers before, so that is a good base.
"I've had the opportunity to work on production browsers for many years (at Apple and Nokia)"
I made a comment about why browser is hard in general. I don't in anyway suggest or imply this team would struggle with performance, so not sure why their (amazing) background would be relevant.
Please re-read your own comment. You describe two hard things, performance and long-tail compatibility, and literally state that “the team” will have a nightmare left after an initial cruise.
Having a better team just means they can manage "the hard part/nightmare" better, doesn't change where the hard part is.
And I mention these two things because the article itself said they're going to "[f]ocus on vertical slices" first and "[d]eferring on performance work". So I was pointing out that it basically means they started with easy part (nothing wrong with it).
I still failed to see what point I said you didn't agree with, other than argumentum ad verecundiam.
Web browsers are a moving target, just like operating systems. Anyone can 'build there own'; but I'd also say that it is close to impossible to build a secure and viable competing web browser that correctly implements the specification better than Chrome.
The two most important keywords in Drew's blogpost is *serious* and *security*. There is not one mention of either of those words in this blog post; hence Drew's points still stands unchallenged.
You can try, but so did Servo which was a 'serious attempt' and not even the Rust hype could convince the masses that it was better than Chrome.
It moves slower than people assume; I installed Opera 12 last year for the craic – the last version built on their Presto engine, released almost ten years ago – and it works surprisingly well with many sites. I did have to use mitmproxy to rewrite some trivial stuff like some CSS prefixes and s/(let|const)/var/ in JS. Flexboxes are supported, but grid isn't so that failed for some sites.
> Drew's points still stands unchallenged
His points are based on a faulty assumption to start with: he counts all sorts of documents, but that count is spectacularly wrong as it counts many things it shouldn't. I mentioned this at the time: https://news.ycombinator.com/item?id=22617721
Is it a large project? Sure, as many software projects are. But "impossible" and "comparable to the Manhattan project"? Certainly not; it's just that there's not a whole lot of money to be made with a new browser engine or other broadly shared motivation.
And Drew himself answered you at the time on points made to your comment, that is things you said shouldn't be included that should and others you said were incorrectly included but were instead excluded in first place.
Those replies are handwavy and offer no convincing defence at all. In just a few minutes I was able to reduce the 1,217 URLs to 434 by simply excluding outdated or non-applicable stuff. That's about a third and includes some pretty large documents, and that's just with a quick check. The list is unambiguously categorically wrong and anyone who seriously looks at it and comes to a different conclusion is suffering from serious confirmation bias.
Whether the web is "too complex" is a different matter and open to interpretation as "too complex" is subjective. But the data very wrong and therefore the article is wrong. A "correct" conclusion with faulty arguments is just as worthless as an incorrect conclusion: any possible solution depends on a correct understanding of the situation. "Global warming happens because of pornography, therefore we must ban pornography" is just as useless as "global warming is a fake fraud" even though the conclusion of the first is correct.
Why not post this list for the rest of us to see (and check) then? But beforehand those "outdated" stuff (such as your HTML 3.1 example) may still be applicable and excluding them means your approach is incorrect right off the bat.
The HTML 3.1 specification is essentially irrelevant for implementing a modern browser; and you certainly don't need HTML 5.0, and, HTML 5.1, and HTML 5.2, and HTML 5.3, and, HTML 4.0, and, HTML 4.01, and, HTML 3.2, and, XHTML 1.0, and, XHTML 1.1, and, XHTML 2. These are large documents; possibly the largest in the set.
It takes a minute to spot-check; some specific examples were provided in the previous thread. I don't have the list any more and can't be bothered to recreate it; what value is there if you can just check Drew's list – which is really not that hard? I also have no idea how correct it is, exactly; I suspect the actual number would be even lower still.
Isn't that the real problem here? Nobody cares if your browser fails to render that page, because you strictly adhere to the standard while Chrome just happily deals with broken documents (or worse: Chrome requiring documents to be slightly broken).
Chrome is the sole benchmark. If it works in Chrome, it's fine, if it doesn't the site is broken. Standards never enter the discussion.
I think that there will be a new successful browser one day - and it will be disruptive. But it needs two properties:
1. A unique use case or feature that cannot be easily implemented in the existing browsers. Something that breaks the current architecture and turns the current use cases into afterthoughts. ("oh, yeah, right we actually need to render html somehow at some point, can the intern do it?")
2. A significant breakthrough in software engineering productivity, a major step in terms of abstraction and safety. Something like the combination of a LLM and formal methods.
This browser does not check these two boxes (using C++, albeit hopefully a more modern dialect, and targeting plain old browsing). So it is certainly great for the spec - and should be paid for by the W3C, IMO, great for the people developing this as an exercise, but it will never dethrone Chrome.
I think a potentially interesting use case for Ladybird is as a "contenteditable" polyfill. With their dependancy free stack I'm guessing it's not out of the realms of possibility to compile it to WASM and HTML canvas.
Having to only target one rendering engine when developing a rich text editor would be much better than the current nightmare it is.
(There would be an accessibility problem to solve, we need some new APIs for screen readers and canvas)
It's a lot of fun to watch them build Ladybird, and a testament to what a small passionate team can do.
Happy to admit it's a crazy idea, and it's not something that I would want to see as a usual way to built sites. But for small areas of web apps where compatibility is difficult it does make sense.
Google Docs used to be contenteditable based, but moved to a custom rendering engine. They are a large enough company to be able to invest in that. Small businesses aren't, and have to rely on content editable.
Ladybird as a contenteditable polyfill would help smaller teams, or single developers, achieve the same, while also building on the existing tooling and APIs for contenteditable.
Isn't this actually prove that creating desktop grate applications, like a graphical word processor form the late 90's & some colab backend, is still infeasible with web tech?
Doesn't it also prove that it's still easier and faster to create a proper app and GUI toolkit yourself and just render pixels to the screen (as all desktop GUI toolkits do these days) instead of fighting the browser tech idiosyncrasies, even a proper app and GUI stack isn't trivial in itself?
> Doesn't it also prove that it's still easier and faster to create a proper app and GUI toolkit yourself and just render pixels to the screen (as all desktop GUI toolkits do these days) instead of fighting the browser tech idiosyncrasies, even a proper app and GUI stack isn't trivial in itself?
jmo, but i think so as well....
tho i wonder what the performance/battery-life implications of everything doing that might be... perhaps you could have an 'libhtml' for static sites and documents, and different ones for more interactive apps etc
Maybe not exactly the same. But whatever could have been improved on that base in the last 15 years.
The point is: We had much saner tech. Now it's just complete craziness, and still you can't even build a word processor like the one that run on Windows 95. This says just everything about the state of web tech for application development. (And no, this tech is rotten from the roots, so you can't improve on it. It'll get only more crazy and shitty if you try further.)
As an industry we’ve taken a long and windy path to delivering full applications (with a local cache) on every load to a sandboxed environment that is mostly compatible across environments.
While it’s easy to take shots at the current state, I find it hard to imagine another path to delivering a cross platform application over the network. What we have now is actually pretty rad.
We built it stone by stone, incrementally, and now I don’t have to package my application for N platforms unless I want to meet users in their app stores.
If we spent all the effort we used up on making JavaScript work on Java instead, I’d be typing this from my Moon habitat.
But we instead chose a pig, buried it under layers of lipstick, and strapped a jet engine onto it. Sure, it’s airworthy, but was it the best way to allocate resources?
I wouldn't say we chose a pig and buried it under layers of lipstick. I think it's more accurate to say people realized:
1) You could deliver software via a browser with some clever hacking
2) That delivering software via a browser had some extremely attractive properties compared to other approaches
Those clever hacks turned into real applications and those became real products. As people built applications on top of browsers, the browsers evolved to be a better environment for building applications. It was very organic.
The actual language we ended up using is just a byproduct IMO. I don't think people chose JavaScript, they chose the browser and the browser had JavaScript. And since the browser had JavaScript, we kept pushing the limits of JavaScript because we needed to deliver applications to the browser.
Honestly hard to see this happening efficiently in any other way. All things considered, the web platform is pretty fantastic compared to the software distribution story in every other ecosystem. I'd argue that we allocated resources pretty well on this one!
I didn't miss them - I don't have fond memories of them as either a user or a developer. But you're right, they were there. Which is interesting: Java Applets were available from close to day one right along side JavaScript, and yet...
There are two use cases for the web. There is the document web and the application web. Both are equally valid. It is absolutely an application platform, as evidence by the fact people use it to deliver applications. An ever increasing percentage of desktop software is moving to the browser, to the point where many users only need a web browser. I'd argue the only reason mobile hasn't followed suite is the non-market forces behind the app store model.
It is one of the best application platforms for both users and developers. I can write my software exactly one time and it will run on every platform, instead of separate applications for Android, iOS, Mac, Windows, FreeBSD, Linux, etc. etc.
I can assume my users have a web browser - because they do. For interpreted languages (and VMs) I either have to walk my users through setting up the interpreter or bundle it into the distributable.
Compared to everything else I've worked with, the web as an application delivery platform is great. I write my code, send someone a link, and they are running my app.
> Java Applets were available from close to day one right along side JavaScript, and yet...
And yet, what?
But the actual question is: Why? ;-)
> There is the document web and the application web. Both are equally valid.
No, they aren't. The tech was build to support only a lightweight version of the first one. Everything on top is just a great hack, and pure insanity form the technical viewpoint!
> It is absolutely an application platform, as evidence by the fact people use it to deliver applications.
People do a lot of very stupid things. That's not evidence that doing stupid things is a good idea…
> An ever increasing percentage of desktop software is moving to the browser
Nobody is doing that because web-tech is a great application platform. It's actually exactly the other way around: Most people complain about the extremely crappy tech, but still do it for other reasons.
> I'd argue the only reason mobile hasn't followed suite is the non-market forces behind the app store model.
This makes no sense.
It would be much cheaper for the developers to not pay road toll to the app-store owners (-30%!), and they would at the same time remain in control over their own products, if they "delivered" web-apps. But most mobile developers don't do that, for technical reasons: Web apps are just crap and especially on mobile it glaringly shows.
> I can write my software exactly one time and it will run on every platform
You mean, like JVM applications already did 25 years ago?
> Compared to everything else I've worked with, the web as an application delivery platform is great. I write my code, send someone a link, and they are running my app.
What's again the difference here to Java WebStart?
BTW: Installing a JRE is exactly the same one-off effort like installing a web-browser…
---
We lost between 20 and 30 years once again just for political reasons!
Only to arrive at the worst rip-off of some concepts which were already almost "working fine".
I admit that's a recurring pattern. It's always the most terrible tech that will come out on top in the end, for completely insane "reasons". The market just always favors the cheapest shit that can be rolled out with least effort. It was like that for example with things like C or UNIX. Now "worse is better" became a kind of proverb in some circles…
To be of the opinion that the technical best solutions win in the market is imho a sign of not much experience in this world. It was until now the exact opposite in almost all relevant cases, because most of the time the cheapest shit wins on the market.
This is exactly the type of thing why I am so excited about LLMs. If they make use widely more productive we can build huge things like this and take on the monopolies in the space. It might lead to a wave of "thought impossible to build" products.
I've been wondering if at some point we could eventually train the models in first-order terms. I.e. input some HTML/JS/CSS + user state (i.e. scroll position, x/y dimensions) and then it outputs a final rasterized frame representing the current state of the viewport. The training data would be fairly obvious and easy to collect.
Failing that, an optimized binary blob that could achieve the same using training data over modern browser specifications. If you go to ChatGPT and start talking about ISO32000-compliant implementations and poke at the edges, you can get it to start writing a PDF engine pretty quickly.
But their product survived and they ended up being wildly successful.
In a lot of projects, early optimization hurt the development speed and maintainability, sometimes killing the product.
It is easier to see performance bottlenecks once a product is wildly used than adding optimizations everywhere we suspect it might become a problem later.
Yes. Never been easier to build a browser than today with great open source engines. Writing an engine from scratch is the hard part. Extremely labour-intensive to build and then maintain. People usually conflate both.
I think there is an important aspect of it which Andreas glimpsed over - do it incrementally. While Ladybird is not yet on par with more mature engines it already works for a lot of webpages just fine.
I mean I don't see highly advanced / complicated features like webgl used in my day to day browsing, it's tech demos and the like linked from HN at best. A new browser can do without that for a long time.
I wish there was a auto-migration framework, if one software product like my browser is corrupted aka sells out, it packs my settings into a neutral interface file and automatically migrates to a still untouched browser or offers me a list to chose from. Like Nomadic herds of animals, hunted by predators, ever elusive, never caught..
Building a web browser is a difficult, but attainable task. On the other hand, getting the UX right is a gargantuan task that even big tech struggle with.
Actually, UX is much easier to iterate on. There are countless Chomium implementations that are mainly competing on UX. They all render websites just fine. UX is the only thing they compete on. Mostly, there is a lot of imitation and not a lot of innovation in that space. It seems people like tabs at this point and things like bookmarks and back buttons. There are only so many ways to arrange those features on a screen and we've seen most of those over the past 20 years.
The real difficulty with browsers is building a better one than the existing ones. If you make a new browser. It does exactly the same thing as the other ones. It's a great technical accomplishment but it has a very low value. Which is why nobody bothers at this point.
At this point there are only three browser engines with any audience still worth talking about: chromium, safari/webkit, and firefox/gecko. Obviously the first two are related but they forked so long ago that they are quite different at this point. In terms of what they do there are some minor differences but they basically render the same websites in more or less the same ways. There is very little point in picking one over the other at this point.
I actually use Firefox and I'm pretty happy with it. I don't think it does a lot better/different than the other two at this point but I like not selling out completely to Apple/Google. Google treats me like a product rather than a user and Apple seems more interested in telling me what I can't do rather than enabling me to do things I want to do. But objectively, both do a fine job of rendering websites and allowing me to browse the web. Just like Firefox does. And given that there is no practical difference, I choose to use Firefox.
The problem isn't the web browser. The problem is that your Javascript engine has to be staggeringly stellar or your web browser will feel like it's encased in molasses.
I wish them luck, though. We could use some real competition in the web browser space again.
honestly the web is so broken, i think it is beyond repair, i just want to run the website through some LLM to get the content out and show it in lynx and be done
i dont want to consent to be tracked, i dont want to login, i dont want a presonalized feed, i dont want to subscribe to your news letter, i dont want your ads, ethical or not
js is still an issue, but maybe a day will come when i can say to the model 'pretend you are js interpreter; what is text the output of this minified react garbage and show it as markdown' and just pipe it to lynx or w3m or worse case eww
If your team's just 5 people, having a strong leader can make the difference. But with a whole business, it's no longer about one strong leader. The more people you have, the more you have to invest in levelling up individual performance. Almost universally that doesn't scale, so instead you work on process, and use that process, not individual leadership, to ensure better results.
This is great. Refreshing to read something which talks about complexity as a real and important issue (not as a positive or neutral aspect of a system). I've been hoping for a browser like this since the day I tried to download the Chromium repo and found out how large it was. Also, I noticed that it had a large number of external dependencies which made it very difficult to actually dig into the code.
Modern HTML browsers have become Swiss Army Spaghetti. Perhaps we need to split the standard into smaller components so one size doesn't have to fit all. I suggest at least 3 sub-standards:
A) Document-oriented standard. Perhaps HTML standards are "good enough" for this?
B) Media/Art/Gaming.
C) Business & Data CRUD/GUI
(And don't link that XKCD cartoon about 15 standards. There are zero for these categories.)
Since there's already plenty of HTML browsers, instead explore an unserved need, such as a stateful GUI markup browser & standard. HTML/DOM is missing many expected GUI idioms, and has an inherent text positioning flaw:
GUI's, desktops, and mice are still needed for biz and productivity. HTML browsers have been a goofy mess for this, requiring bloated buggy JS libraries with long learning curves. Let's Make Gui's Great Again! (No, I'm not a Don fan, BTW, but his trollisms are catchy.)
5000 feet up the mountain: "I'm climbing Everest solo without oxygen even though it's supposed to be impossible. How come I'm making such good progress?!"
I would love to see this being adopted and ported as a alternative, light browser for other "esoteric" OSes like Haiku, legacy OS X, and even OpenBSD (which has performance issues with Firefox)
I absolutely love that Andreas Kling doesn't care about it being hard/impossible, and just dives straight in with pure optimism. It's absolutely wonderful to see the joy and positivity.
Andreas, if you're reading this… have you folks thought about building a new ACID-type test that covers the gaps in the existing ones? Seems like it would be incredibly useful.
I know that this is more of an "can we do it?" experiment, so I feel kind of bad for ciriticising. It's a great feat to get this far.
But I was disappointed that it just crashes on any github page... and SerenityOS github is literally the first link on the Ladybird default homepage :)
edit: oh it doesn't crash on github page, it crashes on the github issues page.
He's clearly earned that title. He demonstrably has the experience - and inspiring and coordinating volunteer contributions to a project of this scale is an extraordinarily difficult leadership challenge.
Why learn to play an instrument when you can just buy a CD?
The goal is not to have a browser, but to build one. This browser is affiliated with the SerenityOS project, which is reimplementing an entire desktop OS and all applications from scratch.
> Also, since Ladybird is an offshoot from the SerenityOS project, it shares the same culture of accountability and self-reliance. We avoid 3rd party dependencies and build everything ourselves. In part because it’s fun, but also because it creates total accountability for what goes into our software.
Seems similar to how Wine is developed: instead of just going down the list of API functions to implement, the emphasis seems more on "let's get SomeProgram.exe to run" or "let's fix the graphics glitch in SomeGame.exe". Console emulators (especially of the HLE variety) seem to have a similar flow.