> So, how *exactly* is my app/whatever supposed to spin up a parallel process in...

noodletheworld · 2025-03-12T02:17:50 1741745870

Thanks for chiming in with these details, but I would just like to say:

> It will not conflict with other applications that do the same thing.

It is possible not to conflict with existing parallel deployments, but depending on your IPC mechanism, it is by no means assured when you're not forking and are instead launching an external process.

For example, it could by default bind a specific default port. This would work in the 'naive' situation where the client doesn't specify a port and no parallel instances are running. ...but if two instances are running, they'll both try to use the same port. Arbitrary applications can connect to the same port. Maybe you want to share a single compiler service instance between client apps in some cases?

Not conflicting is not a property of parallel binary deployment and communication via IPC by default.

IPC is, by definition intended to be accessible by other processes.

Jupyter kernels for example are launched with a specified port and a secret by cli argument if I recall correctly.

However, you'd have to rely on that mechanism being built into the typescript compiler service.

...ie. it's a bit complicated right?

Worth it for the speedup? I mean, sure. Obviously there is a reason people don't embed postgres. ...but they don't try to ship a copy of it along side their apps either (usually).

nine_k · 2025-03-12T03:16:29 1741749389

> Not conflicting is not a property of parallel binary deployment

I fail to see how starting another process under an OS like Linux or Windows can be conflicting. Don't share resources, and you're conflict-free.

> IPC is, by definition intended to be accessible by other processes

Yes, but you can limit the visibility of the IPC channel to a specific process, in the form of stdin/stdout pipe between processes, which is not shared by any other processes. This is enough of a channel to coordinate creation of a more efficient channel, e.g. a shmem region for high-bandwidth communication, or a Unix ___domain socket (under Linux, you can open a UDS completely outside of the filesystem tree), etc.

A Unix shell is a thing that spawns and communicates with running processes all day long, and I'm yet to hear about any conflicts arising from its normal use.

noodletheworld · 2025-03-12T03:50:20 1741751420

This seems like an oddly specific take on this topic.

You can get a conflicting resource in a shell by typing 'npm start' twice in two different shells, and it'll fail with 'port in use'.

My point is that you can do not conflicting IPC, but by default IPC is conflicting because it is intended to be.

You cannot bind the same port, semaphore, whatever if someone else is using it. That's the definition of having addressable IPC.

I don't think arguing otherwise is defensible or reasonable.

Having a concern that a network service might bind the same port as an other copy of the same network service deployed on the same target by another host is an entirely reasonable concern.

I think we're getting off into the woods here with an arbitrary 'die on this hill' point about semantics which I really don't care about.

TLDR: If you ship an IPC binary, you have to pay attention to these concerns. Pretending otherwise means you're not doing it properly.

It's not an idle concern; it's a real concern that real actual application developers have to worry about, in real world situations.

I've had to worry about it.

I think it's not unfair to think it's going to be more problematic than the current, very easy, embedded story, and it is a concern that simply does not exist when you embed a library instead of communicating using IPC.

jchw · 2025-03-12T04:30:42 1741753842

> It is possible not to conflict with existing parallel deployments, but depending on your IPC mechanism, it is by no means assured when you're not forking and are instead launching an external process.

Sure, some IPC approaches can run into issues, such as using TCP connections over loopback. However, I'm describing an approach that should never conflict since the resources that are shared are inherited directly, and since the binary would be embedded in your application bundle and not shared with other programs on the system. A similar example would be language servers which often work this way: no need to worry about conflicts between different instances of language servers, different language servers, instances of different versions of the same language server, etc.

There's also some precedent for this approach since as far as I understand it, it's also what the Go-based ESBuild tool does[1], also popular in the Node.JS ecosystem (it is used by Vite.)

> For example, it could by default bind a specific default port. This would work in the 'naive' situation where the client doesn't specify a port and no parallel instances are running. ...but if two instances are running, they'll both try to use the same port. Arbitrary applications can connect to the same port. Maybe you want to share a single compiler service instance between client apps in some cases?

> Not conflicting is not a property of parallel binary deployment and communication via IPC by default.

> IPC is, by definition intended to be accessible by other processes.

Yes, although the set of processes which the IPC mechanism is designed to be accessible by can be bound to just one process, and there are cross-platform mechanisms to achieve this on popular desktop OSes. I can not speak for why one would choose TCP over stdin/stdout, but, I don't expect that tsc will pick a method of IPC that is flawed in this way, since it would not follow precedent anyway. (e.g. tsserver already uses stdio[2].)

> Jupyter kernels for example are launched with a specified port and a secret by cli argument if I recall correctly.

> However, you'd have to rely on that mechanism being built into the typescript compiler service.

> ...ie. it's a bit complicated right?

> Worth it for the speedup? I mean, sure. Obviously there is a reason people don't embed postgres. ...but they don't try to ship a copy of it along side their apps either (usually).

Well, I wouldn't honestly go as far as to say it's complicated. There's a ton of precedent for how to solve this issue without any conflict. I can not speak to why Jupyter kernels use TCP for IPC instead of stdio, I'm very sure they have reasons why it makes more sense in their case. For example, in some use cases it could be faster or perhaps just simpler to have multiple channels of communication, and doing this with multiple pipes to a subprocess is a little more complicated and less portable than stdio. Same for shared memory: You can always have a protocol to negotiate shared memory across some serial IPC mechanism, but you'll almost always need a couple different shared memory backends, and it adds some complexity. So that's one potential reason.

(edit: Another potential reason to use TCP sockets is, of course, if your "IPC" is going across the network sometimes. Maybe this is of interest for Jupyter, I don't know!)

That said, in this case, I think it's a non-issue. ESBuild and tsserver demonstrate sufficiently that communication over stdio is sufficient for these kinds of use cases.

And of course, even if the Jupyter kernel itself has to speak the TCP IPC protocols used by Jupyter, it can still subprocess a theoretical tsc and use stdio-based IPC. Not much complexity to speak of.

Also, unrelated, but it's funny you should say that about postgres, because actually there have been several different projects that deliver an "embeddable" subset of postgres. Of course, the reasoning for why you would not necessarily want to embed a database engine are quite a lot different from this, since in this case IPC is merely an implementation detail whereas in the database case the network protocol and centralized servers are essentially the entire point of the whole thing.

[1]: https://github.com/evanw/esbuild/blob/main/cmd/esbuild/stdio...

[2]: https://github.com/microsoft/TypeScript/wiki/Standalone-Serv...