Of course you can express anything on top of a relational model. But for graphs ...

thesz · on March 10, 2016

I beg to disagree.

I am part of the team developing Russian CAD system [0]. It uses what one can consider a hypergraph db (relation includes many objects), but that DBMS system has queries on par with SQL. And they prove themselves very useful in development of CAD.

What you describe can be explained with development inertia. Most CADs are C/C++ and these languages are not very well suited for changes that go through all code base (change of storage engine and data model).

I also made experiments during a dev of graph analytics engine in one part of my experience. The relational model (actually, linear algebra model) has proven itself very competitive. It allows for easy distribution of data, the operations over distributed data are close to optimal, etc, etc.

[0] http://dd.ru/

sklogic · on March 10, 2016

I worked on a (very old) ship-building and factory-building CAD, written mostly in PL/I and Fortran. It was built around a graph database. Throughout its long and turbulent history, there were many attempts of moving to a relational storage, every time resulting in orders of magnitude drop in performance.

Keep in mind that in such a CAD designs are huge. Think of an aircraft carrier scale of "huge". And it was designed when memory was very limited. Therefore, pretty much all the CAD operations depended on the database access.

So, nobody really cared about the queries, they were insignificant. What people cared about was:

* Performance of following an arc

* Transactions

* Data consistency

* Compactness of representation (remember, disk space is also a limited thing when you're building aircraft carriers).

* Nice API (even in a very limited language)

thesz · on March 11, 2016

Had the state of art changed from the time of their last attempt to move to relational DB? Had the hardware changed during that time?

You tell us about some old project, written in hard to maintain languages, which had many failures to adapt to new tech. This is exactly what to expect.

I am talking about relatively modern language (C#) using good DB tech (lagging about seven, maybe five years from the state of art). Maybe, the story will be different in our case.

sklogic · on March 11, 2016

The last attempt was made in around 2008, AFAIR.

Fundamentally nothing changed in the relational storage. Follow-a-graph-edge operation is as expensive as it used to be (involves an index lookup, it cannot be cheap).

If you know a relational arrangement suitable for a cheap O(1) edge traversal - please share. But I am very skeptical.

And I cannot see how the host language is relevant at all. C#, Haskell, whatever - none can make data access operations cost less than what the data model predicts.

thesz · on March 12, 2016

Use covering indices for edges, it is vastly cheaper than regular indices. Choose write-optimized storage layer (fractal indices (circa ~2007, BTW!), LSM) for DB. Don't do singular accesses, use bulk accesses (and now we hit the wall of PL/1 and Fortran you mentioned). I can go on.

I cannot help but feel that what you describe is a classic example of technical inertia due to massive technical debt. It cannot prove that relational DBs are bad for CADs.

sklogic · on March 12, 2016

Is any of the indices you're describing O(1)? Very, very unlikely.

As for the bulk operations, they're in most cases totally useless. Rendering - maybe, but most of the other CAD operations require precise edge following.

And why even going into all the troubles with using this totally unsuitable relational representation when a proper graph dbms is so much easier? Relational religion is so funny, almost as funny as OOP.

thesz · on March 15, 2016

What exactly are these "most CAD operations"? I bet they follow many links from nodes (note the plural!) in most of them. Single node operations are seldom and even then there are many links to follow. This is true for scheme/PCB editor, I bet it is even more true for arch design.

You can look at these indices as O(1) operations. Btrees are just like that.

(if you think that memory access is O(1), you are wrong)

Graph DBs, more often than not, are ad hoc bug ridden slow poor implementation of one tenth of relatively complete implementation of relational DB.

sklogic · on March 15, 2016

> What exactly are these "most CAD operations"?

CSG, place and route (pipes, cables, etc.), design constraint checks, all that stuff.

> I bet they follow many links from nodes (note the plural!) in most of them.

Not that many, mostly single-digit numbers.

> This is true for scheme/PCB editor

Which is very, very different from an oil refinery or an aircraft carrier. Both in a scale and typical operations.

> You can look at these indices as O(1) operations. Btrees are just like that.

WAT?!? Not even close. O(log n) at best. And a multiplier there is huge.

> Graph DBs, more often than not, are ad hoc bug ridden slow poor implementation of one tenth of relatively complete implementation of relational DB.

What?

Graph DBs are orders of magnitude simpler than any relational pile of a mess. It's really hard to screw them up. Everything is trivial there, including transactions, logging, referential transparency and all that.

thesz · on March 18, 2016

Design constrain checks touch A LOT of stuff: usually things have several classes attached to them and check need to be performed on union or intersection of class-related data. Place and route, even for 2D PCB, also touches a lot of stuff - it needs to access constraints, at the very least. In our case, constraint system had to return data to PCB real-time router in under 100us, for several tens of millions of different classes of constraints. I cannot see how you can speak of singular accesses in this context. Or "single-digit as many" accesses.

Now I'll leave conversation. We clearly have different view on almost everything, including, but not limited to "huge multipliers".