More

tszming · on Dec 9, 2022

so what graph database you end up with?

PaulHoule · on Dec 9, 2022

It's been a long strange trip.

I don't think a single product is going to satisfy everyone and that's a problem with the category. If by "graph database" you mean you want to do completely random access workloads you are doomed to bad performance.

There was a time when I was regularly handling around 2³² triples with Openlink Virtuoso in cloud environments in an unchanging knowledge base. I was building that knowledge base with specialized tools that involved

   * map/reduce processing
   * lots of data compression
   * specialized in-memory computation steps
   * approximation algorithms that dramatically speed things up (there was a calculation that would have taken a century to do exactly that we could get very close to in 20 minutes)

Another product I've had a huge amount of fun with is Arangodb, particularly I have used it for applications work on my own account. When I flip a Sengled switch in my house and it lights up a hue lamp, arangodb is a part of it. I am working on a smart RSS reader right which puts a real UI in front of something like

https://ontology2.com/essays/ClassifyingHackerNewsArticles/

and using Arangodb for that. I haven't built apps for customers with it but I did do some research projects where we used it to work with big biomedical ontologies like MeSH and it held up pretty well.

I came to the conclusion that it wasn't scalable to throw everything into one big graph, particularly if you were interested in inference and went through many variations of what to do about it. One concept was a "graph database construction set" that would help build multi-paradigm data pipelines like the ones described above. One thing I got pretty sure about was that it didn't make sense to throw everything into one big graph, particularly if you wanted to do inference, so I got interested in systems that work with lots of little graphs.

I got serious and paired up with a "non-technical cofounder" and we tried to pitch something that works like one of those "boxes-and-lines" data analysis tools like Alteryx. Tools like that ordinarily pass relational rows along the lines but that makes the data pipelines a bear to maintain because people have to set up joins such that what seems like a local operation that could be done in one part of the pipeline requires you to scatter boxes and lines all across a big computations.

I built a prototype that used small RDF graphs like little JSON documents and defined a lot of the algebra over those graphs and used stream processing methods to do batch jobs. It wasn't super high performance but coding for it was straightforward and it was reliable and always got the right answers.

I had a falling out with my co-founder but we talked to a lot of people and found that database and processing pipeline people were skeptical about what we were doing in two ways, one was that the industry was giving up even on row-oriented processing and moving towards column-oriented processing and people in the know didn't want to fund anything different. (Learned a lot about that, I sometimes drive people crazy with, "you could reorganize that calculation and speed it up more than 10x" and they are like "no way", ...) Also I found out that database people really don't like the idea of unioning a large number of systems with separate indexes, they kinda tune out and don't listen until you the conversation moves on.

(There is a "disruptive technology" situation in that vendors think their customers demand the utmost performance possible but I think there are people out there who would be more productive with a slower product that is easier and more flexible to code for.)

I reached the end of my rope and got back to working ordinary jobs. I wound up working at a place which was working on something that was similar to what I had worked on but I spent most of my time on a machine learning training system that sat alongside the "stream processing engine". I think I was the only person other than the CEO and CTO who claimed to understand the vision of the company in all-hands meetings. We did a pivot and they put me on the stream processing engine and I found out that they didn't know what algebra it worked on and that it didn't get the right answers all the time.

Back in those days I got on a standards committee involved w/ the semantics of financial messaging and I have been working on that for years. Over time I've gotten close to a complete theory for how to turn messages (say XML Schema, JSON, ...) and other data structures into RDF structures and after I'd given up I met somebody who actually knows how to do interesting things with OWL, I got schooled pretty intensively, and now we are thinking about how to model messages as messages (e.g. "this is an element, that is an attribute, these are in this exact order...") and how to model the content of messages ("this is a price, that is a security") and I'm expecting to open source some of this in the next few months.

These days I am thinking about what a useful OWL-like product would look like with the advantage that after my time in the wilderness I understand the problem.

Jupe · on Dec 9, 2022

Fun read above - very descriptive and interesting... thanks for sharing!

OWL? RDF? Were you an RPI graduate perhaps? (I wasn't but did visit them once as part of research project).

At the end of the day, triple stores (or quad stores with providence) never quite worked as well as simple property graphs (at least for the problems I was solving). I was never really looking for inference, more like extracting specific sub-graphs, or "colored" graphs, so property attribution was much simpler. Ended up fitting it into a simple relational and performance was quite good; even better than the "best" NoSQL solutions out there at the time.

And, triple stores just seem to require SO MANY relations! RDF vs Property Graphs feels like XML vs JSON. They both can get the job done, but one just feels "easier" than the other.

carterschonwald · on Dec 9, 2022

Where are you based? Id love to hear more about this set of adventures over tea or coffee sometime

PaulHoule · on Dec 9, 2022

Check my profile and send me an email.

estro0182 · on Dec 9, 2022

This is interesting. What is OWL?

Jupe · on Dec 9, 2022

I think he's referring to Web Ontology Language; IIRC it is a kind of schema for relations in graphs. It was a big part of the Semantic Web surge from 10+ years ago.

lolive · on Dec 9, 2022

RDF is the JSON of graphs. OWL is the Json-Schema of RDF.

tszming · on Nov 21, 2016

Environments (dev, prod etc) support would be great, currently need to use apex to simulate.

tszming · on Nov 19, 2016

Content farm will find it useful.

tszming · on Oct 6, 2016

As a product creating so much job opportunities gloablly and didn't take you $0.01, I would like to ask: How shameful is it?

tszming · on Sept 30, 2016

(Rant) if you site's JS file is not minified, , your script will be blocked （Chrome 99）

tszming · on Sept 26, 2016

For those who want to take up a challenge - fork a version of Chrome Android with extension (aka AdBlock) support.

billyjobob · on Sept 26, 2016

https://play.google.com/store/apps/details?id=org.adblockplu...

tszming · on Sept 19, 2016

For those who are not familiar in Email's CSS support, Gmail is actually a blocker, not a mover: https://www.campaignmonitor.com/css/

buro9 · on Sept 19, 2016

You say that like it is a bad thing.

I'm really pleased that email cannot set a style header and has limited ability to have the email deviate greatly in presentation from other email I receive.

amelius · on Sept 19, 2016

Why not block styling on web pages too then? I mean, what makes an email different from a web page, other than that it has been sent to you?

Just playing devil's advocate here.

buro9 · on Sept 19, 2016

> what makes an email different from a web page, other than that it has been sent to you?

I think it's to do with perceived ownership of the environment in which the information is consumed.

Your website... feel free to style, brand, make pretty or ugly. You can make the experience consistent within your realm.

My inbox... I get to control my workflows, how I consume things, in which order, etc. I choose to make the experience consistent within my realm.

That holds fairly well as a definition for why I prefer chat clients that grant me the ability to make all messages consistent, and that do not allow the sender to dictate terms. It also holds up with things like Netflix, it's their realm they can knock themselves out on their design.

It does seem to be whether the environment it is presented is "yours" or "mine".

nailer · on Sept 19, 2016

It's your client, but the site's content, in both web and email cases.

You are free to set your client to ignore styling on the content, but the content should be stylable for everyone else that wants it to look good.

ClashTheBunny · on Sept 19, 2016

One is push, the other is pull. I can choose never to go to a URL again, but I can't choose to never receive spam from the same person from a different address and server.

Also, I think many people do that with AdBlock, so it's not unheard of to block styling on webpages of dubious origin.

tszming · on Sept 19, 2016

However Gmail does support inline CSS.

If you want to show your styles to Gmail users and you have to inline your CSS in every HTML tags you want to alter, which is very ugly and made the email size unneeded large.

chrisscastaneda · on Sept 19, 2016

Most every email client supports inline CSS, and in my experience, inlining your CSS for emails is the safest bet to ensure your styling is consistent across all email clients. What you can't inline ever (on email or web) is CSS media queries. IMHO, I don't think media queries are supported broadly enough to use them extensively, though Gmail supporting them is a step in the right direction.

ianhawes · on Sept 19, 2016

Changelog on this chart indicates it was last updated in 2014.

tszming · on Aug 28, 2016

Why we don't sell ads (2012) https://blog.whatsapp.com/245/Why-we-dont-sell-ads?

espadrine · on Aug 28, 2016

I think the story blew out of proportion.

As far as I can tell, WhatsApp never said they would show ads in their app, and they never said they would sell ads.

On the contrary, they said that they would make it even harder for them to have any data to work with, as messages will be end-to-end encrypted, and so, undecipherable for them.

What they said would happen was telephone and metadata (ie, telephone contacts) cross-references with Facebook's data to improve Facebook's suggestions, which is a roundabout but logical non-automatic contact synchronization scheme.

Reading between the lines, what their blog post was meant to prepare the public for was the arrival of a bot API (as yet secret, and a pure speculation on my part). They want businesses both global and local to communicate with you:

> we want to explore ways for you to communicate with businesses that matter to you too, while still giving you an experience without third-party banner ads and spam

WhatsApp's PR department did a poor job, in my opinion, by talking about Facebook being able to offer better ads, since naturally everyone assumes it means transmitting your telephone number to advertisers, which they awkwardly mention they would not do:

> We won’t post or share your WhatsApp number with others, including on Facebook, and we still won't sell, share, or give your phone number to advertisers.

Adverblessly · on Aug 28, 2016

> As far as I can tell, WhatsApp never said they would show ads in their app, and they never said they would sell ads.

From their update ( https://www.whatsapp.com/legal/#key-updates )

"New ways to use WhatsApp. We will explore ways for you and businesses to communicate with each other using WhatsApp, such as through order, transaction, and appointment information, delivery and shipping notifications, product and service updates,

>>and marketing.<<

For example, you may receive flight status information for upcoming travel, a receipt for something you purchased, or a notification when a delivery will be made.

>>Messages you may receive containing marketing could include an offer for something that might interest you.<<

We do not want you to have a spammy experience; as with all of your messages, you can manage these communications, and we will honor the choices you make."

I interpret the highlighted parts to mean they intend to show ads in the application in some format. It could be ad overlays, it could be adbots sending messages and it could be ad notifications.

Edits: figuring out how to highlight text :X

themartorana · on Aug 28, 2016

There is a non-evil way to interpret this. A lot of small international business is conducted over WhatsApp - for instance, many key sellers in the vintage watch market offer WhatsApp as a communications method. But it requires more work than necessary to initiate the conversation. Allowing these sellers to accept incoming messages without having to use it like an individual would make it significantly easier to use (and might help displace email in many business transactions).

A business account for one-to-one communications with a business would be somewhat revolutionary. It's more private (and conversational) than Twitter, less bulky and in-the-way than email, etc.

espadrine · on Aug 28, 2016

I'm not sure in which dictionary unsolicited electronic messages sent for commercial purposes are not spam…

That formulation is pretty damning. There's the chance that marketing material will only be delivered to users that used a commercial bot, but that would still be unsolicited in my opinion.

nly · on Aug 28, 2016

What did we really expect? That they were going to support a 1 billion user app as a public good? The writing was on the wall when they dropped the 79 pence per year fee

AlexandrB · on Aug 28, 2016

I think that it's a good reminder that all the sentiment in the world is meaningless in the face of changing ownership and/or business realities. See also Instapaper + Pinterest.

Edit - Also a reminder to founders: If you want your vision for the company to survive, don't sell it!

bad_user · on Aug 28, 2016

Founders that sell are very much aware of this fact. If they're selling it's either because they have no choice, or because they don't believe in said vision. In the end, unless you have a strong contract that protects users with the threat of a fork, like being based on a distributed open protocol, or an open source license, it's all just bullshit and marketing, whereas aligning with your users interests is only temporary.

I also think there's an inherent lesson for users here, more than it is for founders: don't trust startups, most of them won't survive and won't have the decency to die either, preferring instead to sell your account and data to the highest bidder.

Freak_NL · on Aug 28, 2016

> If they're selling it's either because they have no choice, or because they don't believe in said vision.

Or just because the price is right. Some may believe that they can do something even greater with the money, and some simply realize that a lot of money is the more attractive option for them. The latter may even have faith that the purchaser will keep the vision alive (and most will be disappointed in that respect).

bogomipz · on Aug 28, 2016

I didn't understand your referencing "Instapaper + Pinterest." Could you explain how these relate?

arthurfm · on Aug 28, 2016

Pinterest acquired Instapaper.

http://blog.instapaper.com/post/149374303661

_phaq · on Aug 28, 2016

I did expect it, but I'd still rather pay with money.

hahooooo · on Aug 28, 2016

Or pay with both, like the NYT (print edition) which costs money _and_ comes with ads, Windows 10 (assuredly not free, the OEM has to pay for it)

Qantourisc · on Aug 28, 2016

I'd love to see come companies offer both. Question then of course: how can we trust them with the infrastructure in place ?

thr0waway1239 · on Aug 29, 2016

The way you framed the question, it seems so obvious that they "had to find a way to monetize". But there were probably some inflection points. This is where the issue of the customer trusting the website/app comes in. Initially, WhatsApp was charging a very small fee ($1 a year?) and making it appear as if they were doing it to avoid selling customer data.

The following post has already been linked many times during the related comment threads, but I am posting it again because I bet most people never actually read the whole thing.

----------

Link: https://blog.whatsapp.com/245/Why-we-dont-sell-ads

Text:

When we sat down to start our own thing together three years ago we wanted to make something that wasn't just another ad clearinghouse. We wanted to spend our time building a service people wanted to use because it worked and saved them money and made their lives better in a small way. We knew that we could charge people directly if we could do all those things. We knew we could do what most people aim to do every day: avoid ads.

No one wakes up excited to see more advertising, no one goes to sleep thinking about the ads they'll see tomorrow. We know people go to sleep excited about who they chatted with that day (and disappointed about who they didn't). We want WhatsApp to be the product that keeps you awake... and that you reach for in the morning. No one jumps up from a nap and runs to see an advertisement.

Advertising isn't just the disruption of aesthetics, the insults to your intelligence and the interruption of your train of thought. At every company that sells ads, a significant portion of their engineering team spends their day tuning data mining, writing better code to collect all your personal data, upgrading the servers that hold all the data and making sure it's all being logged and collated and sliced and packaged and shipped out... And at the end of the day the result of it all is a slightly different advertising banner in your browser or on your mobile screen.

Remember, when advertising is involved you the user are the product.

At WhatsApp, our engineers spend all their time fixing bugs, adding new features and ironing out all the little intricacies in our task of bringing rich, affordable, reliable messaging to every phone in the world. That's our product and that's our passion. Your data isn't even in the picture. We are simply not interested in any of it.

When people ask us why we charge for WhatsApp, we say "Have you considered the alternative?"

-----------

Notice the last line. And ask yourself if you were one of those people who actually believed these words when the blog post was published.

This is why we really need to be outraged. It is surprisingly easy to say the kind of words that con people and then take advantage of them over the long run. And then we make it worse by acting like it is no big deal. Acting like it is no big deal actually emboldens such companies. Now clearly FB lacks this moral compass right at the top (and yes, all my comments make this pretty obvious), but I am starting to wonder if the companies are getting away because there are absolutely no negative sanctions. Soon, this will just become the norm and the accepted practice. If LittleStartupCo pulls a similar bait and switch tomorrow, they will say "Yeah, but WhatsApp did the same thing, and there was a small commotion and people just quickly moved on"

Don't move on. Actually create a ruckus and cause some backlash. If the only problem that the company faces is a few nerds making a small ruckus on a nerd forum, then they will keep doing these things.

Instead, next time you see your friend who works at Facebook/Google/Microsoft/... talk about ethics in any context, mock them for participating in the discussion when they clearly don't have the spine to display the same ethics in their professional work. Just automatically discount their views/doubt their motives in every context, and I bet you will see the message will slowly start moving up to the top of the org.

Even better, stop hiring alumni of these organizations unless they make an open statement that they were sorry for their involvement in organizations which had no regard for privacy. Does it sound drastic? Then how does it feel to have supported the rise of WhatsApp in their days when they badly needed the support, only to see the bait and switch and the oh so casual - "sorry, but we are not really sorry, just FO losers"?

tszming · on Aug 26, 2016

I think most of the providers (e.g. DO, Linode, CloudFlare etc) do not check the authority of DNS due to the chicken-and-egg problem. The AWS way to handle this issue is definitely awesome but the infrastructure required is not worth for those companies who are providing "free DNS service" as an add-on to their existing customers. Anyway, IMO, it is your fault if you point to a nameserver but not utilizing it.

jsmthrowaway · on Aug 26, 2016

The random nameservers are only accidentally a defense against this attack. They're avoiding SPOFs, including TLDs -- you never receive nameservers in the same TLD for example. It's a reliability and scaling consideration with this accidental benefit.

Most admins don't think about a complete TLD failure. Amazon did.

tszming · on Aug 26, 2016

>> accidental benefit.

Agree

>> Most admins don't think about a complete TLD failure. Amazon did.

I think companies such as Google or Facebook did think that before, but I am not sure why they didn't follow this trick.

tszming · on Aug 15, 2016

> Error establishing a database connection

Why you should never use database

virmundi · on Aug 15, 2016

I know there is a sense of biting irony here, but it really a valid comment for relatively static sites. Look at the Git based systems out there. If you can generate your site without a tier, generate it without a tier.