Hacker News new | past | comments | ask | show | jobs | submit login

This is absolutely the problem. But there is a line of sight; namely, combining LLMs with existing semantic data technologies (e.g, RDF.)

This is why I'm building a federated query optimizer: we want to let the LLM reason and formulate queries at the ontological level, with query execution operating behind a layer of abstraction.




Unfortunately this doesn't address the problem I'm describing.

My team had these ontologies available to the LLM and provided it in the context window. The queries were ontologically sensible at a surface level, but still wrong.

The problem is that your ontology is rapidly changing in non-obvious and hard to document ways e.g. "this report is only valid if it was generated on a tuesday or thursday after 1pm because that's when the ETL runs, at any other time the data will be incorrect"


The analysts know where the bodies are buried, so to speak. The execs may not even be aware there are bodies.


This got me curious as to what "queries at the ontological level" means in concrete terms. It's been a good long while since I did anything even remotely data engineering -like, and back then "AI" could be something like a support vector machine (yay moving goalposts), so I haven't had to deal with this sort of stuff at all.


Line of sight to a problem solving architecture, while cool, is nowhere near line of sight on upgrading the existing crappy data that is critically intertwined with literal thousands of apps in a typical enterprise.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: