Show HN: Describe SQL using natural language, and execute against real data

bob1029 · on Dec 16, 2021

This demo sent us on a warpath today. We have a fairly clean SQL schema for which we need to craft a lot of queries that handle things like business logic, reporting and configuration.

If we could get even 50% success rate on a reasonable starting point for the generated SQL each time, that would be the biggest value-add our organization has ever seen.

I think our use case is compelling because we have to implement the same SQL targets for every customer. The only variations are typically customer-specific parameters/codes/etc.

We also have a huge corpus of examples to pull from for training data.

We are thinking about initially implementing some higher order views/functions in our SQL dialect to make things easier on ourselves with the GPT model. Complex joins across many tables seems to be something that would still elude these techniques. Most of our joins are of a very particular shape, so we can abstract the super nasty stuff away.

Worst case scenario, this concludes like my cynical mind assumes it will, but I am open to being surprised this time. We aren't going to put everything behind this, more of a "if it works..." kind of 1-2 week experiment.

atom_arranger · on Dec 17, 2021

There are projects out there that do this.

Possibly relevant: https://yale-lily.github.io/spider

I briefly worked on a startup to commercialize this tech, but we decided it wasn't accurate enough to be useful. It was very cool when it actually worked. If you can only produce what you want half the time on simple queries, that doesn't seem very useful to me though.

weepa · on Dec 18, 2021

Can you elaborate on what kind of use cases you were trying to tackle using NL2SQL in that startup? Who was the target audience/persona?

atom_arranger · on Dec 20, 2021

Any company that has a lot of data in an SQL system that they want to make sense of. The idea would be that business intelligence people, analysts, the CEO, anyone who needed answers could ask a question in plain english and hopefully get a response.

The success rate was just not good enough, even for relatively simple queries. You'd probably need to adjust the query 90% of the time, and the other 10% you couldn't even really trust that the answer was correct.

napoleond · on Dec 16, 2021

Shoot me an email if you’d like—I’d be happy to share any learnings from what I built if relevant.

newusertoday · on Dec 17, 2021

Have you considered using templates? can you elaborate more on why they can't be used (I guess there is no way to cleanly separate parameters but i may be wrong)

tillvz · on Dec 16, 2021

Hey that's very cool and works surprisingly well!

At Veezoo (http://www.veezoo.com) we have been tackling this problem for over 5 years now.

Under the hood we are using our own models. With GPT-3 we're a bit worried about the lack of fine-grained control needed for productive use-cases and obv. also lock in.

Will try out against the same dataset and see how it compares!

tluyben2 · on Dec 16, 2021

Nice product, will give it a spin. For the banking market. Do you have an API exposed as well to feed questions and get (json) data? Our clients would definitely need it integrated fully.

tillvz · on Dec 16, 2021

Currently we don't have such an API exposed. Main reason is the focus on offering a complete self-service analytics solution (i.e. offering next to the NL2SQL also important UI/UX components)

Still we have other ways of integrating it with other systems e.g. exporting to CRMs. Also on the branding side it's possible to have it in the companies corporate identity.

And just let me know if you wanna get a quick tour, happy to show you around :)!

mritchie712 · on Dec 16, 2021

not OP, but this is exactly what ThoughtSpot Everywhere is for:

https://www.thoughtspot.com/everywhere

napoleond · on Dec 16, 2021

Nice! Yeah I have no doubt that a specialized model could beat this general one, although I find the output from the general one to be uncanny at times. Would love to hear your expert opinion on how they compare!

cafed00d · on Dec 16, 2021

Wow, this is absolutely brilliant!

How can I extend this to use other datasets? There seem to be quite a few interesting publicly available datasets out there: https://console.cloud.google.com/marketplace/browse?filter=s...

and I'm wondering if: 1. Is there a paid-for version of your app/website where I can plug in a diff dataset? 2. Have you considered sharing the source code for others to recreate and plug in diff datasets? 3. Or, :sweat_smile:, perhaps this is as simple as adding new datasets to the drop-down menu? say, finance data / Bitcoin transactions data?

Regardless, really cool app!

napoleond · on Dec 16, 2021

For other public BigQuery datasets I’d certainly consider just adding them to the demo for free—which specific Bitcoin/finance datasets did you have in mind?

For private datasets, we’re looking at adding that functionality to the core Tabby service (that’s the SaaS this relates to). Please email for info!

mritchie712 · on Dec 16, 2021

We did a similar analysis a while back, works surprisingly well! I OpenAI increases the amount of "training data" you can send in, I think it could get really good at generating SQL.

https://blog.seekwell.io/gpt3

napoleond · on Dec 16, 2021

Cool! Yes, the ability to customize the model (https://openai.com/blog/customized-gpt3/) also seems like it could be really useful for this.

earleybird · on Dec 16, 2021

May be some rough edges - or I have unreasonable expecations. The weather data set has 'begin' and 'end' columns which generates bad sql.

ie, "ERROR: Syntax error: Expected end of input but got keyword END at [49:3]"

napoleond · on Dec 16, 2021

Yeah many rough edges indeed. The generated SQL is the plain output from GPT-3; I have not done anything to customize the model or validate syntax outside it, so the roughness is expected. No idea if folks will find value in this despite that, hence the demo.

earleybird · on Dec 17, 2021

Definitely passes the 'is it interesting' test. Certainly passes the 'should dig deeper' gate. Well done. I like to think I have a solid sql understanding but it's also nice to be able to ask questions in a natural language.

I'm still amazed that I can type a natural language phrase into google and often get sensible results.

napoleond · on Dec 16, 2021

Clickable link: https://app.tabbydata.com/sql-assistant-demo

perl4ever · on Dec 17, 2021

   /*
   Describe the data in the all_geoclustered table.
   */

   SELECT
     COUNT(*) AS total_rows,
     COUNT(*) FILTER (WHERE date IS NOT NULL) AS total_rows_with_date,
     COUNT(*) FILTER (WHERE wban IS NOT NULL) AS total_rows_with_wban,
     COUNT(*) FILTER (WHERE stn IS NOT NULL) AS total_rows_with_stn,
     COUNT(*) FILTER (WHERE year IS NOT NULL) AS total_rows_with_year,
     COUNT(*) FILTER (WHERE mo IS NOT NULL) AS total_rows_with_mo,
     COUNT(*) FILTER (WHERE da IS NOT NULL) AS total_rows_with_da,
     COUNT(*) FILTER (WHERE temp IS NOT NULL) AS total_rows_with_temp,
     COUNT(*) FILTER (WHERE count_temp IS NOT NULL) AS total_rows_with_count_temp,
     COUNT(*) FILTER (WHERE dewp IS NOT NULL) AS total_rows_with_dewp,
     COUNT(*) FILTER (WHERE count_dewp IS NOT NULL) AS total_rows_with_count_dewp,
     COUNT(*) FILTER (WHERE slp IS NOT NULL) AS total_rows_with_slp,
     COUNT(*) FILTER (WHERE count_slp IS NOT NULL) AS total_rows_with_count_slp,
     COUNT(*) FILTER (WHERE stp IS NOT NULL) AS total_rows_with_stp,
     COUNT(*) FILTER (WHERE count_stp IS NOT NULL) AS total_rows_with_count_stp,
     COUNT(*) FILTER (WHERE visib IS NOT NULL) AS total_rows_with_visib,
     COUNT(*) FILTER (WHERE count_visib IS NOT NULL) AS total_rows_with_count_visib,
     COUNT(*) FILTER (WHERE wdsp IS NOT NULL) AS total_rows_with_wdsp,
     COUNT(*) FILTER (WHERE count_wdsp IS NOT NULL) AS total_rows_with_count_wdsp,
     COUNT(*) FILTER (WHERE mxpsd IS NOT NULL
     
     
   /*
   How often do hail and thunder happen together?
   */

   SELECT
     COUNT(*) AS count,
     SUM(IF(hail=1 AND thunder=1, 1, 0)) AS sum_hail_thunder
   FROM
     `fh-bigquery.weather_gsod.all`
   WHERE
     year = 2019
     AND mo = 11
     AND da = 14
     AND stn = '725030'

lswainemoore · on Dec 17, 2021

/* When it rains does it pour? */

SELECT date, wban, stn, year, mo, da, temp, count_temp, dewp, count_dewp, slp, count_slp, stp, count_stp, visib, count_visib, wdsp, count_wdsp, mxpsd, gust, max, flag_max, min, flag_min, prcp, flag_prcp, sndp, fog, rain_drizzle, snow_ice_pellets, hail, thunder, tornado_funnel_cloud, usaf, name, country, state, call, lat, lon, elev, begin, end, point_gis, fake_date FROM `fh-bigquery.weather_gsod.all_geoclustered` WHERE lat IS NOT NULL AND lon IS NOT NULL AND lat != 0 AND lon != 0 AND lat != 1 AND lon != 1 AND lat != -1 AND lon != -1 AND lat != 2 AND lon != 2 AND lat != -2 AND lon != -2 AND lat != 3 AND lon != 3 AND lat != -3 AND lon != -3 AND lat != 4 AND lon != 4 AND lat != -4 AND lon != -4 AND lat != 5 AND lon != 5 AND lat != -5 AND lon != -5 AND lat != 6 AND lon != 6 AND lat != -6 AND lon != -6 AND lat != 7 AND lon != 7 AND lat != -7 AND lon != -7 AND lat != 8 AND lon != 8 AND lat != -8 AND lon

perl4ever · on Dec 17, 2021

   /*
   What time of day does hail usually happen?
   */

   SELECT
     date,
     wban,
     stn,
     year,
     mo,
     da,
     temp,
     count_temp,
     dewp,
     count_dewp,
     slp,
     count_slp,
     stp,
     count_stp,
     visib,
     count_visib,
     wdsp,
     count_wdsp,
     mxpsd,
     gust,
     max,
     flag_max,
     min,
     flag_min,
     prcp,
     flag_prcp,
     sndp,
     fog,
     rain_drizzle,
     snow_ice_pellets,
     hail,
     thunder,
     tornado_funnel_cloud,
     usaf,
     name,
     country,
     state,
     call,
     lat,
     lon,
     elev,
     begin,
     end,
     point_gis,
     fake_date
   FROM
     `fh-bigquery.weather_gsod.all_geoclustered`
   WHERE
     hail = 1
   ORDER BY
     date DESC
   LIMIT
     10

easylearnai · on Dec 17, 2021

Nice idea, but the default natural language query generated a query that didn't run. Perhaps turn down the temperature or tune other parameters. This could be better if you fine-tune it such that anyone could bring their own dataset, with GPT3 just making it available recently. My favorite recent app for no code sql query is trevor.io that has allowed me to build a dashboard so easily. I only knew basic sql query and didn't have to tamper with the actual SQL code; it was all interactive and the results is always in the correct format. Comparing to the GPT3 approach, I would say I prefer trevor over using natural language for the precision. The appeal of using GPT3 exists only if it gets high enough accuracy, imo.

i_like_apis · on Dec 16, 2021

This is pretty cool.

I thought it was funny in that in the weather dataset that "NULL" comes in first for the win for some questions:

> what is the all time rainiest city? > what are the top 5 most dry states?

The query conversion is impressive!

napoleond · on Dec 16, 2021

The weather dataset has a bit of an unorthodox schema IMO, which gives the model more trouble than usual. That’s kind of the point of this demo, though: to what extent is generated SQL like this useful—despite its flaws—in the context of real-world datasets? Jury is still out :)

mjirv · on Dec 16, 2021

Very nice! I’ve been working on something similar recently: https://acolytehq.com.

Would be happy to chat and compare notes if you'd like!

napoleond · on Dec 16, 2021

Awesome! Always happy to chat; my contact info is in my profile (and also at the bottom of the linked demo).

paulfitz · on Dec 16, 2021

For google-trends it translated "Is there anything cat-related that people are excited about?" to something very reasonable that worked (answer: popcat apparently). Nice!

napoleond · on Dec 16, 2021

Really? Wow, I would not have expected that to work. Might be one advantage of using this general model instead of a more specialized one.

talos2110 · on Dec 16, 2021

Nice!

As an interesting test case, check out the very strange and seemingly recursive query generated for “Get the top 10 authors of caching libraries, ranked by commit volume”

napoleond · on Dec 16, 2021

Thanks for noting this! The functionality is definitely not perfect, and the opaque nature of the underlying model does not give much opportunity for tweaking. I suspect slightly altering your input might help drive better output.

331c8c71 · on Dec 16, 2021

Works unexpectedly well! As many others I am wondering how difficult/costly it'd be to have this assistant set up for another dataset.

napoleond · on Dec 16, 2021

Are you looking to set it up for another public dataset or your own private one? Either way, shoot me an email (on the demo page and also in my HN profile) and I’ll see what I can do.

perl4ever · on Dec 17, 2021

   /*
   What is the range of elevations where tornadoes happen?
   */

   SELECT
     elev,
     COUNT(*) AS count
   FROM
     `fh-bigquery.weather_gsod.all`
   WHERE
     tornado_funnel_cloud = 'T'
   GROUP BY
     elev
   ORDER BY
     count DESC

gertjandewilde · on Dec 26, 2021

Great use case. Is the demo broken?

matthewmorgan · on Dec 17, 2021

Why do programmers seem to love the preposition 'against'? 'On' would have been fine