Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Poozle – open-source Plaid for LLMs (github.com/poozlehq)
132 points by harshithmul on Aug 18, 2023 | hide | past | favorite | 39 comments
Hi HN, We’re Harshith, Manoj, and Manik

Poozle (https://github.com/poozlehq/poozle) provides a single API that helps businesses achieve accurate LLM responses by providing real-time customer data from different SAAS tools (e.g Notion, Salesforce, Jira, Shopify, Google Ads etc).

Why we built Poozle: As we were talking to more AI companies who need to integrate with their customers’ data we realised managing all SAAS tools data and keeping them up-to-date is a huge infra of ETL, Auth management, Webhooks and many more things before you take it to production. It struck us – why not streamline this process and allow companies to prioritise their core product?

How it works: Poozle makes user authorization seamless using our drop-in component (Poozle Link) and handles both API Key and OAuth dance. Post-authentication developers can use our Unified model to fetch data to their LLMs (no need to sync data separately and then normalise at your end). Poozle keeps data updated in real time while giving you options to choose sync intervals. Even if the source doesn’t support webhooks, we’ve got you covered.

Currently, we support Unified API for 3 categories - Ticketing, Documentation and Email. You can watch a demo of Poozle (https://www.loom.com/share/30650e4d1fac41e3a7debc212b1c7c2d)...

We just got started a month ago and we’re eager to get feedback and keep building. Let us know what you think in the comments : )




Definitely a difficult problem you're taking on here, but I don't see anything specific to LLMs here? How or why are you marketing towards LLMs?

How do you compare to the larger players here already Nango[0] and Merge[1] ?

I'm curious how you're thinking about data access / staleness? It's great that you're handling the oauth dance, but does that mean every end user of the product has to auth every product they interface with or are you handling this all at the super admin / enterprise level?

Right now I think there's too much emphasis on the "data loading" aspect of LLMs. I expect to see a swing back into using 3rd party API's SDKs. Interested to hear your thoughts on the Google API, it's absolutely massive and trying to shoehorn that into a unified API scares me.

The only real player that I could see to launch something like this and be successful is Okta.

[0] - https://github.com/NangoHQ/nango [1] - https://merge.dev/


Hey I'm one of the co-founders of Poozle. Thanks for asking great questions let me take them one by one

<Why LLMs> Our goal is to provide context for LLMs. Our first step is to normalize data and offload syncing, similar to other Unified API providers like Merge. In the future, we also plan to assist with vector embeddings or storing data directly in Vector DB for a search context API. We are exploring the best solution and believe building in the community will be a big help.

<Competition with large Players> Nango doesn't offer a pre-built Unified API. Merge focuses on B2B SAAS companies looking to build customer-facing integrations. Our goal is to develop tools and infrastructure to support LLMs. This is similar to how Plaid bet on the Fintech industry and built infrastructure and tools around it, starting with a Unified API for banking data.


I don't think the comparison to Plaid is helping as much as you think. You and Plaid are in completely different verticals and as a result have completely different goals and users.

You created a single API for several services.

That's where the comparison with Plaid ends.


How Plaid has provided all needs for Finance products by providing unified API we are bringing that to multiple verticals.

And choosing LLMs as our primary ICP, the solution will evolve more for their needs.


<Data Access/Staleness>

Currently every user of the product has to do the auth. However in future for our enterprise customers, we plan to support SSO and SAML.

<Google API> You're absolutely right, the array of Google APIs is vast. However, if we approach it from a category perspective, there are typically a couple of key APIs that we need to manage for instance in documentation category we take google docs and for Email we take gmail APIs.


It’s odd to me that people are downvoting you for clearly answering the question.


Would love to understand if there is something missing.


This is a very difficult problem. How do you deal with losing data fidelity? If one service's entity has X field, but all the others don't, do you include it or not?

How do you deal with subtle differences in usage of terminology? If one service uses the word "ticket" slightly differently than others, how do you deal with that?


> If one service's entity has X field, but all the others don't, do you include it or not?

We have a raw field from which you can pick anything integration specific.


Yeah, I think we have 2 things here. Take an example model a "Ticket".

1. Dealing with custom fields. We provide a way you can map the custom fields specific to the company into the common Model.

2. Dealing with the naming difference Github has issues and Asana has Tasks. We map both the data into a common Ticket model.

This ensures that you integrate with our Ticketing Model and you are integrated to all the Ticketing platforms we support


Maybe ticket isn't a great example. Another general problem I've had with mappings is say, one service uses a one-to-one relationship, and another puts both entities into a single entity. E.g., maybe there's a way to share tickets externally. One allows you to create a shareable link as a separate entity, and another embeds the shareable link as a property of the ticket.

I think, looking at this, I'd still probably prefer to create something custom. What would really help is simplifying having to set up OAuth and api calls, and just being able to do the mapping (or transform, of extract-transform-load) myself, in case there are edge-cases.


gotcha.

We do the ETL in the background and ensure that the model is completed irrespective of the 1-1 or different entity. You would also be doing the same if you manage the ETL. You could check (https://github.com/poozlehq/poozle/tree/main/integrations/ti...) our ticketing integrations, would love to take some feedback there.

Also if you need more customization you can write on top of the integration we have already (https://github.com/poozlehq/poozle/tree/main/integrations/ti...) and plug it into the platform.


If I'm understanding Poozle correctly, you could pull "tickets" from multiple services, join them, then send back a new "ticket" that might be in a different collection?


Yeah true. We will pull all similar tickets from multiple SAAS tools and send them back in a common ticketing model


This looks neat.

https://github.com/ShishirPatil/gorilla feels like a project with certain similarities here. How would you compare yourselves to them?


We are more into unified API with the right tooling needed for LLMs to talk to these SAAS tools whereas Gorilla is a model which has knowledge about APIs.

More than a similarity we thought we could use the gorilla LLM knowledge to get to a more generic Unified API from different SAAS tools.

We built an in-house tool (https://www.loom.com/share/ff88f482765d43e49aebcefd3f00df27) which has all REST APIs trained. We will use this for both discoverability of APIs and also to get to a perfect Unified API personal to the companies.


Do you have any plans to opensource this or share more technical implementation details? I've been ideating on this recently but haven't started any implementations yet, would be interesting to see how you approached this!


Yeah we will be opensourcing this in the coming week. Happy to answer questions. Feel free to join our Slack we can discuss this there or shoot me an email at harshith[@]poozle.dev


This sounds very useful, I wonder if you have given thought to users providing their own schema to query rather than using yours(I would assume a LLM or the likes should be able to translate back and forth between schemas).

Another question would be how does the schema update if say a new feature gets added to "tickets"


As we currently expose the common models APIs and you can use certain transformations before you send them into the VectorDB. This way you just have to write one transformation for all the SAAS tools in a specific category.

We are having versioning for the models and also for integrations thus giving us the flexibility to manage the schema updates


This looks excellent and very timely. We just started hand rolling our own ETL process with Python and Github Actions to ingest from various SaaS sources to a central DB for context to LLM(s) and traditional BI tools.

Speaking of Plaid, would love to see an integration there for personal finance


We haven't gotten into finance integrations yet but would love to understand use case more and see what we can do. Happy to discuss more in our community Slack.

Also what kind of SAAS sources are currently ingesting?


Hey, very cool work! How does your solution compare to directly building the context in a framework like LangChain?

I’m an AI researcher, so am a little further from this area, but am very curious.


There are 3 major challenges

1. We think the context you want is distributed into multiple APIs in most of cases and there are also multiple APIs on which you might want to search. Ex: Users, Tickets, Comments etc. A lot to build from scratch.

2. Addressing these for multiple SAAS tools means learning multiple APIs and writing different code for every SAAS tool.

3. Last but difficult is to keep all of this more real-time and build the whole tech around with webhooks to keep this real-time. Also, some SAAS tools like Notion don't have webhooks


Pozzle just sounds like a violation of encapsulation/departmental permiters at best, and a security nightmare at worst.


Would love to understand why you think so?


That's interesting


Doesn’t Poozle mean something, erm… quite specific?


I guess it's a good idea to check your product name in https://www.urbandictionary.com early


It's most commonly used in NZ. It means to scavenge for collectable objects. But yeah that's the start but later we realised it was also something else.


Slangonym for vagina. According to The Big Book of Filth (1999) coinage and usage dates back to the late 19 th century.


Yeah :( looks like we should change the name


well, that escalated quickly '-), sorry if I opened a can of worms by pointing this out, I guess better to sort these things early on than worry about them later in some ways though.


Fwiw, while I haven't heard "poozle" very much as an American, I've heard it enough that my mind instantly went to the unfortunate definition when I saw it here. Meanwhile, I've never heard the NZ definition at all before.


Got it. We might have to think to change the name


Funny you mention NZ, kiwi here - back in high school (late 90s, early 00s) it was definitely a known word and had nothing to do with scavenging (or at least I'd hope not!).


We took the first/3rd meanings https://www.wordhippo.com/what-is/another-word-for/poozle.ht... but missed in between :(


> It's most commonly used in NZ. It means to scavenge for collectable objects.

Not at all commonly used in NZ. I've lived in NZ my whole life and never heard the word.


Gotcha. We will surely rethink on the name




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: