Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Fully client-side GPT2 prediction visualizer (perplexity.vercel.app)
153 points by thesephist on Sept 6, 2023 | hide | past | favorite | 11 comments
Hi HN! I've found this visualization tool immensely helpful over the years for getting an intuition for how an LLM "sees" some piece of text, and with a bit of elbow grease decided to move all compute to client side so I could make it publicly available.

I've found it particularly useful for

- Understanding exactly how repetition and patterns affect a small LM's ability to predict correctly

- Understanding different tokenization patterns and how it affects model output

- Getting a general sense of how "hard" different prediction tasks are for GPT-style models

Known problems (that I probably won't fix, since this was a kind of one-off project)

- Doesn't work well with Unicode grapheme clusters that are multiple GPT-2 tokens (e.g. emoji, smart quotes)

- Support for other models (maybe later?)




There's a video of a previous version of this tool here which I found really helped me understand what it was demonstrating: https://twitter.com/thesephist/status/1617747154231259137

It's really neat to see how this sentence:

> The first time I write this sentence, the model is quite confused about what token is about to come next, especially if I throw in weird words like pumpkin, clown, tweets, alpha, teddy bear.

Shows that the words pumpkin, clown etc are considered really unlikely. But when the sentence is repeated a moment later, all of the words become extremely predictable to the model.

Also worth noting: this demo runs entirely in the browser! It loads a 120MB ONNX version of GPT-2 using Transformers.js.


Really interesting! I wonder how well this syncs up with human intuition and general “information density”. If it’s a close match, maybe you could use this as a tool to help with skimming documents — the red (“hard to predict”) areas might be a good hint to slow down and read more carefully, while the green (“easy to predict”) areas might mean you could skim without losing too much unpredictable information.


If you're set on reading the document as fast as you can, you will skip the "green" bits after having done it a few times. A likely word such as "not" will not stand out. You'd be better off asking a more comprehensive language model for a summary.


This is definitely an interesting idea I've also pondered before. In my experience (just speaking from intuition) what's "easy" for LMs to predict often doesn't line up with our human expectations for what's "obvious". Often LLMs will learn seemingly "low information content" statistical correlations that just helps it lower its training loss.


This is beautiful. Needs to be a standard tool for all models.

Great work!


The highlights being very similar in red and green is a complete nightmare for me because I have deuteranopia. You should probably fix that.


I’m sure it’s neat but it shouldn’t start running on load, because some people are browsing on mobile.


Any chance of Llama2 support?


Definitely possible if supported by transformers.js. If I see enough folks wanting it I'll likely add it at some point.


It would be interesting to have attention visualized as well, similar to how it's done in BertViz:

https://github.com/jessevig/bertviz


I really enjoyed playing with BertViz. Another similar visual exploration tool I found recently is in Edwin Chen's blog. I am pretty sure this is the best explanation of LSTM. I think more tutorials should use this visual approach.

http://blog.echen.me/2017/05/30/exploring-lstms/

There is also https://playground.tensorflow.org/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: