I just published the first issue our digital zine, Forest Friends. The first issue is on "LLM System Evals in the Wild".
Lots of AI engineerers are doing vibes-based engineering, just eyeballing the LLM output and saying "LGTM!". This is a good place to start, as we all should look at our data more. But it's best to move on from vibes to system evals.
The first issue is on how to design and build system evals for a systematic way to gauge how well your LLM app is doing. That way, no matter if there are new models, new users, or new queries, you can be sure you're continuously improving, rather than allowing regressions.
Lots of AI engineerers are doing vibes-based engineering, just eyeballing the LLM output and saying "LGTM!". This is a good place to start, as we all should look at our data more. But it's best to move on from vibes to system evals.
The first issue is on how to design and build system evals for a systematic way to gauge how well your LLM app is doing. That way, no matter if there are new models, new users, or new queries, you can be sure you're continuously improving, rather than allowing regressions.
You can buy the first issue here:
https://issue1.forestfriends.tech/
And if you want to keep abreast of the next issue, you can subscribe here:
https://forestfriends.tech