Hacker News new | past | comments | ask | show | jobs | submit login

> Longer context window (1M+)

What's your use case for this? Uploading multiple documents/books?




Uploading large codebases is particularly useful.


Is it?

I've found that I get better results if I cherry pick code to feed to Claude 3.5, instead of pasting whole files.

I'm kind of isolated, though, so maybe I just don't know the trick.


I've been using Cody from Sourcegraph, and it'll write some really great code; business logic, not just tests/simple UI. It does a great job using patterns/models from elsewhere in your codebase.

Part of how it does that is through ingesting your codebase into its context window, and so I imagine that bigger/better context will only improve it. That's a bit of an assumption though.


Books, especially textbooks, would be amazing. These things can get pretty huge (1000+ pages) and usually do not fit into GPT-4o or Claude Sonnet 3.5 in my experience. I envision the models being able to help a user (student) create their study guides and quizzes, based on ingesting the entire book. Given the ability to ingest an entire book, I imagine a model could plan how and when to introduce each concept in the textbook better than a model only a part of the textbook.


Long agent trajectories, especially with command outputs.


Correct


That would make each API call cost at least $3 ($3 is price per million input tokens). And if you have a 10 message interaction you are looking at $30+ for the interaction. Is that what you would expect?


Gemini 1.5 Pro charges $0.35/million tokens up to the first million tokens or $0.70/million tokens for prompts longer than one million tokens, and it supports a multi-million token context window.

Substantially cheaper than $3/million, but I guess Anthropic’s prices are higher.


You're looking at the pricing for Gemini 1.5 Flash. Pro is $3.50 for <128k tokens, else $7.


Ah... oops. For some reason, that page isn't rendering properly on my browser: https://imgur.com/a/XLFBPMI

When I glanced at the pricing earlier, I didn't notice there was a dropdown at all.


It is also much worse.


Is it, though? In my limited tests, Gemini 1.5 Pro (through the API) is very good at tasks involving long context comprehension.

Google's user-facing implementations of Gemini are pretty consistently bad when I try them out, so I understand why people might have a bad impression about the underlying Gemini models.


Maybe they're summarizing/processing the documents in a specific format instead of chatting? If they needed chat, might be easier to build using RAG?


So do it locally after predigesting the book, so that you have the entire KV-cache for it.

Then load that KV-cache and add your prompt.


This might be when it's better to not use the API and just pay for the flat-rate subscription.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: