Hacker News new | past | comments | ask | show | jobs | submit login

I have a page with some advice on writing searches (https://search.feep.dev/about/query), but I don’t think you did anything wrong here: sometimes my search results are just inexplicably terrible. This definitely falls into that category and is going on my list of test cases that need improvement. There’s a reason I link to Google at the bottom of the results page!

I’m currently using ElasticSearch for ranking, and made a brief effort at tuning it. The problem is that it’s very big and complicated, which makes it hard for me to understand what’s going on under the hood. If I were doing this professionally I’d dive into ES internals and figure it out, but when I can only squeeze in a few hours a week it’s hard to really sink my teeth in. I’d like to switch to something simpler to wrap my head around (possibly Lucene, or Bleve); once I’ve done that I should be able to get a better handle on how the ranking works and how to make it more reliable.




Might be wrong, but the page they provided as an example correct result is not even in your index. Is that correct, and if so, why? If it is in your index, what is a query that would return it as a top ten result?


I can see it in Kibana when I request it by ID, but I can’t seem to get it via text search no matter what keywords I use, which is bizarre. (“NSMutableURLRequest image” should be pulling it up, but isn’t.) I have no idea what’s going on here, but thanks for bringing my attention to it!

This sort of thing is part of the reason I want to move off of ES: it’s a big black box and when something goes wrong I have no idea how to diagnose it. (I’m currently researching “unassigned shards” in case that’s the problem, but for all I know that could be a red herring.) Something a lot simpler would be easier for me to hold in my head and easier to figure out when it goes wrong.


Elasticsearch is distributed Lucene, no?


Yes (well, plus a lot of other features); and it’s the “distributed” part that gives me headaches. I don’t need any of that stuff, since I’m running on a single node, and it means there’s a bunch of abstractions between me and Lucene (which Elastic mostly tries to hide away as an implementation detail).


I don't have much experience with ES, but I remember trying Solr a few years ago and it was relatively simple to get running on a single VM. It is also using Lucene at its core, so it might be worth a try.


My experience with Solr is that it is much more schema-centric than ES. Which is good and bad, because ES being all "don't worry about it" is fine until you do have to worry about it, and then it's some holy hell trying to square up your version of the world with what ES thinks of the world

The Solr search API is worse, IMHO, also, although it can likely be fine if you just stick to their simple query string (for both versions of "their," ES and Solr). That said, my experience with ES is just like OP's: keeping the piece of junk alive and healthy is a time-and-a-half job. Combined with their recent license tomfoolery, I hope to never touch it again

I haven't used any of the new search upstarts in anger enough to know whether they're prime-time or not




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: