You act like there is no benefit to entering a relationship with a fellow human’s website. You act like it is somehow difficult or unpleasant to do that assessment. You act like LLMs are stable and reliable.
I don’t understand how there is any comparison. The only way I can assess the reliability of an LLM is by doing hundreds of trials, analyzing each one for correctness— a gargantuan task. And it’s not like OpenAI has ever done such testing! And even if I did that work (which I HAVE done in other contexts while researching LLM reliability) I cannot assume that it will remain reliable when the model is updated.
Meanwhile there are well known social forces that act on humans to encourage them to not put poison into their recipes. No such force acts on ChatGPT until a kid gets poisoned or some outrageous recipe goes viral.
Sometimes I wonder if certain other people are using some different Internet than I use.
I don’t understand how there is any comparison. The only way I can assess the reliability of an LLM is by doing hundreds of trials, analyzing each one for correctness— a gargantuan task. And it’s not like OpenAI has ever done such testing! And even if I did that work (which I HAVE done in other contexts while researching LLM reliability) I cannot assume that it will remain reliable when the model is updated.
Meanwhile there are well known social forces that act on humans to encourage them to not put poison into their recipes. No such force acts on ChatGPT until a kid gets poisoned or some outrageous recipe goes viral.
Sometimes I wonder if certain other people are using some different Internet than I use.