Hacker News new | past | comments | ask | show | jobs | submit | zone411's submissions login
1. Public Goods Game Benchmark: Contribute and Punish, a Multi-Agent Benchmark (github.com/lechmazur)
7 points by zone411 46 days ago | past
2. Elimination Game: Multi-Agent LLM Social Reasoning, Strategy, and Deception (github.com/lechmazur)
5 points by zone411 69 days ago | past
3. SWE-Lancer: a benchmark of freelance software engineering tasks from Upwork (arxiv.org)
111 points by zone411 77 days ago | past | 74 comments
4. LLM Hallucination Benchmark: R1, o1, o3-mini, Gemini 2.0 Flash Think Exp 01-21 (github.com/lechmazur)
17 points by zone411 84 days ago | past | 3 comments
5. Multi-Agent Step Race Benchmark: LLM Collaboration and Deception Under Pressure (github.com/lechmazur)
7 points by zone411 3 months ago | past | 1 comment
6. Show HN: LLM Thematic Generalization Benchmark (github.com/lechmazur)
6 points by zone411 3 months ago | past
7. Show HN: LLM Creative Story-Writing Benchmark (github.com/lechmazur)
5 points by zone411 3 months ago | past
8. Show HN: LLM Divergent Thinking Creativity Benchmark (github.com/lechmazur)
8 points by zone411 4 months ago | past
9. Show HN: LLM Deceptiveness and Gullibility Benchmark (github.com/lechmazur)
7 points by zone411 6 months ago | past | 1 comment
10. LLM Confabulation (Hallucination) Leaderboard (github.com/lechmazur)
6 points by zone411 6 months ago | past
11. O1-preview and o1-mini results on NYT Connections (twitter.com/lechmazur)
2 points by zone411 7 months ago | past | 1 comment
12. Grok is an AI modeled after the Hitchhiker’s Guide to the Galaxy (twitter.com/xai)
213 points by zone411 on Nov 5, 2023 | past | 228 comments
13. Can you beat a stochastic parrot? ParrotChess.com (parrotchess.com)
3 points by zone411 on Sept 22, 2023 | past | 4 comments
14. Generative AI while browsing in Chrome (labs.google.com)
3 points by zone411 on Aug 15, 2023 | past
15. Statement on AI Risk (safe.ai)
341 points by zone411 on May 30, 2023 | past | 921 comments
16. Google tells staff it plans to limit publishing AI research (businessinsider.com)
63 points by zone411 on May 5, 2023 | past | 28 comments
17. 4th Gen Intel Xeon Scalable Sapphire Rapids Leaps Forward (servethehome.com)
2 points by zone411 on Jan 10, 2023 | past | 1 comment
18. Fast and Furious Movie Titles by 'Claude' from Anthropic AI (twitter.com/jayelmnop)
2 points by zone411 on Jan 9, 2023 | past
19. SatelliteXplorer (esri.com)
2 points by zone411 on Dec 30, 2022 | past
20. SBF Arrested by Bahamian Authorities (twitter.com/tier10k)
1308 points by zone411 on Dec 12, 2022 | past | 812 comments
21. Large Language Models Can Self-Improve (openreview.net)
3 points by zone411 on Oct 2, 2022 | past | 1 comment
22. America Reached One Million Covid Deaths (nytimes.com)
5 points by zone411 on May 14, 2022 | past
23. Show HN: Catchy melodies made with a diffusion-based neural net assistant (youtube.com)
38 points by zone411 on May 11, 2022 | past | 14 comments
24. Honduras Repeals ZEDEs (laprensa.hn)
47 points by zone411 on April 21, 2022 | past | 10 comments
25. Russian Tech Giant Yandex Says Might Default (barrons.com)
12 points by zone411 on March 4, 2022 | past
26. Maryland hasn't updated Covid case data for 15 days due to a security incident (maryland.gov)
112 points by zone411 on Dec 20, 2021 | past | 56 comments
27. Astronomers want NASA to build a giant space telescope to peer at alien Earths (npr.org)
63 points by zone411 on Nov 4, 2021 | past | 34 comments
28. Have Italian Scholars Figured Out the Identity of Elena Ferrante? (lithub.com)
1 point by zone411 on April 13, 2021 | past
29. Trump tells reporters aboard Air Force One he is banning TikTok (twitter.com/joshnbcnews)
26 points by zone411 on Aug 1, 2020 | past | 17 comments
30. Moravec Transfer (everything2.com)
1 point by zone411 on Jan 18, 2020 | past

Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: