We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.
A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.
Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.
We based our hallucination detection on "groundedness" on a claim-by-claim basis, which evaluates whether the LLM response can be cited in provided context (eg: message history, tool calls, retrieved context from a vector DB, etc.)
We split the response into multiple claims, determine if a claim needs to be evaluated (eg: and isn't just some boilerplate) and then check to see if the claim is referenced in the context.
You're not wrong but suffering isn't comparative. Because it's easier for someone to bounce back or have support in the transition doesn't mean it still doesn't suck.
This would be a shocking opinion, if we weren't in unprecedented times.
I wish we had more empathy and be kinder to people going through rough times, regardless of their wealth or position, or the duopoly they work for even, but it's also hard to completely ignore when the effect and impact is so huge.
Now, if you it makes you physically ill, I also wish you either find help or can get out of the situation you're in. Sincerely.
Anthropic does a good job of breaking down some common architecture around using these components [1] (good outline of this if you prefer video [2]).
"Agent" is definitely an overloaded term - the best framing of this I've seen is aligns more closely with the Anthropic definition. Specifically, an "agent" is a GenAI system that dynamically identifies the tasks ("steps" from the parent comment) without having to be instructed that those are the steps. There are obvious parallels to the reasoning capabilities that we've seen released in the latest cut of the foundation models.
So for example, the "Agent" would first build a plan for how to address the query, dynamically farm out the steps in that plan to other LLM calls, and then evaluate execution for correctness/success.
This sums up as ranging from multiple LLM calls to build a smart features to letting the LLM decide what to do next. I think you can go very far with the former but the latter is more autonompus in unconstrained environments (like chatting with a human etc.)
Neat article - I know the author mentioned this in the post, but I only see this working as long as a few assumptions hold:
* avg tenure / skill level of team is relatively uniform
* team is small with high-touch comms (eg: same/near timezone)
* most importantly - everyone feels accountable and has agency for work others do (eg: codebase is small, relatively simple, etc)
Where I would expect to see this fall apart is when these assumptions drift and holding accountability becomes harder. When folks start to specialize, something becomes complex, or work quality is sacrificed for short-term deliverables, the folks that feel the pain are the defense folks and they dont have agency to drive the improvements.
The incentives for folks on defense are completely different than folks on offense, which can make conversations about what to prioritize difficult in the long term.
These assumptions are most likely important and true in our case, we work out of the same room (in fact we also all live together) and 3/4 are equally skilled (I am not as technical)
FWIW, I find the classical chess tournaments with the super GMs to be fairly interesting, if only because the focus of the games is more about the metagame than about the game itself.
The article linked at the bottom of the source is a WSJ piece about how Magnus beats the best players because of the "human element".
A lot about the games today are about opening preparation, where the goal is to out-prepare and surprise your opponent by studying opening lines and esoteric responses (somewhere computer play has drastically opened up new fields). Similarly, during the middle/end-games, the best players will try to force uncomfortable decisions on their opponents, knowing what positions their opponents tend to not prefer. For example, in the candidates game round 1, Fabiano took Hikari into a position that had very little in the way of aggressive counter-play, effectively taking away a big advantage that Hikaru would otherwise have had.
Watching these games feels somewhat akin to watching generals develop strategies trying to out maneuver their counterparts on the other side, taking into consideration their strengths and weaknesses as much as the tactics/deployment of troops/etc.
> When I asked Jody how much of his family’s meat is wild game, he initially said “about half.” Upon reflection, he bumped the number to 70 percent.
Doesn't sound like this is a justification for "culture" or "tradition". Certainly seems a lot more responsible than the average "tradition" of "I got it at the grocery store".
When you hunt for your own food, you are forced to consider the sacrifice of the animal and have to put in the work of preparing for the hunt and cleaning the animal. Things that anyone who's not done this takes for granted when they eat meat.
You might not be the right market (or at least, the marketplace might be different for your demographic).
I'm a parent, and for me, and all my parent friends, Disney+ is the streaming service that generates the most value in our households. Along with all the old/nostalgic Disney animated films, they generate and acquire a lot of the "in" content for kids (Bluey, Mickey Mouse Kids House, etc.)
Before my kids, Disney+ would have been the first streaming service to make the cut. But now, it'll be the last.
> Along with all the old/nostalgic Disney animated films, they generate and acquire a lot of the "in" content for kids (Bluey, Mickey Mouse Kids House, etc.)
True for my family too! It's just that my kids already cycle through most, if not all, of the content they want to watch.
Haven't seen anyone mention a non-DOE lab, so figured I'd weigh in.
I interned twice with MIT Lincoln Labs, which among other things, helped build and deploy Radar for WWII which turned into building/managing the technology for Air-Traffic Control, and then turned towards space.
They are primarily a DOD-associated research lab (even located on an US Air Force Base), and so most of the projects have some military-oriented mission. Their mission is entrepreneurial-minded (which I found cool), in that they do the "basic research" and prototyping to prove viability and then the DOD turns over the project to a contractor to make feasible.
While I was there I worked in their GeoIntelligence and Natural Language groups, doing research which I'd ultimately come to understand as being relevant for Project Maven (year 1) and PRISM (year 2). While I'm sure as an intern my contributions weren't directly related to or otherwise leveraged for these programs, in hindsight it was clear that this was the bigger picture that the work was contributing to. Take from this what you will.
Most of the anecdotes that I've read through in the comments mirrors my experience. However, one thing I see missing was how opportunity was "metered" out. Each group I was in was organized like a research lab and the level of your academic progression limited (or opened) your ability to get access to specific projects/work. Their pay scale was also dictated based on this as well. So if you have a BS, your ability to "move up", doesn't exist, but it does if you have PhD.
Ultimately, I was given an offer to work there, but ended up taking a SWE position in the Bay Area because I wasn't interested in continuing my education and felt like my ability to have a career progression at MITLL would have necessitated that.
One highlight of working at LL as a radar software contractor was getting to go to the regular lectures they had on various topics related to the lab's activities. But of course the big downside is not being able to talk about what all the lectures were on and having to avoid some of the topics they were on now.
Yeah - and it wasn't even necessarily on the tech stuff either - the talks are definitely the things that have stuck with me the longest after working there (and it's been 10+ years!)
I have no idea how this plays out in the US national lab system being discussed .. but a PhD is literally "my first independent solo original research project obsessed over and submitted to critical peer review".
It's the apprenticeship for doing independant solo research and, in general, makes perfect sense that someone have demonstrated a capability for this work prior to being given the reins to resources on the scale of millions, tens of millions, etc.
It's hardly a "glass ceiling" (ie. we profess equality but don't promote women or people of colour or not our religeon but never say why) when it's a stated requirement to, say, first pass an apprenticeship prior to becoming an certified electrician and hired to wire a nuclear weapon.
It's an actual formal staging in a meritocracy, the only issue would be if those that might gain a PhD are denied the opportunity to do so ... (a somewhat tangential issue that might have more play in various places).
A defining characteristic of a PhD is that the definitive review is from an external examiner who is paid by a different organization than the candidate, and the review is the personal, professional opinion of that researcher, and not that of their organization.
This is definitely not true in the US, at least in physical sciences. Typically your PhD advisor (who is obviously in the same organization and gets paid with the same funding source as the student) has almost entirely all of the say in a PhD defense. There is another faculty or two from the same department on the committee (who work on different stuff, possibly funded from somewhere else). And, IME as a mere formally, there is often another faculty from a separate academic unit (department) who's just along for the ride to give the appearance of oversight but doesn't really know what's going on. They'll ask a softball question to remind everyone they are there.
Sure, anyone in the committee can grill you as a sort of hazing ritual, but the reality is that your PhD advisor won't let you stand for defense unless you are almost 100% sure to pass it.
Source: have attended probably a dozen thesis defenses (including my own).
In the UK you can normally submit without your supervisor's approval if you insist (and have survived the programme long enough to actually have a thesis written). The actual viva will most likely be one internal academic (not your supervisor) and one external and your supervisor won't be present.
That said, it's a pretty crappy idea under nearly all circumstances -- if your supervisor doesn't think the thesis is passable and discourages you from submitting it then it's quite likely the examiners will agree with them. And getting a terminal MPhil isn't exactly a badge of honour...
So there’s no external examiner? That’s very surprising to me, but it certainly refutes my claim.
I agree the supervisor should almost never allow a defense to take place that you won’t pass. But I can’t imagine a school passing a candidate if the external gives an unfavorable report.
I’ve also been involved in many PhD defenses, every one of which had an independent external examiner. Computer Science or closely related, not physical sciences.
Counter-counter-point that poorly paid five year review doesn't have to be poorly paid or take five years - that's a function of countries and their approach to education etc.
Notwithstanding both degree inflation and the notion that "the exception proves the rule" -- a system that would rule out someone like Freeman Dyson would seem flawed.
In that NYPost article Dr Natalie Gosnell isn't discussing the PhD requirement for advancement within the US National Labs .. so you may have to expand your point if you want to make one.
Edit: you've unfortunately been breaking the site guidelines in other places recently too, e.g. with personal attacks and using HN for ideological battle. We end up having to ban accounts that do that. I don't want to ban you, so if you'd please review the rules and use HN as intended, we'd appreciate it.
I was specifically talking about a PhD requirement as a precursor step in a research career.
> Academia is a shocking hive of nepotism and corruption wherever you are.
Also isn't something discussed by Dr Natalie Gosnell who has her Dr. and a position. Her complaint (in the article you linked) is that " astrophysics .. is paralyzed by “systemic racism and white supremacy” ".
There's no obvious mention of nepo babies in that link that you selected.
My own preference is for astro tangential projects such as
It's largely part of the same rotten thing. You think rampant “systemic racism and white supremacy” and sexism has absolutely no relationship to the presence of nepotism or glass ceilings? Or that its presence in one part of academia has no bearing on whether or not its present in academic-adjacent industry, who could not be expected to have any idea about the problems in academia? Come on.
MIT-LL is way more stringent on degrees no matter what your experience is. B.S is relegated to technician work for upwards of 90%+ of your time no matter how long you have been there. Masters is generally a step up from that, and in the engineering that could mean middle of the road in the hierarchy. But PhD is the only way to really move up beyond that except in the rare cases of the fabrication group.
MIT main campus is slowly heading in the same direction over the past decade, at least in the science staff positions. There are three major titles, research specialist (B.S.), research engineer ( masters), and research scientist ( PhD). Unfortunately in the past 5 years, they changed the requirements of research engineer to require a PhD. And honestly, I have more recently been involved in the HR/hiring side of main campus, and though with the right department there is some wiggle room... There isn't that much you can do when the office of the VP of research says otherwise.
So unfortunately to move up with Anything but a PhD, you need to shift into some type of admin or adjacent role.
To exemplify this, I had a coworker who had worked in a research group as a research specialist for half a decade, coauthored like 15 papers while there, and had a BS in physics. And said he couldn't get a promotion no matter what he tried because VPR wouldn't let his boss do it without an advanced degree (PhD). So he left for a different department as some kind of lab manager of a large research group. Which to me is crazy, you have a dedicated and knowledgeable worker, who wants to stay and advanced the research and group... But because they didn't spend 6-8 years on a PhD, their only way to advance their career is to go into management of a lab - giving up on any research work...
FWIW, I was a Research Scientist there, a bit over a decade ago, despite having only MS degrees, no PhD.
I knew of a few Research Scientists there who didn't have PhDs, so I didn't think it that unusual at the time I was hired. At one point shortly after arriving, I did suddenly wonder whether some rule had been bent, or there was a hiccup in some process that was supposed to block that, but I still got an appointment renewal after a year.
As the recipient of getting to work for a great PI, and of a title that sounded impressive to my parents, I can't complain about that.
(I might've been lucky that time. I'm not a fan of degree/class/caste ceilings. For one one of many reasons... There's the tragic story of a dear friend, who was a lab tech at another research university, in a field that had a degree glass ceiling among the technician ranks. Her supervisor and lab director sounded very supportive, and said she was the best technician in the lab, but their hands were tied on promoting her to a higher technician rank. She couldn't stomach the doctorate-level degree debt load that the supervisor was encouraging (she was poor, already had debt, and no family safety net), though it would've leapfrogged her over the role she sought. So she tried earning affordable transferable credits, for the gatekeeping for the next rank for which she was already qualified, while working full-time and living in lousy conditions. It killed her at around 30. Her lab did a memorial service. I got invited, but I didn't go. Besides being devastated myself, I was sure the supervisor and director already felt awful, and I had nothing to say in a memorial service context to certain cliquish technicians who bullied her for being meticulous about science protocols, and perhaps for aspiring above her station.)
We think we stand out from our competitors in the space because we built first for the enterprise case, with consideration for things like data governance, acceptable use, and data privacy and information security that can be deployed in managed easily and reliably in customer-managed environments.
A lot of the products today have similar evaluations and metrics, but they either offer a SAAS solution or require some onerous integration into your application stack.
Because we started w/ the enterprise first, our goal was to get to value as quickly and as easily as possible (to avoid shoulder-surfing over zoom calls because we don't have access to the service), and think this plays out well with our product.
reply