Hacker News new | past | comments | ask | show | jobs | submit | more naet's comments login

“We engage in countermeasures to protect our IP, including a careful process for which frontier capabilities to include in released models, and believe . . . it is critically important that we are working closely with the US government to best protect the most capable models from efforts by adversaries and competitors to take US technology.”

The above OpenAI quote from the article leans heavily towards #1 and IMO not at all towards #2. The later would be an extremely charitable reading of their statement.


What they say explicitly is not what they say implicitly. PR is an art.


NYT claims that OpenAI trained on their material. They argue for copyright violation, although I think another argument might be breach of TOS in scraping the material from their website or archive.

The complaint filing has some references to some of the other training material used by OpenAI, but I didn't dig deeply in to what all of it was:

https://nytco-assets.nytimes.com/2023/12/NYT_Complaint_Dec20...


What's that got to do with this books claim?


Relevant similar behavior.


I'm great at magic eyes / stereograms and have a ton of posters around my house with them, but I still had trouble with seeing the differences in the test images. I easily locked in my focus on the overlapping cat images but only one difference stood out to me. I eventually got them all but it wasn't that easy (maybe with practice I could get there). The differences are noticeable when I focus right on it, but when I'm looking at the whole image it's harder to tell what is missing from one eye.


Are you able to look around while keeping your "unified vision"?

To me, all the differences appeared to be flashing (probably my brain alternates between the pair of images it attempts to "lock in", or something to that effect).


My local ISP offers VOIP.

For a long time the area was serviced by AT&T, who probably started with phone lines and then progressed over time to dial up and then more modern cable / broadband. They probably bundled in home phone service for many years.

When the local ISP built out all their gigabit fiber infrastructure they probably felt they had to offer some kind of phone service to compete, and went with VOIP since they weren't going to build out a whole telephone network infrastructure. I'd bet most people don't use it, but they need to offer it to be viable for certain older customers that don't want to give up their home phones.

I briefly set up a home phone on the provided VOIP, just for fun and nostalgia, but it was pretty annoying with sometimes getting disconnected and needing a manual power cycle to reconnect so I stopped using it.


The author says "whiteboard tests" are broken, but it seems like they're arguing that online coding assessments are broken, not in person interviews using an actual whiteboard.

Doing an in person interview on a whiteboard sidesteps the AI issue. As someone who's done a large number of remote interviews, there are some clear signs that some candidates try to cheat online tech interviews. I wonder if the trend will fuel more of returns to the office, or at least a return to in-person interviewing for more companies.


If your coding assessment can be done with AI and the code that the candidate is expected to write can’t be, doesn’t that by definition mean you are testing for the wrong thing during your coding interview?


Absolutely. We've switched from (much simpler than hackerrank) coding tests to debugging problem. It relies on external API, so we get to see candidate's train of thought, and naive methods of cheating using ChatGPT are trivially detectable.

But this is arms race of course. I have no doubt LLMs could solve that problem too (they might be able already with the right prompt). And then we'd have to make to even more realistic... How does one fit "here is a 1M lines codebase and a user complaint, fix the problem" in a format of 1-hour interview? We'll either have to solve this, or switch to in-person interviews and ban LLMs.


Every people says these, but what is the best objective way to know the candidate is good for position? Leetcode is still the best option imo.


Give them a real world simple use case where they have code and they have to fix real world code by making the unit test pass.

Never in almost 30 years of coding have I had to invert a b-tree or do anything approaching what leetCode tests for.

Well actually I did have to DS type code when writing low level cross platform C in the late 90s without a library.

But how many people have to do that today?

And how is leetCode testing the best way when all it tests for is someone’s ability to memorize patterns they don’t use in the real world?


Yeah its weird, because the whole point of having a system for hiring involving common questions, rubrics, etc is because at the end of the day you can either show that scoring well on the interview is correlated with higher end-of-year performance reviews or not show that and alter your interview system until it does.

Like you guys can keep posting these articles that have 0 statistical rigor. It's not going to change a process that came about because it had statistical significance.

Do remember, Google used to be known for asking questions like "How many piano tuners are in NYC". Those questions are gone not because somebody wrote a random article insulting them; they're going because somebody did actual math and showed they weren't effective.


Yes, because of Google’s rigorous hiring process they have had many successful products outside of selling ads against search…

I’ve done my stint in BigTech, most developers are not doing anything ground breaking


> or at least a return to in-person interviewing for more companies.

This has been broken for a while now, and companies still haven't reset to deal with it. The incentives to the contrary are too large.


The disincentives are huge though. Hiring a bad employee is a very expensive problem and hard to get rid of.


Isn't it as simple as going on pip for fangs, a short conversation for a founder of a startup and a few weeks notice pay?


The process is anything but simple at large companies. Even if the new hire is a complete fraud and can barely write code it'll still take an average manager 6-12 months to be able to show them the door. And it'll involve countless meetings and a mountain of paperwork, all taking away time from regular work. And then it'll take another 6 months to get a replacement and onboard them. That means your team has lost over a year of productivity over a single bad hire.


That comes after the decision that you can't fix the situation, which comes after you discovered that the hire was bad, which comes after a number of visible failures. That's a lot of wasted time/effort, even if the firing itself is simple.


Depends on the country I think - in Australia at least it seems like you can sue for unfair dismissal if you're angry about being kicked out, so HR departments only seem to get rid of someone as a last resort.


The cost of hiring, firing, rehiring approximates the position’s yearly salary.


In my area they just tell you to leave. No warning. No severance. Midwest US.


In which country?

In France, for instance, you have a (typically) 6 months long no questions asked window to fire a new hire, if they prove a bad employee. Presumably, if you haven't found out in 6 months, you wouldn't find out by changing the interviewing strategy.


Is using AI cheating when it's part of the job now. Is not using AI signalling inexperience in the llm department.


Copy pasting code from ChatGPT doesn't mean you have any kind of understanding of LLMs.


Yes, obviously. Cheating is subverting the testers intent and being dishonest about it. Not just what a lawyer can weasel word their way around.


It’s not dishonest, it’s just business. I’m under the exact same burden of truth as the company interviewing me; zilch.


Fair enough! In this case, it seems that policy "hard-fail any candidates who use cheat, using AI or otherwise" is working as expected. Interviews are supposed to be candidate's best showing. If that includes cheating, better fail them fast.


I wonder if OpenAI/Google/Microsoft, et al would hire a developer that leaned heavily on ChatGPT, etc to answer interview questions? Not that I expect them to have ethical consistency when there are much more important factors (profit) on the table, but after several years of their marketing pushing the idea that these are ‘just tools’ and the output was tantamount to anything manually created by the prompter, that looks pretty blatantly hypocritical if they didn’t.


Amazon uses Hackerrank and explicitly says not to use LLMs. In that case it would be cheating. However, given that everyone is apparently using it, I now feel dumb for not doing so.


They made tools to make us redundant and are upset we're forced to use those tools to be competitive.


Depends on what kind of developer are you trying to hire, maybe.


It's cheating if you don't say you're using it.


At some point I assume that it’ll be so normal that you’ll almost have to say when you’re not using it.

I don’t need to say that I’m using a text editor, instead of hole punched cards. It’s also quite common to use an IDE instead of a text editor these days in coding interviews. When I was a student I remember teachers saying that they considered an IDE as cheating since they wanted to test our ability to remember syntax and to keep a mental picture of our code in our heads.


That's actually a valid question. It looks like it was an unpopular one.

Personally, I despise these types of tests. In 25 years as a tech manager, I never gave one, and never made technical mistakes (but did make a number of personality ones -great technical acumen is worthless, if they collapse under pressure).

But AI is going to be a ubiquitous tool, available to pretty much everyone, so testing for people that can use it, is quite valid. Results matter.

But don't expect to have people on board that can operate without AI. That may be perfectly acceptable. The tech scene is so complex, these days, that not one of us can actually hold it all in our head. I freely admit to having powerful "google-fu," when it comes to looking up solutions to even very basic technical challenges, and I get excellent results.


So now there are job applicants not only pretending to know DSA by using ChatGPT, but also claim they have "experience in the LLM department".

It's not part of the job now, unless you're too inexperienced to estimate how long it takes to find subtle bugs.


My wife worked in a higher end catering kitchen and the "recipes" she brought home were crazy to me. They have little instruction other than ingredient ratios and sometimes an extremely rough outline of the technique. I guess a lot of experience is needed to fill out the gaps. As a home cook I would not have been able to follow them without more info.


I wrote a program in high school for my calculator to balance chemical equations. Ran the idea by my teacher and he said if I could make it myself I could use it. Probably was more work for me in the long run than just studying equation balancing, but I did get it working well and then had access to it for my AP Chem test.

Did some others I don't remember as well, definitely some physics, some trigonometry and other convenience functions for math, etc. I didn't have the cable to plug it into a computer and it was pretty annoying typing it all using the calculator keys, but it was my first experience with programming and I did end up making some stuff that was personally useful for my classes.

I wonder if my old calculator still has all the programs I did in memory too. Not sure if it still works, since it was over ten years ago, but I'm pretty sure I have it in a closet box and saw it when I moved last.


I have played a ton of 3 min flat blitz games at the local bar using my phone as a click and my phone has never been broken, maybe due to a little good luck. Not as satisfying to tap the screen compared to slamming a button on a physical clock, but nobody has ever slammed the phone like that even if they've been drinking.

Seen plenty of liquids spilled on the board though so there is always that hazard...


The Lichess app has had a built in clock that works very well and has all the time settings you would want. You can find it in the menu under the analysis board. I think a lot of people don't know that it's included even if they use Lichess.

Been using that for at least five years when I need a phone based chess clock on the go. I didn't try yours, but it has all the options for stuff like increments, handicap, stages, etc.


It seems like a large part of the ruling hinges on the fact that Google matched the image hash to a hash of a known child pornography image, but didn't require an employee to actually look at that image before reporting it to the police. If they had visually confirmed it was the image they suspected it was based on the hash then no warrant would have been required, but the judge reads that the image hash match is not equivalent to a visual confirmation of the image. Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.

I think it would obviously be less than ideal for Google to require an employee visually inspect child pornography identified by image hash before informing a legal authority like the police. So it seems more likely that the remedy to this situation would be for the police to obtain a warrant after getting the tip but before requesting the raw data from Google.

Would the image hash match qualify as probable cause enough for a warrant? On page 4 the judge stops short of setting precedence on whether it would have or not. Seems likely that it would be a solid probable cause to me, but sometimes judges or courts have a unique interpretation of technology that I don't always share, and leaving it open to individual interpretation can lead to conflicting results.


The hashes involved in stuff like this, as with copyright auto-matching, are perceptual hashes (https://en.wikipedia.org/wiki/Perceptual_hashing), not cryptographic hashes. False matches are common enough that perceptual hashing attacks are already a thing in use to manipulate search engine results (see the example in random paper on the subject https://gangw.cs.illinois.edu/PHashing.pdf).


It seems like that is very relevant information that was not considered by the court. If this was a cryptographic hash I would say with high confidence that this is the same image and so Google examined it - there is a small chance that some unrelated file (which might not even be a picture) matches but odds are the universe will end before that happens and so the courts can consider it the same image for search purposes. However because there are many false positive cases there is reasonable odds that the image is legal and so a higher standard for search is needed - a warrant.


>so the courts can consider it the same image for search purposes

An important part of the ruling seems to be that neither Google nor the police had the original image or any information about it, so the police viewing the image gave them more information than Google matching the hash gave Google: for example, consider how the suspect being in the image would have changed the case, or what might happen if the image turned out not to be CSAM, but showed the suspect storing drugs somewhere, or was even, somehow, something entirely legal but embarrassing to the suspect. This isn't changed by the type of hash.


That's the exact conclusion that was reached - the search required a warrant.


the court implied even a hash without collisions would not count when it should.


It shouldn't. Google hasn't otherwise seen the image, so the employee couldn't have witnessed a crime. There are reportedly many perfectly legal images that end up in these almost perfectly unaccountable databases.


That makes sense - if they were using a cryptographic hash then people could get around it by making tiny changes to the file. I’ve used some reverse image search tools, which use perceptual hashing under the hood, to find the original source for art that gets shared without attribution (saucenao pretty solid). They’re good, but they definitely have false positives.

Now you’ve got me interested in what’s going on under the hood, lol. It’s probably like any other statistical model: you can decrease your false negatives (images people have cropped or added watermarks/text to), but at the cost of increased false positives.


> what's going on under the hood

Rather simple methods are surprisingly effective [1]. There's sure to be more NN fanciness nowadays (like Apple's proposed NeuralHash), but I've used the algorithms described by [1] to great effect in the not-too-distant past. The HN discussion linked in that article is also worth a read.

[1] https://www.hackerfactor.com/blog/index.php?/archives/432-Lo...


This submission is the first I've heard of the concept. Are there OSS implementations available? Could I use this, say, to deduplicate resized or re-jpg-compressed images?


Probably yeah, though there’s significant overlap between how much distortion to accept vs the number of false positives.


The hash functions used for these purposes are usually not cryptographic hashes. They are "perceptual hashes" that allows for approximate matches (e.g. if the image has been scaled or brightness-adjusted). https://en.wikipedia.org/wiki/Perceptual_hashing

These hashes are not collision-resistant.


They should be called embeddings.


> Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.

If it was a cryptographic hash (apparently not), this mathematical near-certainty is necessary but not sufficient. Like cryptography used for confidentiality or integrity, the math doesn't at all guarantee the outcome; the implementation is the most important factor.

Each entry in the illegal hash database, for example, relies on some person characterizing the original image as illegal - there is no mathematical formula for defining illegal images - and that characterization could be inaccurate. It also relies on the database's integrity, the user's application and its implementation, even the hash calculator. People on HN can imagine lots of things that could go wrong.

If I were a judge, I'd just want to know if someone witnessed CP or not. It might be unpleasant but we're talking about arresting someone for CP, which even sans conviction can be highly traumatic (including time in jail, waiting for bail or trial, as a ~child molestor) and destroy people's lives and reputations. Do you fancy appearing at a bail hearing about your CP charge, even if you are innocent? 'Kids, I have something to tell you ...'; 'Boss, I can't work for a couple weeks because ...'.


It seems like there just needs to be case law about the qualifications of an image hash in order to be counted as probable cause for a warrant. Of course you could make an image hash be arbitrarily good or bad.

I am not at all opposed to any of this "get a damn warrant" pushback from judges.

I am also not at all opposed to Google searching it's cloud storage for this kind of content. There are a lot of things I would mind a cloud provider going on fishing expeditions to find potentially illegal activity, but this I am fine with.

I do strongly object to companies searching content for illegal activity on devices in my possession absent probable cause and a warrant (that they would have to get in a way other than searching my device). Likewise I object to the pervasive and mostly invisible delivery to the cloud of nearly everything I do on devices I possess.

In other words, I want custody of my stuff and for the physical possession of my stuff to be protected by the 4th amendment and not subject to corporate search either. Things that I willingly give to cloud providers that they have custody of I am fine with the cloud provider doing limited searches and the necessary reporting to authorities. The line is who actually has the bits present on a thing they hold.


I think if the hashes were made available to the public, we should just flood the internet with matching but completely innocuous images so they can no longer be used to justify a search


Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: