It seems this gpt4free was basically hijacking 3rd parties services that use GPT-4, bypassing the official OpenAI APIs in order to avoid paying for inference. Of course, that means that the hijacked 3rd parties are the ones footing the bill...
I'm not surprised they have been issued a takedown notice.
It's not clear to me that DMCA Takedown is an applicable legal process for that, but I guess when does that ever stop anyone these days.
What specific US laws do folks think that repo (or running/using the software in that repo) might have been violating? (I agree it seems likely that it's _some_ law, I'm not challenging that just asking if anyone has a legal analysis they want to share).
It's an excellent point. DMCA is for copyright. My take (I have a background in this area but IANAL): I think they can get away with the copyright because of the name/content usage (no one of the opposite end of the request is going to question that, it seems obvious), but I think it's clear to those in the know that's not WHY they went after this one.
In theory, they could probably use DCMA to go after anyone using the terms (right or wrong). In practicality, they used it as a tool to go after this particular one because they didn't like what they were doing.
See for example, the group arrested for selling devices that allowed people to take control of their own Nintendo Switch systems[1].
Although, after digging into the story, it looks like they may have also operated an illicit app store containing cracked IPs, so that situation is a little murky.
I think jailbreaking is a current exception to the DMCA (according to the copyright office's latest report). An app store full of cracked games is obviously illegal, though.
Hm, good point, I think of "takedown notice" as being about the DMCA, because I never heard that term at all before the DMCA, I think of it as a term of art from DMCA. But people could be using it differently or mis-using it.
However, this is on Github. Github specifically has a "DMCA Takedown Policy" [1]. I don't believe they have any other policy or procedure involving a "takedown notice". But sure, I could be wrong, or the notice on the repo could be not quite right about what's going on.
Other companies, even big ones, will just take down anything a big corporation asks them to, with no written policy or a written policy basically saying that's what they'll do, while using language implying the DMCA (like "takedown notice"), when that's not what they're doing at all. But Github has actually been pretty good at actually doing this according to the procedure spelled out in DMCA, and not just randomly for whatever another big corporation might want. And being clear about what they're doing why if they're doing something else.
> Xtekky initially told me that he hadn't decided whether to take the repo down or not. However, several hours after this story first published, we chatted again and he told me that he plans to keep the repo up and to tell OpenAI that, if they want it taken down, they should file a formal request with GitHub instead of with him.
> "I believe they contacted me before to pressurize me into deleting the repo myself," he said. "But the right way should be an actual official DMCA, through GitHub."
Yeah what's curious to me is why OpenAI has grounds here vs [the abused 3rd parties]. Maybe they are trying to stand up for the people using their API as a courtesy because they want them to stay in business or something, but it seems the damaged parties are the 3rd party services bankrolling the access, and so they'd need to pursue legal action and/or patch their services.
I do imagine OpenAI has something in their terms where you're not allowed to use their APIs unless you agree to their terms, which includes payment and not using other accounts than your own (fraud). So maybe that's it?
CFAA[0] is one that comes to mind but I also think that has different issues with what might be overly vague terminology. It at least seems more applicable to this, though I am certainly not a legal expert.
There was a somewhat recent supreme court ruling that said just breaking some ToS is not a CFAA violation. Unless the gpt4free repo had straight up stolen credentials the CFAA shouldn't apply.
They should be happy that OpenAI went after them with the DMCA and not for computer hacking and fraud, which is what they technically did by hijacking other people's API keys.
A lawyer must have advised this, as financial fraud likely has a higher burden of proof. They might still proceed with criminal charges (if a DA agrees) or a lawsuit.
They were referring to the fact that everything ChatGPT is built on is other peoples work. Beyond the actual building of the model details, there is nothing that ChatGPT owns. All the content they use to train, all of the art they use to train. Everything is stolen/used without permission. Obviously there is more to it than that, because you published it on the internet. But that's a different topic.
Everything that anyone has ever built is built on the works of others. This is how we progress as a species. The entire reason why the internet is so revolutionary is that it allows for permissionless innovation.
Then OpenAI should allow us to do some permissionless innovation on their work.
Strangly enough, it's only interested in promoting permissionless innovation when it stands to profit. It plunders the commons, and gives nothing unencumbered back.
The reason intellectual property was invented was to encourage people to go and create new things and share them, the logic being that having a monopoly on your own work by default means you can make money from being creative and therefore people will choose to do it. The reverse is already happening, people are deciding (privately) not to publish things they have created because they rightly assume it will be stolen by an AI, monetised and used to destroy their own job. It is not merely complaining for its own sake. There is a good amount of theft and a bad amount of theft. As theft increases unchecked the amount of new output is poised to decline.
I don't see any world where it matters in the slightest. When it comes to how we deal with currently available training data nothing will change, first because of politics but also because people want the LLMs superpower more than they want to protect IP of a few individuals. And I firmly believe that no human training data that has not been produced and publishes today will play any significant role in future AI development.
The topic's biggest cop-out. Intellectual property doesn't exist in a vacuum. I have limited-to-zero sympathy for corporate entities like Getty images that hoard IP, but our society's social contract says labor isn't free unless people donate it. We need to implement some sort of alternate compensation system before entirely disregarding IP so we don't pull the rug out from under perfectly honest independent creatives with kids and mortgages and medical bills plying their craft in an established system. Until then, taking the fruits of creative labor without permission is theft that is much more consequential and much less morally defensible than what you describe.
I'll bet if someone outside of our IP jurisdiction figured out a way to reliably and thoroughly reverse engineer the most complex commercial software from binaries so people could spit out a working, fully-customized copy of a commercial application from a prompt, and the entirety of the software development market would soon collapse, the tenor of this conversation would be very different.
Maybe the people with the very ethically defensible stance that private property is theft would be totally fine with OpenAI knocking down your home to build their new headquarters without compensating you? Imagine the progress! (hint: they probably wouldn't be ok with it)
None of this stuff exists in a vacuum. None of it.
You are right, the (potential) negative impact AI training has on what people do will only manifest in the future.
But no matter how I or anyone else feel about the car or how bad it is for the environment, or how much we dislike the noise they impose on us, it's simply not going to bring back the horse.
Another cop-out. No, the horse isn't dead. This technology is at its precipice and society outside of the tech world hasn't even started to react yet let alone develop entrenched immutable norms surrounding it. A good portion of what these algorithms put out isn't even particularly commercially useful... yet. There's a lot of time to change policy, to change corporate norms, to change compensation structures, and to change perspectives. Just because you find that prospect inconvenient compared to just throwing up your hands and saying fuck it, likely because you benefit from doing so, doesn't mean that you don't have a moral obligation to reduce the harm these behaviors will result in.
Beyond that, the technology is just the catalyst. It's a tool. The problem is what people are doing with it. That's an ongoing behavior that can be changed-- not a bell you can't un-ring.
I won't claim to know what's in your head, but most people I've encountered who rebuff complex topics with idealistic platitudes don't really think the topics are that simple. They're avoiding confronting the negative consequences of a behavior they have no intention of changing to avoid damaging their moral self-image.
"chilling effects" usually refers to when people decide not to share things because of potential legal consequences. For example, if people stop creating or distributing AI art because they don't want to be sued by artists for using their style, that's a chilling effect. Basically the opposite of what you are describing
>Just say 'fuck it' until there's nothing left to scrape other than content also made by AI?
Sounds good to me! There will always be people making free art, and AI will make this much easier.
The thing that I think people are missing is that AI-generated content CAN be used to improve AI models. There is no requirement that the input data is created without AI.
Furthermore, AI-generated content on the internet is not random; it is curated content. Generally speaking people don't post every image they generate with Stable Diffusion, they only post the best images. If you consider engagement metrics and user feedback (upvotes etc), they can be a valuable and useful part of a training set.
Two months? The (ai-luddite) preachers have been reciting this litany since the first decent diffusion models released over a year ago. They haven't slowed down any.
This is an incorrect and unfair statement that would not pass the test in any court of law. ChatGPT uniquely orders information in a way that gives them a competitive advantage in the marketplace. While the source information is public, the ordering of it is proprietary and a trade secret.
Your argument is a reductio ad absurdum to "everything is made of atoms and no one ones atoms, ergo no one owns anything."
I'm sorry but you can't honestly use stolen without permission here. If you publish something and someone else acquires it legally (because you published it for free or because they paid for or otherwise obtained a license to it) then you don't get to control how the work is used after the fact. You only control the terms of them receiving a copy. You can't say "I didn't want my work used for AI training data when I published it so it's all stolen as far as I'm concerned". It just doesn't work that way.
Now that doesn't mean you can't license your work for exclusive use by humans and explicitly forbid AI training data in the license applied to your work, but you'd have to do that when you publish it, not retroactively.
I personally don't think IP has a place in modern society. But I was mostly replying to the authors comment.
My concerns mostly lie with the fact it's owned largely by $MSFT rather than a more "open source" contributing to society entity. But again that's a much different topic.
Block access if you don't want them to access your data. No bills created.
However, if AI ends up being as mainstream as the average HN user is claiming, are you sure you aren't shooting yourself in the foot to not have your brand and product info not included in that data set if it replaces search engines?
Is it any different from a Google crawler? They put ads on your content on the SERPs after crawling it.
Google drives traffic to your website, and generates revenue. It also respects IP and gives credit. Ai crawlers don't. Just because content is out in the open it doesn't mean there's no license to follow when using it. If they gave credit, respected IP and drove traffic then sure.
Whilst I'm not an AI proponent, Bing cites sources in the form of sites, thus driving traffic, so that's not strictly true. Bing seems to be somewhat of a sensation currently with consumers.
I am not against ai and i am open to any ai tools that give credit or satisfy the requirements of the license of whatever they ingested. Ai is good, theft is not. So if bing plays by the rules then thats great.
The phrase "Who's 'we,' white man?" is a reference to a scene in the 1991 film "Grand Canyon." In the scene, a wealthy white man tries to help a stranded African American man in a dangerous part of town. The African American man responds to the white man's offer of help with the question "Who's 'we,' white man?" which is a critique of the white man's assumption that they are part of the same group. The phrase has since been used in various contexts to challenge assumptions of shared identity or experience.
Curious how much people's bills are inflated by ai crawlers constantly sucking their data and how much in revenue is lost since traffic is not brought to their websites. And since there's no way to stop this theft, since most of them don't honor robots.txt, people are forced to remove content. Perhaps those charged for bandwidth are losing some dime right now.
It's the punchline of a vaguely racist old joke involving Lone Ranger and Tonto. I have to admit that I also often think of when somebody uses "we" inappropriately to make their opinion or experience appear universal.
(But as seen here, you can't really just drop the punchline into a conversation.)
Why is OpenAI getting involved? They are getting paid either way. The third parties should do the takedown if they are not happy about their endpoint being scraped.
Presumably, they're looking out for their paying users (see: they want to keep those paying users), who would have a terrible experience if and when they found out someone else had been using their APIs and/or API keys.
This project is designed to allow people to use ChatGPT via reversed engineered private APIs. It's not surprising they went after this.
Here's the project description from the README:
Have you ever come across some amazing projects that you couldn't use just because you didn't have an OpenAI API key?
We've got you covered! This repository offers reverse-engineered third-party APIs for GPT-4/3.5, sourced from various websites. You can simply download this repository, and use the available modules, which are designed to be used just like OpenAI's official package. Unleash ChatGPT's potential for your projects, now! You are welcome ; ).
You are protected in your speech from the government. Commercial law does and will still apply. Arbitrary company decisions happen all the time, and GitHub makes it clear that they won’t refrain from deleting repos for whatever reason.
Not sure if that distinction applies here. I understand that the 1st amendment doesn't protect you from a company refusing to publish what you've said - that makes sense. But this is a case of a company attempting to use commercial law (created by the government) to cause someone else to stop their speech. That seems like a simple violation of the "Congress shall make no law respecting an establishment of religion, or prohibiting the free exercise thereof; or abridging the freedom of speech"
But idk because i'm not a lawyer and we have copyright and ip laws so clearly congress can pass SOME laws that prohibit speech. Free speech absolutism is weird to me
In this case, it is really GitHub’s company policy which is being applied, right? They aren’t required to host anything in general, and they have a policy of taking down repos based on their interpretation of the DMCA, or some similar law (which might be a misinterpretation).
Any law is ultimately enforced by the government. There isn’t a different type of law to which the constitution doesn’t apply (I mean it doesn’t say a ton about limiting various types of laws—laws around contracts, state law, etc etc—but it still applies, it just doesn’t say much).
However, this seems more like an issue of corporate policy than law.
Github needs to have some policy that ends up with them taking down repos that actually host illegal content, they don’t have any legal obligation to host files, so they can respond to takedown notices by just taking down the files. This wouldn’t be the government forcing them to take down files, it would be them deciding not to try and parse the law very closely. But this is different from having an area of law where the constitution doesn’t apply, and it bears repeating, because the constitution is really important and the idea that there should be some sort of cutout where it doesn’t apply is bad for society.
Something we should grapple with as a society is whether poorly written, ambiguous laws should be interpreted as the government taking action by essentially forcing companies into be over zealous in their corporate policy.
> Any law is ultimately enforced by the government
So? That’s not what the first amendment applies to. You do not have first amendment rights in civil cases. This is not “an idea”. It’s just how it is. See libel.
> > Any law is ultimately enforced by the government
> So? That’s not what the first amendment applies to.
Yes it is.
> You do not have first amendment rights in civil cases.
Yes, you do; that’s why US defamation law is more limited than the common law it derives from, and where Fair Use as a judicial application of the First Amendment came from before it was codified in statute.
> See libel.
Libel is a perfect example of how you do have First Amendment rights in civil cases. Here's a long list of cases applying the First Amendment in the libel/defamation context:
You are actually very protected in documenting security flaws, and even republishing them.
I am unsure of who you think enforces laws... as far as I know OpenAI doesn't have their own police force yet.
They can sue you of course, but they generally can't demand compliance with takedowns in this case without first going to a judge and requesting a court order.
There is no "commercial law" unless you mean UCC.. which doesn't apply here.
Still, there's nothing illegal about GitHub deleting your repo for any reason they choose as long as they're a private entity not owned by the government.
I'm not a lawyer or even an American but that certainly isn't how the DMCA works. The takedown is issued against the hosting company and, if they comply, they have no further liability. If they don't comply, they are liable in court so, of course, they all comply.
There's a difference between being compelled by a court order to take down a repo and choosing to comply with a DMCA takedown notice of dubious validity because you don't want to waste any more time on the issue and are happy to screw your users.
I think an interesting legal case could be made about publishing the code as a web page somewhere that could play around with the speech/action distinction - but I certainly don't think Microsoft is going to host ways for people to backdoor OpenAI under any circumstances - honestly I'm surprised it took an actual DMCA notice
> You are protected in your speech from the government
In theory. In practice we see in the Twitter files, the new rule is that government agencies are free to send takedown requests to social media platforms for speech that disagrees with our (abhorrent) foreign policy.
>In practice we see in the Twitter files, the new rule is that government agencies are free to send takedown requests to social media platforms for speech that disagrees with our (abhorrent) foreign policy.
Where in the Twitter files did it show that social media platforms would be punished with jail time or violence or anything if they refused to obey the government's orders?
Because unless you can demonstrate the government was putting a gun to Twitter's head and would not take no for an answer, that isn't a "new rule" it's literally just the government making a request. Which they and anyone else is and has always been allowed to do. And which social media platforms have sometimes refused without reprisal. I mean, I see speech that disagrees with American foreign policy all the time on social media. No one's being sent to the camps for it. It doesn't even get censored.
In general it's certainly true that private internet companies are not bound by the USA first ammendment in that way, and routinely "censor" speech, I agree. Including for things that they think will be bad PR for them, or will be legally risky for them. But.
> GitHub makes it clear that they won’t refrain from deleting repos for whatever reason.
What actions or speech do you think Github has taken that makes that clear?
In general, I have seen Github stick to only taking things down according to the actual DMCA law, more than most companies that take things down pretty much whenever anyone asks them to.
Github has a DMCA Takedown Policy [1], that is better than most companies. Most companies policies -- if they even transparently publish them at all, which they often don't -- go well beyond what the DMCA requires in what they will take down. Compare to eg YouTube [2], which isn't really using a DMCA process at all, doesn't really have a transparent policy at all, and does not allow you to counter-notice. Github's policy is way better than most; but maybe there are occasions where they have been known not to follow their own policy, is that what you're saying?
From what I've seen, github has actually made it much more clear than most companies that they won't just randomly take things down for arbitrary reasons, but have a clear and transparent policy based on the DMCA. But maybe there are things I don't know.
In this particular case, though, someone else pointed out to me in another part of this thread -- it's not totally clear Github is even involved. From the text on the repo, it seems possible that OpenAI contacted the repo owner directly, and the repo owner decided to change the text of the repo README to say that, and that may all that has happened? If Github had actually done a "takedown" according to their usual procedures, I think the repo wouldn't be there anymore? But it's not really clear what's going on, or even what the repo owner _claims_ is going on, unless we have more info than appears in the linked repo README. It's not currently clear that Github is involved at all.
> What actions or speech do you think Github has taken that makes that clear?
Github has deleted Iranian / Russian / etc. Repos before purely for political reasons. GitHub belongs to Microsoft, and Microsoft has a stake in OpenAI. Seems like a reason to me.
By purely political reasons, do you mean to comply with US law? Those are the only cases I can find [1] [2] [3]. Or if there are other cases, I'd love to see a link to more info!
I don't like the US law much either, personally. I just don't consider a US corporation complying with US law to be a demonstration they will do arbitrary things for "any reason", it's sort of the default I expect from corporations and very predictable. If there's an example of a US corporation that chose to intentionally violate those sanction laws as an ethical stance, I'd love to find out more about that, and the outcome, too! I would imagine they would be penalized by the US government.
Microsoft was in fact recently so penalized [4], but I don't think it had anything to do with github, and I definitely don't think it was an ethical stance, just a mistake/profit-motivated one, or because these laws are a mess and hard to comply with. But I expect US corporations to try to comply with US laws, and don't consider doing so to be arbitrary or unpredictable "any reason".
This is why Microsoft's takeover of Github (and OpenAI for that matter) is so tragic. They weren't required to take this down. It got taken down because Microsoft didn't like it. Microsoft now has their hooks in the open source community and can crush any project who does something they don't like.
I find it extremely sad that nowadays the EFF, FSF and ACLU are so watered down compared to the 90s (when I first read about abuses). With the wave of information and abuses that will come in the next years due to proprietary LLMs, i wish there was a new person with the drive of Stallman. Hunanity desperately NEED the new Stallmans, Lech Johansens, Russinovichs and Linuses of these new generations generations.
If the code in any way includes private API keys, or circumvents protections on another entity's private API keys, then this is intellectual theft and punishable by the law. I'm willing to bet that without those private keys, the repo is worthless.
it doesn't contain private keys, arguably it contains irresponsible disclosures of various ways some large API users can predictably get their keys hijacked
Assuming that those third-party services are ones that the public can access via their own web interfaces, such that the only thing unauthorized is the manner in which the APIs are consumed, this would seem (unless I am missing more specific precedent) to fall out of CFAA coverage as a result of the Van Buren v. United States decision.
I remember seeing "Help: FBI criminally charged me with $6MM loss for hotlinking. I didn't do it" on HN earlier this year (https://news.ycombinator.com/item?id=30589489). Was this person lying?
There is no indication of what the charge was, and usually with hotlinking to an asset the legal issue is copyright infringement (which can be criminal as well as civil); that’s very different from suggesting that use of an API endpoint intended to be used by a public web frontend is a CFAA violation.
I remember when this passed and thinking that it was all the big, incompetent businesses that can afford lawyers on retainer making sure that only big businesses that can afford lawyers on retainer maintain their position of superior power over individuals. Snuffing out any hope that the little guy - who through sheer talent - can do things on this incredible newfangled equalizing innovation called the Internet will finally have some real chance at power.
Bank of America used it to make people who simply changed the account number in their URL bar the criminals instead of them, who were completely incompetent at securing access to their customer's accounts. What previously would have been arguably criminal negligence.
It placed intent above competence - but only for those who can afford lawyers.
> Keeping in mind that "not complying with a corporation's policies" is not the same as breaking the law.
Actually, the law says that the Terms of Service is a legally-binding contract unless you can prove any provision is legally considered unconscionable. However, if that happens, all provisions except that provision still bind. It is illegal to break a legally-binding contract, and you can be sued or taken to arbitration at a minimum in a civil court for "breach of contract." And that's before any Computer Fraud and Abuse Act or Digital Millennium Copyright Act violations.
Yes, corporations don't sue users for "breach of contract" almost... ever. It's expensive, risky, has low compensation for doing so, and is just bad PR. But they legally always can.
> Actually, the law says that the Terms of Service is a legally-binding contract unless you can prove any provision is legally considered unconscionable
No, it doesn’t.
It says they can state the terms of a contract if all the requirements of contract formation have been met, which are more than just the absence of unconscionable terms.
I'm assuming the users of Gpt4free haven't signed up to OpenAI's terms and conditions, even if they do contain language prohibiting use of these private APIs. A corporation can't unilaterally impose their TOS on the entire population (or, at least, one would hope they can't).
In that case though, let's say OpenAI decided to enforce their Terms of Use by potentially suing. The defendant would likely have to show, whether he likes it or not, that he never once signed up for ChatGPT, never once signed up for the official OpenAI API, and managed to perfectly reverse-engineer the API from the outside. Seems unlikely to me.
But then of course... CFAA and DMCA. The DMCA in particular, for example, doesn't consider the strength of the lock in the criminality. DVDs can be cracked with 7 lines of Perl since 2001, but it's still a DMCA violation.
> The defendant would likely have to show, whether he likes it or not, that he never once signed up for ChatGPT, never once signed up for the official OpenAI API, and managed to perfectly reverse-engineer the API from the outside.
These aren’t reverse engineering the OpenAI API, they are reverse engineering the APIs of public services that in turn call the OpenAI API.
I’m not sure under what theory OpenAI would even sue.
> But then of course... CFAA and DMCA. The DMCA in particular, for example, doesn't consider the strength of the lock in the criminality.
The DMCA only applies to technology addressing copyrights, and CFAA seems inapplicable to consuming the backend APIs used by publicly accessible services because that’s just use of authorized access by a different manner, outside of CFAA scope under the Van Buren precedent.
This is almost certainly an instance of Unauthorized Use under the CFAA and therefore criminal in the USA and any jurisdictions with similarly broad anti-hacking laws.
If those are APIs consumed by public sites, then they are APIs the public is authorized to use by way of those sites, and Van Buren v. United States says that if you are authorized to access a system, accessing it a different “manner or circumstances” is not “unauthorized” as that term is used in the CFAA.
I wasn't aware companies could, by fiat, declare certain publicly available endpoints private, thereby compelling everyone by force of law to pretend they don't exist.
My bank's website is publicly available. That doesn't mean anyone is free to access my bank account. Just 'cause something is accessible on the internet doesn't mean you have the right to access it. Case law and statute goes back at least to the 1980s on this point.
Citing convictions overturned on appeal probably isn't the strongest evidence of illegality. (Because they were overturned on threshold issues that didn’t involve inquiry into the substantive merits of the charges, its not evidence against illegality, either, but...)
My point is people have gone to prison over GET parameters, not the legality of the it. DOJ has CFAA. Abusing private APIs is flying close to the sun. Even if you do get out of prison eventually
So, if I create a cat GIF API, but announce that it's a private cat GIF API only I am allowed to use, I can sue anyone else who uses it to retrieve a cat GIF?
Knowingly using a private API without authorization can fall under CFAA, contract law, copyright law, trespass to chattel, etc -- and you can issue a C&D and/or sue for whatever is relevant.
These are the exact same “private API”s your browser utilizes when visiting chat.openai.com and require your own API keys granted to you by OpenAI.
Calling it illegal is utterly insane. It’s just a different user-agent and they’d prefer people use their official ones. OpenAI literally controls the keys so if they don’t want someone using an alternate mechanism, they can and will just ban the account.
My website is private. If you visit it I will sue you.
If someone bypasses authentication I understand but if your api is open on the public internet on purpose, you don't get to randomly declare what's private and what isn't.
> mycoolsite.com is the same as mycoolsite.com/api/bb8d4cc4-1453-473b-8594-95db0f41877d/3c9242b8-2394-48c1-9643-618ca38eb13d for which you'll also need these dozen parameters and custom headers for, which there is no public documentation for
If someone I didn't grant access broke (in a very smart way) into my house, turned on the lights for a minute and then left, I'd still be pissed and would call it illegal.
The situations aren't really comparable. We're talking about sending a request from a computer to a publicly available API endpoint that Open AI would rather you didn't, and then using the data that endpoint sends in response.
(Somewhat tangential, the "networks as a 3D space you travel around in with locations you visit" analogy does more harm than good. It's not what's happening and it results in muddled thinking.)
Something being accessible does not mean you're authorized to access it. Someones house being unlocked doesn't mean it's okay for you to enter. Authorization is the key part here and you likely can be convicted under the CFAA[1].
Again, I don't think the analogy holds—no one is entering anything. A better analogy would be someone standing outside your house and asking you to pass a book to them through the open door, which you then voluntarily do.
>projects that you couldn't use just because you didn't have an OpenAI API key?
It's amazing how the repo phrases this like "having an OpenAI API key" is something that's gatekept, rather than something you get by making a free account. (You may not be able to use it, but the more honest phrasing of "don't want to pay for your own API usage" is apparently too transparent for what this is offering.)
It's a project that lets you piggyback off of others' ChatGPT API keys without their permission? If so, then it seems like it would violate both OpenAI's ToS as well as the ToS for any site that is being used as a proxy.
And is this a DMCA takedown? It's not actually specified in the readme update and I would have thought that the repo would have been hidden by now if it was one. Plus I'm not sure what they'd be claiming copyright on here (the API maybe?)
Just like all the code you write is just code you read elsewhere "repackaged". Ok sometimes you come up with what seems to be novel code, but we all know really you're just a sophisticated pattern matcher and you're just typing out the code you think is best at any given moment, based on everything you've seen and learnt from.
That makes no sense. The degree of which your inputs to your coding environment can differ is many orders of magnitude less than what the LLM can generate.
That's my exact point. The compiler is explicitly assigning those roghts to the iser. It's a legal contract. It's not implicit in the design of the license. Each entity's rights to ChatGPT are exclusively described in the EULA that entity signed to ontain access.
I think the devs behind GCC are going to be really upset when they find out people have been using it to generate binaries. This is different from paintings and novels, where the authors made them for the sole purpose of being a tool. /s
A lot of comments confuse this with a different repo. It has nothing to do with the name. This project is/was a way to use LLM APIs on someone else's dime. It's the equivalent of "S3 4 free" where someone would collect exposed AWS credentials and use them to store their stuff.
This isn't about exposed credentials though. It would be like an autmatic image uploder that could pick an image hosting site such as imgur and upload the image for you and give you a link. Services are offering the ability to host images for you. You aren't stealing imgur's s3 credentials. They just let any user upload images for free despite the fact it technically costs them money to host the file for you. Similarly there are sites offering the ability to serve LLM requests for you for free.
No, the 1:1 analogy you're looking for is realizing someone has a poorly protected api.___domain.com endpoint that uploads images to their S3 bucket and then using that to host your own images in their bucket instead of paying for your own.
Gpt4free uses API vulnerabilities that ultimately proxy to OpenAI's API with someone else's OpenAI credentials so that you don't have to pay for it. That's the whole gimmick.
These API endpoints aren't public service open relays which seems to be what you're trying to claim in your analogy:
>These API endpoints aren't public service open relays which seems to be what you're trying to claim in your analogy:
The whole point of the project is that they are. It's a compilation of public, free APIs that have been found. Those issues you linked are from people who don't understand that it's expensive to run a free relay for a paid service.
No service allows you to upload to some other user's Imgur account. The services like the ones you mentioned usually provide a service and do it on the user's behalf to the user's account.
Here is an interesting poem that the repo maintainer committed as a readme, incase anyone doesn't click the link:
We got a takedown request by openai's legal team...
here is a lil poem you can read in the meantime, while I am investigating it:
A little boy sat, in his humble abode.
He tinkered and toyed with devtools galore,
And found himself curious, eager for more.
He copy-pasted requests, with glee and delight,
A personal project, to last him the night.
For educational purposes, and fun it was too,
This little boy's journey had just begun anew.
Now far away, in a tower so grand,
A big company stood, ruling the land.
Their software was mighty, their power supreme,
But they never expected this boy and his dream.
As he played with their code, they started to fret,
"What if he breaks it? What if we're upset?"
They panicked and worried, their faces turned red,
As visions of chaos danced in their head.
The CEO paced in his office so wide,
His minions all scurrying to hide.
"Who is this child?" he cried out in fear,
"Who dares to disrupt our digital sphere?"
The developers gathered, their keyboards ablaze,
To analyze the boy's mischievous ways.
They studied his project, they pored through his code,
And soon they discovered his humble abode.
"We must stop him!" they cried with a shiver,
"This little boy's making our company quiver!"
So they plotted and schemed to halt his advance,
To put an end to his digital dance.
( I did not write it )
discord: https://discord.com/gpt4free
I wonder how long until GitHub acts on the DMCA? I am not familiar with the process.
OpenAI issues DMCA to GitHub, GitHub passes it along to the user, user... has the right to ignore it and leave all of the content up and update the README with a poem?
Github first disables the repo as soon as the report is received, and then waits for a response/appeal from the user before any further action is taken.
Why isn't this repo disabled then yet? They already appealed and... won? Or you just have to submit the appeal and they enable your repo again no matter what while the process plays out?
DMCA counternotice isn’t an appeal that someone has to judge. As soon as you send it to the provider, they can restore access without leaving the safe harbor (if you are infringing, you are still liable, but the host has fulfilled their safe harbor requirements.)
But this may be some other C&D, the repo owner says they got a “takedown” without mentioning DMCA; there is no reason to assume this means Github got a DMCA notice.
They put my site https://cocalc.com, which has chatgpt API integration, into this gpt4free. As a result, I had to modify https://cocalc.com to require sign in before providing the ChatGPT functionality to visitors, and I also explicitly updated our terms of service to clarify how our API can be used. I made a pull request https://github.com/xtekky/gpt4free/pull/461 to Gpt4free to have them remove cocalc. They were respectful, with some discussion back and forth, and they merged the PR. I personally don't think that Gpt4free should be taken down, so long as they respect the explicit requests of projects they proxy. They were certainly respectful with cocalc.
not sure what's justifying the pearl-clutching here... they've openly stated they are basically repurposing actual ChatGPT APIs from openai through some "reverse engineered private APIs" -- uhh..
These analogies don't work. Sending a bunch of data to a computer and receiving a bunch of data in return is in no way analogous to physically entering private property without permission. They are not the same thing, or the same order of thing, or at all comparable.
The only "real" thing about either of those 2 cases is social conventions.
Is is entirely impossible to imagine a culture where walking unbidden into private property is very normal but pinging someone electronically without a common understanding is an intrusion?
This was ~2 months ago, and I'm fortunate enough to have a direct contact at OpenAI who I complained to. He came back promptly and told me it was a mistake and the takedown notice was retracted. I also changed the twitter bot's logo to be purple instead of green to avoid future issues.
What Exceptions Does DMCA Section 1201 Have To Allow Reverse Engineering?
Section 1201 contains an exception for reverse
engineering, as well as security research, encryption
research, and the distribution of security tools, all of
which may support reverse engineering. However, these
exceptions are drafted very narrowly. If your research
might implicate section 1201, consult a lawyer to see if
you can do your work in a way that is allowed by one of
the relevant exceptions or by an exemption periodically
granted by the Copyright Office. The following factors
are relevant to whether you are entitled to a reverse
engineering, research or security exception. However,
meeting any or all of these factors will not necessarily
protect your work. The list is offered just to give you
an idea of the kinds of things that distinguish
permissible from impermissible reverse engineering:
You lawfully obtained the right to use a computer
program;
You disclosed the information you obtained in a good
faith manner that did not enable or promote
copyright infringement or computer fraud;
Your sole purpose in circumventing is identifying
and analyzing parts of the program needed to achieve
interoperability;
The reverse engineering will reveal information
necessary to achieve interoperability;
Any interoperable program you created as a result of
the reverse engineering is non-infringing;
You have authorization from the owner or operator of
the reverse engineered software or the protected
computer system to do your research;
You are engaged in a legitimate course of study, are
employed, or are appropriately trained or
experienced, in the field of encryption technology.
You provide timely notice of your findings to the
copyright owner.
On the other hand, it contains some traps that can be used to put some limits back in, such as the last lines here: (emphasis mine)
"(15) The unauthorised reproduction, translation, adaptation or transformation of the form of the code in which a copy of a computer program has been made available constitutes an infringement of the exclusive rights of the author. Nevertheless, circumstances may exist when such a reproduction of the code and translation of its form are indispensable to obtain the necessary information to achieve the interoperability of an independently created program with other programs. It has therefore to be considered that, in these limited circumstances only, performance of the acts of reproduction and translation by or on behalf of a person having a right to use a copy of the program is legitimate and compatible with fair practice and must therefore be deemed not to require the authorisation of the rightholder. An objective of this exception is to make it possible to connect all components of a computer system, including those of different manufacturers, so that they can work together. Such an exception to the author's exclusive rights may not be used in a way which prejudices the legitimate interests of the rightholder or which conflicts with a normal exploitation of the program."
You can follow our instructions to try and appease the powers that be but we deserve the right to ignore our rules and go after you anyways. We are the Law.
The bigger issue is that none of the things here is a copyright protection mechanism within the scope of the DMCA to start with, so the DMCA doesn’t even apply.
What are the chances of the conflict of interest (or lack thereof) between OpenAI/Microsoft/Github being an issue here? I'm kind of surprised they even bothered with a takedown request.
We don't even know what type of takedown request they received my friend. There are many ways to legally request removal, the linked page explains nothing. Who knows what's going on based on the current information.
A lot of the comments seem to repeat the "just like how GPT is using our content we created" sort of reasoning - whether that you agree with that or not, I don't see how that justifies the bill for your usage being out of someone else's (not openAIs) pocket?
Looks like it was a “paywall bypass” for GPT-3.5/GPT-4 through vulnerable third parties. DMCA forbids access control circumvention, among other things, so seems like a takedown is expected.
But isn't DMCA about protecting copyrighted content? And the copyright to ChatGPT responses must belong to the one who have asked a question because ChatGPT is just a tool. Whoever is using the tool should own the copyright on replies.
IANAL, not even US Person, but 17 USC ss 1201 (a)(1)(A) states: "No person shall circumvent a technological measure that effectively controls access to a work protected under this title. ...".
Is "work" defined anywhere by law or by precedents? I just genuinely don't know. It seems to me that depending on that, the OpenAI API might be considered "work" just like a copyrighted manuscript. I'd also think there must be some other laws forbidding hacking, but DMCA must have a fast track everywhere.
But ChatGPT responses are not copyrighted and cannot be copyrighted by OpenAI. They are the work of whoever asked a question, so in this case the user bypasses the protection to create its own work, not to read someone's else work.
The model is proprietary and owned by OpenAI, therefore the outputs pf a proprietary process are at minimum copyrighted by OpenAI. When you sign up, you obtain a license to use that output as prescribed by the license.
No it abuses security vulnerabilities in 3rd party businesses who are using OpenAI. It doesn't get you access to OpenAI's api at OpenAI's expense. It gets you access at [vulnerable 3rd party]'s expense. Bankrupting someone using OpenAI doesn't seem to achieve much in the way of democratization of AI tools, sorry.
it's not bankrupting them, as the author is highly ethic (by using only "Big companies" open apis and remove the small one and the ones that ask from him to be removed).
But as for your comment, i see it rather as opportunity to make it only with Opt-in by the companies themselves.
That way it will actually make it even win-win situation for them for Marketing and Ads (with lower price).
Only stealing from people who haven't asked you nicely to stop doesn't scream "highly ethical" to me
Security researchers put a lot of emphasis on responsibly disclosing vulnerabilities. The maintainers of this project could have easily done the same, but they didn't
Of course! If this was opt-in then the only problem would be between OpenAI and the service providers to decide whether that's an allowable user of OpenAI's apis based on the terms of service and whatnot.
It is literally GPT4. This is not the similarly named open source LLM "GPT4All."
GPT4Free is an API reverse engineering and proxy project which exposes an API to use GPT4 by proxy through GPT4 based services like the search engine Phind.
Essentially you are using the reverse engineered services OpenAI credits to access GPT4 instead of using your own OpenAI account.
Then these services need to model their threats with more sophistication. The existence of this project indicates there are security vulnerabilities in services that use OpenAI. In any case, I maintain that increased attention to this topic is a good thing rather than a bad thing, which is contrary to what GGP was suggesting when they referenced the Streisand effect.
To be honest I don't think they care about it being alternative.
Just that people keep obnoxiously naming their projects after them for visibility.
Imo this is exactly how this kind of polite takedown should be used. If it highlighted it to you great because at least you know for sure it's not OpenAIs product.
I suppose, since the substring "gpt4" is there. But, "generative pre-trained transformer -> gpt" seems fair game. Other companies use that acronym, so there's "EinsteinGPT", "BloombergGPT", etc.
They should just call it FreeGPT that's what FreeBSD did. So did FreeNAS, FreePascal and FreeType.
But it's clear and obvious to me that they saw GPT2 then GPT3 and thought well let's pun on it with GPT4.
First line of the wiki you link
> Generative pre-trained transformers (GPT) are a family of large language models (LLMs),[1][2] which was introduced in 2018 by the American artificial intelligence organization *OpenAI*
Digression, but why do they call it "pre-trained"? Don't they train it from scratch? Or is the point that they pretrain it and it's intended only for downstream fine tuning ok specific tasks? If so, does ChatGPT use a fine-tuned version? Is the non-finetuned version good for anything on its own?
You are correct. This is not the similarly named open source LLM "GPT4All."
GPT4Free is an API reverse engineering and proxy project which exposes an API to use the real GPT4 by proxy through GPT4 based services like the search engine Phind.
Yep. It’s an affinity scam. It has nothing to do with GPT4. IIRC it’s just some model with a GPT-like interface offered for free. Once they change their name to something that isn't trying to catch GPT4’s sails in an completely dishonest and scummy manner, we can discuss it further.
To the CrabLang folks,: this is why you care about trademarks. So when someone does this you can protect your project from scammers.
EDIT I might be conflating GPT4all with this… which doesn't make the situation any better and kinda proves my point. This type of scam is confusing and deceptive. And this one seems actively malignant.
Yes but often FOSS projects and their developers do not have the money or desire to: enforce any trademark or license, apply for the trademark itself, or market the trademark in any meaningful way.
It seems this gpt4free was basically hijacking 3rd parties services that use GPT-4, bypassing the official OpenAI APIs in order to avoid paying for inference. Of course, that means that the hijacked 3rd parties are the ones footing the bill...
I'm not surprised they have been issued a takedown notice.