> If you require licensing fees for training data, you kill open source ML.
This is another one of those “well if you treat the people fairly it causes problems” sort of arguments. And: Sorry. If you want to do this you have to figure out how to do it ethically.
There are all sorts of situations where research would go much faster if we behaved unethically or illegally. Medicine, for example. Or shooting people in rockets to Mars. But we can’t live in a society where we harm people in the name of progress.
Everyone in AI is super smart — I’m sure they can chin-scratch and figure out a way to make progress while respecting the people whose work they need to power these tools. Those incapable of this are either lazy, predatory, or not that smart.
"Ethical" in this case is a matter of opinion. The whole point of copyright was to promote useful sciences and arts. It’s in the US constitution. You don’t get to control your work out of some sense of fairness, but rather because it’s better for the society you live in.
As an ML researcher, no, there’s basically no way to make progress without the data. Not in comparison with billion dollar corporations that can throw money at the licensing problem. Synthetic data is still a pipe dream, and arguably still a copyright violation according to you, since traditional models generate such data.
To believe that this problem will just go away or that we can find some way around it is to close one’s eyes and shout "la la la, not listening." If you want to kill open source AI, that’s fine, but do it with eyes open.
Yes, it’s true that open source projects that cannot pay to license content owned by other people are at a disadvantage versus those who can. Open source projects cannot, for example, wholly copy code owned by other people.
Also, beware of originalist interpretations of the Constitution. I believe there’s been about 250 years of law clarifying how copyright works, and, not to beat a dead horse, I don’t think it carves out a special exception for open source projects.
This is another one of those “well if you treat the people fairly it causes problems” sort of arguments. And: Sorry. If you want to do this you have to figure out how to do it ethically.
There are all sorts of situations where research would go much faster if we behaved unethically or illegally. Medicine, for example. Or shooting people in rockets to Mars. But we can’t live in a society where we harm people in the name of progress.
Everyone in AI is super smart — I’m sure they can chin-scratch and figure out a way to make progress while respecting the people whose work they need to power these tools. Those incapable of this are either lazy, predatory, or not that smart.