• 0 Posts
  • 147 Comments
Joined 1 year ago
cake
Cake day: June 22nd, 2023

help-circle

  • The work is reproduced in full when it’s downloaded to the server used to train the AI model, and the entirety of the reproduced work is used for training. Thus, they are using the entirety of the work.

    That’s objectively false. It’s downloaded to the server, but it should never be redistributed to anyone else in full. As a developer for instance, it’s illegal for me to copy code I find in a medium article and use it in our software. I’m perfectly allowed to read that Medium article, learn from it, and then right my own similar code.

    And that makes it better somehow? Aereo got sued out of existence because their model threatened the retransmission fees that broadcast TV stations were being paid by cable TV subscribers. There wasn’t any devaluation of broadcasters’ previous performances, the entire harm they presented was in terms of lost revenue in the future. But hey, thanks for agreeing with me?

    And Aero should not have lost that suit. That’s an example of the US court system abjectly failing.

    And again, LLM training so egregiously fails two out of the four factors for judging a fair use claim that it would fail the test entirely. The only difference is that OpenAI is failing it worse than other LLMs.

    That’s what we’re debating, not a given.

    It’s even more absurd to claim something that is transformative automatically qualifies for fair use.

    Fair point, but it is objectively transformative.



  • You said open source. Open source is a type of licensure.

    The entire point of licensure is legal pedantry.

    No. Open source is a concept. That concept also has pedantic legal definitions, but the concept itself is not inherently pedantic.

    And as far as your metaphor is concerned, pre-trained models are closer to pre-compiled binaries, which are expressly not considered Open Source according to the OSD.

    No, they’re not. Which is why I didn’t use that metaphor.

    A binary is explicitly a black box. There is nothing to learn from a binary, unless you explicitly decompile it back into source code.

    In this case, literally all the source code is available. Any researcher can read through their model, learn from it, copy it, twist it, and build their own version of it wholesale. Not providing the training data, is more similar to saying that Yuzu or an emulator isn’t open source because it doesn’t provide copyrighted games. It is providing literally all of the parts of it that it can open source, and then letting the user feed it whatever training data they are allowed access to.


  • LLMs use the entirety of a copyrighted work for their training, which fails the “amount and substantiality” factor.

    That factor is relative to what is reproduced, not to what is ingested. A company is allowed to scrape the web all they want as long as they don’t republish it.

    By their very nature, LLMs would significantly devalue the work of every artist, author, journalist, and publishing organization, on an industry-wide scale, which fails the “Effect upon work’s value” factor.

    I would argue that LLMs devalue the author’s potential for future work, not the original work they were trained on.

    Those two alone would be enough for any sane judge to rule that training LLMs would not qualify as fair use, but then you also have OpenAI and other commercial AI companies offering the use of these models for commercial, for-profit purposes, which also fails the “Purpose and character of the use” factor.

    Again, that’s the practice of OpenAI, but not inherent to LLMs.

    You could maybe argue that training LLMs is transformative,

    It’s honestly absurd to try and argue that they’re not transformative.




  • Making a copy is free. Making the original is not.

    Yes, exactly. Do you see how that is different from the world of physical objects and energy? That is not the case for a physical object. Even once you design something and build a factory to produce it, the first item off the line takes the same amount of resources as the last one.

    Capitalism is based on the idea that things are scarce. If I have something, you can’t have it, and if you want it, then I have to give up my thing, so we end up trading. Information does not work that way. We can freely copy a piece of information as much as we want. Which is why monopolies and capitalism are a bad system of rewarding creators. They inherently cause us to impose scarcity where there is no need for it, because in capitalism things that are abundant do not have value. Capitalism fundamentally fails to function when there is abundance of resources, which is why copyright was a dumb system for the digital age. Rather than recognize that we now live in an age of information abundance, we spend billions of dollars trying to impose artificial scarcity.


  • they did NOT predict generative AI, and their graphics cards just HAPPEN to be better situated for SOME reason.

    This is the part that’s flawed. They have actively targeted neural network applications with hardware and driver support since 2012.

    Yes, they got lucky in that generative AI turned out to be massively popular, and required massively parallel computing capabilities, but luck is one part opportunity and one part preparedness. The reason they were able to capitalize is because they had the best graphics cards on the market and then specifically targeted AI applications.



  • Well it is one thing to automate a repetitive task in your job, and quite another to eliminate entire professions.

    No it is not. That is literally how those jobs are eliminated. 30 years ago CAD came out and helped to automate drafting tasks to the point that a team of 20 drafters turned into 1 or 2 drafters and eventually turned into engineers drafting their own drawings.

    What you call “menial bullshit” is the entire livelihood and profession of quite a few people, speaking of taxis for one.

    Congratulations, despite you wanting to look at it with rose coloured glasses, that does not change the fact that it is objectively menial bullshit.

    What are all these people going to do when taxi driving is relegated to robots?

    Find other entry level jobs. If we eliminate *all * entry level jobs through automation, then we will need to implement some form of basic income as there will not be enough useful work for everyone to do. That would be a great problem to have.

    Will the state have enough cash to support them and help them upskill or whatever is needed to survive and prosper?

    Yes, the state has access to literally all of the profits from automation via taxes and redistribution.

    A technological utopia is a promise from the 1950s. Hasn’t been realized yet. Isn’t on the horizon anytime soon. Careful that in dreaming up utopias we don’t build dystopias.

    Oh wow, you’re saying that if human beings can’t create something in 70 years, then that means it’s impossible and we’ll never create it?

    Again, the only way to get to a utopia is to have all of the pieces in place, which necessitates a lot of automation and much more advanced technology than we already have. We’re only barely at the point where we can start to practice biology and medicine in a meaningful way, and that’s only because computers completely eliminated the former profession of computer.

    Be careful that you don’t keep yourself stuck in our current dystopia out of fear of change.


  • Better system for WHOM? Tech-bros that want to steal my content as their own?

    A better system for EVERYONE. One where we all have access to all creative works, rather than spending billions on engineers nad lawyers to create walled gardens and DRM and artificial scarcity. What if literally all the money we spent on all of that instead went to artist royalties?

    But tech-bros that want my work to train their LLMs - they can fuck right off. There are legal thresholds that constitute “fair use” - Is it used for an academic purpose? Is it used for a non-profit use? Is the portion that is being used a small part or the whole thing? LLM software fail all of these tests.

    No. It doesn’t.

    They can literally pass all of those tests.

    You are confusing OpenAI keeping their LLM closed source and charging access to it, with LLMs in general. The open source models that Microsoft and Meta publish for instance, pass literally all of the criteria you just stated.




  • I think that’s a huge risk, but we’ve only ever seen a single, very specific type of intelligence, our own / that of animals that are pretty closely related to us.

    Movies like Ex Machina and Her do a good job of pointing out that there is nothing that inherently means that an AI will be anything like us, even if they can appear that way or pass at tasks.

    It’s entirely possible that we could develop an AI that was so specifically trained that it would provide the best script editing notes but be incapable of anything else for instance, including self reflection or feeling loss.




  • We are human beings. The comparison is false on it’s face because what you all are calling AI isn’t in any conceivable way comparable to the complexity and versatility of a human mind, yet you continue to spit this lie out, over and over again, trying to play it up like it’s Data from Star Trek.

    If you fundamentally do not think that artificial intelligences can be created, the onus is on yo uto explain why it’s impossible to replicate the circuitry of our brains. Everything in science we’ve seen this far has shown that we are merely physical beings that can be recreated physically.

    Otherwise, I asked you to examine a thought experiment where you are trying to build an artificial intelligence, not necessarily an LLM.

    This model isn’t “learning” anything in any way that is even remotely like how humans learn. You are deliberately simplifying the complexity of the human brain to make that comparison.

    Or you are over complicating yourself to seem more important and special. Definitely no way that most people would be biased towards that, is there?

    Moreover, human beings make their own choices, they aren’t actual tools.

    Oh please do go ahead and show us your proof that free will exists! Thank god you finally solved that one! I heard people were really stressing about it for a while!

    They pointed a tool at copyrighted works and told it to copy, do some math, and regurgitate it. What the AI “does” is not relevant, what the people that programmed it told it to do with that copyrighted information is what matters.

    “I don’t know how this works but it’s math and that scares me so I’ll minimize it!”