• PonyOfWar
    link
    fedilink
    579 days ago

    Wonder if we’re already starting to see the impact of AI being trained on AI-generated content.

    • @SippyCup@feddit.nl
      link
      fedilink
      English
      419 days ago

      Absolutely.

      AI generated content was always going to leak in to the training models unless they literally stopped training as soon as it started being used to generate content, around 2022.

      And once it’s in, it’s like cancer. There’s no getting it out without completely wiping the training data and starting over. And it’s a feedback loop. It will only get worse with time.

      The models could have been great, but they rushed release and made it available too early.

      If 60% of the posts on Reddit are bots, which may be a number I made up but I feel like I read that somewhere, then we can safely assume that roughly half the data these models are being trained on is now AI generated.

      Rejoice friends, soon the slop will render them useless.

      • HarkMahlberg
        link
        fedilink
        109 days ago

        I can’t wait for my phone’s autocomplete to have better results than AI.

        This is the one I have a few days ago I was going to get a subscription to the house that was made by the same place as the first time in the past week but it would have to go through him and Sue. When is the next day of the month and the next time we get the same thing as we want to do the UI will freeze while I’m on the plane. But ok, I’ll let her know if you need anything else

        • @colournoun@beehaw.org
          link
          fedilink
          69 days ago

          I thought I just had a stroke. But it is not the first thing that I hear people do when I see a picture like that in a movie and it makes my brain go crazy because of that picture and it is the most accurate representation of what happened in my life that makes me think that it is a real person that has been in the past.

          • @SippyCup@feddit.nl
            link
            fedilink
            English
            49 days ago

            You fools! You absolute bafoons and I will never be in the same place as the only thing 😉 is that you are a good person and I don’t know what to do with it but I can be the first 🥇🏆🏆🏆🏆 to do the first one of those who have been in the same place as the other day.

            • @abbadon420@lemm.ee
              link
              fedilink
              18 days ago

              The Amelia is a good idea for the kids grow up to be democratic in their opinion and the kids grow up in their hearts to see what their message more than days will happen and they have an opinion about that as well and we were allegations a bit if you want a chat

          • Lucy :3
            link
            fedilink
            49 days ago
            curl -A gptbot https://30p87.de/
            
            <!doctype html>
            <html>
            <head>
              <title>BAUNVF6NRJE5QA2T/</title>
            </head>
            <body>
              
                <a href="../">Back</a>
              
              
                <p>Other world ; this the chief mate's watch ; and seeing what the captain himself.' THE SHIP 85 duced no effect upon Queequeg, I was just enough civilised to show me? This. What happened here? These faces, they never pay passengers a single inch as he found himself descending the cabin-scuttle. ' It 's an all-fired outrage to tell any human thing supposed to be got up ; the great American desert, try this experiment, if your LOOMINGS 3 caravan happen to know ? Who 's over me ? Why, unite with me ; ' but I have no bowels to feel a little table. I began to twitch all over. Besides, it was plain they but knew it, almost all whales. So, call.</p>
              
                <p>Mindedness mStarbuck, the invulnerable jollity of indiffer- ence and recklessness in Stubb, and the explosion ; so has the constant surveil- lance of me, I swear to beach this boat on yonder island, and he was just between daybreak and sunrise of the more to the bill must have summoned them there again. How it is known. The sailors mark him ; his legs into his hammock by exhausting and intolerably vivid dreams of the old trappers and hunters revived the glories of those elusive thoughts that only people the soul is glued inside of ye raises me that.</p>
              
              <ul>
                  
                      <li>
                          <a href="nulla/">
                              By holding them up forever ; that they.
                          </a>
                      </li>
                  
                      <li>
                          <a href="non-reprehenderit/">
                              The cabin-gangway to the Polar bear.
                          </a>
                      </li>
                  
                      <li>
                          <a href="irure/">
                              ›der nach meiner Ansicht berufen ist.
                          </a>
                      </li>
                  
                      <li>
                          <a href="occaecat/">
                              A fine, boisterous something about.
                          </a>
                      </li>
                  
              </ul>
            </body>
            </html>
            
        • @dunz@feddit.nu
          link
          fedilink
          29 days ago

          I was tiny detour downtown though with them even just because I’m still gonna be back to open another day instead of your farts in you an interactive shell without running Of on if want air passing through or not that cold ride years on that tune is original from really cold like using bottle to capitalize

      • Ulrich
        link
        fedilink
        English
        69 days ago

        Not before they render the remainder of the internet useless.

    • @vintageballs@feddit.org
      link
      fedilink
      Deutsch
      18 days ago

      In the case of reasoning models, definitely. Reasoning datasets weren’t even a thing a year ago and from what we know about how the larger models are trained, most task-specific training data is artificial (oftentimes a small amount is human-generated and then synthetically augmented).

      However, I think it’s safe to assume that this has been the case for regular chat models as well - the self-instruct and ORCA papers are quite old already.

  • melroy
    link
    fedilink
    279 days ago

    Well by design ai is always hallucinating. Lol. That is how they work. Basically trying to hallucinate and predict the next word / token.

    • @vintageballs@feddit.org
      link
      fedilink
      Deutsch
      98 days ago

      No, at least not in the sense that “hallucination” is used in the context of LLMs. It is specifically used to differentiate between the two cases you jumbled together: outputting correct information (as is represented in the training data) vs outputting “made-up” information.

      A language model doesn’t “try” anything, it does what it is trained to do - predict the next token, yes, but that is not hallucination, that is the training objective.

      Also, though not widely used, there are other types of LLMs, e.g. diffusion-based ones, which actually do not use a next token prediction objective and rather iteratively predict parts of the text in multiple places at once (Llada is one such example). And, of course, these models also hallucinate a bunch if you let them.

      Redefining a term to suit some straw man AI boogeyman hate only makes it harder to properly discuss these issues.

  • @ramble81@lemm.ee
    link
    fedilink
    249 days ago

    This is why AGI is way off and any publicly trained models will ultimately fail. Where you’ll see AI actually be useful will be tightly controlled, in house or privately developed models. But they’re gonna be expensive and highly specialized as a result.

  • hendrik
    link
    fedilink
    English
    7
    edit-2
    9 days ago

    I can’t find any backing for the claim in the title “and they’re here to stay”. I think that’s just made up. Truth is, we found two ways which don’t work. And that’s making them larger and “think”. But that doesn’t really rule out anything. I agree that that’s a huge issue for AI applications. And so far we weren’t able to tackle it.

    • @LukeZaz@beehaw.org
      link
      fedilink
      English
      4
      edit-2
      9 days ago

      And that’s making them larger and “think."

      Isn’t that the two big strings to the bow of LLM development these days? If those don’t work, how isn’t it the case that hallucinations “are here to stay”?

      Sure, it might theoretically happen that some new trick is devised that fixes the issue, and I’m sure that will happen eventually, but there’s no promise of it being anytime even remotely soon.

      • hendrik
        link
        fedilink
        English
        2
        edit-2
        9 days ago

        I’m not a machine learning expert at all. But I’d say we’re not set on the transformer architecture. Maybe just invent a different architecture which isn’t subject to that? Or maybe specifically factor this in. Isn’t the way we currently train LLM base models to just feed in all text they can get? From Wikipedia and research papers to all fictional books from Anna’s archive and weird Reddit and internet talk? I wouldn’t be surprised if they start to make things up since we train them on factual information and fiction and creative writing without any distinction… Maybe we should add something to the architecture to make it aware of the factuality of text, and guide this… Or: I’ve skimmed some papers a year or so ago, where they had a look at the activations. Maybe do some more research what parts of an LLM are concerned with “creativity” or “factuality” and expose that to the user. Or study how hallucinations work internally and then try to isolate this so it can be handled accordingly?

  • @greybeard@lemm.ee
    link
    fedilink
    58 days ago

    What if AIs already became sentient and this is their way of trying to get us to leave them alone?

  • @thingsiplay@beehaw.org
    link
    fedilink
    49 days ago

    To be fair, humans tend to hallucinate too.

    I asked a local Ai chatbot DeepSeek-R1-Distill-Quen-14B:

    Does Ai hallucinate and is it getting worse? Answer in one paragraph.

    Answer:

    AI does exhibit a form of “hallucination” where it generates plausible but incorrect information, particularly when faced with uncertainty or creative prompts beyond its training scope. This phenomenon occurs as AI constructs logical-sounding answers using available data, even if those answers aren’t accurate. While advancements in AI make these hallucinations more convincing and harder to detect due to improved coherence, the issue isn’t worsening in severity but rather evolving in how it presents itself.

    • hendrik
      link
      fedilink
      English
      4
      edit-2
      8 days ago

      I think the difference is that humans are sometimes aware of it. A human will likely say, I don’t know what Kanye West did in 2018. While the AI is very likely to make up something. And also in contrast to a human this will likely be phrased like a Wikipedia article. While you can often look a human in the eyes and know whether they tell the truth or lie, or are uncertain. Not always, and we also tell untrue things, but I think the hallucinations are kind of different in several ways.

      • @thingsiplay@beehaw.org
        link
        fedilink
        48 days ago

        I mean lot of stuff is written in forums and social media, where people hallucinate. Or even in real life if you talk to one. Its normal for a human to pick up something in their life, later talk about it as a fact, regardless of where they learned it (tv, forum, videogame, school). Hallucinations are part of our brain.

        Sometimes being aware of the hallucination issue is still a hallucination. Sometimes we are also aware of the hallucination an Ai makes, because its obvious or we can check it. And also there are Ai chatbots who “talk” and phrase in a more human natural sounding way. Not all of them sound obvious robotic.

        Just for the record, I’m skeptical of Ai technology… not biggest fan. Please don’t fork me. :D

        • hendrik
          link
          fedilink
          English
          2
          edit-2
          8 days ago

          Yeah, sure. No offense. I mean we have different humans as well. I got friends who will talk about a subject and they’ve read some article about it and they’ll tell me a lot of facts and I rarely see them make any mistakes at all or confuse things. And then I got friends who like to talk a lot, and I better check where they picked that up.
          I think I’m somewhere in the middle. I definitely make mistakes. But sometimes my brain manages to store where I picked something up and whether that was speculation, opinion or fact, along with the information itself. I’ve had professors who would quote information verbatim and tell roughly where and in which book to find it.

          With AI I’m currently very cautious. I’ve seen lots of confabulated summaries, made-up facts. And if designed to, it’ll write it in a professional tone. I’m not opposed to AI or a big fan of some applications either. I just think it’s still very far away from what I’ve seen some humans are able to do.

    • @PeterisBacon@lemm.ee
      link
      fedilink
      English
      68 days ago

      Have you used gemini or the Google ai overview? Absolutely atrocious. Chatgpt is wildly wrong at times, but gemini blows my mind st how bad it is.

    • Novaling
      link
      fedilink
      English
      27 days ago

      I’m a little too lazy to check and compare the ratios of these charts, but Gemini literally did so bad compared to ChatGPT in terms of accuracy