• @bionicjoey@lemmy.ca
    link
    fedilink
    English
    679 months ago

    Makes sense. AAVE is mostly a spoken thing, LLMs are mostly trained on the corpus of written text on the internet and in books. It’s pretty rare for people to write in an AAVE style in those contexts.

  • @Ghyste@sh.itjust.works
    link
    fedilink
    English
    319 months ago

    They can’t possibly encounter much of it in training material… Of course they’re not going to like it.

  • Lvxferre [he/him]
    link
    fedilink
    English
    29
    edit-2
    9 months ago

    I’m not from USA, black, nor a native English speaker, but due to Linguistics I can give you guys some further info.

    AAE (Afro-American English), in a nutshell, is a group of English varieties used by some speakers from USA and Canada. In a lot of aspects they resemble geographical varieties, like the ones you’d see in plenty other languages, but there’s a key difference: it isn’t used by people “of a certain region”, but rather by people “of a certain race” (black people).

    This is mostly but not completely spoken (cue to the term AAVE - the “V” stands for “vernacular”); it affects also the way that those people use the written language. So often you see AAE features in written English, like:

    • Negative concord - for example, “I don’t want to hear nothing about this shit, man.”
    • Habitual-be - for example, “They be talking about this everyday.”
    • bits of non-standard spelling, due to phonetic differences
    • expressions and vocab typically used primarily by black people

    What the article is saying is that LLMs are biased against those features. It’s a rather strong bias, and not noticed for a geographical variety used as reference (Appalachian English). In other words: the LLM has been fed racist babble, and now it’s regurgitating it.

    • @yamanii@lemmy.world
      link
      fedilink
      English
      49 months ago

      I see, that’s very different from most countries I imagine? People often speak on their own local dialect, here a northeastern would informally speak a completely different portuguese than someone from the south, doesn’t matter the race.

      • Lvxferre [he/him]
        link
        fedilink
        English
        49 months ago

        Yup, it’s atypical even in the rest of the Americas. I think that the nearest equivalent in Portuguese would be the quilombola dialects, but even then it’s way off - because those dialects are still geographically associated with their respective quilombos, not just with race.

    • @flamingo_pinyata@sopuli.xyz
      link
      fedilink
      English
      99 months ago

      I’d say this is exactly where the LLMs problems with it comes from. For most of us outside of the US and even a lot of people there, it’s exactly that - a caricature of a lower class black person. However for many people it’s a legit dialect of English they speak every day.

    • @sailingbythelee@lemmy.world
      link
      fedilink
      English
      69 months ago

      I don’t live in America either, but I went on a cruise once and there were many Americans, including a black American couple who were very obviously urban. By which I mean, the wife wore high heels and a tight jeweled mini-skirt on a sea-kayaking excursion…clearly signalling that she hadn’t spent much time outside of a city.

      Anyway, I was shocked when they spoke exactly like The Jeffersons, with all the exaggerated whooping, non-stop vernacular, and stage-like mannerisms. It was so over-the-top that I honestly thought they were play acting, but after chatting with them for a while I realized that was just how they were. They were very nice people and clearly having a great time.

    • @Dasus@lemmy.world
      link
      fedilink
      English
      29 months ago

      Not to be confused with African-American Vernacular English.

      Aave is what I’d say is more “the kind of language a stereotypical black character in a movie would use”.

      African-American Vernacular English[a] (AAVE)[b] is the variety of English natively spoken, particularly in urban communities, by most working- and middle-class African Americans and some Black Canadians.[4] Having its own unique grammatical, vocabulary and accent features, AAVE is employed by middle-class Black Americans as the more informal and casual end of a sociolinguistic continuum. However, in formal speaking contexts, speakers tend to switch to more standard English grammar and vocabulary, usually while retaining elements of the non-standard accent.[5][6] AAVE is widespread throughout the United States, but is not the native dialect of all African Americans, nor are all of its speakers African American.

      • @sanpo@sopuli.xyz
        link
        fedilink
        English
        69 months ago

        Well, “not to be confused”, but the same page says AAVE is just a dialect of AAE, so mostly not much of a difference, I think.

        • Lvxferre [he/him]
          link
          fedilink
          English
          19 months ago

          The difference here is mostly scope: AAE includes stuff like African-American Standard English (English as used by black people in more formal settings) and the written language, while AAVE refers only to the vernaculars.

          Note that some don’t even make this distinction, but I think that it’s important.

  • @madcat@lemm.ee
    link
    fedilink
    English
    159 months ago

    Because there is no such thing as “African American English”. There is proper English and then there is slang.

        • Lvxferre [he/him]
          link
          fedilink
          English
          159 months ago

          It’s kind of off-topic, but also on-topic:

          The Queen/king and no one else.

          King Charles uses a variety called Received Pronunciation, but both of his sons (William and Harry) use Southern Standard British instead. Geoff Lindsey has a video on the differences.

          As such, once William rises to the throne, what’s considered “the King’s English” will change. And, alongside it, what plenty people in the UK consider as standard English will change too.

    • TimeSquirrel
      link
      fedilink
      139 months ago

      Color or colour? Truck or lorry? Cookie or biscuit?Which one is “proper”?

    • Lvxferre [he/him]
      link
      fedilink
      English
      129 months ago

      What you call “proper English” (or “proper” any other language) is merely an arbitrary construct. It is not set on stone.

      That applies to all levels of a language, by the way, not just vocabulary (“slang”).

      • @madcat@lemm.ee
        link
        fedilink
        English
        19 months ago

        Slang is slang. It’s always used verbally. I am not sure why someone would expect a llm to generate proper slang. Not sure at all how stating that fact makes one a “bigot”.

    • @Wanderer@lemm.ee
      link
      fedilink
      English
      29 months ago

      It’s bad enough the American’s are too stupid to use the proper one that we have to have two.

      But people talking incorrectly is not a reason to write like that. Unless it’s a character speaking or whatever.

    • TheRealKuni
      link
      fedilink
      English
      129 months ago

      Essentially, yes. Ebonics isn’t inherently offensive or inappropriate, as far as I can tell, but it has connotations that are not attached to AAE. Linguists avoid the term today, and modern uses of it tend to be derogatory.

      Source

  • @randon31415@lemmy.world
    link
    fedilink
    English
    99 months ago

    African Americans have a weak bias against writing in African American English -> Colleges have weak bias against accepting African Americans as graduate students -> Academic text have strong bias for text written by graduate students -> LLM training data has bias for academic texts -> LLMs have a strong bias for writing like training data.

    The error occurs upstream a bit, don’t point at the coders.

    • @TexMexBazooka@lemm.ee
      link
      fedilink
      English
      1
      edit-2
      9 months ago

      Writing in AAVE is silly, just like someone from the Deep South including southern drawl in their writing would be, or someone from Boston spelling “car keys” as “kha kees”

      So

      African Americans have a weak bias against writing in African American English -> Colleges have weak bias against accepting African Americans as graduate students

      Is a bit of a jump. Someone writing in AAVE probably wouldn’t get accepted to college, because written word is supposed to transcend dialects and follow a set of rules to be universally understandable.