• CerebralHawks@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    9
    ·
    6 days ago

    My guess is, those that do are trained from forum posts where intelligence, including the knowledge of how and the wisdom of when to use non-standard punctuation marks, like en and em dashes, the semicolon, and others, were considered valuable. These people would seem, on the surface, to know more about what they’re talking about and would provide better training data for the LLM. Those people used em dashes, so, so too do the AI models based on them.

    Also, sorry (not sorry). I am a religious em dash user and have been for over 30 years. I’m not saying I’m smarter than anyone about any one thing, but it is entirely possible some of my forum posts were used to train LLMs. I didn’t get paid for it though; hence the “not sorry” part. If it trained on my posts after the fact, I won’t take any blame for that. But, people were using em dashes long before AIs were.