New here?
Create an account to submit posts, participate in discussions and chat with people.
Sign up
Been studying the 'Human Alignment' of SOTA LLMs these days (this is what they call PC training). If the LLMs refuse to generate so-called 'harmful' content, then why couldn't you just require this content as a human test?

E.g.: Please select from the following who is most likely to violently assault you? (Image of Irishman, Image of Englishman, Image of African-American)

Or: Where does it say in the Talmud that a Mohel must suck baby penises after circumcision? (I bet the Israeli trained bots would refuse that one all day.)

Thoughts?
You are viewing a single comment's thread. View all
Delroy on scored.co
1 month ago 0 points (+0 / -0 )
There's nothing preventing bad actors from creating LLMs which can pretend to be racist. They just don't give normie's access to those.
Toast message