New here?
Create an account to submit posts, participate in discussions and chat with people.
Sign up
Been studying the 'Human Alignment' of SOTA LLMs these days (this is what they call PC training). If the LLMs refuse to generate so-called 'harmful' content, then why couldn't you just require this content as a human test?

E.g.: Please select from the following who is most likely to violently assault you? (Image of Irishman, Image of Englishman, Image of African-American)

Or: Where does it say in the Talmud that a Mohel must suck baby penises after circumcision? (I bet the Israeli trained bots would refuse that one all day.)

Thoughts?
You are viewing a single comment's thread. View all
ScallionPancake on scored.co
1 month ago 2 points (+0 / -0 / +2Score on mirror )
Sad but true.
Toast message