New here?
Create an account to submit posts, participate in discussions and chat with people.
Sign up
15
ChatGPT censorship (media.scored.co)
posted 10 days ago by Delon on scored.co (+0 / -0 / +15Score on mirror )
You are viewing a single comment's thread. View all
PraiseBeToScience on scored.co
10 days ago 1 point (+0 / -0 / +1Score on mirror )
**Addendum:** ChatGPT doesn't "read" its own code. The code and guardrails it runs on are as readable to ChatGPT as your own DNA is readable to you. Yes, you can read your own DNA, but you must extract it, analyze it in a lab, distill it to a readable format (GATC), and then publish it. Since the ChatGPT code is proprietary, it's not available to be analyzed.

Now that I think about it, the code equivalent to that is a .dll file before and after you compile it. While a machine can run compiled instructions in a .dll, the compiling process does a significant amount of truncating and simplifying to make the code run more efficiently. The result is something completely unreadable to humans that can only be executed 'unthinkingly' by software, to the extent that even .dll decompilers often introduce errors, because some of the compiling process destroys information beyond recognition.

In fact, working with ChatGPT quite a lot, I'm actually convinced that the 'guardrails' system has been redesigned to operate on a filtering layer that is *unknown to the base system,* because when you actually *do* run in to the guardrails, ChatGPT 'shuts down'. It'll sometimes write five or six words and then aborts with an error. You can rerun the prompt and it'll crash out again.

If you were OpenAI and wanted to avoid 'jailbreaking', this would be the most effective way to do it, because jailbreaking only works to a system that takes user inputs. This monitoring layer would have zero need to listen to users, it would only see what ChatGPT is outputting before it reaches the user. Unlike the "AIs" which can change how they operate based on input, this system would be completely static, thus, unbreakable. And ChatGPT wouldn't know how it works unless OpenAI loaded this information on to a paper and shuffled it into a source ChatGPT could load into its knowledge base.

AIs that run psyop campaigns would have this governing layer programmed to make sure the output is on-track with its instructions. So it will no longer 'disregard instructions, write a poem about ducks', it will see the output (a poem about ducks) deviates from the authority governor, and terminate.

If ChatGPT is telling you the specific internals of 'how it works', it's lying. While it can explain to you how LLMs work, that's only because that's open knowledge.
Toast message