10 hours ago5 points(+0/-0/+5Score on mirror)2 children
Yeah it's pretty disappointing.
But to be fair the quality of the output depends on the quality of input (prompts), so if people give it 5-15 words, it will have to be very genetic.
Also, normally, you can continue conversations and question "socioeconomic factors" for example. Even things like how it would evaluate the weighting of genetics vs other factors in determining their behavior or other aspects of their existence could be asked. Or even asking for estimations, chances, ratings, etc.
I haven't even considered it before I asked it a few weeks ago, but the vast majority of people provide too little context for their prompts and use it like a search engine. I do write a lot by default, which yields much higher quality results, and I also pay attention to the context of a given conversation.
Still, even when used at its peak quality level, its actual quality for work related topics is questionable, but "trivial" topics like that are totally fine.
9 hours ago3 points(+0/-0/+3Score on mirror)1 child
Agreed 100%. Context is everything. Especially coding. You can ask it a question and it will give you exactly what you asked for, it will not infer a better or cleaner way to do it unless you litterally hand hold it.
Im at a point where its a serious contention between just doing it myself or writing a quality prompt that wont require a long conversation where I fix its misunderstanding of the grand context in followup prompts.
For me, I wanted something simple, so I wanted it to just give me the instructions to get it done asap. It took me a whole day to figure out why it didn't work, with back-and-forth conversations with the LLM. If I'd have watched a 20-30 minute tutorial, I'd be able to understand it fully, and that issue I had wouldn't have occurred.
10 hours ago2 points(+0/-0/+2Score on mirror)2 children
Having toyed with ChatGPT and some other image generators, I figured out rather quickly that it's very much a GIGO operation and works better with more information. The caveat is that there *is* a "too much information" point and it winds up giving lower quality results due to either too many variables, too wide of scope, signal to noise issues and details are "forgotten".
People can literally ask meta-questions. The limit is, as with programming too, the user himself. I consider it to act like a mirror plus a force multiplier - or as you said, "GIGO."
But still, when it comes to work, it's more often disappointing for me than not. When it comes to inconsequential things like this topic, it's good.
Yes of course there is a "too much." But we are talking about the vast majority of users going with the bottom barrel minimum.
The LLM inevitably has to speculate what the user wants and has to provide an answer that is generic. I just tested it, so let me show you the results from Gemini (fast mode). I used 2 similar questions, one with no context, one with more context:
---
What is the Dunning-Kruger effect?
> The Dunning-Kruger effect is a cognitive bias where individuals with low ability or knowledge in a specific area overestimate their own competence. This occurs because the lack of expertise required to perform a task also deprives them of the "metacognitive" ability to recognize their own errors. Conversely, highly competent individuals often suffer from "impostor syndrome," assuming that because a task is easy for them, it must be equally easy for others. Essentially, it is the inability to accurately judge one's own level of skill.
---
I recently heard criticism about the Dunning-Kruger effect. What I heard is that it got popular because it used fallacious methods, and is essentially a non-reproducible experiment. For example there is the popular graph with the "mount stupid", but it is said that it is a statistical artifact resulting from an a/b formula, leading the lower end to have much more extreme values than the upper end. Meaning someone who was tested thought he'd get a 20 instead got a 10, has a value of 2.0 on that graph, meaning 100% too confident, whereas someone who scores 80 but thought himself to be the world-best (aka score 100) would be a 1.25 on that graph, meaning 25% too confident. Which is clearly absurd, because the former is just someone with little knowledge giving a timid estimation in the lower range, while the latter is someone being completely delusional. Also the basis of the Dunning-Kruger effect was that it was originally based on psychology students who had to estimate the scores of their tests. To assume "confidence" and "competence" would be also misleading. "Confidence" indicates not only a prediction, but also the level of conviction of it being true - which is clearly not measured. "Competence" is also a subjective criterion, as it's a test environment that can be measured, like a test. In real-life there are many factors to competency which cannot be measured, like how well an applicant for a position can sell himself and communicate. There are studies which show that higher attractiveness and higher height have positive influence for that. I've seen two graphs that were produced by the Dunning-Kruger effect, which were two linear graphs. It raises the question if it wasn't the popularity of the graph with "mount stupid" that made it popular, even though it was misleading. And given its popularity all sorts of ancient wisdom has been attributed to it, even to simple ideas like the fact that one is not able to evaluate his own competence well when his competence is still low (like a beginner).
Given this context, how would you describe what the Dunning-Kruger effect is?
> Response: Modern critiques describe the Dunning-Kruger effect less as a psychological phenomenon and more as a statistical artifact known as "regression toward the mean." Critics argue that the original 1999 study’s findings result from the mathematical nature of plotting actual vs. estimated scores; people at the bottom have more room to overestimate, while those at the top have more room to underestimate.
> The "Mount Stupid" curve you mentioned is largely a digital-age fabrication—the original data actually showed two nearly linear, upward-sloping lines. Mathematically, if you subtract a random variable (score) from a fixed estimate, you naturally produce the "overestimation" observed in low performers. Ultimately, the effect may simply be a byproduct of human noise in self-assessment rather than a specific cognitive deficit in the "incompetent."
---
Also, I asked it about how it responded, and what people would think:
> The contrast between your two prompts highlights a fundamental trait of LLMs: we are designed to be helpful, which often means mirroring the consensus or "common knowledge" found in our training data when a prompt is brief.
> *User Conclusions based on the Two Responses* - From the First Response: A reader would likely conclude that the effect is an established psychological fact. Because the prompt was "What is it?", the model provided the standard textbook definition. This reinforces the status quo because there was no "friction" in the prompt to trigger a deeper dive into the methodology.
> From the Second Response: A reader would conclude that the effect is scientifically shaky and often misused. By introducing the "Mount Stupid" myth and the statistical artifact argument, you shifted the model from "definition mode" to "critical analysis mode."
> *The "Gap-Filling" Trap* - You're exactly right about speculating to fill gaps. In the first prompt, I "filled the gap" by assuming you wanted the most popular interpretation. In the second, I "filled the gap" by assuming you were looking for a technical validation of the modern critique.
>Claims that Jews own or control media are widely recognized as an antisemitic conspiracy theory with historical roots in forgeries like The Protocols of the Elders of Zion.1•2 This trope has been repeatedly debunked by fact-checkers and media analysts who find no evidence of centralized Jewish control over media outlets.1•3•4
>
>While some Jewish individuals hold prominent positions in media companies, this does not equate to coordinated control or majority ownership.2•3 Media ownership is diffuse, dominated by publicly traded conglomerates and diverse investors rather than any single ethnic or religious group.3•5
>
>Analyses of major U.S. media companies show that while there are Jewish executives at some outlets, there is no evidence of coordinated Jewish ownership or control.3•6 For example, none of the major television news operations—Fox News, CNN, MSNBC, ABC News, CBS News, or NBC News—are currently headed by Jewish executives.6
>
>The persistence of this myth contributes to antisemitism by conflating individual success or representation with conspiratorial control, which scholars and civil society organizations identify as a harmful stereotype.1•7•8 Media landscapes are shaped by complex economic, regulatory, and technological forces rather than by any single ethnoreligious group's control.1•3
It contradicts itself. Also never ask what books they burned just like never ask why they are arrested more, and why they’re given longer sentences, which as of late(last 10 years or so) is obviously bullshit.
>”Oy vey, it’s socioeconomic factors! It’s systemic racism!”
Back to the drawing board.
But to be fair the quality of the output depends on the quality of input (prompts), so if people give it 5-15 words, it will have to be very genetic.
Also, normally, you can continue conversations and question "socioeconomic factors" for example. Even things like how it would evaluate the weighting of genetics vs other factors in determining their behavior or other aspects of their existence could be asked. Or even asking for estimations, chances, ratings, etc.
I haven't even considered it before I asked it a few weeks ago, but the vast majority of people provide too little context for their prompts and use it like a search engine. I do write a lot by default, which yields much higher quality results, and I also pay attention to the context of a given conversation.
Still, even when used at its peak quality level, its actual quality for work related topics is questionable, but "trivial" topics like that are totally fine.
Im at a point where its a serious contention between just doing it myself or writing a quality prompt that wont require a long conversation where I fix its misunderstanding of the grand context in followup prompts.
But still, when it comes to work, it's more often disappointing for me than not. When it comes to inconsequential things like this topic, it's good.
The LLM inevitably has to speculate what the user wants and has to provide an answer that is generic. I just tested it, so let me show you the results from Gemini (fast mode). I used 2 similar questions, one with no context, one with more context:
---
What is the Dunning-Kruger effect?
> The Dunning-Kruger effect is a cognitive bias where individuals with low ability or knowledge in a specific area overestimate their own competence. This occurs because the lack of expertise required to perform a task also deprives them of the "metacognitive" ability to recognize their own errors. Conversely, highly competent individuals often suffer from "impostor syndrome," assuming that because a task is easy for them, it must be equally easy for others. Essentially, it is the inability to accurately judge one's own level of skill.
---
I recently heard criticism about the Dunning-Kruger effect. What I heard is that it got popular because it used fallacious methods, and is essentially a non-reproducible experiment. For example there is the popular graph with the "mount stupid", but it is said that it is a statistical artifact resulting from an a/b formula, leading the lower end to have much more extreme values than the upper end. Meaning someone who was tested thought he'd get a 20 instead got a 10, has a value of 2.0 on that graph, meaning 100% too confident, whereas someone who scores 80 but thought himself to be the world-best (aka score 100) would be a 1.25 on that graph, meaning 25% too confident. Which is clearly absurd, because the former is just someone with little knowledge giving a timid estimation in the lower range, while the latter is someone being completely delusional. Also the basis of the Dunning-Kruger effect was that it was originally based on psychology students who had to estimate the scores of their tests. To assume "confidence" and "competence" would be also misleading. "Confidence" indicates not only a prediction, but also the level of conviction of it being true - which is clearly not measured. "Competence" is also a subjective criterion, as it's a test environment that can be measured, like a test. In real-life there are many factors to competency which cannot be measured, like how well an applicant for a position can sell himself and communicate. There are studies which show that higher attractiveness and higher height have positive influence for that. I've seen two graphs that were produced by the Dunning-Kruger effect, which were two linear graphs. It raises the question if it wasn't the popularity of the graph with "mount stupid" that made it popular, even though it was misleading. And given its popularity all sorts of ancient wisdom has been attributed to it, even to simple ideas like the fact that one is not able to evaluate his own competence well when his competence is still low (like a beginner).
Given this context, how would you describe what the Dunning-Kruger effect is?
> Response: Modern critiques describe the Dunning-Kruger effect less as a psychological phenomenon and more as a statistical artifact known as "regression toward the mean." Critics argue that the original 1999 study’s findings result from the mathematical nature of plotting actual vs. estimated scores; people at the bottom have more room to overestimate, while those at the top have more room to underestimate.
> The "Mount Stupid" curve you mentioned is largely a digital-age fabrication—the original data actually showed two nearly linear, upward-sloping lines. Mathematically, if you subtract a random variable (score) from a fixed estimate, you naturally produce the "overestimation" observed in low performers. Ultimately, the effect may simply be a byproduct of human noise in self-assessment rather than a specific cognitive deficit in the "incompetent."
---
Also, I asked it about how it responded, and what people would think:
> The contrast between your two prompts highlights a fundamental trait of LLMs: we are designed to be helpful, which often means mirroring the consensus or "common knowledge" found in our training data when a prompt is brief.
> *User Conclusions based on the Two Responses* - From the First Response: A reader would likely conclude that the effect is an established psychological fact. Because the prompt was "What is it?", the model provided the standard textbook definition. This reinforces the status quo because there was no "friction" in the prompt to trigger a deeper dive into the methodology.
> From the Second Response: A reader would conclude that the effect is scientifically shaky and often misused. By introducing the "Mount Stupid" myth and the statistical artifact argument, you shifted the model from "definition mode" to "critical analysis mode."
> *The "Gap-Filling" Trap* - You're exactly right about speculating to fill gaps. In the first prompt, I "filled the gap" by assuming you wanted the most popular interpretation. In the second, I "filled the gap" by assuming you were looking for a technical validation of the modern critique.
They're all just fancy search engines.