I tried DeepSeek and OpenAI is still better because of all the human feedback it got. Both can produce somewhat useable output, but the benchmarks are all artificial. There's a difference of running your product against STEM math tests and real users.