Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.

Source:https://github.com/SoraKumo001/next-streaming

⬅️ Reasoning models don't always say what they think

lsy 20 hoursReload

The fact that it was ever seriously entertained that a "chain of thought" was giving some kind of insight into the internal processes of an LLM bespeaks the lack of rigor in this field. The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it! They aren't references to internal concepts, the model is not aware that it's doing anything so how could it "explain itself"?

CoT improves results, sure. And part of that is probably because you are telling the LLM to add more things to the context window, which increases the potential of resolving some syllogism in the training data: One inference cycle tells you that "man" has something to do with "mortal" and "Socrates" has something to do with "man", but two cycles will spit those both into the context window and lets you get statistically closer to "Socrates" having something to do with "mortal". But given that the training/RLHF for CoT revolves around generating long chains of human-readable "steps", it can't really be explanatory for a process which is essentially statistical.

pton_xd 20 hoursReload

I was under the impression that CoT works because spitting out more tokens = more context = more compute used to "think." Using CoT as a way for LLMs "show their working" never seemed logical, to me. It's just extra synthetic context.

xg15 19 hoursReload

> There’s no specific reason why the reported Chain-of-Thought must accurately reflect the true reasoning process;

Isn't the whole reason for chain-of-thought that the tokens sort of are the reasoning process?

Yes, there is more internal state in the model's hidden layers while it predicts the next token - but that information is gone at the end of that prediction pass. The information that is kept "between one token and the next" is really only the tokens themselves, right? So in that sense, the OP would be wrong.

Of course we don't know what kind of information the model encodes in the specific token choices - I.e. the tokens might not mean to the model what we think they mean.

PeterStuer 20 hoursReload

Humans also post-rationalize the things their subconscious "gut feeling" came up with.

I have no problem for a system to present a reasonable argument leading to a production/solution, even if that materially was not what happened in the generation process.

I'd go even further and pose that probably requiring the "explanation" to be not just congruent but identical with the production would either lead to incomprehensible justifications or severely limited production systems.

ctoth 19 hoursReload

I invite anyone who postulates humans are more than just "spicy autocomplete" to examine this thread. The level of actual reasoning/engaging with the article is ... quite something.