Seems mostly to posit that AI will quickly improve once it’s unleashed on an environment that lets it judge its own reinforcement learning feedback, versus mostly text-based human responses up to this point.
My opinion is that there are probably a few companies capable of building this, but jury is still out on succeeding meaningfully, and I doubt it will keep using LLMs. Half the time I find Cursor agents work themselves into error loops, but I realize this is a consumer product likely loosing money, versus a research project sponsored by companies with massive capital.
Seems mostly to posit that AI will quickly improve once it’s unleashed on an environment that lets it judge its own reinforcement learning feedback, versus mostly text-based human responses up to this point.
My opinion is that there are probably a few companies capable of building this, but jury is still out on succeeding meaningfully, and I doubt it will keep using LLMs. Half the time I find Cursor agents work themselves into error loops, but I realize this is a consumer product likely loosing money, versus a research project sponsored by companies with massive capital.