What's more, they exhibit a counter-intuitive scaling Restrict: their reasoning work boosts with issue complexity nearly some extent, then declines despite obtaining an suitable token price range. By evaluating LRMs with their regular LLM counterparts underneath equal inference compute, we establish a few general performance regimes: (1) small-complexity duties https://www.youtube.com/watch?v=snr3is5MTiU