You Got Commands in My Prompt!
Qwen Adds a Secret Menu
When you start up or prompt an LLM, you get to provide a handful of parameters that influence the way the LLM runs. We discussed some of these in our guide to getting started with local LLMs, but they include settings like temperature and the max tokens.
The Qwen team added a new parameter recently, with the release of their hybrid thinking Qwen 3 models. By adding a flag enable_thinking
, you can toggle on or off reasoning. I love this flexibility, especially in the context of locally running models. It’s handy to have both modes while only having to download and store one model.
But the Qwen team went a step further and added what they call a “soft switch” to control thinking: by adding “/think” or “/no_think” you can turn on or off thinking on the fly, as you prompt. Again, it’s a big benefit to local models, enabling you to control this behavior straight from the prompt without fiddling with parameters and/or restarting the model.
I can’t stop thinking about this feature and what it says about the current state of UX for LLMs and prompting in general.
The beauty of LLMs is their natural language interface. We get to control a program by typing or speaking, without any special terms or protocols. And yet, here’s the Qwen team mixing in commands among the natural language.
Why did they fine-tune behavior around a command and not natural language?
Another team might have fine-tuned the model on many different variants of, “Don’t reason,” requests, teaching the model to recognize that humans may want to turn thinking on and off. Perhaps the Qwen team found this to perform inconsistently in practice. Perhaps it was cheaper and simpler to wire in this convenience feature using only a special token(s), not the many possible natural language variants.
Or was it a product decision: they recognized that toggling thinking should be handled by the application layer in most cases, not by a layperson user?
The Qwen team has yet to share the implementation details or motivations for this choice.
Will hidden, programmatic controls in prompts catch on?
It feels so awkward to put commands in natural language. True, we’ve trained a subset of users that “/” commands activate special things (in chat apps, particularly), but it feels at odds with LLM’s core promise.
But despite this awkwardness, the “/no_think” command is undeniably useful! I want more Slash Commands in my models, letting us control the behavior of a running model from the prompt. Like: /verbose
, /concise
, /low_think
, /high_think
, /creative
, /code
(especially language-specific ones, like /ruby
), /no_emoji
, /one_sentence
and /bool
.
All of these can be handled in the prompts, but it can get verbose and inconsistent from model to model. Anyone who’s fought with a LLM to deliver consistent JSON output or respond with only one sentence knows the terrain gets squishy.
For large, frontier models and their company’s platforms, these slash commands should (and often do) live in the UI.
But as you get beyond the most used commands (toggling thinking, for example), they’re more likely to get left off the screen to prevent user confusion. For the more niche but valuable commands, perhaps there is a need here. A UX model akin to an In-N-Out Secret Menu…
For local or other self-hosted models, the argument is likely that these things could live in start up or inference parameters. Which is true – but as these models keep getting better (see: Qwen 3), I see more and more people running them consistently, where these soft switches would be useful.
I’m torn: I don’t expect slash commands to proliferate, but I hope they do.