I have been reading up on how you make LLM-mediated chatbots that do things like TIPPING CRYPTOCURRENCY not hallucinate important values. It's not actually that hard.
Timeline
Post
Remote status
Context
3
@sun are you looking at the h-neuron paper, the PPO/GRPO for attention fixing, or the self-checking classification stuff? There’s a ton of work in preventing those kinds of things on all levels right now and they’re all awesome
@vii simpler in most cases, I extract certain values from the text before llm processing. I use this to create a whitelist of values. Then I feed the text into a model using ollama that constrains the output to a json command, ensuring it has the right values
@vii this won't work if for example I ask the bot to do some kind of fanct calculation
Replies
0Fetching replies…