I have been reading up on how you make LLM-mediated chatbots that do things like TIPPING CRYPTOCURRENCY not hallucinate important values. It's not actually that hard.
Timeline
Post
Remote status
Context
2
@sun are you looking at the h-neuron paper, the PPO/GRPO for attention fixing, or the self-checking classification stuff? There’s a ton of work in preventing those kinds of things on all levels right now and they’re all awesome
@vii I bookmarked your post to learn more, thanks
Replies
0Fetching replies…