Tuesday, May 6, 2025
Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels https://ift.tt/ms3dGqp
Show HN: Kevin-32B – how to do multi-turn RL on writing CUDA kernels Hey – we just published a blog post about Kevin-32B = K(ernel D)evin. It's to our knowledge the first open-source model that's RL-trained on CUDA kernels. Our goal was to demonstrate multi-turn RL using GRPO. We used 180 Python->CUDA conversion tasks from the KernelBench dataset. The results were surprisingly strong! We were able to outperform top reasoning model like o3 & o4-mini. We're sharing our training setup and learnings in the blogpost. Also the model is on HuggingFace: https://ift.tt/0yh2O8l https://ift.tt/CD7O8up May 7, 2025 at 01:18AM
Subscribe to:
Post Comments (Atom)
Show HN: Contrapunk – Real-time counterpoint harmony from guitar input, in Rust https://ift.tt/zk0ioyJ
Show HN: Contrapunk – Real-time counterpoint harmony from guitar input, in Rust https://contrapunk.com/ April 5, 2026 at 06:10AM
-
Show HN: Music player for big local collections with mpd support mpz is a C++/Qt music player focused on UX, with derectory tree and playlis...
-
Show HN: Stickerbox, a kid-safe, AI-powered voice to sticker printer Bob and Arun here, creators of Stickerbox. If AI were built for kids, w...
-
Show HN: HCB Mobile – financial app built by 17 y/o, processing $6M/month Hey everyone! I just built a mobile app using Expo (React Native) ...
No comments:
Post a Comment