Tuesday, April 30, 2024
Show HN: I replicated Anthropic's monosemanticity research using just my MacBook https://ift.tt/uVWKnSl
Show HN: I replicated Anthropic's monosemanticity research using just my MacBook Hi everyone, I've been working on an open-source implementation of Anthropic's research on monosemanticity ("Towards Monosemanticity"). The problem Anthropic is trying to solve is that language models are hard to interpret because individual neurons can be responsible for multiple different things. The research finds that training a small autoencoder on neuron activations can result in "features" which are much easier to interpret. When I was reading the original research, I got really excited when I realized that the models they used were really small, and I could probably train them from scratch with just my M3 MBP. My models are somewhat undertrained compared to what Anthropic produced, but I think my results are still very compelling. Let me know what you think! https://ift.tt/iySbl1e April 30, 2024 at 10:56PM
Subscribe to:
Post Comments (Atom)
Show HN: Lucidia, a WebGL visualizer inspired by Drempels https://ift.tt/yobm4ZX
Show HN: Lucidia, a WebGL visualizer inspired by Drempels Made with ChatGPT, open source at https://ift.tt/BpEIVx0 https://ift.tt/pe0A1Lw Ap...
-
Show HN: High school robotics code/CAD/design binder release Hello HN! My name is Patrick, and I am a junior at my High School’s FRC robotic...
-
Show HN: D&D meets Siri – Interactive voice adventure Hey HN! I've been building tooling for voice-driven apps over the past few mon...
-
Show HN: I Made an AI Social Media Manager to Automate Content Creation Hey HN, I am a Solopreneur, and I love building apps to automate bor...
No comments:
Post a Comment