Tuesday, December 26, 2023
Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 https://ift.tt/KZ9Wzgb
Show HN: Made a batching LLM API for a project. Mistral 200 tk/s on RTX 3090 I was running into an issue with a vLLM bug that affected multiple GPUs and I needed a stand-in while that bug was getting fixed that used the same API format but had better performance than the API on text-generation-webui. It's very rough. I'm not a coder by trade. But it's very fast once you have many simultaneous connections. https://ift.tt/qv2Nm6X December 27, 2023 at 01:22AM
Subscribe to:
Post Comments (Atom)
Show HN: AI quiz generator from any topic or book in seconds https://ift.tt/8f7I9vU
Show HN: AI quiz generator from any topic or book in seconds https://www.wiyomi.com April 10, 2025 at 10:57AM
-
Show HN: High school robotics code/CAD/design binder release Hello HN! My name is Patrick, and I am a junior at my High School’s FRC robotic...
-
Show HN: D&D meets Siri – Interactive voice adventure Hey HN! I've been building tooling for voice-driven apps over the past few mon...
-
Show HN: I Made an AI Social Media Manager to Automate Content Creation Hey HN, I am a Solopreneur, and I love building apps to automate bor...
No comments:
Post a Comment