Fast AI
The Chrome team is building AI into Chrome, and it's fast. I made a small React app that submits a new prompt on every single keystroke. The latency is often below 100ms, making it feel nearly instant.
Seems light on resources. It uses the GPU when it can.