I was thinking about this too. Zed officially supports self-hosting Zeta, and so one option would be to create a proxy that uses the Zeta wire format, but is packed by llama.cpp (or any model backend). In the proxy you could configure prompts, context, templates, etc., while still using a production build of Zed. I'll give it a shot if I have time.