Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What would be a typical/recommended server setup for using this for RAG? Would you typically have a separate server for the GPUs and the DB itself?


Assuming you are using GPUs for model inference, the best way to set it up would have the DB and a separate server to send inference requests. Note that we plan on support custom model endpoints and on the database side so you probably won't need the inference server in the future!




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: