You have a small, mostly static corpus (e.g., a few hundred to a few thousand chunks). You want zero‑infrastructure local retrieval with fast, predictable latency. You’re assembling “infinite few‑shot ...
You can either disable the history with the --history-mode=none option, or only keep the previous queries without sending their database result with the --history-mode=queries option. Please note that ...