* Fix llama-cpp-python build for Apple Silicon
* Make ShellCheck happy
* Make gaby happy
---------
Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
* Enhance Signal Handling for Graceful Termination
* Fixed formatting
---------
Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
* Feature: add new k-quants q6_K models
* Feature: update llama-cpp-python==0.1.62
* Fix: labels and 7b not use k-quants
* Fix: labels and 7b not use k-quants
* Fix: labels and 7b old one and q6_K
* Fix: labels and 7b old sizes
* Fix: labels and 7b Koala names
---------
Co-authored-by: pabl-o-ce <cye@poscye.com>
* Pin llama-cpp-python to 0.1.54 for GGMLv3 support
* Update to GGMLv3 models
* Reflect current GGMLv3 models
* More readable model names
* Fix file sizes
---------
Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
* integrate langchain
get rid of mongodb
use llama-cpp-python bindings
* fixed most chat endpoints except posting questions
* Working post endpoint !
* everything works except streaming
* current state
* streaming as is
* got rid of langchain wrapper for calling llm, went back to using bindings directly
* working streaming
* sort chats by time
* cleaned up styling and added back loading indicator
* Add persistence support to redis
* fixed tooltips
* fixed default prompts
* added link to api docs (closes How to use the api #155 )