24 Commits

Author SHA1 Message Date
Mariusz Kogen
7f6321ae82
Fix llama-cpp-python build for Apple Silicon (#763)
* Fix llama-cpp-python build for Apple Silicon

* Make ShellCheck happy

* Make gaby happy

---------

Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
2023-09-20 08:35:50 -04:00
Mariusz Kogen
e87d0209c8
Enhance Signal Handling for Graceful Termination (#727)
* Enhance Signal Handling for Graceful Termination

* Fixed formatting

---------

Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
2023-09-13 20:29:34 -04:00
Juan Calderon-Perez
0500cb2266
Remove support for DragonflyDB (#684) 2023-09-03 23:37:23 -04:00
Juan Calderon-Perez
53793ca580
Update llama-cpp-python to v0.1.78 (#653)
* Update dev.sh

* Update deploy.sh
2023-08-24 23:31:01 -04:00
Gianni C
5aca2b27d6
Add Kubernetes helm charts for Serge (#500)
Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
2023-08-10 23:02:14 -04:00
Juan Calderon-Perez
12ec7b7f42
Support for DragonflyDB (#598) 2023-08-06 22:54:42 -04:00
Juan Calderon-Perez
20c3dac583
Update llama-cpp-python to v0.1.77 2023-07-29 23:56:49 -04:00
Juan Calderon-Perez
6445c21af0
Update llama-cpp-python to v0.1.70 (#518) 2023-07-09 18:57:34 -04:00
Juan Calderon-Perez
696c2d288c
Fixes to startup scripts and Dockerfiles (#517) 2023-07-09 18:28:33 -04:00
Juan Calderon-Perez
65cfcfbfc3
Support for llama-cpp-python v0.1.69 (#516) 2023-07-09 15:51:07 -04:00
Juan Calderon-Perez
83819b2eba
Update llama-cpp-python to v0.1.66 (#469) 2023-06-26 23:51:33 -04:00
Juan Calderon-Perez
c31c464aec
Update llama-cpp-python to v0.1.65 (#454) 2023-06-20 20:07:16 -04:00
Juan Calderon-Perez
ee27eedeb3
Update llama-cpp-python to v0.1.64 (#441) 2023-06-18 13:15:45 -04:00
Juan Calderon-Perez
75dd5580d9
Update llama-cpp-python to v0.1.63 (#433) 2023-06-16 00:28:24 -04:00
Juan Calderon-Perez
4970865a49
Add support for validating shell scripts (#416) 2023-06-11 20:39:11 -04:00
PΔBLØ ᄃΞ
634fdacc08
Feature: add new k-quants q6_K models (#412)
* Feature: add new k-quants q6_K models

* Feature: update llama-cpp-python==0.1.62

* Fix: labels and 7b not use k-quants

* Fix: labels and 7b not use k-quants

* Fix: labels and 7b old one and q6_K

* Fix: labels and 7b old sizes

* Fix: labels and 7b Koala names

---------

Co-authored-by: pabl-o-ce <cye@poscye.com>
2023-06-10 23:28:05 -04:00
Juan Calderon-Perez
8c211053d4
Update llama-cpp-python to v0.1.61 (#403) 2023-06-09 23:54:56 -04:00
Juan Calderon-Perez
77131da4f1
Update llama-cpp-python to v0.1.59 (#401) 2023-06-09 22:08:44 -04:00
Juan Calderon-Perez
13daab6880
Update llama-cpp-bindings (#377) 2023-06-03 11:04:29 -04:00
Juan Calderon-Perez
c83a30797a
Update python bindings to 0.1.55 (#355) 2023-05-29 18:54:49 -04:00
Nathan Nye
51fae79aa2
GGMLv3 support (#334)
* Pin llama-cpp-python to 0.1.54 for GGMLv3 support

* Update to GGMLv3 models

* Reflect current GGMLv3 models

* More readable model names

* Fix file sizes

---------

Co-authored-by: Juan Calderon-Perez <835733+gaby@users.noreply.github.com>
2023-05-26 00:00:08 -04:00
Juan Calderon-Perez
65ab97bdd4
Pin llama-cpp-python to 0.1.49 (#294) 2023-05-12 10:01:41 -04:00
Nathan Sarrazin
e512011470
Use python bindings, integrate with LangChain and get rid of MongoDB (#148)
* integrate langchain
get rid of mongodb
use llama-cpp-python bindings

* fixed most chat endpoints except posting questions

* Working post endpoint !

* everything works except streaming

* current state

* streaming as is

* got rid of langchain wrapper for calling llm, went back to using bindings directly

* working streaming

* sort chats by time

* cleaned up styling and added back loading indicator

* Add persistence support to redis

* fixed tooltips

* fixed default prompts

* added link to api docs (closes How to use the api #155 )
2023-04-23 23:42:20 +02:00
Nathan Sarrazin
b5c423fe59
API Refactor & Model Manager (#101)
* API refactoring

* delete partially downloaded files on startup

* remove unused deps
2023-03-28 23:56:41 +02:00