# Serge - LLaMA made easy ๐Ÿฆ™ ![License](https://img.shields.io/github/license/serge-chat/serge) [![Discord](https://img.shields.io/discord/1088427963801948201?label=Discord)](https://discord.gg/62Hc6FEYQH) Serge is a chat interface crafted with [llama.cpp](https://github.com/ggerganov/llama.cpp) for running Alpaca models. No API keys, entirely self-hosted! - ๐ŸŒ **SvelteKit** frontend - ๐Ÿ’พ **[Redis](https://github.com/redis/redis)** for storing chat history & parameters - โš™๏ธ **FastAPI + LangChain** for the API, wrapping calls to [llama.cpp](https://github.com/ggerganov/llama.cpp) using the [python bindings](https://github.com/abetlen/llama-cpp-python) ๐ŸŽฅ Demo: [demo.webm](https://user-images.githubusercontent.com/25119303/226897188-914a6662-8c26-472c-96bd-f51fc020abf6.webm) ## โšก๏ธ Quick start ๐Ÿณ Docker: ```bash docker run -d \ --name serge \ -v weights:/usr/src/app/weights \ -v datadb:/data/db/ \ -p 8008:8008 \ ghcr.io/serge-chat/serge:latest ``` ๐Ÿ™ Docker Compose: ```yaml services: serge: image: ghcr.io/serge-chat/serge:latest container_name: serge restart: unless-stopped ports: - 8008:8008 volumes: - weights:/usr/src/app/weights - datadb:/data/db/ volumes: weights: datadb: ``` Then, just visit http://localhost:8008/, You can find the API documentation at http://localhost:8008/api/docs ## ๐Ÿ–ฅ๏ธ Windows Setup Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models. ## โ˜๏ธ Kubernetes & Docker Compose Setup Instructions for setting up Serge on Kubernetes can be found in the [wiki](https://github.com/serge-chat/serge/wiki/Integrating-Serge-in-your-orchestration#kubernetes-example). ## ๐Ÿง  Supported Models | Category | Models | |:-------------:|:-------| | **Alpaca ๐Ÿฆ™** | Alpaca-LoRA-65B, GPT4-Alpaca-LoRA-30B | | **Chronos ๐ŸŒ‘**| Chronos-13B, Chronos-33B, Chronos-Hermes-13B | | **GPT4All ๐ŸŒ**| GPT4All-13B | | **Koala ๐Ÿจ** | Koala-7B, Koala-13B | | **LLaMA ๐Ÿฆ™** | FinLLaMA-33B, LLaMA-Supercot-30B, LLaMA2 7B, LLaMA2 13B, LLaMA2 70B | | **Lazarus ๐Ÿ’€**| Lazarus-30B | | **Nous ๐Ÿง ** | Nous-Hermes-13B | | **OpenAssistant ๐ŸŽ™๏ธ** | OpenAssistant-30B | | **Orca ๐Ÿฌ** | Orca-Mini-v2-7B, Orca-Mini-v2-13B, OpenOrca-Preview1-13B | | **Samantha ๐Ÿ‘ฉ**| Samantha-7B, Samantha-13B, Samantha-33B | | **Vicuna ๐Ÿฆ™** | Stable-Vicuna-13B, Vicuna-CoT-7B, Vicuna-CoT-13B, Vicuna-v1.1-7B, Vicuna-v1.1-13B, VicUnlocked-30B, VicUnlocked-65B | | **Wizard ๐Ÿง™** | Wizard-Mega-13B, WizardLM-Uncensored-7B, WizardLM-Uncensored-13B, WizardLM-Uncensored-30B, WizardCoder-Python-13B-V1.0 | Additional weights can be added to the `serge_weights` volume using `docker cp`: ```bash docker cp ./my_weight.bin serge:/usr/src/app/weights/ ``` ## โš ๏ธ Memory Usage LLaMA will crash if you don't have enough available memory for the model: | Model | Max RAM Required | |-------------|------------------| | 7B | 4.5GB | | 7B-q2_K | 5.37GB | | 7B-q3_K_L | 6.10GB | | 7B-q4_1 | 6.71GB | | 7B-q4_K_M | 6.58GB | | 7B-q5_1 | 7.56GB | | 7B-q5_K_M | 7.28GB | | 7B-q6_K | 8.03GB | | 7B-q8_0 | 9.66GB | | 13B | 12GB | | 13B-q2_K | 8.01GB | | 13B-q3_K_L | 9.43GB | | 13B-q4_1 | 10.64GB | | 13B-q4_K_M | 10.37GB | | 13B-q5_1 | 12.26GB | | 13B-q5_K_M | 11.73GB | | 13B-q6_K | 13.18GB | | 13B-q8_0 | 16.33GB | | 33B | 20GB | | 33B-q2_K | 16.21GB | | 33B-q3_K_L | 19.78GB | | 33B-q4_1 | 22.83GB | | 33B-q4_K_M | 22.12GB | | 33B-q5_1 | 26.90GB | | 33B-q5_K_M | 25.55GB | | 33B-q6_K | 29.19GB | | 33B-q8_0 | 37.06GB | | 65B | 50GB | | 65B-q2_K | 29.95GB | | 65B-q3_K_L | 37.15GB | | 65B-q4_1 | 43.31GB | | 65B-q4_K_M | 41.85GB | | 65B-q5_1 | 51.47GB | | 65B-q5_K_M | 48.74GB | | 65B-q6_K | 56.06GB | | 65B-q8_0 | 71.87GB | ## ๐Ÿ’ฌ Support Need help? Join our [Discord](https://discord.gg/62Hc6FEYQH) ## โญ๏ธ Stargazers Stargazers over time ## ๐Ÿงพ License [Nathan Sarrazin](https://github.com/nsarrazin) and [Contributors](https://github.com/serge-chat/serge/graphs/contributors). `Serge` is free and open-source software licensed under the [MIT License](https://github.com/serge-chat/serge/blob/master/LICENSE). ## ๐Ÿค Contributing If you discover a bug or have a feature idea, feel free to open an issue or PR. To run Serge in development mode: ```bash git clone https://github.com/serge-chat/serge.git docker compose -f docker-compose.dev.yml up -d --build ```