Go to file

dependabot[bot] 1c754da978

Bump vite from 5.3.5 to 5.4.1 in /web (#1423 )

Bumps [vite](https://github.com/vitejs/vite/tree/HEAD/packages/vite) from 5.3.5 to 5.4.1.
- [Release notes](https://github.com/vitejs/vite/releases)
- [Changelog](https://github.com/vitejs/vite/blob/main/packages/vite/CHANGELOG.md)
- [Commits](https://github.com/vitejs/vite/commits/v5.4.1/packages/vite)

---
updated-dependencies:
- dependency-name: vite
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

2024-08-16 07:47:39 -04:00

.github

Bump docker/build-push-action from 5 to 6 (#1354 )

2024-06-21 19:54:54 -04:00

api

Bump redis from 5.0.7 to 5.0.8 in /api (#1404 )

2024-08-07 09:23:02 -04:00

charts

Update values.yaml

2024-01-21 01:39:27 -05:00

docs

Add Kubernetes helm charts for Serge (#500 )

2023-08-10 23:02:14 -04:00

scripts

Bump llama-cpp-python to v0.2.87

2024-08-07 19:56:47 -04:00

vendor

Update requirements.txt (#1345 )

2024-06-15 22:40:51 -04:00

web

Bump vite from 5.3.5 to 5.4.1 in /web (#1423 )

2024-08-16 07:47:39 -04:00

.dockerignore

Updates to CI process for Python dependencies (#912 )

2023-11-27 22:08:30 -05:00

.gitattributes

added .gitattributes

2023-03-25 16:14:37 +01:00

.gitignore

Add support for User Management (#1313 )

2024-07-27 14:48:53 -04:00

CODE_OF_CONDUCT.md

Create CODE_OF_CONDUCT.md (#88 )

2023-03-27 18:41:32 +02:00

docker-compose.dev.yml

fix: User Management (#1394 )

2024-08-03 09:32:26 -04:00

docker-compose.yml

Support for DragonflyDB (#598 )

2023-08-06 22:54:42 -04:00

Dockerfile

fix: User Management (#1394 )

2024-08-03 09:32:26 -04:00

Dockerfile.dev

Add missing musl-dev

2024-07-27 15:21:59 -04:00

LICENSE-APACHE

Add support for dual-license (#852 )

2023-11-05 09:51:25 -05:00

LICENSE-MIT

Add support for using wheels when installing llama-cpp-python (#904 )

2023-11-26 18:34:28 -05:00

README.md

fix: User Management (#1394 )

2024-08-03 09:32:26 -04:00

README.md

Serge - LLaMA made easy 🦙

Serge is a chat interface crafted with llama.cpp for running GGUF models. No API keys, entirely self-hosted!

🌐 SvelteKit frontend
💾 Redis for storing chat history & parameters
⚙️ FastAPI + LangChain for the API, wrapping calls to llama.cpp using the python bindings

🎥 Demo:

demo.webm

⚡️ Quick start

🐳 Docker:

docker run -d \
    --name serge \
    -v weights:/usr/src/app/weights \
    -v datadb:/data/db/ \
    -p 8008:8008 \
    ghcr.io/serge-chat/serge:latest

🐙 Docker Compose:

services:
  serge:
    image: ghcr.io/serge-chat/serge:latest
    container_name: serge
    restart: unless-stopped
    ports:
      - 8008:8008
    volumes:
      - weights:/usr/src/app/weights
      - datadb:/data/db/

volumes:
  weights:
  datadb:

Then, just visit http://localhost:8008, You can find the API documentation at http://localhost:8008/api/docs

🌍 Environment Variables

The following Environment Variables are available:

Variable Name	Description	Default Value
`SERGE_DATABASE_URL`	Database connection string	`sqlite:////data/db/sql_app.db`
`SERGE_JWT_SECRET`	Key for auth token encryption. Use a random string	`uF7FGN5uzfGdFiPzR`
`SERGE_SESSION_EXPIRY`	Duration in minutes before a user must reauthenticate	`60`
`NODE_ENV`	Node.js running environment	`production`

🖥️ Windows

Ensure you have Docker Desktop installed, WSL2 configured, and enough free RAM to run models.

☁️ Kubernetes

Instructions for setting up Serge on Kubernetes can be found in the wiki.

🧠 Supported Models

Category	Models
Alfred	40B-1023
BioMistral	7B
Code	13B, 33B
CodeLLaMA	7B, 7B-Instruct, 7B-Python, 13B, 13B-Instruct, 13B-Python, 34B, 34B-Instruct, 34B-Python
Codestral	22B v0.1
Gemma	2B, 1.1-2B-Instruct, 7B, 1.1-7B-Instruct
Gorilla	Falcon-7B-HF-v0, 7B-HF-v1, Openfunctions-v1, Openfunctions-v2
Falcon	7B, 7B-Instruct, 40B, 40B-Instruct
LLaMA 2	7B, 7B-Chat, 7B-Coder, 13B, 13B-Chat, 70B, 70B-Chat, 70B-OASST
LLaMA 3	11B-Instruct, 13B-Instruct, 16B-Instruct
LLaMA Pro	8B, 8B-Instruct
Med42	70B
Medalpaca	13B
Medicine	Chat, LLM
Meditron	7B, 7B-Chat, 70B
Meta-LlaMA-3	8B, 8B-Instruct, 70B, 70B-Instruct
Mistral	7B-V0.1, 7B-Instruct-v0.2, 7B-OpenOrca
MistralLite	7B
Mixtral	8x7B-v0.1, 8x7B-Dolphin-2.7, 8x7B-Instruct-v0.1
Neural-Chat	7B-v3.3
Notus	7B-v1
Notux	8x7b-v1
Nous-Hermes 2	Mistral-7B-DPO, Mixtral-8x7B-DPO, Mistral-8x7B-SFT
OpenChat	7B-v3.5-1210
OpenCodeInterpreter	DS-6.7B, DS-33B, CL-7B, CL-13B, CL-70B
OpenLLaMA	3B-v2, 7B-v2, 13B-v2
Orca 2	7B, 13B
Phi 2	2.7B
Phi 3	mini-4k-instruct, medium-4k-instruct, medium-128k-instruct
Python Code	13B, 33B
PsyMedRP	13B-v1, 20B-v1
Starling LM	7B-Alpha
SOLAR	10.7B-v1.0, 10.7B-instruct-v1.0
TinyLlama	1.1B
Vicuna	7B-v1.5, 13B-v1.5, 33B-v1.3, 33B-Coder
WizardLM	2-7B, 13B-v1.2, 70B-v1.0
Zephyr	3B, 7B-Alpha, 7B-Beta

Additional models can be requested by opening a GitHub issue. Other models are also available at Serge Models.

⚠️ Memory Usage

LLaMA will crash if you don't have enough available memory for the model

💬 Support

Need help? Join our Discord

🧾 License

Nathan Sarrazin and Contributors. Serge is free and open-source software licensed under the MIT License and Apache-2.0.

🤝 Contributing

If you discover a bug or have a feature idea, feel free to open an issue or PR.

To run Serge in development mode:

git clone https://github.com/serge-chat/serge.git
cd serge/
docker compose -f docker-compose.dev.yml up --build

The solution will accept a python debugger session on port 5678. Example launch.json for VSCode:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Remote Debug",
            "type": "python",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 5678
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}/api",
                    "remoteRoot": "/usr/src/app/api/"
                }
            ],
            "justMyCode": false
        }
    ]
}

Languages

Svelte 56.2%

Python 29.1%

CSS 5%

Shell 3.5%

TypeScript 2.8%

Other 3.4%

README.md Unescape Escape

Serge - LLaMA made easy 🦙

⚡️ Quick start

🌍 Environment Variables

🖥️ Windows

☁️ Kubernetes

🧠 Supported Models

⚠️ Memory Usage

💬 Support

🧾 License

🤝 Contributing

README.md