mirror of synced 2024-07-27 07:34:11 +08:00

FauxPilot - an open-source GitHub Copilot server

Go to file

Brendan Dolan-Gavitt 788edd44f2 Add missing configs for 4gpu setups		2023-05-29 16:00:43 -04:00
.github	Remove the issue autoreplier; just adds clutter	2023-04-27 00:22:12 -04:00
converter	Add missing configs for 4gpu setups	2023-05-29 16:00:43 -04:00
copilot_proxy	fix raise FauxPilotexception (#172 )	2023-03-28 10:06:20 +02:00
documentation	Remove Gitlab VSCode Extension suggestion (#169 )	2023-03-15 11:20:15 +01:00
img	doc: added logo image of FauxPilot	2022-11-04 14:39:00 +09:00
python_backend	Fix segfault issue	2022-11-19 18:32:50 +00:00
tests/python_backend	Dev (#148 )	2023-02-15 09:17:07 +01:00
.editorconfig	Create .editorconfig	2022-10-23 05:03:33 +08:00
.gitignore	Ignore huggingface cache	2022-11-26 22:14:02 +08:00
api.dockerignore	Resolve conflicts	2023-02-13 16:52:49 +01:00
docker-compose.yaml	Dev (#148 )	2023-02-15 09:17:07 +01:00
launch.sh	Fix docker compose invocation	2023-01-24 12:10:48 -06:00
LICENSE	add missing license	2022-08-03 09:42:32 -04:00
proxy.Dockerfile	Reduce docker build context size	2023-02-13 16:05:35 +01:00
README.md	Its not actually github copilot, its an alternative. (#175 )	2023-03-28 10:02:03 +02:00
setup.cfg	Dev (#148 )	2023-02-15 09:17:07 +01:00
setup.sh	fix: fixed an incorreect if statement (#158 )	2023-03-13 10:00:01 +01:00
shutdown.sh	Now that launch.sh runs in the background, add shutdown.sh to stop the server	2022-10-19 17:37:59 -04:00
triton.Dockerfile	Reduce docker build context size	2023-02-13 16:05:35 +01:00
triton.dockerignore	Resolve conflicts	2023-02-13 16:52:49 +01:00

README.md

FauxPilot

This is an attempt to build a locally hosted alternative to GitHub Copilot. It uses the SalesForce CodeGen models inside of NVIDIA's Triton Inference Server with the FasterTransformer backend.

Prerequisites

You'll need:

Docker
docker compose >= 1.28
An NVIDIA GPU with Compute Capability >= 6.0 and enough VRAM to run the model you want.
nvidia-docker
curl and zstd for downloading and unpacking the models.

Note that the VRAM requirements listed by setup.sh are total -- if you have multiple GPUs, you can split the model across them. So, if you have two NVIDIA RTX 3080 GPUs, you should be able to run the 6B model by putting half on each GPU.

Support and Warranty

lmao

Okay, fine, we now have some minimal information on the wiki and a discussion forum where you can ask questions. Still no formal support or warranty though!

Setup

This section describes how to install a Fauxpilot server and clients.

Setting up a FauxPilot Server

Run the setup script to choose a model to use. This will download the model from Huggingface/Moyix in GPT-J format and then convert it for use with FasterTransformer.

Please refer to How to set-up a FauxPilot server.

Client configuration for FauxPilot

We offer some ways to connect to FauxPilot Server. For example, you can create a client by how to open the Openai API, Copilot Plugin, REST API.

Please refer to How to set-up a client.

Terminology

API: Application Programming Interface
CC: Compute Capability
CUDA: Compute Unified Device Architecture
FT: Faster Transformer
JSON: JavaScript Object Notation
gRPC: Remote Procedure call by Google
GPT-J: A transformer model trained using Ben Wang's Mesh Transformer JAX
REST: REpresentational State Transfer