From 99cf7bcc9a4cb65e20bb672dd59682db5a76508d Mon Sep 17 00:00:00 2001 From: Jesse Johnson Date: Tue, 4 Jul 2023 15:54:40 +0000 Subject: [PATCH 1/5] Update server instructions for web front end --- examples/server/README.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/examples/server/README.md b/examples/server/README.md index ba4b2fec9d1df..40093a3442265 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -1,6 +1,6 @@ # llama.cpp/example/server -This example demonstrates a simple HTTP API server to interact with llama.cpp. +This example demonstrates a simple HTTP API server and a simple web front end to interact with llama.cpp. Command line options: @@ -21,6 +21,7 @@ Command line options: - `-to N`, `--timeout N`: Server read/write timeout in seconds. Default `600`. - `--host`: Set the hostname or ip address to listen. Default `127.0.0.1`. - `--port`: Set the port to listen. Default: `8080`. +- `--public`: path from which to serve static files (default examples/server/public) - `--embedding`: Enable embedding extraction, Default: disabled. ## Build @@ -59,7 +60,7 @@ server.exe -m models\7B\ggml-model.bin -c 2048 ``` The above command will start a server that by default listens on `127.0.0.1:8080`. -You can consume the endpoints with Postman or NodeJS with axios library. +You can consume the endpoints with Postman or NodeJS with axios library. You can visit the web front end at the same url. ## Testing with CURL From fa4a48caccfd44868a54206df0b8846fa3e4c7b2 Mon Sep 17 00:00:00 2001 From: Jesse Johnson Date: Wed, 5 Jul 2023 15:42:26 +0000 Subject: [PATCH 2/5] Update server README Fix param for setting client path Add example of front usage --- examples/server/README.md | 26 +++++++++++++++++++++++++- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/examples/server/README.md b/examples/server/README.md index 40093a3442265..83903a29a2c3a 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -21,7 +21,7 @@ Command line options: - `-to N`, `--timeout N`: Server read/write timeout in seconds. Default `600`. - `--host`: Set the hostname or ip address to listen. Default `127.0.0.1`. - `--port`: Set the port to listen. Default: `8080`. -- `--public`: path from which to serve static files (default examples/server/public) +- `--path`: path from which to serve static files (default examples/server/public) - `--embedding`: Enable embedding extraction, Default: disabled. ## Build @@ -191,3 +191,27 @@ Run with bash: ```sh bash chat.sh ``` + +### Extending the Web Front End + +The default location for the static files is `examples/server/public`. You can extend the front end by running the server binary with `--path` set to `./your-directory` and importing `/completion.js` to get access to the llamaComplete() method. A simple example is below: + +``` + + +
+      
+    
+ + +``` \ No newline at end of file From 775a299a4bf65c5221ceec24a09b5202ef73d6b0 Mon Sep 17 00:00:00 2001 From: Jesse Johnson Date: Wed, 5 Jul 2023 16:00:22 +0000 Subject: [PATCH 3/5] Remove duplicate OAI instructions --- examples/server/README.md | 16 ---------------- 1 file changed, 16 deletions(-) diff --git a/examples/server/README.md b/examples/server/README.md index 4af433f2dcaf0..b8172119a9bea 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -208,22 +208,6 @@ openai.api_base = "http://:port" Then you can utilize llama.cpp as an OpenAI's **chat.completion** or **text_completion** API -### API like OAI - -API example using Python Flask: [api_like_OAI.py](api_like_OAI.py) -This example must be used with server.cpp - -```sh -python api_like_OAI.py -``` - -After running the API server, you can use it in Python by setting the API base URL. -```python -openai.api_base = "http://:port" -``` - -Then you can utilize llama.cpp as an OpenAI's **chat.completion** or **text_completion** API - ### Extending the Web Front End The default location for the static files is `examples/server/public`. You can extend the front end by running the server binary with `--path` set to `./your-directory` and importing `/completion.js` to get access to the llamaComplete() method. A simple example is below: From bd08fa95228ac66b94039567b4281e30eb591e5f Mon Sep 17 00:00:00 2001 From: Jesse Johnson Date: Wed, 5 Jul 2023 16:03:39 +0000 Subject: [PATCH 4/5] Fix duplicate text --- examples/server/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/server/README.md b/examples/server/README.md index b8172119a9bea..be03215d4353c 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -1,6 +1,6 @@ # llama.cpp/example/server -This example demonstrates a simple HTTP API server and a simple web front end and a simple web front end to interact with llama.cpp. +This example demonstrates a simple HTTP API server and a simple web front end to interact with llama.cpp. Command line options: From 3a47811cbe0a0ede8df3dd631714ce2f2a1b1c90 Mon Sep 17 00:00:00 2001 From: Jesse Johnson Date: Wed, 5 Jul 2023 16:26:53 +0000 Subject: [PATCH 5/5] Fix error from editorconfig checker --- examples/server/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/server/README.md b/examples/server/README.md index be03215d4353c..037412d7634c8 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -230,4 +230,4 @@ The default location for the static files is `examples/server/public`. You can e -``` \ No newline at end of file +```