From 04b4d98deba25d173292c4ca14bc626e027fe1e7 Mon Sep 17 00:00:00 2001 From: Nick Hale <4175918+njhale@users.noreply.github.com> Date: Tue, 27 Feb 2024 16:16:57 -0500 Subject: [PATCH 1/2] docs: add basic vision tool cookbook Signed-off-by: Nick Hale <4175918+njhale@users.noreply.github.com> --- docs/docs/101-cookbook/vision.md | 48 ++++++++++++++++++++++++++++++++ 1 file changed, 48 insertions(+) create mode 100644 docs/docs/101-cookbook/vision.md diff --git a/docs/docs/101-cookbook/vision.md b/docs/docs/101-cookbook/vision.md new file mode 100644 index 00000000..5b04d397 --- /dev/null +++ b/docs/docs/101-cookbook/vision.md @@ -0,0 +1,48 @@ +# Vision + +In this example, let's say that we want to create a `.gpt` script that uses vision to describe an image on the internet in a certain style. + +1. Clone the vision tool repository from Github + + ```shell + git clone https://github.com/gptscript-ai/vision + ``` + +2. Install the tool's dependencies + + ```shell + npm install + ``` + +3. Create a file called `describe.gpt`. + + ```yaml + Tools: sys.write, ./vision/tool.gpt + Args: url: (required) URL of the image to describe. + + Describe the image at ${url} as if you were the narrator of a Wes Anderson film and write it to a file named description.txt. + ``` + +4. Run the script. + + ```shell + gptscript describe.gpt --url "https://github.com/gptscript-ai/vision/blob/main/examples/eiffel-tower.png?raw=true" + ``` + +## Recap + +In this example, we have created a GPTScript that leverages the `vision` tool to describe an image at a given URL. We gave it some flexibility by specifiying an argument `url` that the user can provide when running the script. We also specified gave the script a specific style to generate the description with. + +Notable things to point out: + +#### Tools + +The `tools` directive was used here to reference the `vision` and the `sys.write` tools. GPTScript will know the tools availble to it and will use them when it sees fit in the script. + +#### Args + +We used the `args` directive to specify the `url` argument that the user can provide when running the script. This is a required argument and the user must provide it when running the script. + +#### Interpolation + +The `${url}` syntax can we used to reference the `url` argument in the script. This is a made-up syntax and can be replaced with any other syntax that you prefer. The LLM should be able to understand the syntax and use the argument in the script. From 99243ba97737b870fd71273ff65cbf974beb8497 Mon Sep 17 00:00:00 2001 From: Nick Hale <4175918+njhale@users.noreply.github.com> Date: Fri, 1 Mar 2024 09:32:18 -0500 Subject: [PATCH 2/2] Update docs/docs/101-cookbook/vision.md Co-authored-by: Tyler Slaton <54378333+tylerslaton@users.noreply.github.com> Signed-off-by: Nick Hale <4175918+njhale@users.noreply.github.com> --- docs/docs/101-cookbook/vision.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/docs/101-cookbook/vision.md b/docs/docs/101-cookbook/vision.md index 5b04d397..9d24a9ee 100644 --- a/docs/docs/101-cookbook/vision.md +++ b/docs/docs/101-cookbook/vision.md @@ -31,7 +31,7 @@ In this example, let's say that we want to create a `.gpt` script that uses visi ## Recap -In this example, we have created a GPTScript that leverages the `vision` tool to describe an image at a given URL. We gave it some flexibility by specifiying an argument `url` that the user can provide when running the script. We also specified gave the script a specific style to generate the description with. +In this example, we have created a GPTScript that leverages the `vision` tool to describe an image at a given URL. We gave it some flexibility by specifying an argument `url` that the user can provide when running the script. We also gave the script a specific style to generate the description with. Notable things to point out: