2023-05-25-gpt-to-svg - Kruzenshtern Lab

# GPT to SVG I have been progressively refining a drawing algorithm, and this article provides a brief overview of the first stage, along with selected images that were generated by GPT. As a recap, I started generating SVG images using GPT-3.5, which you can check out in [[2023-03-28-gpt-voyage|this post]]. These images were rather rudimentary, yet GPT managed to capture ideas I was trying to illustrate. The next steps were pretty simple. It is well-known that GPT can perform better if instructed to think in steps. That is what I did--I created a basic guideline for GPT what it means to draw an illustration such as break down an object into visual components, create a composition, and then convert everything into SVG elements. On average, approximately 20% of the generated images I consider successful. These images might be recognizable or visually appealing. Certainly, that was a subjective process, but so is art. My bar was pretty low. A gallery of 223 SVG illustrations generated during this stage is located in `/files/2023-05-25`. It is unfortunate that I didn't save original prompts and generated descriptions, so leaving this as a historical artifact. I have been running experiments in fish shell using `rsvg-convert` to render images directly in the terminal. ## Key findings - GPT-3.5 often struggles to execute more than three tasks in a single prompt. I needed to break down more complex requests into multiple steps. This is an overhead in terms of tokens. - GPT-3.5 doesn't understand the ordering of elements. For instance, it may draw an object only to later cover it entirely with a large background. I needed to instruct GPT to draw elements in a specific order. - While GPT-3.5 can identify that it needs to draw multiple elements (such as a cat's ears), it often places them on the top of the head instead of in symmetrical positions. I managed to design a prompt that occasionally resolves this issue, though it is not reliable and requires further improvements. - Rarely, GPT-3.5 may refuse to generate an SVG image by saying an infamous phrase that "as a large language model...". A solution that fixed this 99% of the time is to use a multi-shot prompting and provide an example or two. - GPT-3.5 can generate invalid SVG files, for example, by using the `xlink` tag. The solution is to explicitly blacklist certain elements. - GPT-3.5 may sometimes draw SVG elements without color. The solution is to be explicit about the properties that matter most.