A Guide for Writing Bots - Feldherren's Site

# General Tips ## Just write - ...no, seriously. Just write. This is the single best piece of advice I can give you. - Every single model people use for RP nowadays can and has always responded well, if not best, to plain text and prose. - Worst case, even if you want to use a particular format for some benefit or another (usually saving tokens), you end up with a bunch of notes about how you want the bot to act. - Also, you've practiced writing. Good outcome regardless. ## Write things you're interested in - Generally, write things you'd personally want to RP with. - Being invested in any given project helps see it through to completion. - It's easier to test them if you're interested in them. - You guarantee one (1) happy user, when all's said and done. - Obviously, you can still write things for other people (gifts? Themed contest entries?), and I encourage you to do so; it's a nice thing to do for people, and trying things outside of your normal comfort zone is a really good way to improve as a writer. ## Don't write directly in the frontend - Bot-chatting frontends are good for testing and chatting, but they're commonly not optimised for the act of writing bots. Whenever I try to write primarily in them, I find myself needing to hop around between multiple different interfaces for various elements of the character - description, alternate greetings, lorebook, example dialogue. - Most web-based frontends don't save the definitions as you go; I've seen way too many people lamenting about how the page reloaded, or how they accidentally navigated away from it, and they lost all their work. - You should find a good notetaking application or organisational solution that works for you, rather than writing things directly in your frontend. - I use and often shill [Obsidian](https://obsidian.md/), whilst I've seen other people mention using Google Docs, OneNote, Word, Notion. - NB: if you're using Word, [disable smart-quotes](https://support.microsoft.com/en-us/office/smart-quotes-in-word-and-powerpoint-702fc92e-b723-4e3d-b2cc-71dedaf2f343) or replace them with regular quotation marks; plenty of LLMs can't consistently pair them, and frontends may not properly highlight text if they're mispaired. - Find something you like or can at least tolerate using, regardless. ## Don't hold yourself to too-high standards out of the gate - Your bot likely isn't going to be complete and perfect on your first draft. This is normal; almost all writing (and art in general) goes through multiple drafts before reaching a state the creator is happy with. - Unless you're me and you're just YOLO publishing a pair of guides on writing and testing bots, but in this case I can update as I go. Also I'm still happy with how these read. - If you come to think it isn't perfect or up to standards after publishing your bot, don't worry; every bot-repository and frontend I've seen has functionality for updating a bot after publishing it. ## It isn't programming - If you're intimidated by anyone calling writing this kind of bot 'coding', or calling the definitions 'code'... don't be. It's **not** programming, and it's not rigidly specific about what input you hand it (well, it kind of is, but all of that is handled by your frontend for you; you **won't need to worry about it** most of the time). ## Formats - Whilst there are plenty of formats out there, whichever you decide to use, as long as it's human-readable it *should* work well with most LLMs. - Though it's worth noting that LLMs without a proper system role (NovelAI, primarily, but the recent Mistral models are weird about that too) can pick up some really strange formatting, sometimes. These are uncommon, though - because they're not as good for RP, fewer users... use them. - NovelAI loves copying your writing style. It'll pick up on uncapitalised lorebook entries, for example, and be unlikely to capitalise anything if they're too low-depth. - That said, anyone using a particular LLM should be aware of its own quirks; don't worry too much about accommodating every single user. - **When in doubt, prose works.** ## Context - With the advent of workable 16k and 32k context models, saving space in context for chat memory isn't as much of a pressing issue as it used to be when the context limit was 2k. - You can probably safely assume most users have 8k context. Under these conditions, I typically recommended a soft limit of 1k permanent tokens. Going over by a few hundred is fine. Just try to make sure you're using tokens well; editing down is as important a skill as writing in the first place, and worth practicing. - There *is* a sweet-spot at the tail of context where the LLM weighs tokens more highly when generating text - that's why the main prompt sits way back there, at the tail. The shorter your bot's definitions are, the likelier they are to also fit into that sweet spot, and be relevant during generations. - Ultimately, your bot needs as many tokens as it needs, permanent and temporary. ## Macros - Familiarise yourself with the macros/placeholders your frontend of choice supports. - For example, `{{char}}` is in most frontends replaced with the character's name (or in-chat name, for frontends that support a separate name from the bot's name in navigation) before being sent to the LLM; likewise, `{{user}}` is replaced with the user's persona name. - These two macros are supported by almost every frontend; use them. - Important note regarding JanitorAI: `{{char}}` is *apparently* only replaced in the frontend, and the string `{{char}}` is sent directly to the LLM. - For non-JAI users, this is why we sometimes see `{{char}} is John.` constructions in people's bots. - Outside of JAI, pretty much every frontend just resolves `{{char}} is John.` to `John is John.`; it won't break anything, but it's also a little pointless. - Whilst `{{char}}` and `{{user}}` are pretty-consistently supported everywhere, every single frontend supports a different range of macros; you'll need to refer to your frontend-of-choice's documentation for more detail. ## Language - If you want the bot to output in a particular language, the most certain way to do that is to write the bot's definitions and everything else in that language. - You might have some success with a main prompt and/or post-history/jailbreak that just says 'use this language', but it's most certain if you just write in that language from the get-go. - This also assumes the LLM powering everything can handle the language. Deepseek is probably great at Chinese, and Mistral great with French. A lot of western models are great at English and can be passable at European languages. ## Writer's Block - If you're suffering from writer's block or in need of inspiration, take a break; read a book, watch a movie, listen to a podcast. - Taking your mind off of things might help your ideas percolate for a bit. - This is a hobby. It's supposed to be fun. - Nobody wants you to burn out. ## Discouraging Unwanted Behaviour - A tip regarding writing technique, finally? - Also prompt-writing, I guess. - It's hard to prompt behaviour out of a LLM directly. Simply going 'do not X' rarely works well. - It can work with some models that excel at following directions, but it's too unreliable, broadly-speaking, across all models. - In the case that your bot has persistent behaviour you'd rather they not do, it's much more effective to justify *why* the character should not or does not do a thing, or give an alternative that the character will do instead, than to simply state they don't do it. - Let's take `{{char}} does not eat grapefruits.` as an example. - Simply putting `{{char}} does not eat grapefruits.` by itself is likely to be ineffective because, in the act of specifying that, you've put 'grapefruits' into context so they're likelier to come up, and grapefruits are food, so the training data likely has a bias towards eating it. It might still work, but if the LLM is determined to eat that grapefruit, {{char}} is probably going to eat it. - `{{char}} does not eat grapefruits; she's *allergic* to them, she might die!` is an example of justification; we've justified not eating grapefruits with an allergy. Allergies are things people don't usually want to trigger (and people also generally don't want to die), and this is likely to have been reflected in the training data. - `{{char}} does not eat grapefruits; instead, he hucks them at the skulls of his enemies. They're a great projectile weapon.` is an example of redirection; we've given the character something to do with grapefruits instead of eating them. And after the grapefruit's been thrown, hopefully it's left in a less-edible state. - This redirection might be less effective if {{char}} doesn't have any enemies around, though, or he might save grapefruit for the express purpose of throwing them, should any enemies emerge; some testing is required. - As a bonus, we've given {{char}} some anti-grapefruit characterisation. # Formats ## Prose/Plain Text - Writing the bot's definitions in full prose. Like a book might describe things, or the backstory you might write for your character in a tabletop RPG. - This isn't really a format so much as just writing, so I don't have too much to say about it. Or, rather, there's far too much to say about it; there are any number of books and guides on how to write creative fiction (or creative non-fiction) out there, and they go over the basics a whole lot better than I can. - Also, reading as a hobby? If anyone asks, it's research. - Prose definitions can range from simple, dry descriptions of fact, to text with tone written into it. Written well, prose bot definitions don't necessarily need example dialogue to set the tone of the output. - One of the tricks you can do with prose bot definitions is writing a caveman/goblin with broken English by... simply writing the entire definition that way. - Please use line breaks. - Whilst LLMs might handle the bot fine without them, not using line breaks makes it terrible to read through the bot, if anyone wants to. - Including you. - Probably. If you can read a solid wall of text without your eyes glazing over, hat's off to you. - Moderately-expensive in terms of tokens, but it's variable; you can pack a lot of detail into prose, depending on how you write it. - For me this is the most natural way to write a bot, but it comes down to personal taste. ## Condensed Prose Formats - [P-list](https://wikia.schneedc.com/bot-creation/trappu/creation), [Statuo's format](https://rentry.co/statuobotmakie#template-for-bot-creation), even [W++](https://rentry.org/PygTips) (and more besides; I don't remember all of them). - Terse descriptions. Short traits. Compact. Separated by commas, semi-colons, or plusses. - W++ wraps its little descriptive bits in quotation marks, and also separates with plusses. This is... kind of overkill; remove the quotation marks and I can almost certainly guarantee the bot won't work any worse. - That said, W++ *works* and if it's most comfortable for you to write and read, use it. Those quotation marks aren't *hurting* anything (except the token count). - The main benefit of condensed prose formats is that they save on tokens; you can generally fit more detail into the same amount of tokens with a condensed prose format than you would when writing the definitions in prose. - Condensed prose formats tend to be harder to write tone into - they end up leaning more heavily on example dialogue and the greeting to get an idea of how the character speaks. - You don't *need* to supply example dialogue, but without it LLMs are likelier to resort to model-default speech. - I tinkered with this kind of format for a bit; it's workable, but too utilitarian for my tastes. - Also I suck at example dialogue. ## Ali:chat - ...[ali:chat](https://rentry.co/alichat) - Essentially, the definition is written as a conversation/example RP between {{char}} and some other character - usually an interviewer, so the LLM doesn't assume this is all stuff {{user}} should know. - It's almost like example dialogue - Very good at encouraging the LLM to use a particular speech pattern or have particular speech tics. - Very good if you want the LLM to hallucinate details well. - Very expensive in terms of tokens compared to the context you're spending on it. - God, I suck at ali:chat. - I'm bad at example dialogue, too. Sometimes a given bot-writing format just isn't what you're good at writing. It's still worth trying sometimes, but don't worry if you are bad at it. There are as many ways to write bots as there are authors; something will work for you. ## Other Formats I know less about these. Take everything with a pinch of salt, but these especially so. ### JED - Intended to be modular, apparently, but a lot of people seem to look at it and fill literally everything out, resulting in fairly high-token definitions. - Seems to work fine, though. ### XML - I'm not sure this really... works as people think it does. Most models, even up to GPT and Claude, aren't trained on XML for character definitions. GPT and Claude are likely to be able to parse XML as XML, in some respect, but it still amounts to fancy headings for sections more than anything, surely. - It was popular towards the start when people thought the best way to work with an LLM was via stuff that looked like code/machine-readable, but that's just not how LLMs work. - Lesser models likely just ignore the XML formatting and just treat the tags as headings. - Seems to work overall, though like W++ you'd probably save a bunch of tokens just dropping most of the stuff that makes it XML. - NovelAI models probably hate this, formatting-wise.