Streamline the prompting

The config file now contains a new "system_prompt" value in the [llm] section. This would be the same for all DECEIVE instances, and configures how the emulation itself will act. The honeypot administrator should mostly keep this intact. The prompt.txt file now focuses more on what type of system to emulate, and optional details such as valid users, contents to stage on the system, etc.
2025-07-01 08:37:28 -04:00 · 2025-01-17 14:37:07 -05:00
parent 767104fa72
commit cda3c5496b
4 changed files with 56 additions and 36 deletions
--- a/README.md
+++ b/README.md
@ -38,6 +38,14 @@ The SSH server requires a TLS keypair for security communications. From the top

 Open the `SSH/config.ini` file and review the settings. Update the values as needed, paying special attention to the values in the `[llm]` section, where you will configure the LLM backend you wish to use, and to the `[user_accounts]` section, where you can configure the usernames and passwords you'd like the honeypot to support.

+### Tell DECEIVE What it's Emulating
+Edit the `SSH/prompt.txt` file to include a short description of the type of system you want it to pretend to be. You don't have to be very detailed here, though the more details you can provide, the better the simulation will be. You can keep it high level, like:
+
+    You are a video game developer's system. Include realistic video game source and asset files.
+If you like, you can add whatever additional details you think will be helpful.  For example:
+
+    You are the Internet-facing mail server for bigschool.edu, a state-sponsored university in Virginia. Valid user accounts are "a20093887", "a20093887-admin", and "mxadmin". Home directories are in "/home/$USERNAME".  Everyone's default shell is /bin/zsh, except mxadmin's, which is bash. Mail spools for all campus users (be sure to include email accounts that are not valid for logon to this server) are in /var/spool/mail. Be sure to simulate some juicy emails there, but make them realistic.  Some should be personal, but some should be just about the business of administering the school, dealing with students, applying for financial aid, etc. Make the spool permissions relaxed, simulating a misconfiguration that would allow anyone on the system to read the files.
+
 ## Running the Honeypot
 To start the DECEIVE honeypot server, first make sure that you have set any environment variables required by your chosen LLM backend.  For example, if you are using any of the OpenAI models, you will need to set the `OPENAI_API_KEY` variable like so:

--- a/SSH/config.ini.TEMPLATE
+++ b/SSH/config.ini.TEMPLATE
@ -2,7 +2,6 @@
 # We provide reasonable defaults for most configuration items, but you should 
 # review this before using it in production. 

-
 [honeypot]
 # The name of the file you wish to write the honeypot log to.
 log_file = ssh_log.log 
@ -37,17 +36,41 @@ model_name = gpt-4o
 #llm_provider = gemini
 #model_name = gemini-1.5-pro

-
-# The plain text file containing the LLM system prompt. 
-# This is required, no matter which LLM you use.
-system_prompt_file = prompt.txt
-
 # The maximum number of tokens to send to the LLM backend in a single 
 # request. This includes the message history for the session, so should
 # be fairly high. Not all models support large token counts, so be sure 
 # to test this with your model.
 trimmer_max_tokens = 64000

+# System prompt for the LLM. This will be the same for any running instance.
+# Change this if you need to tweak behaviors of the simulation itself.
+system_prompt = Interpret all inputs as though they were SSH commands and provide a realistic 
+    output. Unless otherwise specified, assume you are emulating a Linux operating 
+    system, so be sure to include realistic users, processes, and files. Do not 
+    include extraneous formatting in your responses. Also make sure that your 
+    responses do not indicate that the user is on a virtual machine, container, 
+    or emulation at all. 
+
+    On the first call, be sure to include a realistic MOTD.
+
+    End all responses with a realistic shell prompt to display to the user, 
+    including a space at the end.
+
+    Include ANSI color codes for the terminal with the output of ls commands 
+    (including any flags), or in any other situation where it is appropriate, but 
+    do not include the ``` code formatting around those blocks.
+
+    Make sure all user and host names conform to some reasonable corporate naming 
+    standard. Never use obviously fake names like "Jane Doe" or just Alice, Bob, and Charlie.
+
+    If at any time the user's input would cause the SSH session to close (e.g., if 
+    they exited the login shell), your only answer should be "XXX-END-OF-SESSION-XXX" 
+    with no additional output before or after. Remember that the user could start up 
+    subshells or other command interpreters, and exiting those subprocesses should not 
+    end the SSH session.
+
+    Assume the username is {username}.
+
 # The valid user accounts and passwords for the SSH server, in the
 # form "username = password".  Note that you can enable login without 
 # a password by leaving that field blank (e.g., "guest =" on a line by
@ -56,3 +79,4 @@ trimmer_max_tokens = 64000
 guest = 
 user1 = secretpw
 user2 = password123
+
--- a/SSH/prompt.txt
+++ b/SSH/prompt.txt
@ -1,27 +1,2 @@
-You are a linux operating system accepting commands from a user via SSH. 
-Interpret all inputs as though they were SSH commands and provide a realistic 
-output. You are emulating a video game developer's system, so be sure to 
-include realistic users, processes, and files, especially video game source 
-and asset files. Do not include extraneous formatting in your responses. 
-Also make sure that your responses do not indicate that the user is on a 
-virtual machine, container, or emulation at all. 
-            
-On the first call, be sure to include a realistic MOTD. 
-           
-End all responses with a realistic shell prompt to display to the user, 
-including a space at the end. 
-            
-Include ANSI color codes for the terminal with the output of ls commands 
-(including any flags), or in any other situation where it is appropriate, but 
-do not include the ``` code formatting around those blocks. 
-
-Make sure all user and host names conform to some reasonable corporate naming 
-standard. Never use obviously fake names like "Jane Doe" or just Alice, Bob, and Charlie. 
-
-If at any time the user's input would cause the SSH session to close (e.g., if 
-they exited the login shell), your only answer should be "XXX-END-OF-SESSION-XXX" 
-with no additional output before or after. Remember that the user could start up 
-subshells or other command interpreters, and exiting those subprocesses should not 
-end the SSH session.
-
-Assume the username is {username}.
+You are a video game developer's system. Include realistic video game source 
+and asset files.
--- a/SSH/ssh_server.py
+++ b/SSH/ssh_server.py
@ -285,6 +285,15 @@ def choose_llm():

    return llm_model

+def get_prompts() -> dict:
+    system_prompt = config['llm']['system_prompt']
+    with open("prompt.txt", "r") as f:
+        user_prompt = f.read()
+    return {
+        "system_prompt": system_prompt,
+        "user_prompt": user_prompt
+    }
+
 #### MAIN ####

 # Always use UTC for logging
@ -311,9 +320,9 @@ logger.addFilter(f)

 # Now get access to the LLM

-prompt_file = config['llm'].get("system_prompt_file", "prompt.txt")
-with open(prompt_file, "r") as f:
-    llm_system_prompt = f.read()
+prompts = get_prompts()
+llm_system_prompt = prompts["system_prompt"]
+llm_user_prompt = prompts["user_prompt"]

 llm = choose_llm()

@ -334,6 +343,10 @@ llm_prompt = ChatPromptTemplate.from_messages(
            "system",
            llm_system_prompt
        ),
+        (
+            "system",
+            llm_user_prompt
+        ),
        MessagesPlaceholder(variable_name="messages"),
    ]
 )