This version adds support for Azure OpenAI models. I'm not entirely happy with how each LLM provider has it's own set of params, and am investigating how to make these seem a little more unified, so this support may change in the future.
Also, Azure's content filter flags the "XXX-END-OF-SESSION-XXX" token as "sexual content", so I changed it to use "YYY" instead. I feel so protected!
If all of the necessary options are passed as command line flags, you may no longer even need a config file. in this case, don't complain that a config file wasn't provided. As part of this, allow the user to set the user account(s) using the -u flag.
The config template specified the default server version string as "SSH-2.0-OpenSSH_8.2p1 Ubuntu-4ubuntu0.3" but the SSH module automatically prepends "SSH-2.0-" to the beginning. This gave the version string returned to the client a potential fingerprint that could be used to easily identify DECEIVE honeypots. Updated the default value and added comments to document this behavior.
* 'sensor_name` is an arbitrary string that identifies the specific honeypot sensor that generated the log. Set it in the config.ini file. If not set, it will default to the honeypot system's hostname.
* 'sensor_protocol' identifies the specific protocol this honeypot sensor uses. For SSH, it's always "ssh" but as other protocols are added to DECEIVE in the future, this will have different values for their logs.
Setting a password to be "*" in the config file will cause the server to accept any password the client provides for that account, including an empty password.
SSH servers can take user commands from an interactive session as normal, but users can also include commands on the ssh client command line which are executed on the server (e.g., "ssh <hostname> 'uname -a'"). We now execute these non-interactive commands properly as well.
Also added a new "interactive" flag to all user commands (true/false) to show which type of command execution this was.
For whatever reason, MacOS returns 4 values from conn.get_extra_info('peername') and conn.get_extra_info('sockname'), but Linux systems only return 2. On the Mac, it's only the first two that we need anyway. Now we retrieve them all, no matter how many there are, and just use the first two so it will work on both platforms.
* --prompt-file to specify a file from which to read the prompt.
* --prompt to specify a prompt string on the command line
* --config to specify an alternate config file
The config file now contains a new "system_prompt" value in the [llm] section. This would be the same for all DECEIVE instances, and configures how the emulation itself will act. The honeypot administrator should mostly keep this intact. The prompt.txt file now focuses more on what type of system to emulate, and optional details such as valid users, contents to stage on the system, etc.
* All logging is now in JSON lines format!
* Fixed a bug where the session summary was generated twice for the same session
* Fixed a reversion in the exit handling when the user logged out gracefully.
* Session summaries now occur both at normal session termination (e.g., the user gracefully logs out) or abnormal termination, such as if the client disconnects suddenly.
* Now encode the AI results as UTF-8 instead of ASCII, because it would ocassionally send back non-ASCII characters which caused the server to throw errors