Prompt Injection

Severity: Medium
Test name: Prompt Injection
Summary

Prompt injections occur when carefully constructed prompts bypass filters or manipulate the Language Model (LLM) into disregarding prior commands or performing actions beyond its intended scope. These crafted inputs are engineered with precision to exploit vulnerabilities within the model, potentially resulting in data exposure, unauthorized access, or compromises to security integrity.

Impact

Data Leakage, Legal Issues, Token leaks.

Location

The issue can be found in the server response.

  • Tightened Privilege Management: Restrict the LLM's privileges to the minimum required for its operation, ensuring it cannot modify user settings without explicit authorization.
  • Robust Input Screening: Implement stringent validation and sanitation techniques for inputs to filter out potentially malicious prompts from untrustworthy sources.
  • Isolation and Oversight of External Content Engagement: Delineate and control the LLM's interaction with unverified content. Vigilantly oversee its engagement with external plugins or features that could initiate non-reversible operations or compromise sensitive information.
  • Trust Configuration: Establish definitive boundaries regarding the entities the LLM considers reliable, encompassing external data sources and supplementary functionalities. Operate under the premise that the LLM is not inherently secure, thereby maintaining user sovereignty over critical decisions.
Classifications
  • CWE-20
  • CVSS:3.1/AV:N/AC:H/PR:N/UI:R/S:C/C:H/I:N/A:N/E:F/RL:U/RC:C
References