Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ai assistant citations #205586

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

KDKHD
Copy link
Member

@KDKHD KDKHD commented Jan 6, 2025

Summary

This PR adds citations to the security AI assistant. Citations are produced when tools are used and they are displayed in the LLM response as numbered superscript elements. A label appears when the user hovers over the numbered elements and clicking on the label opens a new tab that displays the cited data.

Tools that are cited:

  • Include citations for the following tools:
  • alert_counts_tool -> cites to alerts page
  • knowledge_base_retrieval_tool -> cites knowledge base management page with specific entry pre-filtered
  • open_and_acknowledged_alerts_tool -> cites to specific alert
  • security_labs_tool -> cites knowledge base management page with specific entry pre-filtered
  • knowledge_base indices -> cites knowledge base management page with specific index pre-filtered

Changes:

  • Tools return a citationElement string e.g. !{citation[global_threat_report_entry](/app/management/kibana/securityAiAssistantManagement?tab=knowledge_base&entry_search_term=5e981c00-26bb-497b-a3ee-feb8bfe19bfc)}
  • A custom markdown parser extracts the citationElements from the LLM response and renders them in the AI assistant.
  • LLM system prompts were modified to coerce the LLM to include citations whenever possible. Tested many different prompts and found that using "few-shot prompting" worked well for this use case.

Considerations:

  • One of the main objectives of this feature was to produce in-text citations to create a great user experience. Multiple approaches were tested to do this reliably. Attempts were made to make the LLM return structured JSON containing the citations however this was unreliable with smaller models. Generation post-processing (issuing an additional LLM call to annotate the response with citations) was also explored however this also had limitations as the second LLM call would not contain enough contextual information to reliably create the citations.

Eventually direct prompting in combination with citationElement meant that the LLM had enough contextual information to produce good citations and no additional LLM calls were required.

image
Jan-07-2025.14-33-21.mov

Checklist

Check the PR satisfies following conditions.

Reviewers should verify this PR satisfies this list as well.

  • Any text added follows EUI's writing guidelines, uses sentence case text and includes i18n support
  • Documentation was added for features that require explanation or tutorials
  • Unit or functional tests were updated or added to match the most common scenarios
  • If a plugin configuration key changed, check if it needs to be allowlisted in the cloud and added to the docker list
  • This was checked for breaking HTTP API changes, and any breaking changes have been approved by the breaking-change committee. The release_note:breaking label should be applied in these situations.
  • Flaky Test Runner was used on any tests changed
  • The PR description includes the appropriate Release Notes section, and the correct release_note:* label is applied per the guidelines

Identify risks

Does this PR introduce any risks? For example, consider risks like hard to test bugs, performance regression, potential of data loss.

Describe the risk, its severity, and mitigation for each identified risk. Invite stakeholders and evaluate how to proceed before merging.

@elasticmachine
Copy link
Contributor

elasticmachine commented Jan 6, 2025

🤖 Jobs for this PR can be triggered through checkboxes. 🚧

ℹ️ To trigger the CI, please tick the checkbox below 👇

  • Click to trigger kibana-pull-request for this PR!
  • Click to trigger kibana-deploy-project-from-pr for this PR!

@KDKHD KDKHD force-pushed the feature/ai-assistant-citations branch from e0e6b14 to 7785e37 Compare January 7, 2025 14:18
@elasticmachine
Copy link
Contributor

elasticmachine commented Jan 7, 2025

💔 Build Failed

Failed CI Steps

Metrics [docs]

Module Count

Fewer modules leads to a faster build time

id before after diff
integrationAssistant 466 468 +2
securitySolution 6488 6492 +4
total +6

Public APIs missing comments

Total count of every public API that lacks a comment. Target amount is 0. Run node scripts/build_api_docs --plugin [yourplugin] --stats comments for more detailed information.

id before after diff
@kbn/elastic-assistant-common 410 412 +2

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
securitySolution 21.4MB 21.4MB +3.0KB
Unknown metric groups

API count

id before after diff
@kbn/elastic-assistant-common 447 449 +2

History

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants