In this article, we outline a number of reflections on the effective use of Artificial Intelligence models, and in particular Large Language Models (LLMs).
The starting idea is simple: to obtain reliable results, every application that leverages these tools should adhere to the principle of “zero use of prior knowledge”.
By following this principle, the model is prevented from relying on outdated information or on biases acquired during the training phase.
By explicitly providing, at the time of prompting, all the data needed to answer the question correctly, it becomes possible to increase the transparency and reliability of the result.
Index
The structure of use cases: building a framework
To create effective and controlled applications, a structure (framework) is proposed, consisting of three components (which we will call slots), all essential for interacting with an LLM.
Slot 1: context (input)
This slot must contain all the information necessary to place the model in the best possible conditions to generate the requested response. A good context includes sufficient and relevant details that allow the model to clearly understand the overall framework in which the question is situated. For example, to obtain an effective summary of a business report, it is necessary to provide the full text or, at the very least, the key points of the document, thereby ensuring an adequate informational basis on which the model can work.
Slot 2: question (input)
This phase is necessary to state, precisely, clearly, and directly, what is expected from Artificial Intelligence. A well-structured question makes it possible to guide the LLM effectively toward the expected result, minimizing the risk of vague or out-of-context answers. For example, a question such as “What are the key points covered in this report?” is far more effective and productive than generic requests such as “Summarize this document.”
Slot 3: answer (output)
This is the result generated by the model on the basis of the context and question previously provided. A correct formulation of the two preceding inputs encourages precise, coherent, and above all relevant answers. A well-structured approach significantly reduces the risk of obtaining ambiguous, incorrect, or off-topic responses.
Use case: internal press review (passive RAG)
Below is a first concrete use case, relating to the management of an internal press review in a large company. In structured business contexts, the amount of available information is often too great for end users to consume in full.
To overcome this difficulty, it is possible to adopt an approach that combines content specific to the user’s business unit with a concise summary of other relevant company news.
Objective
The objective is to support the spread of knowledge within the organization, encouraging collaboration and dialogue between different business units, without overloading users with irrelevant or excessive information.
Phase 1: algorithmic step – preliminary content search
- Preliminary input: Each user has previously saved keywords representing their professional and thematic interests.
- Search: By using a full-text search engine, the system quickly identifies all company news items matching the selected keywords. These news items constitute the informational material, selected both from the user’s own business unit and from other relevant company content.
Phase 2: consult the LLM – generation of the press review
- Context slot (input): this is composed of the full text of the news items identified by the search engine and selected in the previous phase, in order to provide the model with a complete and structured basis to process.
- Question slot (input): here the model is given a clear and specific request: “Using this company news, create a concise press review highlighting the topics of interest to the user.”
- Answer slot (output): the model generates a press review, a clear and concise text that effectively summarizes the topics most relevant to the user. Subsequently, through an algorithmic process, links to the original sources will be integrated, thus offering a deeper and more personalized way to consume the information.
Passive RAG approach
We can define this method as “passive RAG” because the query used for document retrieval derives exclusively from the keywords chosen by users and is not produced by the LLM itself. This approach makes it possible to combine the precision of full-text search with the generative model’s summarization and analytical capabilities, producing a highly personalized and manageable result.
Use case: automated replies to customer emails (active RAG)
This section presents a second application scenario: the effective handling of customer requests received by email, using the framework illustrated above.
Objective
The objective is to automatically extract the request expressed in a customer email in order to speed up the response process, avoiding repetitive manual activities and reducing the operational workload for the users involved in handling it.
Phase 1: identifying the question (first LLM invocation)
- Context slot (input): includes the full text of the email sent by the customer. Since emails may be written in very different formats and styles, the presence of details, even if apparently secondary, helps ensure a correct understanding of the request.
- Question slot (input): the LLM is prompted as follows: “This is an email from a customer, written in free form. Identify the main question and reformulate it as an explicit question to be used as a query for our internal search engine”.
- Answer slot (output): the model returns a concise and precise query that captures the customer’s need.
Phase 2: intermediate algorithmic step – search in the corporate knowledge base
Without further LLM invocations, the query obtained in the previous phase is used to query an internal knowledge base containing relevant company documents and information. From this search, the five most relevant documents are selected (which we might define as “answer candidates”).
Phase 3: generating the response (second LLM invocation)
- Context slot (input): the original email text and the content of the “answer candidate” documents retrieved in the previous algorithmic phase are provided.
- Question slot (input): the new request to the LLM is: “The customer sent this email and we have these company documents that should contain the answer. If you find relevant information in the documents, generate a reply email. If you do not find anything relevant, respond exactly with ‘HUMAN INTERVENTION NEEDED’”.
- Answer slot (output): the LLM directly generates the text of the reply email or clearly indicates the need for human intervention.
Active RAG approach
This process is defined as active RAG because, unlike the previous one, the query used in the search is generated directly by the model from the content of the email sent by the customer. This ensures flexibility, adaptability, and the ability to handle heterogeneous requests, making interaction with the end customer more effective.
Conclusions
This method, based on the principle of “zero use of prior knowledge”, guarantees transparency, control, and reliability of results. By effectively integrating generative models and document retrieval, it enables optimized management of both internal and external business communications, offering timely and precise responses to users’ needs.
Companies can assess this framework within their own organizational contexts, analyzing its benefits, critical issues, and opportunities, with the aim of improving their internal processes for communication and information management.