rag¶

`duallens_analytics.rag` ¶

RAG pipeline – retrieval + LLM generation.

This module implements the core Retrieval-Augmented Generation loop:

Accept a user question.
Retrieve the top-k relevant document chunks from the ChromaDB vector store.
Assemble a prompt that injects the retrieved context.
Invoke the Chat LLM and return the generated answer together with the source excerpts.

QNA_SYSTEM_MESSAGE = "You are an assistant specialised in reviewing AI initiatives of companies and providing accurate answers based on the provided context.\n\nUser input will include all the context you need to answer their question.\nThis context will always begin with the token: ###Context.\nThe context contains references to specific AI initiatives, projects, or programmes of companies relevant to the user's question.\n\nInstructions:\n- Answer ONLY based on the context provided.\n- If the context is insufficient, clearly state that.\n- Your response should be well-structured and concise.\n" `module-attribute` ¶

System message instructing the LLM to answer only from provided context.

`QNA_USER_TEMPLATE = '###Context\nHere are some documents that are relevant to the question mentioned below.\n{context}\n\n###Question\n{question}\n'` `module-attribute` ¶

User-message template with {context} and {question} placeholders.

`get_llm(settings)` ¶

Return a configured :class:~langchain_openai.ChatOpenAI instance.

All LLM hyper-parameters (model, temperature, max_tokens, etc.) are read from the supplied :class:~duallens_analytics.config.Settings.

Parameters:

Name	Type	Description	Default
`settings`	`Settings`	Application settings.	required

Returns:

Type	Description
`ChatOpenAI`	A ready-to-invoke `ChatOpenAI` object.

Source code in src/duallens_analytics/rag.py

def get_llm(settings: Settings) -> ChatOpenAI:
    """Return a configured :class:`~langchain_openai.ChatOpenAI` instance.

    All LLM hyper-parameters (model, temperature, max_tokens, etc.) are
    read from the supplied :class:`~duallens_analytics.config.Settings`.

    Args:
        settings: Application settings.

    Returns:
        A ready-to-invoke ``ChatOpenAI`` object.
    """
    logger.debug(
        "Creating ChatOpenAI: model=%s, temp=%.2f, max_tokens=%d",
        settings.model,
        settings.temperature,
        settings.max_tokens,
    )
    return ChatOpenAI(
        model=settings.model,
        temperature=settings.temperature,
        max_tokens=settings.max_tokens,
        top_p=settings.top_p,
        frequency_penalty=settings.frequency_penalty,
    )

`query_rag(question, retriever, settings)` ¶

Execute the full RAG loop: retrieve context then generate an answer.

Parameters:

Name	Type	Description	Default
`question`	`str`	Natural-language question from the user.	required
`retriever`	`VectorStoreRetriever`	A LangChain retriever backed by the ChromaDB vector store (see :func:`~duallens_analytics.vector_store.get_retriever`).	required
`settings`	`Settings`	Application settings forwarded to :func:`get_llm`.	required

Returns:

Type	Description
`str`	A tuple of `(answer_text, source_excerpts)` where
`list[str]`	source_excerpts is a list of the raw `page_content` strings
`tuple[str, list[str]]`	from the retrieved document chunks.

Source code in src/duallens_analytics/rag.py

def query_rag(
    question: str,
    retriever: VectorStoreRetriever,
    settings: Settings,
) -> tuple[str, list[str]]:
    """Execute the full RAG loop: retrieve context then generate an answer.

    Args:
        question: Natural-language question from the user.
        retriever: A LangChain retriever backed by the ChromaDB vector
            store (see :func:`~duallens_analytics.vector_store.get_retriever`).
        settings: Application settings forwarded to :func:`get_llm`.

    Returns:
        A tuple of ``(answer_text, source_excerpts)`` where
        *source_excerpts* is a list of the raw ``page_content`` strings
        from the retrieved document chunks.
    """
    logger.info("RAG query: %s", question[:120])
    docs = retriever.invoke(question)
    context_parts = [d.page_content for d in docs]
    logger.debug("Retrieved %d context chunks", len(context_parts))
    context = ". ".join(context_parts)

    prompt = (
        f"[INST]{QNA_SYSTEM_MESSAGE}\n"
        f"user: {QNA_USER_TEMPLATE.format(context=context, question=question)}"
        f"[/INST]"
    )

    llm = get_llm(settings)
    logger.debug("Invoking LLM for RAG answer")
    resp = llm.invoke(prompt)
    logger.info("RAG answer generated (%d chars)", len(resp.content))
    return resp.content, context_parts

rag¶

duallens_analytics.rag ¶

QNA_USER_TEMPLATE = '###Context\nHere are some documents that are relevant to the question mentioned below.\n{context}\n\n###Question\n{question}\n' module-attribute ¶

get_llm(settings) ¶

query_rag(question, retriever, settings) ¶

`duallens_analytics.rag` ¶

`QNA_USER_TEMPLATE = '###Context\nHere are some documents that are relevant to the question mentioned below.\n{context}\n\n###Question\n{question}\n'` `module-attribute` ¶

`get_llm(settings)` ¶

`query_rag(question, retriever, settings)` ¶