Context Management

Mellea manages context using two complementary mechanisms:

Components themselves, which generally contain all of the context needed for a single-turn request. MObjects manage context using fields and methods, and Instructions have a grounding_context for RAG-style requests.
The Context, which stores and represents a (sometimes partial) history of all previous requests to the LLM made during the current session.

We have already seen a lot about how Components can be used to define the context of an LLM request, so in this chapter we will focus on the Context mechanism. When you use the start_session() method, you are actually instantiating a Mellea with a default inference engine, a default model choice, and a default context manager. The following code is equivalent to m.start_session():

from mellea import MelleaSession

m = mellea.MelleaSession(
    backend=OllamaBackend(model_id=IBM_GRANITE_3_3_8B)
    context=SimpleContext()
)

The SimpleContext — which is the only context we have used so far — is a context manager that resets the chat message history on each model call. That is, the model’s context is entirely determined by the current Component. Mellea also provides a LinearContext, which behaves like a chat history. We can use the LinearContext to interact with chat models:

# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/context_example.py#L1-L5
from mellea import start_session

m = mellea.start_session(ctx=LinearContext())
m.chat("Make up a math problem.")
m.chat("Solve your math problem.")

The Context object provides a few useful helpers for introspecting on the current model context; for example, you can always get the last model output:

# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/context_example.py#L7
print(m.ctx.last_output())

or the entire last turn (user query + assistant response):

# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/context_example.py#L9
print(m.ctx.last_turn())

Introduction

Core Concepts

Context Management