IBM's Mellea 0.4.0 Adds Agent Tooling to Granite Models

IBM Granite shipped Mellea 0.4.0 and three new Granite Libraries for ibm-granite/granite-4.0-micro on March 20. For developers building structured AI systems, the release matters because it moves common pipeline tasks such as RAG validation, attribution, uncertainty scoring, and safety checks out of prompt templates and into callable LoRA adapters integrated directly into Mellea workflows.

The immediate change is architectural. IBM is packaging narrow LLM operations as reusable components instead of relying on a single general model plus more prompting. If you build RAG systems, agent pipelines, or safety gates, this gives you a more explicit way to compose retrieval, validation, and repair steps around a 3B base model with 128K context.

Release Scope

The March 20 release combines one library update and three adapter collections:

Component	Scope	Key additions
Mellea 0.4.0	Python library for generative programs	Granite Library integration, rejection-sampling repair flows, event-driven observability hooks
`granitelib-rag-r1.0`	Agentic RAG adapters	6 adapters for retrieval and answer validation
`granitelib-core-r1.0`	Verification and explainability adapters	3 adapters for attribution, requirement checks, and uncertainty
`granitelib-guardian-r1.0`	Safety and factuality adapters	4 capabilities for guardrails, factuality, and policy checks

All three Granite Libraries target ibm-granite/granite-4.0-micro, IBM’s 3B parameter decoder-only dense transformer with a 128K sequence length.

Adapter Design

Each library breaks a larger application concern into smaller typed operations.

granitelib-rag-r1.0 includes six adapters: Query Rewrite, Query Clarification, Context Relevance, Answerability Determination, Hallucination Detection, and Citation Generation. IBM positions these across pre-retrieval, pre-generation, and post-generation stages. This lines up with the broader shift from monolithic RAG to multi-stage retrieval pipelines, similar to the design pressure behind agentic retrieval systems.

granitelib-core-r1.0 is the verification layer. It includes Context Attribution, Requirement Check, and Uncertainty. The most important detail is that uncertainty returns a calibrated certainty percentage, where answers assigned X percent are intended to be correct about X percent of the time. If you already use LLM evaluation, this gives you a runtime signal that can feed routing, retry, or human-review thresholds instead of serving only as an offline metric.

granitelib-guardian-r1.0 covers Guardian Core, Factuality Detection, Factuality Correction, and Policy Guardrails. Guardian Core evaluates prompts and responses for risks including safety issues, jailbreaking, profanity, violence, sexual content, social bias, unethical behavior, tool-call hallucinations, and RAG-related risks. Outputs are structured JSON, and Policy Guardrails can return an additional “Ambiguous” state.

Mellea 0.4.0 Changes the Execution Model

The library release would be less useful without Mellea 0.4.0. The new version adds native support for these adapters as first-class intrinsics inside structured workflows.

Several additions stand out:

Guardianlib intrinsics
find_context_attributions()
requirement_check and uncertainty as core intrinsics
hook system and plugin support
OTLP logging export
OpenTelemetry metrics support
configurable OTLP and Prometheus exporters
token usage metrics

This pushes Mellea closer to an application runtime than a thin orchestration wrapper. Type hints become schemas, requirements can be checked before output leaves a session, and failed checks can trigger repair attempts through rejection sampling. If your team is already working on structured outputs or LLM observability, this release connects both concerns in one stack.

Practical Tradeoffs

The release is narrowly scoped, which is also the point. These libraries are built only for granite-4.0-micro, not as general adapters across arbitrary base models. If you are standardizing on Granite, the integration is cleaner. If your stack spans multiple vendors, you are choosing a more model-specific workflow abstraction.

The adapter counts are also a useful signal about intent:

Library	Adapter / capability count
`granitelib-rag-r1.0`	6
`granitelib-core-r1.0`	3
`granitelib-guardian-r1.0`	4

IBM is defining a catalog of narrowly bounded operations rather than a broad agent framework. This complements orchestration layers more than it replaces them. If you compare agent stacks regularly, this fits closer to specialized skills than to end-to-end frameworks, much like the distinction between agent skills and orchestration rules.

Deployment Implications

The base model choice matters. granite-4.0-micro is a 3B model with 128K context, which keeps the footprint smaller than frontier-scale alternatives while still supporting long-context enterprise workflows. The RAG library page lists 14.4M parameters, reflecting the lightweight adapter approach.

For production teams, the real value is operational control. You can separate retrieval rewriting from answerability checks, separate generation from factuality correction, and separate response production from policy gating. Each step can be monitored with telemetry, scored independently, and retried selectively. This is a better fit for systems where context engineering and compliance matter more than squeezing every task through one prompt.

If you run Granite models in production, the next step is straightforward: map your current prompt chain to explicit validation points, then replace the highest-risk stages, usually answerability, hallucination checks, and policy review, with adapter-backed intrinsics you can observe and enforce.

IBM's Mellea 0.4.0 Adds Agent Tooling to Granite Models

Release Scope

Adapter Design

Mellea 0.4.0 Changes the Execution Model

Practical Tradeoffs

Deployment Implications

Keep Reading

How to Build a Domain-Specific Embedding Model

IBM Granite 4.1 Pushes Dense 8B Model Past Previous 32B MoE

Moonbounce Secures $12M to Automate AI Content Moderation

IBM Releases Granite 4.0 3B Vision for Document Parsing and Chart Extraction

Google DeepMind Releases AI Manipulation Toolkit