Ai Engineering 3 min read

OpenAI Releases 1.5B Privacy Filter MoE for PII Redaction

OpenAI released an open-weight, 1.5 billion parameter model designed to detect and redact personally identifiable information locally before cloud processing.

OpenAI released OpenAI Privacy Filter on April 22, 2026, an open-weight model built to detect and redact personally identifiable information (PII) in text. Detailed by VentureBeat, the release gives developers a specialized tool for sanitizing sensitive data on-device before it reaches cloud infrastructure. Distributed under the Apache 2.0 license, the model shifts PII detection away from cloud APIs and brittle regex patterns toward local, context-aware processing.

Architecture and Inference Model

The system relies on a Mixture-of-Experts architecture designed for high throughput. It contains 1.5 billion total parameters but activates only 50 million parameters per token during inference. This sparse activation keeps memory requirements low while maintaining the capacity needed for contextual evaluation, a common advantage when working with MoE architectures.

OpenAI adapted the model from an autoregressive pretrained checkpoint in the gpt-oss family, transforming it into a bidirectional token-classification stack with a pre-norm transformer encoder. Instead of generating text sequentially, the model labels an entire input sequence in a single forward pass. The decoding step uses a constrained Viterbi procedure to output coherent spans, stabilizing the boundaries around detected entities.

The model supports up to 128,000 tokens per pass. For developers handling large document dumps or long-form server logs, this context window eliminates the need to chunk text prior to redaction.

Detection Capabilities and Benchmarks

OpenAI Privacy Filter detects spans across eight specific categories: names, addresses, email addresses, phone numbers, URLs, dates, account numbers, and secrets like API keys.

On the PII-Masking-300k benchmark, the model establishes new baselines for precision and recall in local sanitization tasks.

MetricScore
Standard F1 Score96.00%
Precision94.04%
Recall98.04%
Corrected F1 Score97.43%

The corrected F1 score accounts for known annotation errors in the original benchmark dataset. The model also responds well to domain-specific fine-tuning. In OpenAI’s testing, fine-tuning the base model on a small, specialized dataset increased the F1 score on that domain from 54% to 96%. This adaptability is critical for organizations that use custom formats for internal account numbers or proprietary data structures.

Deployment and Implementation

The model is available on Hugging Face and GitHub. The repository at openai/privacy-filter contains the implementation and configuration files required for runtime control. Developers can tune the precision and recall tradeoffs directly in the configuration based on their application needs.

The Hugging Face release includes a model card and an interactive demo. It supports transformers.js, allowing developers to run the model entirely within a web browser utilizing WebGPU. This client-side execution path guarantees that raw text never leaves the user’s device, matching the architecture goals of developers running models locally for strict privacy compliance.

If you build data pipelines in high-sensitivity sectors like healthcare or finance, OpenAI Privacy Filter provides a robust local sanitization layer. You should integrate it as an initial filtering step rather than a standalone compliance guarantee, ensuring human oversight or secondary validation remains in place for critical systems.

Get Insanely Good at AI

Get Insanely Good at AI

The book for developers who want to understand how AI actually works. LLMs, prompt engineering, RAG, AI agents, and production systems.

Keep Reading