How to Build Multi-Agent CNC Workflows on AMD MI300X

MachinaCheck is a multi-agent AI system that automates CNC manufacturability analysis, highlighted in the recent AMD Developer Hackathon wrap-up on May 10, 2026. The platform reduces manual Design for Manufacturability (DFM) evaluation time from up to 60 minutes down to roughly 30 seconds. You can replicate this architecture to build hybrid manufacturing workflows that combine deterministic CAD parsing with large language model reasoning.

System Architecture

Building an automated DFM pipeline requires segmenting tasks strictly between predictable code and generative models. Relying on an LLM for exact geometric math will result in failed parts.

The MachinaCheck architecture uses LangChain to orchestrate a multi-agent architecture consisting of five distinct components:

STEP File Parser: A non-LLM, pure Python component that extracts raw geometric data from standard 3D CAD files.
Operations Classifier: An instance of Qwen 2.5 7B that analyzes the extracted geometry to identify necessary machining operations, such as differentiating between drilling and milling.
Tool Matcher: A deterministic Python script that queries a workshop database to find available tools matching the required specifications.
Feasibility Decision Agent: A second Qwen 2.5 7B call that reasons over the combined geometric data and available tooling to determine if the part can be manufactured within the specified tolerances.
Report Generator: A final Qwen 2.5 7B pass that produces a structured manufacturing report, complete with tool lists and risk assessments.

Hardware and Model Serving

Running multiple simultaneous agent calls requires significant memory bandwidth and capacity. The system runs on the AMD Instinct MI300X platform via the AMD Developer Cloud.

The hardware provides 192GB of HBM3 memory, which allows the pipeline to load and run Qwen 2.5 7B without relying on quantization. The model is served using the vLLM stack compiled for ROCm 7.

Because the workflow splits tasks across discrete agents, inference latency defines the total pipeline execution time. The vLLM configuration on the MI300X achieves an average response time of under 3 seconds per agent call.

Designing the Hybrid Workflow

The most critical design decision in this pipeline is the hybrid approach to data processing. The system offloads reasoning to Qwen 2.5 7B while actively preventing the LLM from handling deterministic lookups.

The Tool Matcher component avoids the LLM entirely. When matching a required 4mm hole to an available 4mm drill bit, standard database queries provide 100% accuracy with zero hallucination risk. You must structure your LangChain tools to enforce this separation, passing only the final structured outputs from the Python scripts into the Feasibility Decision Agent’s context window.

To begin testing this architecture, provision an MI300X instance on the AMD Developer Cloud and deploy the ROCm 7 compatible vLLM container.

How to Build Multi-Agent CNC Workflows on AMD MI300X

System Architecture

Hardware and Model Serving

Designing the Hybrid Workflow

Keep Reading

Thousand Token Wood Runs a 5-Agent Economy on Qwen2.5-3B

How to Build Graph-Based Workflows With Google ADK Go 2.0

Slack Gains Shared Autonomous Agents With Claude Tag Beta

A2A v2.0 Adds Zero-Knowledge Proofs to Multi-Agent Handoffs

Two-Agent AMIE Architecture Matches Physicians on 3-Visit Plans