Production

AI orchestration layer for a materials engineering assistant

Intent resolution, hybrid retrieval, and response assembly for a domain-specific AI assistant.

Context

The system existed to help materials engineers search a large technical corpus through natural language without losing precision. The challenge was to keep the system useful for real work, not just conversationally fluent.

Architecture

I split the pipeline into explicit stages: resolver, query builder, hybrid retrieval, and response assembly. That made intent handling testable and kept retrieval logic separate from generation.

Query flow

User Query
  -> Resolver
    -> Query Builder
      -> Hybrid Retrieval
        -> Response Assembly

Key components

Resolver for intent, scope, and entity extraction
Query builder for deterministic structured retrieval
Hybrid retrieval across lexical, vector, and relational sources
Response assembly layer for context construction and answer generation

Tradeoffs

Precision versus recall when translating ambiguous language into structured search
Deterministic routing versus flexible model-driven behavior
Latency budgets across retrieval, ranking, and generation

Lessons

Separating retrieval preparation from retrieval execution improved maintainability
Clear interfaces between stages made relevance tuning much easier
The architecture became more reliable once generation stopped owning search behavior