Production

AI orchestration layer for a materials engineering assistant

Intent resolution, hybrid retrieval, and response assembly for a domain-specific AI assistant.

Context

The system existed to help materials engineers search a large technical corpus through natural language without losing precision. The challenge was to keep the system useful for real work, not just conversationally fluent.

Architecture

I split the pipeline into explicit stages: resolver, query builder, hybrid retrieval, and response assembly. That made intent handling testable and kept retrieval logic separate from generation.

Query flow

User Query
  -> Resolver
    -> Query Builder
      -> Hybrid Retrieval
        -> Response Assembly

Key components

  • Resolver for intent, scope, and entity extraction
  • Query builder for deterministic structured retrieval
  • Hybrid retrieval across lexical, vector, and relational sources
  • Response assembly layer for context construction and answer generation

Tradeoffs

  • Precision versus recall when translating ambiguous language into structured search
  • Deterministic routing versus flexible model-driven behavior
  • Latency budgets across retrieval, ranking, and generation

Lessons

  • Separating retrieval preparation from retrieval execution improved maintainability
  • Clear interfaces between stages made relevance tuning much easier
  • The architecture became more reliable once generation stopped owning search behavior

Related writing