Agentic Engineering (AE+X)

A unifying research agenda for configuring, testing, and deploying multi-agent systems for reliable outcomes.

My 3–5 year agenda is Agentic Engineering (AE+X): the goal is not only to use AI/LLMs, but to design, evaluate, and deploy multi-agent architectures that deliver reliable outcomes in real workflows.

Key themes:

  • Evaluation and benchmarking for agentic systems (reliability, calibration, failure analysis)
  • Reproducible pipelines for human/LLM hybrid coding and quality assurance
  • Governance as a first-class output (documentation, auditability, responsible AI)

Selected outputs & working papers (2025+)

  • Hierarchical Error Correction for Large Language Models: A Systematic Framework for Domain-Specific AI Quality Enhancement
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Submitted
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: A systematic, multi-level error-correction workflow that improves the robustness of domain-specific AI/LLM outputs.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.
  • Automated Quality Assessment for LLM-Based Complex Qualitative Coding: A Confidence-Diversity Framework
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Submitted
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: A practical way to assess (and calibrate) the reliability of LLM-based complex qualitative coding using a confidence–diversity lens.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.
  • A Confidence–Diversity Framework for Calibrating AI Judgement in Accessible Qualitative Coding Tasks
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Revise & resubmit
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: A confidence–diversity framework for calibrating AI judgement in accessible qualitative coding tasks, balancing accuracy with uncertainty awareness.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.
  • Visual Orientalism in the AI Era: From West-East Binaries to English-Language Centrism
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Submitted
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: In the AI era, “orientalism” shifts from simple West–East binaries toward a subtler English-language centrism that structures visibility and authority.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.

See also: the full research portfolio with cite/export tools on Research.