Reliable LLM-Assisted Qualitative Analysis

Benchmarks, calibration, and QA for human/LLM hybrid coding in communication research.

I work on reliability and evaluation for LLM-assisted qualitative analysis: how to design benchmarks, calibrate outputs, and run QA for complex coding tasks so results remain defensible in real research workflows.

Representative directions:

  • Coding quality assessment and calibration (complex labels, multi-stage rubrics, error analysis)
  • Benchmark design for domain-specific annotation tasks
  • Reproducible pipelines for human/LLM hybrid coding (auditability and responsible use)

Selected outputs & working papers (2025+)

  • Automated Quality Assessment for LLM-Based Complex Qualitative Coding: A Confidence-Diversity Framework
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Submitted
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: A practical way to assess (and calibrate) the reliability of LLM-based complex qualitative coding using a confidence–diversity lens.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.
  • A Confidence–Diversity Framework for Calibrating AI Judgement in Accessible Qualitative Coding Tasks
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Revise & resubmit
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: A confidence–diversity framework for calibrating AI judgement in accessible qualitative coding tasks, balancing accuracy with uncertainty awareness.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.
  • Hierarchical Error Correction for Large Language Models: A Systematic Framework for Domain-Specific AI Quality Enhancement
    Cite / Share
    Export
    Share
    Tip: listen to the two-host audio overview above for a quick, conversational walkthrough, then share the link + citation.
    (2025)— Submitted
    Listen: audio overview
    Tip: the player loads audio only when opened, to keep the page fast.
    Take-home: A systematic, multi-level error-correction workflow that improves the robustness of domain-specific AI/LLM outputs.
    Prefer listening to reading? The audio overview gives a short, two-host walkthrough of the paper’s core question, method, and key finding.

See also: the full research portfolio with cite/export tools on Research.