This project is covered by an NDA. Enter the password to view, or request access and I'll get back to you.
AbbVie's first step toward AI-assisted authorship โ an intelligent SOP comparison tool that helped Quality Operations authors produce faster, higher-quality Lab Investigation Reports.
Lab Investigation Reports (LIRs) are high-stakes quality documents โ every one must be accurate, consistent, and compliant with global SOPs. Before this tool existed, authors spent significant time drafting, self-reviewing, and correcting reports before they were even ready to be formally reviewed. The process was slow, error-prone, and relied heavily on individual author expertise.
The AI Auditor was designed to change that. The goal: give authors real-time, SOP-aligned feedback as they work โ not after the fact โ so they could catch issues earlier, reduce rework, and produce consistently higher-quality LIRs.
Scope and scale: We launched the pilot at one North American site and subsequently expanded to four additional sites. The tool also laid the foundation for a broader AI authorship strategy across CAPA, ER, and NCR workflows.
Because this was an internal MVP, we couldn't conduct broad external research before building. Early design decisions were grounded in SME interviews, direct SOP analysis, and tight feedback loops with site teams during the pilot.
The tool also had to operate in a heavily regulated environment, any AI output needed to be clearly framed as assistive (not authoritative), and the design needed to maintain author accountability at every step. This was a compliance requirement.
When we received the brief, we didn't jump straight to wireframes. Our first step was to study AI design patterns across similar document-review tools โ understanding how they communicate recommendations, structure feedback, and handle the critical question of user trust.
Because the AI Auditor needed to function as a human-in-the-loop assistant โ supplementing the author, not replacing their judgment, we built our initial design around two core principles:
The initial home screen was designed with a single, clear primary action: start an audit. We also introduced a percentage-based compliance score to help authors quickly gauge the level of revision their document might need.
After the first round of feedback from SMEs, ML engineers, and developers, a few things became clear. Here's how the design evolved:
Usage grew sharply after the pilot launch, expanding from one site to five as adoption spread geographically. By late Q3 2024, usage was higher and more geographically distributed โ confirming strong scalability across diverse teams and contexts.
This was one of the most technically and organizationally complex projects I've worked on โ a genuine first for the organization. A few things stood out.
With AI-generated outputs in a regulated environment, trust isn't implicit, it has to be designed in. Transparency, explainability, and clear disclaimers weren't nice-to-haves; they were what made the tool adoptable.
The percentage score felt like good UX until users started making decisions based on it. That taught me to push harder on how metrics will actually be interpreted not just how they're intended.
Working without broad user research early on forced us to be more deliberate about SME feedback and iterative testing. It made the design process leaner and the decisions more defensible.
Designing for an AI system where outputs are probabilistic, not deterministic, required new patterns and new conversations with the ML team. This project made me a stronger advocate for AI transparency across the entire product.