The work is not in the words. It is in the structure you put under them first.
You have qualitative data. Maybe not as much as you would like, but more than you remember. Engagement and pulse survey verbatims going back several cycles. Exit interview notes from the last two years. Recordings or summaries from focus groups, listening tours, leadership offsites. A handful of one-off executive interviews that ended up in someone's project folder.
The instinct is to throw all of it into your safe-to-use deployment ChatGPT or Copilot and ask for themes. That instinct is reasonable. It will also disappoint you.
What you get when you do this
You get themes that are plausible enough. They are not wrong, exactly. They tend to look like the themes a thoughtful analyst could have pulled from any one of those sources on its own. The AI has not added the cross-cutting picture you were hoping for.
What is missing is not a better model. It is the structure underneath the model. The qualitative material you handed in carries no consistent tags, no shared vocabulary, no source trail. It is a pile of words. The model treats it as a pile of words. The output reflects that.
The broader qualitative picture
Employee listening programs have done sophisticated work for HR over the last decade. Engagement surveys, pulse instruments, and the analysis built around them are not the problem here. They were designed to answer specific, well-scoped questions about how people feel about working at your company, and they answer those questions well.
The AI moment has added a different set of questions to the same desk. How does the work actually get done in this organization? Where do systems quietly fail? What would the people running the function change if they had the authority to do it? Those questions sit outside what listening instruments were built to capture, and they sit outside what any single qualitative source on its own was built to capture.
Your full qualitative picture is larger than your listening program alone. It includes exit interview notes, focus group recordings or summaries, listening tour outputs from a new executive, leadership offsite materials, onboarding survey responses, occasional executive interviews, and yes, notes from past consulting engagements that nobody opened a second time. All of it was collected for different reasons by different teams using different question sets. None of it was ever organized into something a single tool, human or otherwise, could actually work against as a whole.
This is not a volume problem. Adding more text will not fix it. It is a structure problem.
Where the work actually is
A piece of qualitative data becomes useful when it sits inside a structure that lets you ask things of it. There are four properties that structure has to provide.
The first is categorization. Every statement, whether it came from a survey comment, an exit conversation, or a focus group, should be tagged against a consistent set of dimensions you care about, including function, process area, geography, employee level, system involved, and decision type. Without categorization, you cannot ask narrow questions of broad data, and you cannot connect a finding in one source to a related finding in another.
The second is anonymization that does not destroy attribution. The model and your team should not see the speaker's name. The structure underneath should still know which source this statement came from, what role the person held, and what prompt or question elicited it.
The third is a fact layer that lives separately from the raw text. Long-form qualitative content is hard to query. A clean set of extracted statements, such as "the compensation exception process for sales managers in our APAC region requires manual override by a regional analyst," is straightforward to query.
The fourth is a source trail you can actually follow. Every claim should trace back to a specific source, a specific moment in the original material, and the specific question or prompt that elicited it. Without that, you cannot defend an insight to anyone who asks where it came from.
This is the work itself. It is not glamorous, it is not the part anyone wants to spend money on, and it is also the entire reason your AI tool produces something useful instead of something hollow.
Why most teams skip this part
There are two reasons. The first is that it looks like infrastructure rather than insight. Spending two months categorizing and structuring qualitative data is hard to put in a board update. Insight has a story arc. Structure does not.
The second reason is that the methodology to do this work well is not common in the average HR function. It pulls from research design, library science, and data engineering, and most HR teams have not staffed for any of those disciplines because they have not needed to until now. The AI moment is the first time the gap is actually expensive.
At Ikona Analytics, the categorization, anonymization, fact extraction, and source trail work is most of what we actually do. The interviews we conduct are the visible part. The structuring is where the time goes, and it is the reason the resulting knowledge asset stays useful six months after the project ends, rather than aging out the way a consulting deck does.
What this means for your next AI move
The next time someone proposes pointing an AI tool at the qualitative data you already have, the right first question is not which model to use. It is what structure you are going to put under the data before you ask the model anything. If the honest answer is "none," you will get plausible-sounding themes regardless of which model you bought.
The model is the easy part. The structure is the work.
Written by
Bennet Voorhees
Bennet Voorhees is a founding partner at Ikona Analytics, bringing deep expertise in workforce intelligence, diagnostic methodology, and HR technology transformation from experience across Fortune 100 organizations.
Learn more about our team