Text mining techniques have garnered attention for their ability to process and analyze qualitative data systematically, reducing the risk of subjective interpretations. For instance, Ito et al. (2019) demonstrated the efficacy of text mining in evaluating work logs of healthcare providers involved in nutritional programs for the elderly. By applying quantitative analysis techniques to qualitative work logs, the researchers minimized arbitrary interpretations, providing a more objective basis for evaluating the program's effectiveness. Similarly, Lebowitz et al. (2020) utilized statistical text mining to analyze reflective essays written by medical students, allowing for a robust exploration of student experiences during their rural placements. This approach facilitated the examination of larger datasets, providing richer qualitative insights than traditional qualitative methods could yield.
However, one significant drawback of text mining is the potential for oversimplification of complex qualitative data. While text mining can efficiently categorize and analyze large volumes of text, the nuances and contextual meanings often embedded in qualitative content may be lost during the analysis process. For instance, Ekin et al. (2021) noted that while text mining tools can uncover trends and relationships within qualitative data, they may fail to capture the depth of human experiences that qualitative researchers traditionally seek to understand. Thus, there is a credible concern that relying exclusively on text mining could lead to a reductionist interpretation of qualitative data, potentially neglecting essential contextual factors. Another issue related to the quality and representativeness of data used in text mining. As discussed by Kaur (2017), the effectiveness of text mining is contingent upon high-quality textual data; if the data is biased or unrepresentative, the resulting analyses may reflect those limitations. This aspect further emphasizes the need for careful data selection and preprocessing in qualitative research utilizing text mining tools (Renganathan, 2017).
To demonstrate the application of text mining in qualitative data analysis, this study re-presents the analysis of a dataset previously utilized in Mammadova et al. (2026), which examined two new instructors’ teaching experiences in a HyFlex environment and interpreted the results from a Technological Pedagogical Content Knowledge (TPACK) framework perspective. The qualitative data collected through reflexive journaling (Ortlipp, 2008) and interactive interviews (Ellis et al., 2011) over ten weeks, as described in full detail by Mammadova et al. (2026). Data collection was conducted in accordance with approval from the university’s Institutional Review Board.
Frequency analysis. Inductive emergent thematic coding was used for qualitative analysis, yielding five overarching themes. The study then analyzed how frequently each theme appeared across the weekly dataset. Although frequency analysis did not account for deeper contextual meaning or nuanced interpretation beyond patterns of word co-occurrence, it enabled the estimation of the relative proportion of discussion devoted to each theme.
The text mining approach was utilized to quantify the qualitative data using Python 3.11.4. To prepare the textual data for quantitative analysis, the study implemented a multi-step preprocessing pipeline using standard Natural Language Processing (NLP) (Tabassum & Patil, 2020). First, the textual data was standardized by converting all text to lowercase and removing numerical values and non-alphabetical characters. Next, standard stopwords (e.g., “the”, “and”, and “but"), which are low-semantic-value words, were filtered out to remove uninformative noise from the data using the Natural Language Toolkit (NLTK) library. The stop words list was supplemented with phrases such as “like,” “uh,” “yeah,” “would,” and “hey.” The last step in data preprocessing was lemmatization, which reduced words (e.g., “testing” and “tested”) to their canonical roots (e.g., “test”) by using the WordNetLemmatizer module from the NLTK library. This systematic cleaning protocol yielded a streamlined, uniform dataset optimized for subsequent computational analysis.
The preprocessed text segments were aggregated across the five NVivo-coded themes to construct a distinct, theme-specific list of words (Onwuegbuzie & Teddlie, 2003). These five vocabularies served as the definitive linguistic profile for their respective thematic categories. In parallel, the transcripts from the weekly interviews were tokenized, breaking the continuous text into constituent words for further analysis.
The proportion of theme-specific lexicons within the weekly dataset was measured longitudinally over the ten-week duration (Figure 1 and Table 1). To determine the relative prominence of each theme within the coded dataset, a normalization procedure was executed. Specifically, the proportion of a given theme for a specific week was divided by the cumulative proportions of all five themes for that same week, yielding a percentage. This calculation was systematically applied to all five themes across the entire ten- week corpus of documents (Figure 2 and Table 2). Ultimately, a higher percentage of theme-associated words in any given week serves as a metric indicating a more pronounced conversational focus on that particular topic during the interview and in the journal (see Mammadova & Topalgokceli, 2025, for Python code).
Table 1
The Proportion of Each Coded Overarching Theme-Related Words in the Weekly Interview
Date and number of weeks | Technology use in teaching | Class preparation | Remote & in-person student engagement | Expectation & policy | Hyflex definitions | Total |
Sep 8-Week 1 | 18.64 | 19.13 | 18.15 | 14.49 | 22.37 | 92.78 |
Sep 15-Week 2 | 17.81 | 17.01 | 17.67 | 12.82 | 19.77 | 85.08 |
Sep 22- Week 3 | 19.64 | 18.77 | 20.42 | 15.93 | 22.07 | 96.82 |
Sep 29-Week 4 | 19.32 | 17.71 | 18.95 | 14.21 | 23.13 | 93.32 |
Oct 13-Week 5 | 17.95 | 16.58 | 18.67 | 13.91 | 21.36 | 88.46 |
Oct 20- Week 6 | 19.22 | 18.13 | 19.32 | 14.31 | 22.85 | 93.82 |
Oct 27-Week 7 | 18.72 | 18.57 | 18.75 | 14.75 | 22.04 | 92.84 |
Nov 3-Week 8 | 19.84 | 19.11 | 19.74 | 16.16 | 23.68 | 98.53 |
Nov 17-Week 9 | 19.07 | 18.48 | 18.27 | 14.60 | 22.50 | 92.92 |
Dec 8-Week 10 | 17.77 | 17.67 | 17.75 | 14.56 | 21.60 | 89.35 |
Figure 1
The Proportion of Coded Overarching Theme-Related Words in the Weekly Interview
Table 2
The Percentage of Each Coded Overarching Theme-Related Words in the Weekly Interview (adapted from Mammadova et al., 2026)
Date and number of weeks | Technology use in teaching | Class preparation | Remote & in-person student engagement | Expectations & policy | Hyflex definitions |
Sep. 8 Week 1 | 24.11 | 20.09 | 19.56 | 20.62 | 15.62 |
Sep. 15 Week 2 | 23.23 | 20.94 | 20.77 | 19.99 | 15.07 |
Sep. 22 Week 3 | 22.79 | 20.29 | 21.09 | 19.38 | 16.45 |
Sep. 29 Week 4 | 24.78 | 20.70 | 20.31 | 18.98 | 15.23 |
Oct. 13 Week 5 | 24.15 | 20.29 | 21.10 | 18.74 | 15.72 |
Oct. 20 Week 6 | 24.35 | 20.48 | 20.59 | 19.32 | 15.25 |
Oct. 27 Week 7 | 23.74 | 20.17 | 20.20 | 20.01 | 15.88 |
Nov. 3 Week 8 | 24.03 | 20.14 | 20.04 | 19.39 | 16.40 |
Nov. 17 Week 9 | 24.21 | 20.52 | 19.66 | 19.89 | 15.71 |
Dec. 8 Week 10 | 24.18 | 19.89 | 19.86 | 19.78 | 16.29 |
Total % of each theme | 23.96 | 20.35 | 20.32 | 19.61 | 15.76 |
Figure 2
The Percentage of Each Overarching Theme-Related Words Per Week (adapted from Mammadova et al., 2026)
Over the ten weeks, “Technology use in teaching” predominated the discussions (23.96%), while having a steady weekly presence between 22.79% and 24.18%. This theme was followed by “Preparing for teaching in the class,” which emerged as the second most discussed theme (20.35%), ranging from 19.89% to 20.94% across the weekly dataset. The challenge of “Remote and in-person student engagement” ranked third (20.32%), fluctuating between 19.56% to 21.1% every week. Clearly setting and communicating “Expectations and policy” was the fourth discussed topic (19.61%), with weekly proportions spanning 18.74% to 20.62%. “HyFlex definition” garnered the least attention (15.76%), ranging from 15.07% to 16.45% weekly.
Following the contextual alignment model of Mammadova et al. (2026), the five themes were mapped onto three specific TPACK domains (Figure 3 and Table 3). First, "Remote and in-person student engagement" and "Class preparation" were combined under the Technological and Pedagogical Knowledge (TPK) domain. Second, the cumulative percentages of "Expectations and policy" and "HyFlex definitions" were used to construct the Contextual Knowledge (CK) domain. Finally, the "Technology use in teaching" theme was categorized independently under the Technological Knowledge (TK) domain.
Table 3
The Percentage of Each Three TPACK Domains in Weekly Interview (adapted from Mammadova et al., 2026)
Date and number of weeks | Technological and Pedagogical Knowledge | Contextual Knowledge | Technological Knowledge |
Sep. 8 Week 1 | 39.65 | 36.24 | 24.11 |
Sep. 15 Week 2 | 41.70 | 35.06 | 23.23 |
Sep. 22 Week 3 | 41.38 | 35.83 | 22.79 |
Sep. 29 Week 4 | 41.01 | 34.21 | 24.78 |
Oct. 13 Week 5 | 41.39 | 34.46 | 24.15 |
Oct. 20 Week 6 | 41.07 | 34.57 | 24.35 |
Oct. 27 Week 7 | 40.37 | 35.89 | 23.74 |
Nov. 3 Week 8 | 40.17 | 35.80 | 24.03 |
Nov. 17 Week 9 | 40.19 | 35.60 | 24.21 |
Dec. 8 Week 10 | 39.75 | 36.07 | 24.18 |
Total % of domain | 40.67 | 35.37 | 23.96 |
Figure 3
The Distribution of Each Three TPACK Domains in the Weekly Interview
This study showed how integrating text mining and frequency-based analysis with qualitative data can extend the interpretive power of qualitative research. While traditional qualitative approaches prioritize depth, nuance, and contextual meaning (Lindlof & Taylor, 2019) they often underrepresent patterns of emphasis across time and participants. By quantifying the relative frequency of themes, this study introduced an additional analytical layer that reveals what instructors consistently prioritized, thereby offering a different “voice” of the data that complements narrative interpretation.
One main contribution of frequency analysis is its ability to surface salience and priority. The consistent dominance of “Technology use in teaching” (23.96%) across all ten weeks suggests that instructors’ experiences were heavily anchored in technological concerns. While qualitative narratives might describe challenges or successes with technology, the consistent proportional prominence of this theme indicates that it remained a central cognitive and practical focus throughout the teaching period. It aligns with arguments in content analysis that frequency can serve as an indicator of importance or emphasis within communication (Saldaña, 2013), particularly when interpreted alongside qualitative meaning rather than in isolation.
Another key contribution of frequency analysis is pointing out the relatively balanced distribution of themes and areas of potential neglect or underdevelopment. The near-equal weighting of the “Class preparation” (20.35%) and “Remote and in-person student engagement” (20.32%) themes suggests that instructors were simultaneously negotiating challenges in instructional design and student interaction, highlighting the dual pedagogical demands of HyFlex instruction. The lower proportion of “HyFlex definitions” (15.76%) suggests that instructors spent less time explicitly conceptualizing the model itself. It may indicate either an assumed understanding or, conversely, a lack of shared conceptual clarity that remains under-discussed. From an instructional design perspective, this gap could signal the need for clearer institutional guidance or shared frameworks to support consistent implementation. Without frequency analysis, such a balance and underdevelopment might not be immediately evident, as qualitative reporting often foregrounds more illustrative or extreme cases rather than proportional representation. This contribution to text mining supports what mixed-methods scholars describe as complementarity, in which quantitative indicators enhance qualitative insights (Wu & Guo, 2011).
By examining weekly distributions of themes, this study identifies subtle fluctuations in thematic emphasis over time and provides a time-dependent record to map longitudinal trajectories. Given that human discourse naturally responds to real-time stress and evolving workplace realities, tracking weekly fluctuations in word frequency captures these hidden shifts with empirical granularity. This longitudinal perspective is often difficult to capture through conventional qualitative summaries, which tend to aggregate findings and obscure temporal dynamics. Tracking these patterns enables researchers to identify critical periods where specific challenges intensify or decline.
The aggregation of themes into TPACK domains further demonstrates how frequency analysis can inform theoretical interpretation. The predominance of TPK (40.67%) suggests that instructors were primarily engaged in integrating pedagogy with technology, rather than focusing on technology in isolation (TK = 23.96%). This finding reinforces the complexity of HyFlex teaching, where effective instruction requires simultaneous attention to both delivery mechanisms and pedagogical strategies. Moreover, the substantial proportion of CK (35.37%) indicates that institutional expectations, policies, and definitional clarity are not peripheral but central to instructors’ experiences. Such insights can guide professional development by highlighting that support should extend beyond technical training to include policy clarity and pedagogical integration.
This study demonstrated how AI can be utilized in educational research to analyze complex and large-scale textual data more efficiently and with greater precision. The data analysis with AI helped researchers to reach replicable results within significantly shorter timeframes, enhancing both the reliability and depth of findings. This presentation advocates for greater adoption of AI as a research method in education, not only as a tool for automation but to uncover patterns and perspectives that traditional methods may overlook. For example, the results suggest important implications for the recruitment and hiring process. It highlights that institutions seeking to implement HyFlex instruction should consider assessing candidates’ baseline technological and pedagogical competencies. Hiring faculty and academic leaders should ensure that instructors possess, or can quickly develop, core technological skills needed for HyFlex teaching before the start of the semester. Aligning hiring practices with the specific demands of HyFlex can improve instructional quality and reduce onboarding strain.
While text mining represents a powerful tool for qualitative research, offering substantial benefits such as increased efficiency and objectivity in data analysis, researchers must remain cautious of its limitations. As noted in qualitative methodology literature, frequency does not inherently capture meaning, depth, or context (Saldaña, 2013). A theme discussed less frequently may still carry significant conceptual weight or represent critical challenges. Therefore, the strength of this approach lies not in replacing qualitative interpretation but in augmenting it with systematic pattern detection.