EdTech Archives EdTech Archives The Journal of Applied Instructional Design, 15(2)

Exploring Educators’ Engagement with Online Professional Development: COVID-Era Insights with Time-Series Clustering and Anomaly Detection:

Javier Leung

Abstract

This study, presented as a research poster at the 2025 AECT conference, examines K-12 educators’ engagement with online professional development (PD) from March to December 2020 during the COVID-19 pandemic, leveraging time-series clustering and anomaly detection methods to identify peak access dates and abnormal user engagement patterns. The findings reveal how school districts accessed PD during challenging times and highlight the critical needs of small school districts, particularly in enhancing emergency preparedness planning.

Introduction

Since 2014, the EdHub Library, developed by the University of Missouri’s Network for Educator Effectiveness (NEE), has provided over 500 self-paced activities for K-12 educators, focusing on research-based teaching strategies and the development of classroom observation skills. Educators can navigate topics via the homepage, search engine, or Teacher Indicator sitemaps. The library organizes professional development (PD) materials into a three-level structure, with each module offering three to five independent activities throughout the year. To date, EdHub has supported roughly 38,000 educators in Missouri, Nebraska, Kansas, and internationally, including Vietnam (Leung, 2024a).

Impact of COVID-19 on Professional Development

The COVID-19 pandemic exacerbated disparities in internet access and high-quality PD, increasing the need for remote teaching and digital tool-focused PD (Kim et al., 2021). Educators prioritized immediate challenges, including technology integration and remote classroom management, with resource selection influenced by local policies (Reich et al., 2020). The pandemic highlighted gaps in educators' technology skills, driving demand for virtual classroom management and online teaching pedagogy (Trust & Whalen, 2020). Heavy workloads and disrupted collaborative learning environments hindered PD participation and peer support, as educators faced unprecedented challenges requiring swift instructional and resource decisions (Hill et al., 2013).

Problem statement

EdHub uses Google Analytics to track user engagement with PD, collecting data on access, location, user type, and time spent, while maintaining privacy. Past research has shown that unusual engagement patterns can indicate behavioral shifts due to external factors, such as policy changes or pandemics (Schildkamp et al., 2020). Given the library's large user base and data, machine learning is more effective than qualitative methods for identifying engagement patterns and irregularities (Aggarwal, 2015; Mao et al., 2024). Machine learning has been vital for EdHub Library in designing and evaluating online educator PD for NEE, including curating emergency preparedness resources (Leung, 2024a), understanding search terms (Leung, 2024b), and personalizing PD for new teachers (Leung, 2025). This study extends these data-informed efforts to analyze educators' engagement with PD during the pandemic.

Professional Capital Theory

The Professional Capital Theory offers a framework for analyzing how user engagement with PD has unfolded during the COVID-19 pandemic (Hargreaves & Fullan, 2015). Human capital emphasizes the necessity for educators to rapidly acquire skills in technology integration and online teaching methods, as they pivoted to remote instruction. Social capital was weakened as traditional in-person professional development was disrupted, reducing opportunities for in-person collaboration. Decisional capital faced significant challenges, as educators were required to make quick and informed decisions to adapt to remote learning environments.

Purpose of the Study

This study employs time-series clustering and anomaly detection on web analytics data to examine how educators utilized the EdHub Library during the COVID-19 pandemic. We identify engagement patterns based on accessed pages, locations, dates, and time spent on resources. The study addresses the following research questions: RQ1. What are the general user access characteristics during the pandemic? RQ2. Which professional development materials were detected as regular patterns across locations during the pandemic? RQ3. Which professional development materials were detected as anomalies across locations during the pandemic?

Methods

This study used the Python package Tsfresh (Time Series Feature Extraction Based on Scalable Hypothesis Tests) for time-series clustering and anomaly detection to analyze educators' engagement with PD in three parts: (1) exploratory data analysis, (2) cluster pattern recognition, and (3) anomaly or irregular behavior detection. Tsfresh extracts features from noisy time-series data with temporal dependencies while reducing dimensionality (Blue Yonder GmbH, n.d.).

Data Description

The web analytics data was extracted from the Google Analytics dashboard into Jupyter Notebooks for 10 months, from March 1, 2020, to December 31, 2020. Time-series analyses were conducted using Tsfresh, while Matplotlib was used for visualization (Matplotlib, n.d.). The key variables included Page for tracking the PD resource, City for access locations, Date for date and time of user engagement, and Time On Page for measuring engagement duration in seconds (GA4 Analytics Dimensions and Metrics, n.d.). This selection of variables helped identify educators' regular and irregular engagement with resources over the period of analysis.

Exploratory data analysis

In the first part of the study, Tsfresh enabled exploratory data analysis to quantify and visualize educators’ access to PD during the pandemic by identifying the number of unique users across monthly peaks and troughs (high or low points).

Tsfresh for Time-series Clustering

In the second part of the study, Tsfresh was applied unsupervised to time-series data, converting it into a structured format for clustering (Blue Yonder GmbH, n.d.). This process identified user engagement patterns related to resources, locations, access dates, and time spent on PD. Using the k-means algorithm, the optimal number of clusters was determined to be eight, with a high silhouette score of 0.9614, indicating well-separated clusters.

Tsfresh for Time-series Anomaly detection

Third, the anomaly detection component employed the Isolation Forest model to identify irregular user engagement with PD resources that deviated from expected behaviors identified by the clustering step. The MinimalFCParameters in Tsfresh extracted key statistical measures for the Time On Page variable, including sum, mean, median, minimum, maximum, and standard deviation (Blue Yonder GmbH, n.d.). The model identified 1,465 anomalies across small school districts in Missouri and Fresno, California, including specific PD resources.

Evaluation Metrics for Clustering

The evaluation of cluster pattern recognition involved three main steps: (1) testing the Tsfresh algorithm with k-means models from 0 to 20 to observe differences in cluster outputs; (2) applying the elbow method to determine the ideal number of clusters based on silhouette scores, which range from -1 to +1; and (3) visualizing the clusters in a principal component analysis (PCA) space to assess their formation (Blue Yonder GmbH, n.d.).

Findings

RQ1. What are the general user access characteristics during the pandemic?

In the exploratory component, Figure 1 shows that the highest usage occurred from August to October 2020, as educators prepared for remote instruction, and during mid-March 2020, when online classroom observation training took place. Unique users ranged from 157 to 545 during the high-activity periods (August–October). In contrast, the lowest usage was noted from April to July 2020 and November to December 2020, with fewer than 206 unique users.

Figure 1

Peaks in User Engagement From March 1, 2020, Through December 31, 2020

Peaks in User Engagement From March 1, 2020, Through December 31, 2020

RQ2. Which professional development materials were detected as regular patterns across locations during the pandemic?

In the cluster pattern recognition section, Figure 2 displays eight clusters of small school districts based on the City variable, with access to the Administrator, Units of Instruction, and Remote Learning modules. Many clusters also included a "not set" location, as educators often used proxies and virtual private networks (VPNs) for privacy purposes. Clusters 0 and 1 are denser, indicating similar data points, while Clusters 3 and 7 are more spread out, suggesting higher variability. Clusters nearer to the origin have lower PCA component values, while those farther out may contain outliers in the Time On Page variable. Figure 3 details the characteristics of the clusters, and Figure 4 summarizes the grouping patterns of access to PD resources and cities.

Figure 2

Clusters of PD Resources in PCA Space

Clusters of PD Resources in PCA Space

Figure 3

General Characteristics of Clusters

General Characteristics of Clusters

Figure 4

Description of Pages and Locations from Clusters

Description of Pages and Locations from Clusters

RQ3. Which professional development materials were detected as anomalies across locations during the pandemic?

Figure 5 illustrates the anomaly detection of City, Page, and Time On Page variables. For instance, Ste. Genevieve, MO, had the longest Time On Page at 3,196 seconds (about 53 minutes), and Fresno, CA's Classroom Observation Training had a Time On Page of 2,470 seconds (about 41 minutes).

Figure 5

Detected Anomalies by Median Time On Page Variable (in seconds)

Detected Anomalies by Median Time On Page Variable (in seconds)

Figure 6 presents the descriptive statistics for the Time On Page variable, revealing significant variability in user engagement with PD resources during the pandemic. The large standard deviations and differences between the minimum and maximum values suggest that some educators engaged with PD longer than others. The mean (540.61 seconds) exceeds the median (477.09 seconds) because outliers inflate the average, indicating a right-skewed distribution. Additionally, a bimodal characteristic is present, as the statistical distribution of Time On Page values exhibits two distinct modes, indicating a large number of users with low engagement and a smaller group with very high engagement.

Figure 6

Anomaly Metrics Summary for Time On Page (in seconds)

A table with numbers and a few black text

AI-generated content may be incorrect.

Discussion

The study highlights the key role of online professional development (PD) in assisting educators during crises, especially in smaller districts in Missouri. Time-series clustering and anomaly detection identified general behaviors and moments of significant engagement, showing how the pandemic affected learning needs. The findings are vital for enhancing educators’ human, social, and decisional capital by tailoring online programs to their contexts.

Implications

The study highlights practical implications for program managers to tackle disparities in access, skills, and collaboration that hinder educators’ professional growth. Future materials should prioritize targeted support for resource-constrained school districts by promoting virtual collaboration, tailoring content to their professional needs, and investing in digital literacy initiatives to enhance educators’ preparedness for future crises.

Conclusion

This study examined educators’ engagement with online PD during the pandemic through time-series clustering and anomaly detection. Key findings showed that peak engagement periods coincided with school year preparations and the onset of the pandemic, reflecting a surge in demand for remote instruction training. Cluster analysis revealed consistent patterns of resource access, particularly for the Administrator Library, Units of Instruction, and Remote Learning modules. Notable anomalies included high engagement with webinars and training materials. The results underscore the need for targeted support to reduce disparities and strengthen future PD initiatives.

References

  1. Aggarwal, C. C. (2015). Data mining: The textbook. Springer. https://doi.org/10.1007/978-3-319-14142-8
  2. Blue Yonder GmbH. (n.d.). tsfresh/tsfresh at main · blue-yonder/tsfresh. GitHub. Retrieved January 23, 2025, from https://github.com/blue-yonder/tsfresh/tree/main/tsfresh
  3. GA4 Analytics Dimensions and Metrics. (n.d.). Retrieved January 23, 2025, from https://support.google.com/analytics/table/13948007?hl=en
  4. Hargreaves, A., & Fullan, M. (2015). Professional capital: Transforming teaching in every school. Teachers College Press.
  5. Hill, H. C., Beisiegel, M., & Jacob, R. (2013). Professional development research: Consensus, crossroads, and challenges. Educational Researcher, 42(9), 146–154. https://doi.org/10.3102/0013189X13512674
  6. Kim, L. E., Leary, R., & Asbury, K. (2021). Teachers’ narratives during COVID-19 partial school reopenings: An exploratory study. Educational Research, 63(2), 244–260. https://doi.org/10.1080/00131881.2021.1918014
  7. Leung, J. (2025). Insights into online educator professional development: Revealing resource recommendations with association rule mining. The Journal of Applied Instructional Design, 14(2). https://doi.org/10.59668/2222.21504
  8. Leung, J. (2024a). Supporting K-12 educators’ professional development continuity: A forecasting approach for curating online resources for emergency preparedness planning. The Journal of Applied Instructional Design, 13(2). https://doi.org/10.59668/1269.15622
  9. Leung, J. (2024b). Improving educators’ search engine experience: A quantitative analysis of search terms. IEEE Access, 12, 69076–69086. http://dx.doi.org/10.1109/ACCESS.2024.3393423
  10. Mao, S., Zhang, C., Song, Y., Wang, J., Zeng, X. J., Xu, Z., & Wen, Q. (2024). Time series analysis for education: Methods, applications, and future directions. arXiv. https://arxiv.org/abs/2408.13960
  11. Matplotlib. (n.d.). Retrieved January 23, 2025, from https://matplotlib.org/
  12. Reich, J., Buttimer, C. J., & Thompson, M. (2020). Remote learning guidance for educators during the COVID-19 pandemic: A first look. Journal of Learning Sciences, 29(4), 573–581. https://doi.org/10.35542/osf.io/437e2
  13. Schildkamp, K., Wopereis, I., Kat-De Jong, M., Peet, A., & Hoetjes, I. (2020). Building blocks of instructor professional development for innovative ICT use during a pandemic. Journal of Professional Capital and Community, 5(3-4), 281–293. https://doi.org/10.1108/JPCC-06-2020-0034
  14. Trust, T., & Whalen, J. (2020). Should teachers be trained in emergency remote teaching? Lessons from the COVID-19 pandemic. Journal of Technology and Teacher Education, 28(2), 189–199. https://doi.org/10.70725/307718pkpjuu