Designing and Evaluating a 3D Virtual World Game for English Language Learning: A Learning Experience Design Approach

Rui Tammy Huang

doi:10.59668/515.12900

Designing and Evaluating a 3D Virtual World Game for English Language Learning: A Learning Experience Design Approach

Rui Tammy Huang

Abstract

This study took an iterative multi-phase learning experience design (LXD) approach to design and evaluate a 3D virtual world game focusing on English language learning. Multiple LXD methods were conducted including empathy interview, heuristic evaluation, cognitive walkthrough, and concurrent think-aloud usability testing, in order to identify usability problems and how learners rated the usability of the intervention. A total of 137 usability problems were identified. Learners rated the game with high overall perceived usability. This study provided evidence for the multi-dimensional usability framework proposed by Jahnke and colleagues and its application at a micro level.

Introduction

Research interest in digital games-based second language learning (DGBL2L) has been around for over two decades (e.g., Coleman, 2002; Emde et al., 2001) and has continued to grow in the past ten years (Huang & Schmidt, 2022). A review of 16 peer-reviewed systematic reviews, scoping reviews, and meta-analyses published between 2011 and 2020, and a systematic review of prior empirical studies in the field of DGBL2L between 2011 and 2020 revealed various benefits digital games could potentially bring to English language learners. For example, motivation is widely accepted by second language researchers and practitioners as an important factor that can impact the success of second language acquisition (Dörnyei, 1998; Ebrahimzadeh & Alavi, 2017). Digital game-based learning may influence motivation by bringing an authentic second language sociocultural context (e.g., Jabbari & Eslami, 2019; C. Wang et al., 2020; Yaşar, 2018), which is typically remote to learners in the physical world, to an immersive learning environment (e.g., Blume, 2020; Neville et al., 2009; Rankin et al., 2009). Furthermore, evidence suggests that digital game-based learning promotes second language motivation by providing challenge, a sense of control, and awards (Jackson & McNamara, 2013; Laine & Lindberg, 2020; Sandberg et al., 2014). Digital games may also enhance learner engagement (Chen & Kent, 2020; Hung & Young, 2015), persistence (e.g., Chen et al., 2018; Eltahir et al., 2021; Sung et al., 2017), and enjoyment (Gellar-Goad, 2015; Hartfill et al., 2020; Lingwati, 2017), and promote social interaction (e.g., Jabbari & Eslami, 2019; Poole & Clarke-Midura, 2020; Yudintseva, 2015). DGBL2Ls are also beneficial for vocabulary learning, communicative skills, creativity, and writing skills development etc. (Poole & Clarke-Midura, 2020; C. Wang et al., 2020; Xu et al., 2020; Yaşar, 2018).

While digital games show promising potential to support English language learning, researchers pointed out a range of limitations in prior empirical research and practice, among which, two limitations are related to the design and development of digital learning games. Firstly, prior review articles pointed out a lack of study on second language skills beyond vocabulary learning skills (Hung et al., 2018; Poole & Clarke-Midura, 2020; Xu et al., 2020). This was also evidenced in a systematic review I conducted across 10 academic databases (Huang & Schmidt, 2022). In a total of 209 empirical studies, 52% focused on enhancing vocabulary learning alone. A potential solution to this limitation is to intentionally design and develop language learning games that promote language skills such as listening, speaking, reading, and writing. Secondly, prior review articles pointed out a need to better integrate second language pedagogies into game mechanics (Acquah & Katz, 2020; C. Wang et al., 2020; Xu et al., 2020; Yudintseva, 2015). Aligned with Common European Framework of Reference for Languages (Council of Europe, n.d.) and the American Council on the Teaching of Foreign Languages Proficiency Guidelines 2012 (ACTFL, 2012), task-based language teaching is based on a competence-based communicative approach (Littlewood, 2004) and aims to promote the development of learners’ communicative skills (e.g., Cai & Lv, 2019; Ellis, 2000). Bryfonski and Mckay’s (2019) meta-analysis suggested that task-based language teaching can be efficacious to promote language communication skills. Therefore, intentionally designing digital games that integrate task-based language teaching pedagogy with the game mechanics could be the potential solution to address the second limitation.

Digital games for learning are not easy to design effectively due to their interdisciplinary nature (Bellotti et al., 2013), especially the link between pedagogy and game design (Abbott, 2020). Digital learning games are likely to fail without a pedagogical and learner-focused foundation (Lepe-Salazar, 2015; Westera, 2019). Unfortunately, most of the prior empirical studies on DGBL2L did not unpack the design of the learning games before testing their impact on language learning outcomes (Rankin & Edwards, 2017). Similar to how a quantitative study’s validity diminishes when employing an unvalidated instrument, a study based on a designed intervention is weakened when utilizing designs that have not undergone scrutiny with the intended learners. To address the weakness in the design of digital game-based learning interventions, such as the link between pedagogy and game design, Abbott (2020) proposed taking an LXD approach to develop digital learning games as a potential solution.

As part of a multi-phase mixed-methods iterative LXD project, the purpose of this paper is to present the iterative design and formative evaluation of a 3D virtual world game for English language learning, The Future Writers, as an example to showcase how the LXD approach may be used to systematically design and evaluate learner experience, which may be beneficial to both researchers and practitioners in the field of DGBL2L and the field of learning/instructional design and technology (LIDT) who are interested in LXD related research and practice. To enhance the understanding of LXD, this paper also serves to showcase that multiple usability dimensions (technological, pedagogical, and sociocultural; Jahnke et al., 2020) not only coexist but also intersect in a product designed for learners as users. The overarching research questions that guided this inquiry were:

RQ1: What usability problems are present and design improvements might be needed when participants formatively evaluate this intervention?
RQ2: How do intended learners rate the intervention’s usability?

Learning Experience Design Approach

Gray (2020) defines design thinking as a user-centered approach to design, which has been instantiated substantially across disciplines in user experience design (UXD). UXD can be broadly applied to any design (a chair, a cell phone app, a vacation package, etc.). From the researcher’s side, in the field of LIDT, there has been an increasing interest to adopt or adapt UXD methods and processes to design, develop, and evaluate interventions for the purpose of learning (e.g., Chang & Kuwata, 2020; Quintana et al., 2020; Schmidt, Earnshaw, et al., 2020). For example, in Cavignaux-Bros and Cristol’s (2020) study, they adopted participatory design and co-design methods in the iterative design and development of a Massive Open Online Courses (MOOC) on public innovation. Stefaniak and Sentz (2020) also suggest that UXD can benefit LXD by helping designers take an empathetic and pragmatic approach to align interventions to learners’ actual and contextual needs. From the practitioner's side, there is evidence of rapid adoption of the LXD approach across multiple industries (Waight et al., 2023; X. M. Wang et al., under review).

As an emerging phenomenon in the field of LIDT, researchers and practitioners are still trying to gain a better understanding of LXD (e.g., Gray, 2020; Quintana et al., 2020; Schmidt, Tawfik, et al., 2020). For example, some pioneer researchers have put in efforts to define LXD (Ahn, 2018; Chang & Kuwata, 2020; Stefaniak, 2020; Vann & Tawfik, 2020). Among these efforts, Schmidt and Huang (2021) assert that LXD is “a human-centric, theoretically-grounded, and socio-culturally sensitive approach to learning design, intended to propel learners towards identified learning goals, and informed by UXD methods” (p. 1).

As a close sibling to UXD, the LXD approach shares many common UXD methods, such as (1) heuristic evaluation and cognitive walkthrough (Schmidt, Earnshaw, et al., 2020), (2) empathy interview (Schmidt et al., 2022), (3) personas (Schmidt, Earnshaw, et al., 2020), (4) participatory design (Cavignaux-Bros & Cristol, 2020), and (5) task-based think-aloud usability study (Lu et al., 2022). However, these methods need to be adapted for the purpose of learning. For example, Nielsen’s usability heuristics evaluation (Nielsen, 1994), perhaps the most widely used usability instrument, only gauges technological usability. The pedagogical perspective and sociocultural perspective are missing from this instrument. To address this gap, Jahnke and colleagues (2020) proposed a multi-dimensional usability framework (Figure 1). Findings from this study shows that not only the multi-dimensions of usability exist in technologies designed specifically for learning purposes, but also some usability considerations live at the intersections of the three dimensions.

Figure 1

Multi-Dimensional Usability Framework for Learner Experience (Jahnke et al., 2020; used with permission)

A venn diagram with three sections that intersect with each other, including technological dimension, pedagogical dimension, and social dimension. There is an outer circle that wraps the venn diagram which is labeled as learner experience.

Methods

There were two research questions guiding this study. RQ1 asked what usability problems are present and design improvements might be needed when participants formatively evaluate this intervention? This question focused on gaining a deeper understanding of what usability problems are presented that could be identified by experts and intended learners, and the corresponding improvements needed. RQ2 asked how do intended learners rate the intervention’s usability? This question focused on using System Usability Scale (SUS; Brooke, 1996) to quantitatively analyze how intended learners rated the intervention’s usability.

Research Design

The study used a mixed-methods research design (see Figure 2) and was approved by the institutional review board at the University of Florida. The study consisted of three phases in which various LXD methods were used to help with design iteration that occurs after each round of formative evaluation (e.g., four iterations in Phase 1 expert evaluation). Phase 1 focused on front-end analysis and early-stage prototyping, Phase 2 focused on empathy interview, formative evaluation, and iterative design improvements, and Phase 3 focused on the game’s functionality in terms of how well it facilitated collaborative language learning. This paper only reports the first two phases.

Figure 2

A Multi-Phase LXD Approach to Design and Develop a DGBL2L Intervention

This image depicts three phases of the entire study. Phase 1 was between November 2021 and Mid-February 2022, and it included literary review, rapid prototyping, and four expert evaluations. Prototype revisions were made after each evaluation. Phase 2 was between Mid-February and March 2022, and it included four empathy interviews, five one-on-one usability study sessions, and functional prototyping revisions after each usability testing. Phase 3 was conducted in April 2022, and it included minor functional prototype revision and four dyads playtesting.

The 3D Virtual World Game Overview

I created, designed, and developed The Future Writers, a browser-based 3D virtual world game. The tools I used to develop this environment are publicly available and free of cost, including: (1) Mozilla Hubs [Software app], which is like a virtual world playroom, and (2) Mozilla Spoke [Software app], which is the toolbox to build virtual spaces. The Mozilla Hubs environment provided affordances for learners to use multiple language communication skills such as reading, writing, listening, and speaking, including: (1) multimedia playback function, (2) voice chat function, and (3) text chat function.

This intervention is composed of six virtual spaces: (1) pre-game lobby area (Figure 3), where everyone meet and learners watch a short overall introduction video, (2) Future Land (Figure 4), where the two learners enter the game world on each side of the partition rail, and watch an introduction video to learn the game premise and immediate quests together, (3) two Evidenceverse spaces, one for Tom and the other for Jenny (Figure 5), in which only one learner may enter one of the two Evidenceverse to watch an introduction video on game/learning quests in this space, three videos related to the story character (Tom or Jenny), so the player may collect information he/she thought could be helpful to share with his/her partner in the next space, (4) Magicverse (Figures 6 and 7), which is the space where the two learners reunite to watch a game/learning quests video, discuss their findings, draw a conclusion on what tragedy might happy to the two characters (Tom and Jenny), and then write a story to change their future, and (5) post-game meeting area (Figure 8), where learners and researchers meet to discuss their learning and gameplaying experiences.

Figure 3

Pre-Game Lobby Area

Figure 4

Future Land

Figure 5

Evidenceverse for Jenny

Figure 6

Entering Magicverse Where They First Reunite to Watch Another Introduction Video to Learn About Game/Learning Quests

This is a screenshot of the entrance of the Magicverse when the two players reunited from the two Evidenceverses. There is a large TV with an introduction video for this game space.

Figure 7

Learners Compose a Future Story in Magicverse

This is a screenshot of the Magicverse where two players write stories and post their sentenced in the 3D game world. In this image, there are two players’ avatars in the game space and their text chat window on the right side of the screen.

Figure 8

Post-Game Meeting Area

Phase 1 Expert Evaluation

Participants

I used a convenience sampling method for expert evaluation recruitment. Specifically, two immersive learning experts (with at least three years of immersive learning technology research or teaching experience) performed a heuristic evaluation, and two subject-matter experts (English as a second language researcher or instructor) performed a cognitive walkthrough (Schmidt, Earnshaw, et al., 2020).

Expert evaluation methods may be applied in any design phase throughout an LXD process (Schmidt, Earnshaw, et al., 2020); however, given the scarcity of prior empirical studies that unpack the “black box” of how DGBL2L interventions were designed to foster language learning, it is helpful to have experts evaluate early-stage prototypes to avoid potential major flaws.

Data Collection

A heuristic evaluation was performed by the two immersive learning experts looking at the interface and sharing their professional opinions using a heuristic evaluation checklist (Jahnke et al., 2021). A cognitive walkthrough was performed by the two subject matter experts following Spencer’s streamlined cognitive walkthrough (2000) guidelines. Each data collection session was performed on Zoom, video recorded, and transcribed.

Data Analysis

To iteratively improve the design of the intervention, after each data collection cycle, I prioritized problems identified by the experts by assigning severity levels between 0 and 4 using the usability severity rankings (Nielsen, 1994). All level 3 and 4 and most of level 2 problems were solved before the next expert evaluation. Level 1 problems were addressed when time permitted. Level 0 problems were not addressed because these issues were not considered to be usability problems.

The problems coded between level 1 and level 4 were used as the data source to answer RQ1. Through thematic coding, technological, pedagogical, and sociocultural usability problems (Jahnke et al., 2020) were identified.

Phase 2 Empathy Interview

Participants

I used a convenience sampling method for intended learner participants. The inclusion criteria for this empathy interview required that the participant: (1) must be an international student studying in the United States, (2) must be officially enrolled in English Language Institute programs at the University of Florida, and (3) must be a beginner to intermediate levels according to the proficiency levels standard of the English Language Institute. Four participants (two female and two male) participated in the empathy interview.

Data Collection

The interviews were conducted using Zoom online meeting software. Webcam video and audio were recorded. I also took interview notes.

Data Analysis

Empathy interview transcripts were analyzed using empathy mapping methods (Siricharoen, 2021; Thompson et al., 2016). The empathy map is composed of four major parts, including what each participant (1) says, (2) thinks, (3) does, and (4) feels. After four empathy maps were generated, they were compared to identify themes across multiple maps. Figure 9 illustrates a typical empathy interview data analysis process, including the empathy interview, empathy map, and personas development (Ferreira et al., 2015; Schmidt et al., 2022). Personas are fictitious users who are representations of typical users and who might employ the technology within their specific usage context (Miaskiewicz & Kozar, 2011). In LXD approach, personas may be used throughout the entire iterative design process to help situating learning within learners’ lived experiences (Robinson & Harrison, 2017; Schmidt & Tawfik, 2022).

Figure 9

Process of Empathy Interviews, Empathy Mapping, and Development of Patient Personas (Schmidt et al., 2022; used with permission)

This is a diagram that depicts a typical three-step process of empathy interview data analysis, including one, empathy interview, two, empathy map, and three, personas development.

Phase 3 One-on-One Usability Study

Participants

Participants were recruited through a convenience sampling method. The inclusion criteria remained largely the same as the empathy interview, except the English language proficiency levels were refined to intermediate level. Five qualified learners participated in the usability study: two female and three male students (ages between 19 and 27, three from Asian countries and two from Arabic countries).

Procedure

Each usability study session lasted around two hours. Upon completion of a Mozilla Hubs training game, the participants took a short break. Then, I facilitated the usability study with the participants. During the usability study, a task-based concurrent think-aloud approach (Krug, 2010) was used, followed by the administration of the SUS, and then a self-developed semi-structured interview protocol. The task-based think-aloud protocol and the semi-structured interview questions were created using guidelines by Krug (2010) and are provided in the appendix.

Data Collection

In addition to data collection approaches used in the empathy interviews, screen recordings from the participants’ game-playing experience were also captured using OBS (2022).

Data Analysis for RQ1

Usability problems were identified through four different approaches: (1) participants’ task completion status, (2) participants’ verbal expression of problems or preferences, (3) conversation between a participant and me which revealed a usability problem, and (4) my observations of existing or potential usability problems. The problems were then thematically coded to identify evidence of multi-dimensional usability problems.

Data Analysis for RQ2

Participant ratings of usability were determined via analysis of SUS data. All participants’ SUS results were calculated according to Brooke's guidelines (1996). Specifically, the calculation function has two steps, (1) for odd (positive) items, subtract one from the score, and for even (negative) items, subtract the score from five, and (2) multiply the sum of all items scores by 2.5 to normalize them on a scale of 100.

Results

In this study, usability problems were identified via expert evaluation and usability testing; therefore, the results from both are presented together. This section is composed of three subsections: (1) findings from the empathy interview, and specifically, the empathy map and persona, (2) identified usability problems across two phases, and (3) findings from the SUS.

Empathy Map and Persona

Four learners participated in empathy interviews. Based on the interview transcripts and video recordings, four empathy maps were developed (see Figure 10 for an example).

Figure 10

An Example Empathy Map Based on an Empathy Interview Performed in This Study

This is an example of an empathy map. This map is composed of four parts, including one, what the participant says, two, what the participant thinks, three, what the participant does, and four, what the participant feels.

Empathy maps were then used to inform the development of learner personas. Figure 11 provides an example of personas, which was used to help situate my design in target learners’ lived experiences.

Figure 11

An Example Persona Developed for This Study

This is an example of a persona. The name of this persona is Young-Chul. He is an Asian and speaks Korean and English. This persona provides information such as his English proficiency levels, interests, motivation, goals, behaviors, and attitude.

Identified Usability Problems

Four experts identified 75 usability problems in Phase 1, and five intended learners identified 62 usability problems in Phase 2. Table 1 shows the stratified usability problems across three usability dimensions in both phases.

Table 1

The Stratified Usability Problems in Phase 1 and Phase 2

Category	Phase 1		Phase 2
Category	Count	Percentage	Count	Percentage
Technological	54	72%	40	65%
Pedagogical	19	25%	20	32%
Sociocultural	2	3%	2	3%
Total	75	100%	62	100%

If only evaluated based on the total number of identified usability problems, it appears that the game design might have not improved that much throughout the iterative process. However, Figure 12 depicts identified problems by severity levels, which shows clear improvements in terms of reduced higher-level usability problems identified by intended learners compared to those identified by experts.

Figure 12

Usability Problems in Two Phases by Severity Levels

This is a bar chart with two categories along the X axis, including expert and learner. The X axis show the usability problem severity levels. The severity levels are: 0 = I don’t agree that this is a design problem at all; 1 = cosmetic problem only: need not be fixed unless extra time is available on project; 2 = minor usability problem: fixing this should be given low priority; 3 = major usability problem: important to fix, so should be given high priority; and 4 = usability catastrophe: imperative to fix this before the next phase. The Y axis show the number of usability problems identified by the participants.

Technological Problems

In Phase 1, examples of Level-4 technological usability problems include: (1) too much information in the intro video in Future Land space that caused player’s cognitive overload, which was a learnability problem, according to ISO 25010 (International Organization for Standardization, 2011), and (2) the player could not continue the game quests after reading instructions in Magicverse because the navigation system was confusing, which was an operability problem (International Organization for Standardization, 2011).

In Phase 2, the Level-4 technological usability problem was a 3D virtual space floor plan problem. It occurred due to some revisions made in the Mozilla Spoke project file that accidentally blocked the walkable area. While it was a catastrophic problem, the fix was easy. Level-3 technological usability problems appear to be in four subcategories, including (1) multimedia object settings (e.g., Dome 1 video volume too low), (2) human-computer interaction-related problems (e.g., not clear the exact media frame position to put the text chat message), (3) unclear game instruction (e.g., not clear which player can use which yellow glass to post notes), and (4) 3D space technological problem (e.g., the participant’s avatar moved under the sand because he was looking down). Some of the problems I identified were triangulated by the post-usability testing interview. For example, one participant commented that: “In the game, sometimes I was not clear; should I click on this, should I press the link on the crystal? This is the only part that I was a bit confusing.”

Pedagogical Problems

In Phase 1, an example pedagogical problem identified by an immersive learning expert was that, not until the player came to the Magicverse, he could then figure out why he must listen to the videos and take notes in Evidenceverse. This problem revealed that the language learning task integrated in the game needed to be better explained. An example pedagogical problem identified by a subject matter expert was that, in Magicverse when learners were asked to compose the final story, there was no clear clue of the difference between the five blue glass panels on what to write. He suggested providing prompting questions to guide learners’ story-development considering their level of English proficiency.

In Phase 2, the ten Level-2 pedagogical usability problems were further grouped into four subcategories, including (1) the narrative speed of the videos in the game was not appropriate (three were considered too slow, and one was too fast), (2) the story videos without closed caption were a bit hard to understand, (3) was not clear how to use the writing glass panels collaboratively, and (4) did not fully understand the writing prompt questions (e.g., not clear what the word “overview” means on the first blue glass). An example improvement was that while one subject matter expert suggested adding closed captions to the story videos, I designed videos with and without a closed caption and performed an A/B test during the first two usability study sessions with intended learners to verify this pedagogical design feature. Both participants preferred videos with closed captions. Therefore, closed captions were added to all story videos and tested in the following three usability study sessions. The results showed a unanimous preference for the participants to have closed captions.

Sociocultural Problems

This intervention is built on the basis of a jigsaw type of task-based language learning activity. Therefore, making sure communication and interaction between two players/learners is fundamental. When I coded usability problems, any problems that could potentially hinder communication and interaction between two learners were coded as sociocultural problems, no matter what causes. In addition, any cultural-related problems, such as uni-demographically representative, culturally misrepresentative, may be coded as sociocultural problems. In this study, no such problems were identified. In Phase 1, there were two sociocultural problems. Firstly, one immersive learning expert pointed out that the last game instruction said players would share evidence in Magicverse, but he thought the evidence had been typed in Chat message, so he has already shared it with the other player. This problem could potentially hinder the collaborative learning activity, which this game was designed to foster. Secondly, although discussion was encouraged through game prompts in Evidenceverse, a subject matter expert worried that learners might not discuss with each other unless something wasn't clear. This could be a potential valid point, which I was not able to address in the scope of this study. However, this could be a potential future research direction to compare interventions between learners watching Evidenceverse videos together vs. separately in terms of desired language learning behaviors.

In Phase 2, there were also two sociocultural usability problems. Firstly, a participant correctly understood he was by himself when in the Evidenceverse. However, it was only based on his guess instead of reading it from the game instruction. The second sociocultural usability problem was that a participant thought he could not talk to his partner in Magicverse because the microphone on Mozilla Hubs interface was muted. Both sociocultural problems led to one improvement solution that a game world map with indications of means of communication in each space (see the game world map in Figure 6) was added in each introduction video for all game spaces. In addition, a game instruction was added in the introduction for Evidenceverse to remind players: “you can’t see each other in Evidenceverse, but you can still chat.”

SUS Findings

To answer RQ2, the SUS results collected in Phase 2 from five intended learners across all ten items with aggregated statistics are reported in Table 2. The aggregated SUS scores were calculated by first subtracting one from all odd (positive-oriented) items and subtracting the original score from five for all even (negative-oriented) items, and then multiplying the sum of all scores by 2.5 to normalize scores on a scale of 100. A mean score of the aggregated scores across five participants was then calculated as the final SUS rated by the participants. The benchmark score for a system with average usability is 68, according to Lewis and Sauro’s (2018) suggestion. The participants rated the usability of this intervention with high overall perceived usability (M = 83, SD = 10.67).

Table 2

Quantitative Analysis From the SUS Survey Results

Items	Participants Score					Mean	Min	Max	SD
Items	P1	P2	P3	P4	P5	Mean	Min	Max	SD
1. I think that I would like to use this space frequently.	4	5	5	4	4	4.4	4	5	0.55
2. I found the space unnecessarily complex.	2	2	1	2	3	2	1	3	0.71
3. I thought the space was easy to use.	4	4	5	5	4	4.4	4	5	0.55
4. I think that I would need the support of somebody to be able to use this space.	3	4	1	1	2	2.2	1	4	1.30
5. I found the various functions in this space were well integrated.	4	5	5	5	4	4.6	4	5	0.55
6. I thought there was too much inconsistency in this space.	2	1	1	2	2	1.6	1	2	0.55
7. I would imagine that most people would learn to use this space very quickly.	4	4	4	5	5	4.4	4	5	0.55
8. I found the space very awkward to use.	2	1	1	1	1	1.2	1	2	0.45
9. I felt very confident using the space.	4	5	5	3	5	4.4	3	5	0.89
10. I needed to learn a lot of things before I could start using this space.	4	1	1	3	1	2	1	4	1.41
Aggregated Score	67.5	85	97.5	82.5	82.5	83			10.67

Discussion

While the intervention’s design outcome was largely positive, this study has several limitations. Firstly, the majority of the participants were around the same age. Secondly, because this study was based on my dissertation, all data analysis was performed by me; no additional researchers were involved to do member checks. To account for this limitation, this mixed-methods study is designed with multiple approaches to do data triangulation.

As discussed earlier, usability evaluation is a widely practiced methodology in user experience research and practice (Jahnke et al., 2020). However, in the context of LXD, there is a lack of pedagogical and sociocultural usability considerations. In this study, the multi-dimensional usability, namely, technological, pedagogical, and sociocultural dimensions, are clearly evidenced in the design and evaluation of this intervention through both expert evaluation and usability study. As presented in findings, all usability problems (except Level 0 which were coded as non-usability problems) were stratified as one of the three dimensions, and most of the problems were solved with corresponding solutions.

Based on the above findings, it seems the three dimensions were mutually exclusive, which appears to be contradicted with Jahnke and colleague’s (2020) multi-dimensional usability framework. However, I would argue that: (1) the multi-dimensional usability framework serves well as a guiding framework when viewing the usability of a learning technology, at the macro level; (2) when examining usability problems and solutions at the micro level, it is better to decompose usability problems to single dimension, so solutions could be designed more specifically, often differently, and potentially more effective in solving the problems.

At the macro level, an LXD product or an intervention is composed of many design objects, among which, some design objects only have one type of usability (e.g., game navigation system has technological usability, prompts for learners to discuss video content with partners has pedagogical usability), some objects are at the intersection of two usability dimensions (e.g., writing prompts for posting key information), or even the intersection of all three dimensions (e.g., story videos). Table 3 provides examples of design objects and the corresponding design considerations from each usability dimension. This provides evidence that not only does usability have multiple dimensions in LXD, but also these dimensions intersect, when viewing a product or an intervention at the macro level.

Table 3

Design Informed From Multiple Usability Perspectives

Dimension	Design Object	Example Usability Considerations
Technological	Game navigation system	Do players understand that the moving particles system indicates the direction they should move in the game world?
Pedagogical	Future writing prompts	Do the writing prompts provide affordances for extended and creative writing opportunities, which is essential to promote language learning?
Sociocultural	The game environment	The game premise is a science-fiction with futuristic feeling, does the game environment (e.g., the surrounding view) align with the players’ perception of the futuristic culture?
Techno-pedagogical	Writing prompts for posting key information	Technological: Do players know which yellow glass to use to post their information? Do players know how to post information in the 3D game world? Pedagogical: Do players know what to write and post?
Sociotechnical	Post-game meeting area	Sociocultural: Do the environment and 3D objects in the environment create a comfortable feeling to elicit discussion? Technological: Are there technological affordances in this area to support players’ preferred means of communication (e.g., text chat, voice chat)?
Socio-pedagogical	The game quest that asks players to discuss the disaster	Sociocultural: Does the quest instruction elicit discussion between players? Pedagogical: Does the task type provide affordance to a meaningful discussion?
Sociotechnical-pedagogical	Story videos	Sociocultural: Are images for the story figures (e.g., Tom, Jenny) culturally inclusive and non-discriminating? Technological: Are the videos easy to interact with for playing, pausing, and rewinding? Can the videos play normally? Pedagogical: Are the wordings in the video stories slightly above players’ language proficiency level so they have a pleasant challenge? Is the audio narrative speed appropriate for target learners?

At the micro level, the examples in Table 3 also reveal that, because each design object resides in one or multiple dimensions of usability, when designing a product or an intervention at a micro level, it is beneficial for designers to intentionally come up with usability design considerations and corresponding solutions along each dimension. This same practice extends to the analysis of identified usability problems, in which case, the decomposition of problems into single dimension usability problems helped more focused solutions to address the specific problems. For example, in Phase 1, one expert was not able to accomplish a writing task at the end of the game because she was not clear on the game instruction. This problem was broken down into two usability problems: (1) technological usability (learnability) problem: the participant thought literally that she needed to write on the glass, and (2) pedagogical usability problem: three instructions on top of five blue glass panels made her think she needed to write three parts; however, the original design intention was to leave it open for learners but encourage them to write as much as possible. After the two usability problems were identified, I created different solutions to address them. For the technological problem, I added an instruction on how to post text into the 3D game world; for the pedagogical problem, I added writing prompts on top of each of the five glass panels. This is one of many examples in this study that showcases that in LXD practice, when designing an intervention at a micro level, it is helpful to decompose usability problems to a single dimension and address them correspondingly.

As mentioned above, this paper reported two phases of this three-phase study. Following Phase 2, a Phase 3 playtesting was conducted with four dyads of intended learners to find evidence of language learning when they played this online game, and whether the learners were satisfied with their learning experience. Findings revealed positive results on both.

Grounded in the entire three-phase study, several future research directions emerged. For example, other pedagogical design approaches could be integrated into game mechanics that might also potentially support language learning opportunities, such as comparing the effectiveness of language learning outcomes between collaborative watching story videos and the current model (jigsaw type of task). Another direction could be the use of natural language processing to support on-demand language learning.

Conclusion

Most of the prior empirical studies on DGBL2L did not unpack the design of the interventions before testing their impact on learning outcomes (Rankin & Edwards, 2017). Just as a quantitative study would be weakened by using a non-validated instrument, so is a design study weakened by using designs that have not been vetted with intended learners. This study showcased how the LXD approach may be used to design and evaluate learner experiences. In addition, this paper provides evidence of how the multi-dimensional usability framework is useful to guide LXD design practice at a macro level. This study also proposed the application of multi-dimensional usability at a micro level, which hopefully could be applied and further examined by researchers and practitioners in future studies.

References

Abbott, D. (2020). Intentional learning design for educational games: A workflow supporting novices and experts. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/11_intentional_learn
Acquah, E. O., & Katz, H. T. (2020). Digital game-based L2 learning outcomes for primary through high-school students: A systematic literature review. Computers & Education, 143, 103667. https://doi.org/10.1016/j.compedu.2019.103667
ACTFL. (2012). ACTFL Proficiency Guidelines 2012. https://www.actfl.org/uploads/files/general/ACTFLProficiencyGuidelines2012.pdf
Ahn, J. (2018). Drawing inspiration for learning experience design (LX) from diverse perspectives. Design Journal, 6(1). https://digitialcommons.montclair.edu/eldj/vol6/iss1/1
Bellotti, F., Kapralos, B., Lee, K., Moreno-Ger, P., & Berta, R. (2013). Assessment in and of serious games: An overview. Advances in Human-Computer Interaction, 2013. https://doi.org/10.1155/2013/136864
Blume, C. (2020). Games people (don’t) play: An analysis of pre-service EFL teachers’ behaviors and beliefs regarding digital game-based language learning. Computer Assisted Language Learning, 33(1–2), 109–132. https://doi.org/10.1080/09588221.2018.1552599
Brooke, J. (1996). SUS: A “quick and dirty” usability. In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & I. L. McClelland (Eds.), Usability evaluation in industry (pp. 189–194). Taylor & Francis.
Bryfonski, L., & McKay, T. H. (2019). TBLT implementation and evaluation: A meta-analysis. Language Teaching Research, 23(5), 603–632. https://doi.org/10.1177/1362168817744389
Cai, L., & Lv, J. (2019). Task-based approach to develop intercultural communicative competence in college English education. Journal of Language Teaching and Research, 10(6), 1279. https://doi.org/10.17507/jltr.1006.17
Cavignaux-Bros, D., & Cristol, D. (2020). Participatory design and co-design—The case of a MOOC on public innovation. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/participatory_and_co_design
Chang, Y. K., & Kuwata, J. (2020). Learning experience design: Challenges for novice designers. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/LXD_challenges
Chen, J. C., & Kent, S. (2020). Task engagement, learner motivation and avatar identities of struggling English language learners in the 3D virtual world. System, 88, 102168. https://doi.org/10.1016/j.system.2019.102168
Chen, Z.-H., Chen, H. H.-J., & Dai, W.-J. (2018). Using narrative-based contextual games to enhance language learning: A case study. Journal of Educational Technology & Society, 21(3), 186–198. https://www.jstor.org/stable/26458517
Coleman, D. W. (2002). On foot in SIM City: Using SIM Copter as the basis for an ESL writing assignment. Simulation & Gaming, 33(2), 217–230. https://doi.org/10.1177/1046878102332010
Council of Europe. (n.d.). The CEFR levels. Common European Framework of Reference for Languages (CEFR). https://www.coe.int/en/web/common-european-framework-reference-languages/level-descriptions
Dörnyei, Z. (1998). Motivation in second and foreign language learning. Language Teaching,
31(3), 117-135. https://doi.org/10.1017/S026144480001315X
Ebrahimzadeh, M., & Alavi, S. (2017). The effect of digital video games on EFL students’ language learning motivation. Teaching English with Technology, 17(2), 87-112.
Ellis, R. (2000). Task-based research and language pedagogy. Language Teaching Research, 4(3), 193–220. https://doi.org/10.1177/136216880000400302
Eltahir, M. E., Alsalhi, N. R., Al-Qatawneh, S., AlQudah, H. A., & Jaradat, M. (2021). The impact of game-based learning (GBL) on students’ motivation, engagement and academic performance on an Arabic language grammar course in higher education. Education and Information Technologies, 26(3), 3251–3278. https://doi.org/10.1007/s10639-020-10396-w
Emde, S. V. D., Schneider, J., & Kötter, M. (2001). Technically speaking: Transforming language learning through virtual learning environments (MOOs). The Modern Language Journal, 85(2), 210–225. https://doi.org/10.1111/0026-7902.00105
Ferreira, B., Silva, W., Oliveira, E., & Conte, T. (2015). Designing personas with empathy map. 27th International Conference on Software Engineering and Knowledge Engineering, 501-505. https://doi.org/10.18293/SEKE2015-152
Gellar-Goad, T. (2015). World of Wordcraft: Foreign language grammar and composition taught as a term-long role-playing game. Arts and Humanities in Higher Education, 14(4), 368–382. https://doi.org/10.1177/1474022214556030
Gray, C. M. (2020). Paradigms of knowledge production in human-computer interaction: Towards a framing for learner experience (LX) design. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/paradigms_in_hci
Hartfill, J., Gabel, J., Neves-Coelho, D., Vogel, D., Räthel, F., Tiede, S., Ariza, O., & Steinicke, F. (2020). Word Saber: An effective and fun VR vocabulary learning game. Proceedings of the Conference on Mensch Und Computer, 145–154. https://doi.org/10.1145/3404983.3405517
Huang, R., & Schmidt, M. (2022). A systematic review of theory-informed design and implementation of digital game-based language learning. In M. Peterson & N. Jabbari (Eds.), Digital games in language learning (1st ed., pp. 14–34). Routledge. https://doi.org/10.4324/9781003240075-2
Hung, H.-C., & Young, S. S.-C. (2015). An investigation of game-embedded handheld devices to enhance English learning. Journal of Educational Computing Research, 52(4), 548–567. https://doi.org/10.1177/0735633115571922
Hung, H.-T., Yang, J. C., Hwang, G.-J., Chu, H.-C., & Wang, C.-C. (2018). A scoping review of research on digital game-based language learning. Computers & Education, 126, 89–104. https://doi.org/10.1016/j.compedu.2018.07.001
International Organization for Standardization (2011). Systems and software engineering — Systems and software quality requirements and evaluation (SQuaRE) — System and software quality models (ISO/IEC Standard No. 25010:2011). https://www.iso.org/standard/35733.html
Jabbari, N., & Eslami, Z. R. (2019). Second language learning in the context of massively multiplayer online games: A scoping review. ReCALL, 31(01), 92–113. https://doi.org/10.1017/S0958344018000058
Jackson, G. T., & McNamara, D. S. (2013). Motivation and performance in a game-based intelligent tutoring system. Journal of Educational Psychology, 105(4), 1036–1049. https://doi.org/10.1037/a0032580
Jahnke, I., Riedel, N., Singh, K., & Moore, J. (2021). Advancing sociotechnical-pedagogical heuristics for the usability evaluation of online courses for adult learners. Online Learning Journal, 25(4). https://doi.org/10.24059/olj.v25i4.2439
Jahnke, I., Schmidt, M., Pham, M., & Singh, K. (2020). Sociotechnical-pedagogical usability for designing and evaluating learner experience in technology-enhanced environments. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/sociotechnical_pedagogical_usability
Krug, S. (2010). Rocket surgery made easy: The do-it-yourself guide to finding and fixing usability problems. New Riders Press.
Laine, T. H., & Lindberg, R. S. N. (2020). Designing engaging games for education: A systematic literature review on game motivators and design principles. IEEE Transactions on Learning Technologies, 13(4), 804-821. https://doi.org/10.1109/TLT.2020.3018503
Lepe-Salazar, F. (2015). A model to analyze and design educational games with pedagogical foundations. Proceedings of the 12th International Conference on Advances in Computer Entertainment Technology, 1–14. https://doi.org/10.1145/2832932.2832951
Lewis, J. R., & Sauro, J. (2018). Item benchmarks for the System Usability Scale. Journal of User Experience,13(3), 158-167. https://uxpajournal.org/item-benchmarks-system-usability-scale-sus
Lingwati, M. L. (2017). The playability of a selected computer game and its impact towards the improvement of English vocabularies. 2017 Conference on Information Communication Technology and Society (ICTAS), 1–7. https://doi.org/10.1109/ICTAS.2017.7920522
Littlewood, W. (2004). The task-based approach: Some questions and suggestions. ELT Journal, 58(4), 319–326. https://doi.org/10.1093/elt/58.4.319
Lu, J., Schmidt, M., Lee, M., & Huang, R. (2022). Usability research in educational technology: A state-of-the-art systematic review. Educational Technology Research and Development, 70, 1951-1992. https://doi.org/10.1007/s11423-022-10152-6
Miaskiewicz, T., & Kozar, K. A. (2011). Personas and user-centered design: How can personas benefit product design processes? Design Studies, 32(5), 417–430. https://doi.org/10.1016/j.destud.2011.03.003
Neville, D. O., Shelton, B. E., & McInnis, B. (2009). Cybertext redux: Using digital game-based learning to teach L2 vocabulary, reading, and culture. Computer Assisted Language Learning, 22(5), 409–424. https://doi.org/10.1080/09588220903345168
Nielsen, J. (1994). Heuristic evaluation. In J. Nielsen & R. L. Mack (Eds.), Usability inspection methods. Wiley.
OBS. (2022). OBS (Version 28.1.2) [Computer software]. https://obsproject.com/
Poole, F. J., & Clarke-Midura, J. (2020). A systematic review of digital games in second language learning studies. International Journal of Game-Based Learning, 10(3), 1–15. https://doi.org/10.4018/IJGBL.2020070101
Quintana, R. M., Haley, S. R., Magyar, N., & Tan, Y. (2020). Integrating learner and user experience design: A bidirectional approach. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/integrating_lxd_and_uxd
Rankin, Y., Morrison, D., & Shute, M. (2009). Utilizing massively multiplayer online games to foster collaboration and learning. 2009 Atlanta Conference on Science and Innovation Policy, 1–10. https://doi.org/10.1109/ACSIP.2009.5367811
Rankin, Y. A., & Edwards, M. S. (2017). The choices we make: Game design to promote second language acquisition. Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, 907–916. https://doi.org/10.1145/3027063.3053358
Robinson, N., & Harrison, L. (2017). Using learner experience design (LXD) to improve digital language learning products. In M. Carrier, R. M. Damerow, & K. M. Bailey (Eds.), Digital language learning and teaching (1st ed., pp. 156–166). Routledge. https://doi.org/10.4324/9781315523293-13
Sandberg, J., Maris, M., & Hoogendoorn, P. (2014). The added value of a gaming context and intelligent adaptation for a mobile learning application for vocabulary learning. Computers & Education, 76, 119–130. https://doi.org/10.1016/j.compedu.2014.03.006
Schmidt, M., Earnshaw, Y., Tawfik, A. A., & Jahnke, I. (2020). Methods of user centered design and evaluation for learning designers. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/ucd_methods_for_lx
Schmidt, M., & Huang, R. (2022). Defining learning experience design: Voices from the field of learning design & technology. TechTrends, 66(2), 141–158. https://doi.org/10.1007/s11528-021-00656-y
Schmidt, M., Lu, J., Luo, W., Cheng, L., Lee, M., Huang, R., Weng, Y., Kichler, J. C., Corathers, S. D., Jacobsen, L. M., Albanese-O′Neill, A., Smith, L., Westen, S., Gutierrez-Colina, A. M., Heckaman, L., Wetter, S. E., Driscoll, K. A., & Modi, A. (2022). Learning experience design of an mHealth self-management intervention for adolescents with type 1 diabetes. Educational Technology Research and Development, 70, 2171-2209. https://doi.org/10.1007/s11423-022-10160-6
Schmidt, M., & Tawfik, A. A. (2022). Activity theory as a lens for developing and applying personas and scenarios in learning experience design. Journal of Applied Instructional Design, 11(1). https://edtechbooks.org/jaid_11_1/activity_theory_as_a
Schmidt, M., Tawfik, A. A., Jahnke, I., Earnshaw, Y., & Huang, R. (2020). Introduction to the edited volume. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/introduction_to_ux_lx_in_lidt
Siricharoen, W. V. (2021). Using empathy mapping in design thinking process for personas discovering. In P. C. Vinh & A. Rakib (Eds.), Context-aware systems and applications, and nature of computation and communication (Vol. 343, pp. 182–191). Springer. https://doi.org/10.1007/978-3-030-67101-3_15
Spencer, R. (2000). The streamlined cognitive walkthrough method, working around social constraints encountered in a software development company. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ’00, 353–359. https://doi.org/10.1145/332040.332456
Stefaniak, J. (2020). The utility of design thinking to promote systemic instructional design practices in the workplace. TechTrends, 64(2), 202–210. https://doi.org/10.1007/s11528-019-00453-8
Stefaniak, J. E., & Sentz, J. (2020). The role of needs assessment to validate contextual factors related to user experience design practices. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/role_of_needs_assessment
Sung, H.-Y., Hwang, G.-J., Lin, C.-J., & Hong, T.-W. (2017). Experiencing the analects of Confucius: An experiential game-based learning approach to promoting students’ motivation and conception of learning. Computers & Education, 110, 143–153. https://doi.org/10.1016/j.compedu.2017.03.014
Thompson, C., Barforoshi, S., Kell, C., & Banerjee, D. (2016). Uncovering the patient experience: Empathy mapping promotes patient-centered care for improved heart failure patient outcomes. Journal of Cardiac Failure, 22(8), S87–S88. https://doi.org/10.1016/j.cardfail.2016.06.280
Vann, S. W., & Tawfik, A. A. (2020). Flow theory and learning experience design in gamified learning environments. In M. Schmidt, A. A. Tawfik, I. Jahnke, & Y. Earnshaw (Eds.), Learner and user experience research: An introduction for the field of learning design & technology. EdTech Books. https://edtechbooks.org/ux/flow_theory_and_lxd
Waight, C. L., Edwards, M. T., & Waight, J. E. (2023). The learning experience designer skillset: Employer expectations. Advanced in Developing Human Resources, 0(0). https://doi.org/10.1177/15234223231193319
Wang, C., Lan, Y.-J., Tseng, W.-T., Lin, Y.-T. R., & Gupta, K. C.-L. (2020). On the effects of 3D virtual worlds in language learning – a meta-analysis. Computer Assisted Language Learning, 33(8), 891–915. https://doi.org/10.1080/09588221.2019.1598444
Wang, X. M., Schmidt, M., Ritzhaupt, A., Lu, J., Huang, R., Lee, M. Y. (under review). Learning experience design (LXD) professional competencies: An exploratory job announcement analysis.
Westera, W. (2019). Why and how serious games can become far more effective: Accommodating productive learning experiences, learner motivation and the monitoring of learning gains. Journal of Educational Technology & Society, 22(1), 59–69. https://www.jstor.org/stable/26558828
Xu, Z., Chen, Z., Eutsler, L., Geng, Z., & Kogut, A. (2020). A scoping review of digital game-based technology on English language learning. Educational Technology Research and Development, 68(3), 877–904. https://doi.org/10.1007/s11423-019-09702-2
Yaşar, S. (2018). The role of massively multiplayer online role-playing games in extramural second language learning: A literature review. Journal of Educational Technology and Online Learning. https://doi.org/10.31681/jetol.436100
Yudintseva, A. (2015). Game-enhanced second language vocabulary acquisition strategies: A systematic review. Open Journal of Social Sciences, 03(10), 101–109. https://doi.org/10.4236/jss.2015.310015

Appendix

USABILITY STUDY PROTOCOL

Introduction (2-3 minutes)

Hi, [insert participant’s name]. My name is [Tammy Huang]. I am a doctoral candidate majoring in Educational Technology and minoring in Human-Computer Interaction. My dissertation study focuses on designing and developing a 3D virtual world English language learning game for international students in America. As part of my dissertation study, today, I am facilitating this usability study session.

Before we continue, I would like to send you a link so you can read the consent form. If you agree to participate, please make the selection at the end of the form and let me know when you are done.

https://ufl.qualtrics.com/jfe/form/SV_aYpoK0lJMmgBGom

[START OBS RECORDING]

[START ZOOM RECORDING]

Great, thanks! May I ask, what is your next step plan after finishing the ELI program?

[Today, we have two parts of activities, first you will go through a training game to learn how to use the game platform. Next, you will do the usability study of this game. I will explain what a usability study is what you will do later.

Training

For now, I am sending you a link for the training game: https://hubs.mozilla.com/gj2mehh/peaceful-witty-place

When you go through this training game, I will help you when you have any questions. Are you ready to get started?

Go ahead clicking on it and wait for it to load.

Once the page is fully loaded.

Before we visit the room, let’s get you signed in.

On the lower right corner there is a button that says “More” (an icon with three dots). Go ahead clicking on it and choose “Sign In” from the menu.

Use the username I just sent in Chat.

Username: altstudio@coe.ufl.edu

Input the username and click on Next. I will authorize you on my end.

Ask participant to mute Hubs upon entry

You did a great job in the training game. Now, do you want to take a break before we move on to the next activity of the day?

Game Usability

Before we begin this activity, I would like to give you a brief introduction to how we do this usability study and how you can help me improve this game. I’m going to read it to make sure that I cover everything.

Today, you are helping me to do a usability study on this learning game. I would like to see how you perceive the game works. You can evaluate it from three perspectives: technology, language learning, and interaction.

This meeting should take about 1 hour. There will be a pre-game meeting area, three game areas, and a post-game meeting area. In each area, I would like to first get your overall impression, and then, you’ll be asked to complete tasks in each area by using different features. When you do the tasks, I will ask you to try to think out loud as much as possible: to say what you’re looking at, what you’re trying to do, and what you’re thinking. This will be a big help for me to understand how you, as a learner, think when playing this game. In the end, I will also ask you some questions about what you thought of your experience.

The first thing I want to clarify right away is that I am testing the design of this virtual world English learning game – not you. You can’t do anything wrong here. If you find yourself confused or unable to understand, that means it is exactly what I need to work on to improve. So, don’t worry about hurting my feelings because you are helping me make this learning game better for you and other learners like you.

If you have any questions as we go along, just ask them at any time. I may not answer them right away since we’re interested in how people do when they don’t have someone who can help. But if you still have any questions when we’re done, I’ll try to answer them then.

And if you need to take a break at any point, just let me know.

Do you have any questions so far?

Are you feeling comfortable and ready to begin?

********

OK, great. Before we get into the learning game, I would like you to share your screen, and I would like to video record the remaining time of our meeting. This video is for the research purpose only, and it will be kept confidential and only accessible to the research team. Is it OK with you?

Thank you very much!

*********

I just pasted a link to the learning game in Zoom’s chat.

https://hubs.mozilla.com/k757sST/the-future-writers-lobby

Facilitator link:

https://hubs.mozilla.com/k757sST/the-future-writers-lobby#Facilitator

Please click on it. Good. Now, please click on “Join Room.” You are doing great.

Pre-Game Meeting Area

Overall Impression (2 minutes)

Good. Now, you entered the Pre-Game Meeting Area. In this area, the two players will meet with the teacher to get to know each other and learn the basic information about this game.

Now, I want you to look around the space and tell me:

What is your first impression of this space?
What do you think you can do here?

Usability Task

Now, please watch the video. After finishing, I’d like to ask several questions.

What do you think about the visual design and color of the video?
What do you think about the voice of the video? Is the speed good for you?
Do you have any question for me after you watch this video?
What do you think you should do next?

Note: Allow the participant to proceed from one task to the next until you don’t feel like it’s producing any value, or the participant is unable to complete the task. Repeat until the participant has provided sufficient feedback on the task, the task is completed, or until time runs out.

Future Land Area

Overall Impression (2 minutes)

Good. Now, you entered the Future Land area of this virtual world. First, I will ask you to look around this space by pressing Q or E key on your keyboard and tell me what you think of it.

What is your first impression of this space?
What do you think you can do here, and what do you think the space is for based on what you see?

Usability Task

Now I’m going to ask you to do whatever you think you should do based on your understanding of the message and videos you read or watch in this game world. Again, as much as possible, it will help me if you can try to think out loud as you go along.

When facilitating

Provide verbal encouragement to motivate the participant to continue, “Good job,” “Fantastic,” “You are doing great,” etc.

Observe

● Is the system-level instruction helpful for the participant to know what to do on this page? (T usability)

● Is the pedagogical-level instruction clear to the participant? (P usability)

● Does the pedagogical-level instruction prompt them to do the learning task? (P usability)

Upon completion, ask the following questions:

● What do you think about the instructions provided by Robert Finch? Did you find anything not clear? (TP usability; Design principle: appropriate scaffolding)

● What do you think about the collaborative discussion promotes in this quest? Do you feel they are helpful to prompt you to discuss English language that you are not certain with your game partner? (S usability, potential to elicit LREs)

● Do you have any immediate ideas or suggestions on how to improve this part you just experienced?

Evidenceverse Area

Overall Impression (2 minutes)

Good. Now, you entered the Evidence Verse. First, I will ask you to look around this space by pressing Q or E key on your keyboard and tell me what you think of it.

What is your first impression of this space?
What do you think you can do here, and what do you think the space is for?

Usability Task

Observe

● Is the system-level instruction helpful for the participant to know what to do on this page? (T usability)

● Is the pedagogical-level instruction clear to the participant? (P usability)

● Does the pedagogical-level instruction prompt them to do the learning task? (P usability)

Upon completion, ask the following questions:

● If two players are playing this game, do you think you will see your partner in this current area?

● Tell me who’s information you are collecting in this area?

● What do you think about the instructions provided by Robot Finch? Did you find anything not clear? (TP usability)

● What do you think about the collaborative discussion prompts in this task? Do you feel they are helpful to prompt you to discuss English language that you are not certain with your game partner? (S usability, potential to elicit LREs)

● Do you have any immediate ideas or suggestions on how to improve this part you just experienced?

Magicverse Area

Usability Task

Now, you entered the Magic Verse. I’m going to ask you to do whatever you think you should do based on your understanding of the message and videos you read or watch in this game world. Again, as much as possible, it will help me if you can try to think out loud as you go along.

Observe

● Is the system-level instruction helpful for the participant to know what to do on this page? (T usability)

● Is the pedagogical-level instruction clear to the participant? (P usability)

● Does the pedagogical-level instruction prompt them to do the learning task? (P usability)

Upon completion, ask the following questions:

● What do you think the overall design of this area? Does it feel like a game?

● What do you think about the instructions provided by Robert Finch? Did you find anything not clear? (TP usability)

● What do you think about the questions or requirements for writing? (P usability)

● What do you think about the requirements for two learners to work together? Would you like to discuss English language that you are not certain with your game partner? (S usability, potential to elicit LREs)

● Do you have any immediate ideas or suggestions on how to improve this part you just experienced?

Post-Game Meeting Area

Overall Impression of Magic Verse (2 minutes)

Great, you entered the Post-Game Meeting Area. Now you have finished this game. I have some questions for you. Could you please tell me your overall impression of the last game area – Evidence Verse.

Evaluation Interview (15-20 minutes)

Great! Thank you for sharing your thoughts and testing on this game. Now you have experienced the entire game, I would like to ask you some questions about your overall experience.

1. Overall experience (overall satisfaction):

a. How would you describe your overall satisfaction with this game?

b. Was there anything you found frustrating or confusing? If so, please describe it.

2. Experience with the learning game (technical usability):

a. What did you expect the learning game to be like?

b. How was the game world like or different from what you expected?

c. What do you think of the video quality and sound quality?

3. The interaction and potential to support social interaction (social usability):

a. How do you feel about setting up your avatar’s name and appearance? (Design principle: personalized identity)

b. How do you feel about the social interaction functions, such as chatting via text message, sharing content in the virtual world space, voice chatting via microphone? (Design principle: support text and voice chat)

4. Experience with language learning (pedagogical usability):

a. What is your overall impression of this game in terms of helping you improve your English communication skills?

b. What would make this language learning game more helpful for you?

c. How do you feel about the level of English language used in the game in relation to your current English proficiency? (Design principle: appropriate language level)

5. What questions do you still have?

System Usability Scale (5 minutes)

OK, as the last part of our meeting, I have a survey form for you to fill out. I am going to share the survey form link in Zoom chat. Please click on it and finish the survey. If you need any help explaining the survey questions, please feel free to ask me.

https://ufl.qualtrics.com/jfe/form/SV_7TDFviBhRfeqaJE

Please select the number that reflects your immediate response to each statement. Don’t think too long about each statement. Make sure you respond to every statement.

Table C-1.

System Usability Scale Used in This Study.

Item	Strongly disagree				Strongly agree
1. I think that I would like to use this space frequently.	1	2	3	4		5
2. I found the space unnecessarily complex.	1	2	3	4		5
3. I thought the space was easy to use.	1	2	3	4		5
4. I think that I would need the support of somebody to be able to use this space.	1	2	3	4		5
5. I found the various functions in this space were well integrated.	1	2	3	4		5
6. I thought there was too much inconsistency in this space.	1	2	3	4		5
7. I would imagine that most people would learn to use this space very quickly.	1	2	3	4		5
8. I found the space very awkward to use.	1	2	3	4		5
9. I felt very confident using the space.	1	2	3	4		5
10. I needed to learn a lot of things before I could start using this space.	1	2	3	4		5

11. Overall, I would rate the user-friendliness of this space as:

Worst Imaginable	Awful	Poor	Ok	Good	Excellent	Best Imaginable