Analysis of a serious game “Urban Science” based on IDAF assessment principles
- Mohsen Haghighatpasand
- Aug 31, 2019
- 12 min read
Updated: Sep 1, 2019
IDAF lens on the game
1. Assessment aims and context of Urban Science
1.1.Aims: The fact that education is now based on facts and formulas companied by standardized tests makes it difficult for students to apply their learnings to solve real problems (Gee, 2004). Urban Science attempts to design a form of assessment that is not separate from learning (e.g. one month of training in two-hours assessment) and is built around central problems in an academic domain or a real world profession to put thinking to work in solving complex real-world problems (Gee & Shaffer, 2010). In the context of this game, the aim is for players of Urban Science to investigate, analyze, understand, and communicate about scientific issues: local species, their life cycle, and their habitat; the role of wetlands in the local ecological system; and specific pollutants, their sources, and their impacts (Gee & Shaffer, 2010). There are no tests to have scores to be used and interpreted; the assessment is part of the learning and not separate from that. Completing the game means solving lots of problems and takes lots of creativity and knowledge which can be directly used in real-like problems. Observations by teachers can be part of evaluation which happens during the game and as a formative assessment, not summative.
1.2.Scoring Inference: They look at what urban planners say and do in their work, find the relevant skill, knowledge, identity, values, and epistemology, and create a model of the way planners think about problems. After they build that model, they compare players to them. They have not specified how they build that model. Through Epistemic Network Analysis (ENA) (Shaffer et al., 2009) they measure the similarities and differences between these ways of thinking.
1.3.Generalization Inference: Nothing is said about the generalizability of the game results but it can be inferred that as the problems are tried to be very real-like the results can be generalized to similar measures
1.4.Extrapolation Inference: Through the result of the game they can understand how well a player can solve real-world problems similar to the ones in the game because the problems are designed to be similar to the ones in the real world. So the results can be extended to the broader universe of possible performances
1.5.Decision Inference: As the game is not used in a serious way, currently it is not being used to make large-stake decisions about the students. However, the designers of the game have a look at the future of assessment and learning through serious games.
1.6.Context: The game was originally designed by Kelly Becket and was expanded by Elizabeth Sowatzke, researchers at the University of Wisconsin. The game was not designed to assess but the assessment was designed to be in the game. Later, Gee and Shaffer (2010) studied the game in order to understand the efficiency of the game regarding its assessment aspect. Gee and Shaffer (2010) designed a model to understand to what extent progress in an epistemic game is similar to a similar situation in real life.
2. Elements foundational in the assessment design:
2.1.Construct: the construct of the assessment is stable across social and racial contexts. In their research on the game, they used students from different races. However, students of this study were all habitats of the city they have been planning. So students who do not live in that city and are not familiar with North American cities and lifestyles may encounter some problems which can affect construct validity.
2.2.Content validity: Possible observations in the game come from the interaction between players, players with the instructor and players with the game. All the information about a player’s progress comes from observation of these sources. The results show the very high content validity of the assessment. The players struggle with keeping the ecological balance of the city and meanwhile they form a complex definition of this concept. According to Shaffer (2006), before the game, less than 10 percent of players could explain what “ecology” is but after the game 80 percent had very complex ideas of it.
2.3.Intended construct: The assessment, which is embedded inside the game, is designed to give a clear picture of the players’ decision about urban planning. The players use their knowledge to make a decision and the results in the game show the effectiveness of their decision. So it can be said that the game has a good level of construct validity.
2.4.Fairness: This assessment seems to provide an equal opportunity for everyone to show their knowledge in the area of urban planning. As the assessment happens in a small class, they probably share a similar culture, background and financial level so the assessment can have a similar complexity for all of them. The fact that they have followed the same curriculum and with the same teacher and syllabus can assure us that they have had a similar opportunity for learning before the assessment.
2.5.Low-stakes: The decision being made based on this small scale assessment are only in the area of the students of this class and deciding about their understanding of the course content. The game has this potential of being used for making more important decisions especially when it’s used for hiring experts or as a job requirement. Candidates can be asked to solve a problem to show their understanding and ability in solving problems.
2.6.Criterion-referenced: This assessment is completely a Criterion-referenced assessment because the students’ performance is measured against a fixed set of criteria. The students will be observed to see how capable they are in understanding urban-related problems and managing them. Although there is not a clear table of criterion offered by designers of the game, it’s not difficult to come up with a list. However, all games especially video games have this possibility of being highly competitive so they can be used to cover both criterion and norm reference assessments.
2.7.Accountability: Students results can show to what extent this system can be used to evaluate the whole system and their efficiency. But this is not something that can give very quick and to the point answers to our questions about accountability because this is a kind of assessment which looks at the future of the students not the end of the course success. The students develop more problem-solving abilities in their field of study and will demonstrate it in a long term. A long-term observation of this system can be a good yard stone of the accountability of an education system.
2.8.Assessment as inquiry: The results can shed light on the next steps of where the students should be lead to. Students can also see their own progress, the areas they need to work on and give feedback to their peers. The photo of the city they build is also a very good form of descriptive feedback they automatically provide to themselves and can progressively modify it to a better city.
2.9.Consequential validity: Students gradually get into the habit of learning to use instead of learning to pass a test. They also see the power of knowledge in practice and understand this before getting into real jobs and facing failure there.
2.10. Washback: The positive washback of this system can be less stress and pressure for teachers and students. The teachers will not also focus just on areas that are probable to show up in the exam and ignore practical parts. The negative washback can be students not taking the course very seriously because they might think that they can learn while they are playing, which can be controlled somehow.
2.11. Rubric: The designers of this game have not explained the rubrics for analytic scoring. This assessment seems to give more of a holistic scoring by which they can understand their overall performance. This can be a limitation of this assessment as it seems that reliability is sacrificed for validity although it can be sorted out in the future and by further studies.
2.12. Reliability: This assessment does not seem to have systematic or random error. However, a more detailed study of this assessment can provide a better view.
3. Appraise the assessment program:
The method used to collect assessment data is very different from common traditional methods. There are no methods and all the feedback is given in the process of playing through instructors and the game itself. The students are more motivated and never complain about the result of their work. The more they know about urban planning and the more they think about that the better results they can get.
This method can have some limitations like any other methods. Many of the players may not be good at video games or show any interest. This can affect their success in accomplishing the tasks. The instructions being given during the game (inside the game) can be complicated to understand and can cause construct-irrelevant variance although the designers have not mentioned that. However, to avoid that, the instructors were always present to help when need be.
4. Appraise the scoring system:
4.1. There is no traditional scoring system as the learning tool is not based on fact and figures. The assessment of this learning tool is intertwined with many different concepts that are all involved at the same time. Things like understanding systems, order, and organization; evolution and equilibrium; and form and function in natural systems, land use models that urban planners work with combined with geographic features, which are all for the sake of solving real-world problems not answering MCQs. The final result of this assessment cannot be shown in quantitative forms but more in qualitative manners. The result can be seen in the process and in the final quality of construction and the logic behind that can be provided in a paragraph. Here is an example of what a player has said about his city:
Jobs would mean more people. More people would mean more pollution and more crime. When the city grows and the city has more people it’s going to need like buildings to house them. If there are more buildings, there will be more traffic. You can’t really change one thing without changing another (as cited in Shaffer, 2006).
As you can see, assessment is more qualitative. However, this needs inter-rater and intra-rater assessment to ensure the reliability of the result but the designers have not talked about such aspects of assessment. The scoring criteria are not clearly explained but this learning tool has a high potential to be developed as a reliable, fair and valid assessment.
5. The assessment results:
5.1.Scoring inference: The scoring procedures for Urban Science seem appropriate but there is not enough info to clearly say if they are administered correctly or not.
5.2.Generalization inference: As there is no rubric it’s difficult to see how successful the game is in using the result to generalize them.
5.3.Extrapolation inference: Urban science assessment is in line with what students need to be able to manage in their real life and what real planners deal with in their profession.
5.4. Decision inference: This assessment system can be a good reference to understand the students’ ability in problem solving and creativity. In the realm of urban planning, the result of this assessment can be used to see how prepared the learners are to get into a real profession.
6. The assessment consequences:
6.1.As the assessment is integrated into the learning tool, I think the purpose is well served. The audience, being students, show a high degree of interest and involvement in the assessment without the negative aspects of pressure and stress. The teachers are also more interested and enjoy teaching and assessing at the same time. They are less under pressure and overwhelmed with designing, running and scoring the papers.
6.2.The intended consequences are providing the students with formative assessment and have them use their knowledge in a practical way in many problem-solving situations. This system prepares students for the skills they need in the 21 century than the traditional assessment systems. Creativity is fostered and social interaction for solving the problems is encouraged. Unintended consequences can be the reliability of the assessment and acceptability of the certificates awarded based on this system. Many universities or companies do not accept a qualitative script. They expect a graded, ranked and tangible assessment; making the tests more reliable takes more raters, time, thought and energy. A clear system should be devised in order to make such assessment more reliable.
Second Part; Final Project
In the second part of this project to be submitted as the final project, I will focus on other potentials of the epistemic serious game Urban Science for assessment. In this part, I will discuss how serious games can help the new concepts in assessment like Ungrading and Grading for Mastery to get immersed in education. In this part, I will use the table designed by Gareis and Grant (2015, p. 13) to compare standardized assessments with classroom assessments embedded in serious games.
Table 1. The comparison between standardized and game based assessment.
Standardized assessments
Epistemic Serious games assessments
Designed for a high population like the 9th grade students of the united states.
Designed for one specific region like the city of Wisconsin in Urban Science game.
Assessment happens in one specific time in a year and tries to assess the knowledge of the students during one year school.
Assessments happens during learning. Students can self-assess themselves because their progress is continuously demonstrated in the game.
Validity is aligned with general standards and general expectations of the students of a large region.
Validity is in accordance with what the game offers. Expectation is not overestimated. However, this can be affected by the students’ gaming abilities.
Reliability is increased at the cost of neglecting students’ differences, abilities and interests.
Authenticity is increased through very contextualized assessment in the game context. The game provides the context for very authentic debates, meetings, presentations, etc. to make decisions for the city.
Summative assessment
Formative assessment
High-stakes
Low-stakes
As it can be seen in the table.1, against standardized assessment which encourages rote learning, Epistemic Serious Game assessment provides a very authentic context for open-ended assessment. The rich context of the game allows teachers to add other forms of assessment to the game but in the real class on top of the assessment that can happen inside the game. Teachers can ask students to draw a map of the future of their city and show the areas that have crime potential and offer solutions. The answers are open ended and can be submitted in different ways. As the game is the crucial component of the class, students’ intrinsic motivation can be enhanced.
In terms of ungrading, this form of assessment can help using different forms of giving feedback to the students. Self-assessment, peer-assessment and the automatic assessment received from the game can all perfectly show the students strength and weaknesses. A four-point scale offered by Robert Marzano (2006) to show the students level of mastering in urban skills. The scale can be designed to be embedded in the game or tailored by the teacher according to the students gaming ability. As some students are better gamers than others and it can affect the validity of the assessment, there can be one scale for students’ gaming skills that students believe they have; this can give the teacher some information in advance that a student’s poor performance might be because of their skills or interests in playing video games. However, their lack of interest doesn’t mean that they should be given another form of assessment; the game provides the context for learning and assessment and students can show their understanding in different forms that satisfies their interests.
What I discussed so for was mainly focused on one epistemic game called Urbane Science. In the continue I will study the three main learning theories and the games designed based on them. I will discuss how they can help assessment.
Reference
Ackermann, E. (2001). Piaget’s constructivism, Papert’s constructionism: What’s the difference. Future of learning group publication, 5(3), 438.
Bagley, E., & Frank, K. (2009). Epistemic network analysis: A prototype for 21st century assessment of learning.
Caperton, I. H. (2010). Toward a theory of game-media literacy: Playing and building as reading and writing. International Journal of Gaming and Computer-Mediated Simulations, 2(1).
Chen, S., & Michael, D. (2005). Proof of learning: Assessment in serious games. Retrieved October, 17, 2008.
Ertmer, P. A., & Newby, T. J. (2013). Behaviorism, cognitivism, constructivism: Comparing critical features from an instructional design perspective. Performance Improvement Quarterly, 26(2), 43-71.
Filsecker, M., & Bündgens-Kosten, J. (2012). Behaviorism, Constructivism, and Communities of Practice: How pedagogic theories help us understand game-based language learning. In Digital games in language learning and teaching (pp. 50-69). Palgrave Macmillan, London.
Gareis, C. R., & Grant, L. W. (2015). Teacher-made assessments: How to connect curriculum, instruction, and student learning (2nd ed.). New York, NY: Routledge.
Gee, J. P. (2003). What video games have to teach us about learning and literacy. New York: Palgrave/Macmillan.
Gee, J. P. (2004). Situated language and learning: A critique of traditional schooling. London: Routledge.
Gee, J. P., & Shaffer, D. W. (2010). Looking where the light is bad: Video games and the future of assessment. Phi Delta Kappa International EDge, 6(1), 3-19.
Papert, S., & Harel, I. (1991). Situating constructionism. Constructionism, 36(2), 1-11.
Rupp, A. A., Gushta, M., Mislevy, R. J., & Shaffer, D. W. (2010). Evidence-centered design of epistemic games: Measurement principles for complex learning environments. The Journal of Technology, Learning and Assessment, 8(4).
Savery, J., & Duffy, T. (1996). Problem based learning: An instructional model and its constructivist framework. In B. Wilson (Ed.), Constructivist learning environments: Case studies in instructional design (pp. 135–148). Englewood Cliffs, NJ: Educational Technology Publications.
Shaffer, D. W. (2006). How computer games help children learn. Macmillan.
Shaffer, D. W., Hatfield, D., Svarovsky, G., Nash, P., Nulty, A., Bagley, E., et al. (2009). Epistemic Network Analysis: A prototype for 21st century assessment of learning. The International Journal of Learning and Media, 1(2), 33-53.
Shepard, L. A. (2000). The Role of Assessment in a Learning Culture. Educational Researcher, 29(7), 4. http://doi.org/10.2307/1176145
Slomp, D. (2016). An integrated design and appraisal framework for ethical writing assessment. The Journal of Writing Assessment, 9(1), 1-14.
Weintrop, D., Holbert, N., Wilensky, U., & Horn, M. S. (2012). Redefining constructionist video games: Marrying constructionism and video game design. In Proceedings of the Constructionism 2012 Conference. Athens, Greece.
Wu, W. H., Hsiao, H. C., Wu, P. L., Lin, C. H., & Huang, S. H. (2012). Investigating the learning‐theory foundations of game‐based learning: a meta‐analysis. Journal of Computer Assisted Learning, 28(3), 265-279.


Comments