Baker & Yacef (2009) review the trends in the field of Educational Data Mining (EDM), specifically in relation to the methodological profile of research, as well as discuss trends and shifts within the research community.
EDM is defined as concerned with developing methods for exploring data from edu settings, and using these methods to understand students and the settings which they learn in. The authors suggest that EDM differs from data mining due to multi-level hierarchy and non-independence in edu data (my note: this is where I am reminded of Carolyn Rose telling statistically-challenged learners not to worry about the stats).
The methods that EDM employs to fulfil its mission fall into two categories: a) statistics and visualisation & b) web mining. Since these two categories were originally used for web-based data (and authored by Romero & Ventura, 2007), Baker (2009) suggests his own taxonomy of EDM methods: a) prediction; b) clustering; c) relationship mining; d) distillation of data for human judgement; e) discovery with models. He explains that the first three categories are inspired by a well-known Moore categorisation (2006). In relation to the fourth category, i.e. the distillation of data for human judgement, Baker & Yacef refer to the theoretical discussions around EDM. Finally, discovery with models is shown to be gaining popularity as a method in EDM, although it is not very explicit whether that is the only reason for it being positioned in a separate category.
Baker & Yacef give a broad overview of possible applications for EDM research, e.g. modelling student individual differences; discovering and improving models of a domain’s knowledge structure; studying the pedagogical support by discovering which types of pedagogical support are most effective (studies by Beck & Moscow, 2008; and Pechanizkiy et al. 2008 are referred to); looking for empirical evidence to refine and extend educational theories and well-known educational phenomena.
Prior to outlining the trends in EDM research, some of the most-cited applied papers in the field are listed. The article highlights the shift in research within EDM, from the original themes, where relationship mining moved from being of utmost interest to researchers to the 5th place; while prediction moved from the lower position to become the number 1 research interest. Finally, discovery with models is shown to be gaining more popularity being only second to prediction. Baker & Yacef refer to the prominence of such modelling frameworks as Item Response Theory, Bayes Nets and Markov Decision Processes (which is just some non-scary jargon). At the end of the section, emergence of public data and public data collection tools is described as a promising trend.