Translating Learning into Numbers: A Generic Framework for Learning Analytics

The following is a summary of and commentary on Greller and Drachsler (2012). I come to this via the JISC/CETIS report I summarised yesterday


I liked this paper because it serves a purpose for me. A purpose that I think may well be useful to a number of others. It gives a framework that seems to cover most of the factors to be considered when designing the use of learning analytics (LA). Though I will need some more reflection and experimentation to consider how complete it is. The paper mentions most of the important limitations or questions over LA that are often overlooked and provides recommendations for areas for future research. Importantly, the framework offers a foundation/lens through which to compare and contrast different approaches. All useful features for a new area that has a touch of the fad about it.


The abstract of Greller and Drachsler is

With the increase in available educational data, it is expected that Learning Analytics will become a powerful means to inform and support learners, teachers and their institutions in better understanding and predicting personal learning needs and performance. However, the processes and requirements behind the beneficial application of Learning and Knowledge Analytics as well as the consequences for learning and teaching are still far from being understood. In this paper, we explore the key dimensions of Learning Analytics (LA), the critical problem zones, and some potential dangers to the beneficial exploitation of educational data. We propose and discuss a generic design framework that can act as a useful guide for setting up Learning Analytics services in support of educational practice and learner guidance, in quality assurance, curriculum development, and in improving teacher effectiveness and efficiency.

Furthermore, the presented article intends to inform about soft barriers and limitations of Learning Analytics. We identify the required skills and competences that make meaningful use of Learning Analytics data possible to overcome gaps in interpretation literacy among educational stakeholders. We also discuss privacy and ethical issues and suggest ways in which these issues can be addressed through policy guidelines and best practice examples.

Promises lots of relevant reading.


Some general background on the the growth of big data.

Argues that electronic data mining makes gather information easy and is resulting in data that is comparable to observational data gathering. Quotes Savage and Burrows (2007) suggesting it will “enhance our understanding and highlight possible inconsistencies between user behaviour and user perception”.

LMS and other systems gather data, but observes (p. 43)

exploitation of the data for learning and teaching is still very limited. These educational datasets offer unused opportunities for the evaluation of learning theories, learner feedback and support, early warning systems, learning technology, and the development of future learning applications.

Critical dimensions of learning analytics

Questions for research from LA

  • Technically-focused
    • compatibility of educational data sets
    • comparability and adequacy of algorithmic and technological approaches
  • “softer” issues – defined as “challenges that depend on assumptions being made about humans or the society in general”
    • data ownership and openness
    • ethical use and dangers of abuse
    • demand for new key competences to interpret and act on LA results

Identify six critical dimensions (soft and hard) as a descriptive framework hoping to later develop it into a domain model or ontology. Deduced from discussions in the research community through a general morphological analysis approach. The framework is intended “to be a guide as much as a descriptor of the problem zones” and is thus labelled a design framework. The framework consists of 6 critical dimensions with each dimension having a number of instantiations. The presented instantiations are not exhaustive. The framework is presented graphically, the following is a tabular representation.

These are critical dimensions since it is proposed that any fully formulated LA design should have at least one instantiation for each dimension. The paper goes on to discuss each dimension in more detail.

Dimensions Instantiations
Stakeholders Institution
Other (researchers, service providers and government agencies)
Internal limitations Competences
External constraints Conventions
Instruments Technology
Data Open
Objectives Reflection

The authors present a paper where they illustrate the use of the design framework to describe the SNAPP tool and its use. It is suggested that such a use can be used

  1. As a checklist when designing a purposeful LA process;
  2. as a shareable description framework to compare context parameters with similar approaches in other contexts, or for replication of the scientific environment.


  • Data clients – beneficiaries of the LA process who are entitled and meant to act upon the outcome.
  • Data subjects – suppliers of the data through their browsing and interaction behaviour.

As shown in the above table, the identified stakeholders are learners, teachers, other and institutions.

I wonder if “institution” is to broad a category. Who is the institution? I think there are perhaps a range of different stakeholders here: academic in charge of a program, head of school/line manager of a group of academics, Dean/Head of faculty, Head of the central L&T division, Head of IT etc. Each of these stakeholders will have slightly different requirements/purposes behind their use of analytics. e.g. an IT manager might want to use analytics to decide whether a particular piece of software should continue to be supported.

Presents a triangle diagram representing a hierarchy of stakeholders and the information flow between them. Emphasises the importance of self-reflection at each level and the possibility of research across the hierarchy. Does mention peer evaluation as another type of information flow (other than hierarchical)


“The main opportunities for LA as a domain are to unveil and contextualise so far hidden information out of the educational data and prepare it for the different stakeholders”

  1. Reflection
    The data client reflecting on evaluating their own data to obtain self-knowledge. Linked to the “quantified self”. Can also be reflection based on the datasets of others. e.g. a teacher reflecting on their teaching style. Interesting quote

    Greatest care should however be taken not to confuse objectives and stakeholders in the design of a LA process and not to let, e.g., economic and efficiency considerations on the institutional level dictate pedagogic strategies, as this would possibly lead to industrialisation rather than personalisation.

    and another one

    LA is a support technology for decision making processes. Therein also lies one of the greatest potential dangers. Using statistical analytic findings is a quantitative not a qualitative support agent to such decision making. We are aware that aligning and regulating performance and behaviour of individual teachers or learners against a statistical norm without investigating the reasons for their divergence may strongly stifle innovation, individuality, creativity and experimentation that are so important in driving learning and teaching developments and institutional growth.

  2. Prediction
    Modelling and predicting learner activities can lead to interventions and adaptive services/curricula. The source of much hope for efficiency gains “in terms of establishing acts of automatic decision making for learning paths, thus saving teacher time for more personal interventions”.

    I do wonder how much management see the efficiency gains as a way to reduce teacher time (full stop)?

    Raises the ethical problems – “judgements based on a limited set of parameters could potentially limit a learner’s potential”. Re-confirm established prejudices. Mentions the problem of diversity of learning making it difficult to make judgements between different learners.

    Analaytics itself is seen as pedagogical neutral, but specific technologies are not.

Educational data

LA uses LMS and other data. Institutions already have a lot of this. Linking these can be useful. Most data in institutions is protected. Researchers are finding it hard to get access to data to test their methods.

Anonymisation is seen as one way to get “open data”. Verbert et al (in press) present a review of existing educational datasets. How open data should be required wider debate.

Existing data initiatives include

  • dataTEL challenge – challenge research groups to release data
  • PSLC dataShop open data repository of educational data sets from intelligent tutoring systems.
  • Linked education – open platform to promote the use of data for educational purposes.

Suggests it’s somewhat bizarre that commercial companies automatically assume ownership of user data when you register on their site, but educational institutions operate on the default that everything is protected from virtually everyone. Not sure of this comparison, but it’s a consideration.

The question is which employees of the institution are included in the data contract between a learner and the institution. Suggests this is not yet resolved. Which is a constraint on inner-institutional research and wider institutional use. Other issues with LA datasets

  • Lack of common data formats.
  • Need for version control and a common reference system
  • Methods to anonymise and pre-process data for privacy and legal protection rights. (Drachsler et al, 2010)
  • Standardised documentation of datasets.
  • Data policies to regulate how data sets can be used.
  • The problem that data sets include data that is not context-free and free of error. (e.g. teachers setting up test student accounts).
  • Related to the last point is “enmeshed identities” e.g. where users are sharing a device for access.
  • Perhaps most importantly is the “on-going challenge to formulate indicators from the available datasets that bear relevance for the evaluation of the learning process”


LA relies on information retrieval technologies including (refs in original): educational data mining, machine learning, statistical analysis techniques and other techniques such as social network analysis and natural language processing.

They also include conceptual instruments: theoretical constructs, algorithms or weightings. i.e. to some extent these provide ways to “translate” raw data into information

Quotes Hildebrandt (2010) that “invisible biases, based on … assumptions … are inevitably embodied in the algorithms that generate the patterns” and makes the related point “LA designers and developers need to be aware that any algorithm or method they apply is reductive by nature in that it simplifies reality to a manageable set of variables (cf. Verbert et al., 2011).”

External constraints

External constraints broken up into

  • Conventions – ethics, personal privacy and similar socially motivated limitations.
  • Norms – restrictions due to laws or specific mandated policies or standards.

They don’t elaborate on the ethics aspects, leaving that to Bollier (2010) and Siemens (2012). But then get into the following.

Apparently data gathered about a person (before it is anonymised) currently belongs to the owner of the data collection tool and also the data client and beneficiary. The problem is that increasingly information is being stored about individuals without their approval or awareness. The ethical principle of “informed consent” is under threat (AoIR, 2002)

Returns to the problem with institutions have lots of data that is collected about learners, but that it is managed by different groups within the institution and that sharing between these groups is questionable.

Also mentions the ethical issues associated with post-analytic decision making. i.e. the decisions (which can be diverse) taken based on data have ethical implications. Also mention this problem when LA used by institutions to quality assure performance of teaching staff. Similarly the application of such algorithms can limit/hurt innovations that “diverge from the statistical mean”.

Internal limitations

Limitations of a more human origin

  • Competencies
    Survey of LA experts found that only 21% of 111 respondents felt learners would have the competency to interpret LA results and figure out what to do (Drachsler & Greller, 2012). I don’t imagine the percentage for teachers would be much greater?

    Therefore, the optimal exploitation of LA data requires some high level competences in this direction, but interpretative and critical evaluation skills are to-date not a standard competence for the stakeholders, whence it may remain unclear to them what do as a consequence of a LA outcome or visualisation

    Makes the point that the trend toward visualisation of LA outcomes can “delude the data clients away from the full pedagogic reality”. “superficial digestion of data presentations can lead to the wrong conclusions”. Data not included in the LA approach can be equally, if not more, important than what is included. e.g. relying solely on LMS quantitative data.

  • Acceptance
    If there’s no acceptance, any insight can simply be rejected. Needs to be more focus on empirical evaluation methods of LA tools (Ali et al, 2012) and on advanced technology acceptance models (e.g. TAM). Suggests TAM “could be an interesting approach to evaluate the emergent analytic technologies for all stakeholders described in our framework and also the needed implementation requirements to guarantee successful exploitation”

The place of pedagogy in the learning analytics framework

LA promise is to offer new methods and tools to “diagnose learner needs and provide personalised instructions to better address these needs”. But whether it does this or simply clusters people into behaviouristic “learner models” is not yet clear.

More empirical evidence required for which pedagogical theory LA serves best. Pardo & Kloos (2011) give a critical reflection. LA still infers indirectly about active cognitive processes.

Pedagogy is not included in the framework, rather it is implicitly contained in the input datasets used. LA sees the pedagogy through the data. Pedagogy is also explicitly addressed int he goals and objectives of the LA designer.

Conclusion and outlook

All six dimensions need to be considered for LA designs. More work on ethics, data curation and ownership needs to happpen at universities and in legislation to reduce risks associated with application of LA.

Use of the framework would allow scientific comparison, replication, external analysis, and alternative contextualisation.

The obvious next research step: evaluate a number of LA research descriptions and test for consistency in the descriptions.


We therefore believe that it will be of critical importance for its acceptance that the development of LA takes a bottom-up approach focused on the interests of the learners as the main driving force.

The relationship between LA and theories of learning, teaching, cognition and knowledge remains open and required more research.


AoIR (Association of Internet Researchers) Ethics Working Committee, & Ess, C. (2002). Ethical decision-making and Internet
research: Recommendations from the AoIR Ethics Working Committee. Retrieved from AoIR website:

Drachsler, H., Bogers, T., Vuorikari, R., Verbert, K., Duval, E., Manouselis, …Wolpers, M. (2010). Issues and considerations
regarding sharable data sets for recommender systems in technology enhanced learning. Elsevier Procedia Computer Science,
1(2), 2849–2858.

Greller, W., & Drachsler, H. (2012). Translating Learning into Numbers: A Generic Framework for Learning Analytics. Educational Technology & Society, 15(3), 42–57.

Pardo, A., & Kloos, C. D. (2011). Stepping out of the box: Towards analytics outside the learning management system.
Proceedings of the 1st International Conference on Learning Analytics and Knowledge (pp. 163–167). New York, NY: ACM.

Siemens, G. et al. (2012). Learning analytics: Guidelines for ethical use. Shared effort of the learning analytics research
community. Retrieved March 23, 2012, from

2 thoughts on “Translating Learning into Numbers: A Generic Framework for Learning Analytics

  1. Pingback: Translating Learning into Numbers: A Generic Framework for Learning Analytics | Analyse This |

  2. Pingback: Translating Learning into Numbers: A Generic Framework for Learning Analytics | Open Distance Learning |

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s