Approaches for literature analysis

One of the on-going research projects we have underway (really just starting up) is an analysis of the learning analytics literature. The following is an ad hoc record of a search into the literature around different approaches to literature analysis. The aim is to further inform the work. Essentially a summary of some readings.

Origins in Information Systems

I’m from the IS discipline originally so I’m aware of some of this type of work there.

Arnott and Pervan (2005) analysed the Decision-Support Systems literature. Their approach was the content analysis of 1000+ papers by reading and applying a data collection protocol. Two authors and a research assistant using the same protocol. Results entered into SPSS.

As the use of SPSS suggests, the “Article Coding Protocol” was a series of questions with numeric answers that characterised an article. Questions covered topics such as

  • Research type including research stage, epistemology, article type.
  • DSS factors.
  • Judgement and decision making factors.

Arnott and Pervan (2005) use the phrase “literature analysis” for their work. But mention other terms from the IS literature ‘review and assessment of research’ (Robey, Boudreau and Rose, 2000), ‘literature review and analysis’ (Alavi and Leidner, 2001), ‘survey’ (Malone and Crowston, 1994). But none of those provide any interesting pointers to further literature.

Content analysis

Hsieh & Shannon (2005) look at qualitative analysis, but start with the development of content analysis. Initially used as a quantitative method “with text data coded into explicit categories and then described using statistics” (p. 1278).

They define qualitative content analysis as

Qualitative content analysis goes beyond merely counting words to examining language intensely for the purpose of classifying large amounts of text into an efficient number of categories that represent similar meanings(Weber, 1990). (p. 1278)

And another nice quote “The goal of content analysis is “to provide knowledge and understanding of the phenomenon under study” (Downe-Wamboldt, 1992, p. 314)” leading to their definition

qualitative content analysis is defined as a research method for the subjective interpretation of the content of text data through the systematic classification process of coding and identifying themes or patterns.

They identify three distinct approaches – conventional, directed or summative – which differ on coding schemes, origins of codes, and threats to trustworthiness.

Directed content anaylsis seems relevant as it draws on existing theory for the initial coding scheme and its goal is “to validate or extend conceptually a theoretical framework or theory”. Identifies limitations and suggests strategies.

Seven classic steps of the analytical proecss underpinning all qualitative content analysis

  1. Formulating the research questions to be answered;
  2. Selecting the sampel to be analysed;
  3. Defining the categories to be applied;
  4. Outlining the coding process and the coder training;
  5. Implementing the coding process;
    “The basic coding process in content analysis is to organize large quantities of text into much fewer content categories” (p. 1285).
  6. Determining trustworthiness;

Julien (2008, n.p.) defines content analysis as

the intellectual process of categorizing qualitative textual data into clusters of similar entities, or conceptual categories, to identify consistent patterns and relationships between variables or themes.

Apparently a method “independent of theoretical perspective or framework” but originates as a quantitative method. Julien (2008) suggests quantitative helps in answering “what” questions while qualitative content analysis helps in answering “why” questions and analysing perceptions.

Multiple coders a common method to improve trustworthiness. 60% agreement between coders is apparently considered acceptable (Julien, 2008).

Krippendorff (2010) suggests content analysis is “a scientific tool”, “can provide new kinds of understanding of social phenomena or inform decisions or pertinent actions” it is teachable.

Krippendorff (2010) also suggests that content analysis uses abduction rather than induction/deduction as used by observation methods. And three criteria for judging results

  1. reliability – can the process be replicated. Human coding is the most unreliable aspect of the process. There are agreement coefficients e.g Scott’s and Krippe-dorff’s.
  2. plausibility – of the path taken from texts to results. Apparently a dig at computer coders who think they’ve solved reliability. Can’t hide behind obscure algorithms.
  3. validity – various forms outlined.


Arnott, D., & Pervan, G. (2005). A critical analysis of decision support systems research. Journal of Information Technology, 20(2), 67–87. doi:10.1057/palgrave.jit.2000035

Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative health research, 15(9), 1277–88. doi:10.1177/1049732305276687

Julien, H. (2008). Content Analysis. In L. M. Given (Ed.), The SAGE Encyclopedia of Qualitative Research Methods Content Analysis (pp. 121–123).

Krippendorff, K. (2010). Encyclopedia of Research Design Content Analysis. In N. J. Salkind (Ed.), Encyclopedia of Research Design (pp. 234–239). Thousand Oaks, CA: Sage Publications.

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s