Counting the uncountable – NGL participation

The following documents the writing of a script to perform simple counts of what the NGL participants have been doing on their blog. Another post on the course blog will offer an explanation of the emails that will be sent to participants real soon now.


There are 10+ participants in NGL. The indicators of participation being looked for are

  • Number of posts.
  • Average word count per post.
  • % of posts with links to blog posts from other participants.
  • % of posts with links to other online resources.
  • % of posts from the blog that appear on the blog first (out of all participants).

Starting point

Will start with the EDC3100 script and modify from there. That script currently calculates the following relevant

  • Posts per week – not needed, but total posts will be available
  • average word count
  • # of links
  • # of links to other participants


Remove activity completion

Get Moodle user information – are we only including currently enrolled students, is now.

What about blog posts? Yep.

Calculate the stats for each participant

  • NUM_POSTS – done.
  • (AVG_)POST_LENGTH – done.
  • POSTS_WITH_LINKS – done.
  • LINKS_HERE_FIRST – to do.

    This is the more difficult task. The requirement here is for each link (not to another participant blog) made in a blog, check to see if it’s the first time the link has appeared in a participant post.

    At the moment the function counting links does have the timepublished for the blog post. It also creates array containing a hash for each link. But that’s all links, but maybe that doesn’t matter.

    What we need here is probably a hash with key on the link and the value being a reference to the hash about the post (which has timepublished).

    With each student object having this, BlogStatistics object can then generate stats for LINKS_HERE_FIRST.

    DoTheLinks updated to do this in

    — See below —

Assign a standard and show the report


Currently the report only assigns percentages for each stat, need to translate that into a mark for the assignment. This would have to

  • average the percentage for each descriptor for a criteria.
    The current descriptors/criteria relationship is

    • Posts (10 marks)
      • # posts
      • # words per post
    • Connections (5 marks)
      • % posts with links to other participant blog posts
    • Other links (5 marks)
      • % posts with links to other resources
      • % of posts where links occur first – not calculated yet
  • calculate the mark per criteria

    The above are stored in a hash where the key is the unique id for the descriptor

    • LENGTH = # words per post
    • NUM_POSTS = # posts
    • LINKS = % posts with links to other resources
    • STUDENT_LINKS = % posts with links to other participant blog posts
  • add them up

Calculating first blogs

The task here is for each student, calculate the percentage of links included in their blog posts that appear there first (amongst all the other student blogs)

What we need here is probably a hash with key on the link and the value being a reference to the hash about the post (which has timepublished).

There is a function createBlogMapping that loops through each post for each student and creates a hash ref MAPPING that maps out who links to who.

A similar function that only works on external links (or perhaps all links) and uses the timepublished to create the necessary hashref.

Perhaps something like

  $whenShared->{$link} = { EARLIEST => "unix timestamp when published",
                           POST => $link_to_blog_post_in_data structure };

This hash would allow a loop for each student that would count the number of times a the POST value is the user’s post.

Exclude any link that isn’t to the student’s actual post, the first link to another student’s post is counted the same as a link elsewhere

So we’re looking at two methods

  1. constructWhenShared – create the hash ref
  2. calculateEarliestPercent – add to {MARKING}->{STATS} the percentage of links first here.