Course Data Dump

Course Data Dump

by Richard Olsen -
Number of replies: 6
Given that this course is about analytics and data has it been considered whether a course data dump would be desirable and possible?

I'm thinking how sites like Stack Overflow offer their complete data as a dump for anyone to use how they see fit, within the bounds of the creative commons license http://blog.stackoverflow.com/2009/06/stack-overflow-creative-commons-data-dump/
In reply to Richard Olsen

Re: Course Data Dump

by George Siemens -
Hi Richard - thanks for raising this topic - I'd like to see a similar CC license adopted for LAK11 (i.e. all user contributions are CC-licensed). We will certainly be doing some analysis of discussions and interactions as the course progresses.

Comments from others?
In reply to George Siemens

Re: Course Data Dump

by Hans de Zwart -

I really like Richard's idea (and Stack Overflow's stated ambitions and intentions).

So for me it would certainly be a plus if it were possible to have all the posts here in Moodle be CC-licensed. I guess because there is ownership of the course/site that should be possible. After that it is indeed important to offer the data of the posts in some format that makes sense for people to use.

Maybe it is a good idea to look at Wikimedia's Terms of use for the Wikipedia project?
http://wikimediafoundation.org/wiki/Terms_of_Use

In reply to Hans de Zwart

Re: Course Data Dump

by George Siemens -
Hi Hans, Richard,

During our session on Friday, the question of privacy and research came up. I stated that, given the conversation is on the open web, course participants should expect some sort of analysis to be conducted based on social networks/language/conceptual development, etc. The individual that does the analysis (especially if intended for publication) should clear ethics within her institution.

A CC license on content is a bit different, as the intent of CC licensing is to permit reuse and sharing. Reusing/sharing vs. analyzing is a small, but I think important, distinction.

In reply to George Siemens

Re: Course Data Dump

by Richard Olsen -
The CC part of it isn't that important to me, although it does give people the freedom to be creative with it while guarding against others simply reposting it chock full of ads.

The interesting thing would be if at the end of the course there was a data file maybe broken up into resources, posts (from here), blog posts (scraped from the netvibes), links (scraped from delicious and diigo) - probably in xml.

Not sure that anyone would do anything with it but at the very least it would be a nice precedent?
In reply to Richard Olsen

Re: Course Data Dump

by Deleted user -
May be the two things could be separated. One is to get a data set with all the activity made available, and a different one is the right to re-use the post content.

As for the data set, I think there will always be somebody that can run some analysis about the type of comments, activity, resources that users touched. It would be a great bootstrapping process for the learning analytics community, right?

In reply to Deleted user

Re: Course Data Dump

by Richard Olsen -
Maybe we should think about this for a week and then map out the content and meta-data that we want to capture.

About week three we could start a few test scrapes, looking to build the tools to scrape the data at the end of the course, unless someone asks us to stop.