Community-Building and Infrastructure Design for Data-Intensive Research in Computer Science Education

Overview

Online educational systems, and the large-scale data streams that they generate, have the potential to transform education as well as our scientific understanding of learning. Computer Science Education (CSE) researchers are increasingly making use of large collections of data generated by the click streams coming from eTextbooks, interactive programming environments, and other smart content. However, CSE research faces barriers that slow progress:

Collection of computer science learning process and outcome data generated by one system is not compatible with that from other systems.
Computer science problem solving and learning (e.g., open-ended coding solutions to complex problems) is quite different from the type of data (e.g., discrete answers to questions or verbal responses) that current educational data mining focuses on.

The project goal is to build community and capacity among CSE researchers, data scientists, and learning scientists toward reducing these barriers and facilitating the full potential of data-intensive research on learning and improving computer science education. We are bringing together CSE tool build communities with learning science and technology researchers toward a software infrastructure that supports scaled and sustainable data-intensive research in CSE that contributes to basic science of human learning of complex problem solving. This goal is being achieved through a set of community-building and infrastructure capacity-building activities whose ultimate goal is to develop and disseminate infrastructure that facilitates three aspects of CSE research:

development and broader re-use of innovative learning content that is instrumented for rich data collection,
formats and tools for analysis of learner data,
best practices to make large collections of learner data and associated analytics available to researchers in CSE, data science, or learning science.

We engage a large community of researchers to define, develop, and use critical elements of this infrastructure to address specific data-intensive research questions. We are hosting workshops, meetings, and online forums leveraging existing communities and building new capacities toward significant research outcomes and lasting infrastructure support.

This is a collaborative project with Carnegie Mellon University and Virginia Tech teams.

Current State

This community-building project has now been expanded into a full-scale infrastructure development project An Infrastructure for Sustainable Innovation and Research in Computer Science Education. All updates will be posted to the new project page. Make sure also to visit project Web site where you can find more information about the project as well as various project materials and resources.