Chapter 8: Learning Analytics for Self-Regulated Learning

The Winne-Hadwin model of self-regulated learning (SRL) [27], elaborated by Winne’s [16, 18, 28] model of cognitive operations and motivation, provides a framework for conceptualizing key issues concerning kinds of data and analyses of data for generating learning analytics about SRL. Trace data are recommended as observable indicators that support valid inferences about metacognitive monitoring and metacognitive control constituting SRL. Characteristics of instrumentation are described for gathering ambient trace data via software learners use to carry out everyday studying. Critical issues are discussed: what to trace about SRL, attributes of instrumentation for gathering ambient trace data, computational issues arising when analyzing trace data alongside complementary data, scheduling and delivering learning analytics, and kinds of information to convey in learning analytics intended to support productive SRL

In phase 1, a learner surveys resources and constraints the learner predicts may affect work, the probability specific actions lead to particular results, and consequences of those actions.Factors external to the learner include access to information, characteristics of sources of information, software tools designed to support learning in various ways and time allowed for work.Examples of factors internal to the learner include knowledge, misconceptions, biases for ways of working, topical interests, and a disposition to interpret slow progress as a signal of low ability versus need to apply more effort (see Winne [22,18]).
Having identified resources and constraints, a learner sets goals and plans how to approach them in phase 2. Goals are standards for the workflow and the products of work.Ipsative goals compare current results to earlier ones; they measure personal growth or decline.Criterion-referenced goals compare ideal to actual process-related features (e.g., effort, pace) or achievements.Norm-referenced goals compare products to a peer's or a group's.Goals and what they reference may be framed by the learner, an instructor or another person.Many goals concern content studied: additions to knowledge, errors corrected or misconceptions replaced.Learners also set goals for learning processes.Which study tactic is most straightforward, more likely to succeed or more familiar (practiced)?Topics of goals may concern motivation and emotion, such as curiosity satisfied or anxiety avoided.Goals may refer to external properties such as number of pages read or written, deadlines for assignments and opportunity to impress others.
In phase 3, the learner engages with the task by enacting and making minor course corrections to plans.Working on a task inherently generates feedback updating the task's conditions across the task's timeline.Feedback may originate outside the learner when software beeps or a peer comments on a post to an online discussion.Or, feedback may arise internally as the learner monitors pace, effort and certainty about knowledge (judgments of learning; see Dunlosky and Tauber [6], Part 3).For example, a search query may be deemed unproductive because results were not what was expected or don't satisfy the standards for particular information.Goals can be updated as tasks progress.
Phase 4 is when the learner disengages from the task as such, monitors properties of phases 1 to 3, and elects to make a large-scale adjustment.Examples might be a learner suspending work on a problem and returning to assigned readings with a revised goal to repair major gaps in knowledge.Or, if re-studying is not predicted to be successful, the learner may seek help from the instructor.Changes may be applied immediately, reshaping the task's multivariate profile in a major way.Or, plans for adaptations may be filed for future tasks, effecting forward reaching transfer.
A 5-slot schema frames events throughout theses phases PG 78 | HANDBOOK OF LEARNING ANALYTICS of SRL.It is summarized by a first-letter acronym, COPES [21].C refers to conditions, factors bearing on whether and how an event unfolds.Time allocated, resources available and exposure to scrutiny by peers or the instructor are common conditions.Internal conditions are psychological features the learner brings to the task.Examples are previously developed knowledge, beliefs about the topic, a toolset of tactics for learning, and motivational and affective descriptions of one's self.
O in the COPES schema is operations learners use to manipulate information.Like conditions, operations are external and internal.External operations are a learner's observable behaviors, such as posing a question or copying information from an instrument readout into an online document.Internal operations manipulate information in the learner's working memory.I posit five primitive cognitive operations transform information in ways that cannot be further decomposed: searching, monitoring, assembling, rehearsing, and translating; the SMART operations [16].Table 1 describes each with examples of traces, observable behaviour tightly coupled to the unobservable cognitive operation [20].More complex descriptions of cognition, study tactics and learning strategies, are modelled as patterns of SMART operations [17].An example study tactic is: Highlight every sentence containing a definition.An example learning strategy is: Survey headings in an assigned reading, generate a key question about each, then, after completing the entire reading assignment, go back to answer each question to test understanding.
The P slot in the COPES schema represents products created by operations.A product can be simple, such as an ordered list of British monarchs; or it can be complex, for example, an argument about privacy risks in social media or an explanation of catalysis.Some products are unforeseen because the learning environment is not completely predictable.E is a monitoring operation that generates a special product, an evaluation comparing a product to standards, S. Standards for a product equate to the goal for that product.
Three more characteristics of SRL are significant for learning analytics.First, SRL is observable only when a learner adjusts conditions, operations, or standards.Such observations require data gathered across time and showing change.Second, learners are agents.They regulate learning based on conditions and standards they judge to matter.As agents, learners always and intrinsically have choices.Therefore, learning analytics are recommendations, not dictates.A learner may think, "I did it because I had to."But, this learner elected to do what they did because they forecast negative consequences for doing something else outweighed costs of doing what they did.Goals reflect decisions that weigh costs against benefits.For example, learners sometimes are not provided standards for evaluating a product because instructors expect learners already have knowledge or skill to evaluate a product.A learner bereft of learning objectives might search for examples against which to compare their products.It can be inferred the learner has a goal to develop standards by analyzing (disassembling) examples.In the classroom, this learner may withdraw and wait for classmates to offer examples.Online, this learner may search the internet using whatever knowledge they have and evolving successively more relevant queries.Third, the COPES model identifies classes of data for developing learning analytics about SRL and suggesting targets for adaptation.
This chapter centers on self-regulated learning (SRL) in which learners are the prime actors amidst others, human and algorithmic.All reciprocally shape conditions within which each learner forges self-regulate learning.Notably, SRL is risky because it may have productive or counterproductive results.
The next section overviews characteristics of learning analytics.Then four main classes of data are distinguished by their origin: traces, learner history, reports, and materials studied.Then computations and reporting formats for learning analytics relating to SRL are described.Together, these sections sketch an architecture for learning analytics designed to support SRL.In a final section, several challenges are raised to designing these learning analytics.

LEARNING ANALYTICS
Four descriptions of learning analytics guide the field.Siemens [14] described learning analytics as "the use of intelligent data, learner-produced data, and analysis models to discover information and social connections, and to predict and advise on learning."The website for the 1st International Conference on Learning Analytics and Knowledge posted this account: "the measurement, collection, analysis and reporting of data about learners and their contexts, for purposes of understanding and optimizing learning and the environments in which it occurs."Educase [8] defined learning analytics as "the use of data and models to predict student progress and performance, and the ability to act on that information."Building on Eckerson [7]'s framework, Elias [9] noted "learning analytics seeks [sic] to capitalize on the modelling capacity of analytics: to predict behaviour, act on predictions, and then feed those results back into the process in order to improve the predictions over time" (p.5).
These descriptions beg fundamental questions.What data should be gathered for input to methods used to generate learning analytics?Answering this question bounds and shapes two questions: First, what are approaches to computations underlying analytics?Second, what can analytics say about phenomena?For instance, if data are not ordinal, A cannot be described as greater than B, nor are transitive statements valid: if A > B and B > C, then A > C. Also, ordinal (rank) data preclude arithmetic operations on them, such as addition or division.

What bearing do properties of data have on the validity of interventions based on learning analytics developed from
those data [20]?For example, if a learner's age, sex, or lab group predicts outcomes, intervening without other data is not warranted.None of these data classes are a direct, proximal (i.e., sufficient) cause of outcomes.More-

Features of Traces.
Four features describe ideal trace data gathered for learning analytics to support SRL.First, the sampling proportion of observed traces to cognitive operations should approach unity.Ideally, but not realistically, every operation is traced throughout each learning session.Second, information operated on is identified.Third, traces are time stamped.Fourth, the product(s) of operations is (are) recorded.Data having this 4-tuple structure would permit an ideal playback machine to read trace data and output a nearly perfect rendition of every learning event and its product(s) across the timeline of the learning session.With 4-tuple trace data, raw material is available to generate rich learning analytics.
In reality, every trace datum has some degree of imperfection and unreliability [20].For example, a highlighting event traces a monitoring operation and generates a product: the mark plus the content marked.At a future time, the mark facilitates locating information.What is vague about this trace is standards the learner used to identify the marked content.Better designed traces can fill this gap.If learners are invited to tag content they highlight -interesting, important, unclear, project1, tellMike, etc.
-the tag exposes the standard used to metacognitively monitor the highlighted information.Some tags reveal a strong signal about a plan -e.g., use this content in project1, in the next chat tellMike about this information.

Learner History
Instruments for tracing the history of a learner's activities are available in at least three environments: paper systems, learning management systems (LMS), and systems offering learners tools for studying "on the fly." Paper Systems.In a paper-based environment, examples of traces are content highlighted, notes, marginalia such as !, ?, and √ added to the whitespace of textbook pages, a pile of books or papers stacked in order of use (e.g., the topmost was most recently used), and multicolored post-it tabs attached to pages in a notebook.
Consider the ?symbol written in the margin of a textbook page.This trace signals the learner metacognitively monitored the meaning of nearby content and judged it confusing or needing more information to understand it.A further inference is available.Why would the learner spend effort to write ? in the margin?The metacognitive judgment does not require recording a symbol.It's likely the learner is motivated to and plans to resolve a gap in understanding.The ? marks where that resolution should be applied.
Tracing in a paper-based environment is easy for learners but gathering and preparing paper-based trace data to generate learning analytics is massively labour intensive.In software-supported environments, this burden is greatly eased.
Learning Management Systems.Modern LMSs seamlessly record various time-stamped records of learners' work.
Examples include: logging in and out of the LMS, resources viewed and downloaded, assignments uploaded, quiz items attempted, and forum posts identifying intended recipients.Some data allow inferences about goals.
For example, clicking a button labelled "practice test" traces a learner's judgment that recall is below a confidence threshold.Other trace data could describe (a) learners' preferred work schedules that mildly support inferences about procrastination, (b) resources learners judge are more relevant or appealing, (c) motivation to calibrate judgments of learning and efficacy, and (d) value attributed to contributing, acquiring, or clarifying by exchanging information with peers.
Data gathered across time can mark when learners first study a resource, if and when they review it, if and when they choose to self test, and when they take a test for marks.Coupled with other data about factors such as credit hours completed or characteristics of peers with whom information is exchanged, data like these provide raw material for building models about how learners selfregulate managing time in a study-review-practice-test cycle [1,4,5].
When students use an LMS, costs are slight to collect and prepare ambient data for input to computations generating learning analytics.However, LMSs rarely gather trace data about operations learners carry as they study and review, and particular information on which they operate.
A time-stamped datum describing a file downloaded provides no information about whether the learner studied that content or how the learner studied it.
Software Tools for Studying.Data about motivation, metacognition and SRL are "raw material for engineering the bulk of an account about why and how learners develop knowledge, beliefs, attitudes and interests" [26, p.1]. Developing these data requires attention to three factors: operationalizing indicators, gathering data to trace these constructs and filtering noise that obscures signals about constructs (see also [13,20]).
Operationalizing indicators to trace COPES calls for imaginative interfaces that encourage learners to use software tools without overly perturbing currently preferred work habits.Table 2 illustrates opportunities to gather trace data when a learner uses software tools to: • Search a repository of resources provided by an instructor and for artifacts the learner creates (e.g., terms, notes, concept maps).
• Select content in a resource to highlight, tag or annotate it.
• Make a note guided by a schema, e.g., a TERM NOTE: term, definition, example, see also . . .; a DEBATE NOTE: claim, evidence, warrant, counterclaim, my position.
• Organize artifacts, e.g., in a directory of folders.
Phase 4, strategically revising learning tactics and strategies was excluded from Table 2.This phase is addressed in the section on Learning Analytics for SRL. "

The Learner's Reports
Paper-based questionnaires (surveys) and oral reports recording ideas "thought aloud" while a learner studies or interviews after studying are common methods for gathering data about learning.In both, learners are prompted to describe features of COPES.The prompt given is critical because the learner uses it to set standards for deciding what to report.A thorough review is beyond the scope of this chapter; see Winne and Perry [30] and Winne [17,19].In general, prompts for questionnaire items present conditions too generally (e.g., When you study . . .).Also, all self-report data suffer loss, distortion, and bias due to frailties of human memory.Consequently, self-report data may correspond weakly to how a learner goes about learning in a particular study session and how learning varies (is self-regulated) as learning conditions vary.Self-report data are important, however.They reflect general beliefs learners hold about COPES.Beliefs shape what learners attend to about tasks, themselves, and standards they set.

Materials Studied
Materials learners work with can be sources of data about conditions that shape SRL.Texts can be described by various analytics including readability and cohesion (e.g., Coh-Metrix).Content can be indexed for opportunity to learn it plus characteristics of what a learner learned previously.Materials a learner studies also can be indexed by rhetorical features such as examples and summaries; and media, such as a quadratic expression described in words (semantic), an equation (symbolic) and a graph (visualization).

LEARNING ANALYTICS FOR SRL
Learning analytics to support SRL have three facets: calculation, delivery factors and recommendation(s).The calculation -e.g., observing presence/absence, count, proportion, duration, probability -is based on traces of operations performed during one or multiple study sessions [13].Delivery factors fall into two main groups: timing and characteristics of the delivered analytic, for example, as text ("You created 3 notes on average per website."), a table or a visualization (e.g., a radar chart with axes labeled by website titles and markers representing the number of notes at each website).Table 3 illustrates trace data that might be mirrored about a learner's engagements.
A "simple" history of trace data mirrored back to a learner may be conditioned or contextualized by other data: features of materials such as length or a readability index, demographics describing the learner (e.g., prior achievement, hours of extracurricular work, postal code), or other characterizations such as disposition to procrastinate, degree in a social network (the number of people with whom this learner exchanged information) or context for study (MOOC vs. face-to-face course delivery, opportunity to submit drafts for peer review).
The third facet of a learning analytic, the recommendation, updates conditions the agentic learner may attend to by describing what the learner might change.The recommended change may be supplemented by guidance about effecting the change and a rationale for change.Changes recommended are limited to four learner-controllable facets of COPES: some conditions, operations, triggers for making an evaluation and standards [23].Products are only indirectly controllable because their characteristics are a function of (a) conditions a learner can change and then chooses to change, particularly information the learner selects to be operated on; and, (b) operation(s) the learner chooses for manipulating information.Rationale for recommendations may be grounded in "common sense," theory, findings mined from data, and results of empirical research in learning science.
When recommendations are operationally defined as how a learner uses tools in software -for example, "highlight more selectively" (meaning highlight fewer words and more relevant content) or "open and review notes not viewed for 5 days" -the learner's uptake and the degree of match between recommendations and the learner's behavior can be tracked.

CHALLENGES FACING LEARNING ANALYTICS ABOUT SRL
As software systems gathering trace data evolve, they are being distributed across widening spans of learners' ages, subjects studied and learners' whereabouts.Using these systems to advance research must respect learners' preferences and legislated boundaries regarding the distribution and uses of data.Hopefully, learners will embrace a social responsibility to improve learning science, a stance that clearly depends on how learning science gathers data and uses learning analytics.

Learning is a Multiplex of Skills
Self-regulating learners choose how they operate on information under particular conditions.If characteristics of operations, e.g., efficiency or effort, and products are substandard, they strive to adapt skills or, as may be possible, remove or reconfigure conditions that bear on applying skills.A useful model here is a production, IF-THEN-ELSE [18].
Selecting and sequencing operations for learning when learners gain useful feedback (e.g., internal feedback and external analytics) about practice over successive trials.Two categories of feedback are distinguished.Knowledge of results feedback describes accuracy or correctness.Because skills are operations applied conditionally (IF), knowledge of results feedback has two dimensions: Were conditions appropriate for choosing a particular skill and was the skill executed correctly [18]?When skills are operationally defined as patterns of traces [23], describing whether a skill is executed correctly is straightforward.
A pattern of traces is the skill in operation.Algorithms are available to generate knowledge of results feedback Metacognitively monitor uses of content; The standard is "useful for the introduction to a project"; assembling elements in a plan for future work) when learning skills are operationalized as traces.A remaining challenge is engineering tools learners work with that generate traces with a strong coupling to cognitive, metacognitive and motivational constructs in learning science.This recommends fusing designs for learning analytics with findings from research in learning science [10].
In the context of achievement testing, feedback can elaborate knowledge of results by adding information intended to help a learner understand why a given answer was correct or incorrect and, if incorrect, what the correct answer is and why it is correct.When traces of learning skills are tightly coupled to constructs in learning science, elaborated feedback has different form.Beyond describing differences between a learner's multiplex of traces and a model pattern (strategy) for learning, theory borrowed from learning science can help form explanations for selfregulating learners about why adapting skills has utility.The question of whether learners act on learning analytics therefore relates to motivation (see [28]).
Across successive learning sessions, each learner tests the main and side effects of recommendations supplied by learning analytics.Across a multitude of learners, today's software systems are positioned to analyze big data about which learning analytics are offered, learners' uptake of recommended adaptations, and the effects of adaptations.This sets a stage for learning science and learning analytics to form a scientifically and practically progressive symbiotic system [24,25].

Time
Other research issues arise because developing skills requires practice.How should analytics be adapted to help learners develop multiplex learning skills?Should learning analytics be delivered just-in-time or just-in-case?If just-in-case, what is the optimal delay between learning events in which traces are gathered and when learning analytics are delivered?Modeling skills in IF-THEN-ELSE form, how should context (IF) be reinstated?Are particular kinds of learning skills more productively served by schedules for delivering analytics?Questions of these kinds further commend a union of learning science and learning analytics.
Learning science has researched how achievement covaries with time spans between studying, reviewing, and test taking sessions [4], forgetting as a function of time [11] and knowledge lost over summer holidays [3].Otherwise, time data have been underused.Traces and other data available for composing learning analytics commonly are timestamped.New research should investigate how time and timing matter in supporting progressive SRL.The requires identifying patterns in COPES events across time [29].Vexing questions here are how to define boundaries for time windows and how to determine which events should be filtered out (see [31]).

More Data and New Systems
Learning analytics are accounts about how learners work and of relations between conditions, forms of learners' work and products.Operationally defining data needed for these purposes is challenging [20].Bootstrapping successively more refined and more effective learning analytics can profit from big data [24].In turn, this recommends designing and widely distributing ensemble software to gather these data.As such learning systems come online, the field of learning analytics will be positioned to replicate what productively self-regulating learners do.At the same time, learners will be afforded regularly upgraded learning analytics to guide self-regulating their learning.

Table 1 :
SMART Cognitive Operations [2,20]age and sex can't be manipulated; and, changing lab group may be impractical (e.g., due to scheduling conflicts with other courses or a job).Finally, because prediction is insufficient to establish causality, it is unknown whether changing any of these characteristics will have any effect.Who generates data?Who receives learning analytics grounded in which data? Learning ecologies are populated by multiple actors.Authors of texts, videos and webpages vary cues they intend to guide learners about how to study; font styles and formats such as bullet lists and sidebars that translate text to graphics, are examples.Instructional designers and front-line instructors augment authors' content, for example, by setting goals for learning and adding content to the author's.Instructors also set schedules for learning and control most opportunities for feedback to learners.Learners study solo, form and disengage from online cliques or face-to-face study groups where they exchange topical information, announce beliefs about topics, and share products of learning activities (e.g., questions, notes).Their educational institution provides a multifaceted infrastructure intended to elevate motivation and promote wellness.Each category of actors adds data and may be a legitimate candidate to receive learning analytics.standardsshouldbeusedtogaugeuptake and benefit?Suppose after receiving learning analytics about scheduling work on assignments, a learner starts work on projects sooner, spends more time on tasks, but achievement remains unchanged.Is this a benefit?DATA FOR LEARNING ANALYTICS ABOUT LEARNING AND SRL TracesAs learners work, they generate ambient data (or accretion data;[15]).For example, clicking a URL to open a web resource creates data about a learner's cognition and motivation.Based on context (perhaps the title of the resource), the learner forecast this URL might contain information of sufficient value to motivate examining it.This click is a trace, an ambient datum affording relatively strong inferences about one or more cognitive, affective, metacognitive, and motivational states and processes (CAMM processes;[2,20]).Following are two further examples of traces and inferences developed with an explicit caveat: inferences are probabilistic, not certain.particular note traces motivation to repair some deficiency in knowledge.If the learner highlights information in the reviewed note, that traces which information the learner monitored and judged deficient.

Table 2 :
Illustrative Traces and Inferences about Phases of SRL

Table 3 :
Sample Analytics Describing COPES Facets in SRLConditionsPresence/absence of a particular (set of) condition(s) within a learning session Onset/offset along the timeline of one study session or across a series Operations Frequency of SMART operations (see Table1) Sequence, pattern, conditional probability relating multiple SMART operations