1 A Motivating Example

The purpose of this package is to assist an instructor in grading R codes. Thus, we begin with a simple example. We describe the assignment, solution file, and two student scripts in the subsequent sections. Following this, we demonstrate how the package can help.

1.1 Download Files

For the purpose of this exercise, you can download the files from this zip file. It contains

  • 1 x question paper (sample_questions_01.Rmd),
  • 1 x solution template (soln_template_01.Rmd), and
  • 2 x sample student scripts (qn01_scr_01.R, qn01_scr_02.R), which are in the folder student_scripts.

Download and unzip the file into your working directory before proceeding. The rest of the code assumes this directory structure - the template and question documents are directly accessible, while the student scripts are one level down, in student_scripts.

In this document, we are going to utilise sample_questions_01.Rmd to demonstrate how to use the main functions within the package. Imagine that sample_questions_01.Rmd is a worksheet that you assign to your students. The pdf that you give out would look something like this.

1.2 Sample Question

Consider the following probability density function (pdf) over the support \((0,1)\):

\[\begin{equation*} f(x) = \begin{cases} 4x^3 & \text{if } 0 < x < 1 \\ 0 & \text{otherwise} \end{cases} \end{equation*}\]

Write a function called rf, that generates i.i.d observations from this pdf. It should take in exactly one argument, \(n\), that determines how many random variates to return. For instance:

set.seed(33)
rf(n = 5)
#> [1] 0.8171828 0.7925983 0.8339701 0.9790711 0.9584520

Now generate 10,000 random variates from this pdf and store them in a vector named X. Your script must generate a function named rf and a vector named X.

1.3 Solution Template 01

If this is an elementary course in R, or if it is just the first assignment of a class that uses R, the instructor may wish to test just a few details in each students’ solution.

  1. Is the number of formal arguments for rf equal to 1?
  2. Is the length of the vector X equal to 10,000?
  3. Compute the mean and standard deviation of the values in the X vector.
    • The theoretical mean and s.d. are 0.8 and 0.16 respectively.
  4. Has a for loop been used within the function definition of rf?
    • It shouldn’t be used, because most operations in R are vectorised.

1.4 Student Scripts

Consider the two student submissions, that are included with the package. Student 1 (qn01_scr_01.R) has a model solution. However student 2 (qn01_scr_02.R) has made a few mistakes:

  • A for loop has been used within the function,
  • The algorithm itself is wrong, and
  • The length of the X vector is wrong.

1.5 Package Output

Here is how the autoharp package can be used to assess the scripts.

library(autoharp)

# populate solution environment
s_env <- populate_soln_env("soln_template_01.Rmd", pattern="test")
   
stud_script_names <- list.files("student_scripts", full.names = TRUE)

# run autoharp function "render_one" on student scripts.
corr_out <- lapply(stud_script_names, render_one, out_dir = "student_out",
                   knit_root_dir = getwd(), soln_stuff = s_env)

do.call("rbind", corr_out)
#> # A tibble: 2 x 10
#>   fname time_stamp run_status run_time run_mem for_loop mean.X  sd.X lenX  lenfn
#>   <chr> <chr>      <chr>         <dbl>   <dbl> <lgl>     <dbl> <dbl> <lgl> <lgl>
#> 1 stud… 2021-05-3… SUCCESS       0.603 3733200 FALSE     0.801 0.163 TRUE  TRUE 
#> 2 stud… 2021-05-3… SUCCESS       0.541 3667624 TRUE      0.477 0.121 FALSE TRUE

The 6th column checks if a “for” loop had been used in the function definition. The next two columns contain the mean and sd of the X vectors from the respective students. The last two columns assess if the length of the created X is 10,000, and if the rf function has only a single argument.

As we can see, the package correctly detected that student 1 did not use a for loop. Student 2, on the other hand, used a for loop. The mean and s.d. were also incorrect.

2 What Can This Package Do?

This package would be useful to an instructor who runs a class that requires students to submit short to medium length assignments in R. In those cases, it can assist the instructor in the following ways:

  1. To run all the scripts submitted by students, regardless of what packages they used. It generates a html page of thumbnails presenting all the images generated by students, with links to their actual html output. (add an example…. )
  2. To generate features from each script that has been submitted. These features fall under three categories:
    • Runtime statistics: How long did it take to run the script and how much memory did the final set of objects take?
    • Correctness statistics: Do the objects generated achieve a basic level of correctness? For instance, the mean of the vector above should be very close to 0.8.
    • R code quality: By representing the R expressions in the script as trees, the package allows one to detect coding styles or expressions with quite a bit of flexibility - hence the ability to detect for loops within the function defined.
  3. It provides a shiny server interface for students to check against before they submit their code (detailed in another vignette). This allows them to ensure that their code will run on the instructor’s machine, generating the correct output. This enables them to avoid problems of
    • using incorrect relative path specifications
    • using incorrect names for objects e.g. case-sensitive mismatches.
    • accidentally running their solution based on objects in their local environments.

3 The autoharp Framework

In this section, we detail the framework that the autoharp uses to achieve the tasks above. Before that, we provide an overview of what we envision the instructor has to do. Ideally, he/she simply has to prepare a question paper and a solution template. The autoharp should do the rest.

Overview

Overview

3.1 Elements of the Framework

3.1.1 The Question Paper

The question paper details what the students need to create within their submission, which could be a plain R script, or an Rmd file. The required objects could be any R object, such as a data frame, a vector, a list, or a function. The question paper should clearly state the name of the object(s) and their key attributes.

For instance, if a function is to be created, the question paper should specify it’s name, number of arguments, names of formal arguments and the return value.

3.1.2 The Solution Key or Template

The solution template must be an Rmd file. It is where you specify the things that should be checked about the student script. First of all, it should generate the correct versions of the objects. These “model” objects can then be used to check against student-created objects.

The autoharp package defines two knitr hooks, for use within the solution template:

  • autoharp.objs: This is a character vector of names of objects that are going to be used as the “model” solution. They could be correct values of something the students were asked to evaluate from their data, or a dataset, or a function.
  • autoharp.scalars: These are character vectors of objects that will be generated within the chunk. These are objects that will possibly be generated using student-created objects. For example, it could be the output from running a student-written function.

To make things concrete, let’s take a look at the lines 19 to 22 from the soln_template_01.Rmd:

```{r test01, autoharp.objs=c("rf", "X"), autoharp.scalars=c("lenX", "lenfn")}
lenX <- (length(X) == length(.X))
lenfn <- (length(fn_fmls(rf)) == length(fn_fmls(.rf)))
```

The first hook communicates to the autoharp that the objects rf and X (created in the previous chunk) are to be used as reference objects - we may wish to compare the student versions with these later on. In preparation, the autoharp duplicates them as .rf and .X.

The second one informs the autoharp that the subsequent code, when run on student-created X and rf, should yield two values: lenX and lenfn. These should be returned as part of the “correctness checks” for each student.

Here is another example of a chunk that will return autoharp scalars:

```{r testxx, autoharp.scalars=c("max_X", "min_X")}
max_X <- max(X)
min_X <- min(X)
```

In short, these chunks contain normal R code that utilise the objects created by students. However, they could also contain autoharp code that analyses the structure of student R code. For instance, the following autoharp-specific code would extract the number of calls made to mutate in the student script:

```{r testxxx, autoharp.scalars=c("f1", "mutate_count")}
f1 <- rmd_to_forestharp(.myfilename)
mutate_count <- fapply(f1, count_fn_call, combine = TRUE, pattern="mutate")
```

rmd_to_forestharp, count_fn_call and fapply are autoharp functions. We shall see more about them soon. The variable .myfilename is hard-coded to contain the path to the current student script. It allows the autoharp to access the student file from within the solution script.

Here is a visual summary of what the solution template should contain:

Solution template

Solution template

Chunks from the solution template will be extracted and stored as a separate temporary R script. This script will be run in a separate process in the student environment, so it should load any libraries that it needs. The chunks that will be copied out are those that contain the “test” prefix as the chunk label. If you take a look at the soln_template_01.Rmd, you will notice that the chunk where libraries are loaded has the chunk name test00. This ensures that that chunk is copied out to the temporary script.

In order to be sure that all solution objects are generated correctly, we should ensure that the solution template can knit. If we need to, this can be done with:

populate_soln_env("soln_template_01.Rmd", pattern="test", render_only=TRUE )

3.1.3 The Student Scripts

The student script can be an R script or an Rmd file. This is where the most amount of uncertainty comes into the process. Student scripts can go wrong in a multitude of ways: they could contain infinite loops, they may use obscure packages, may overwrite your own datasets(!) and call interactive functions such as View.

3.2 How the Elements Work Together

The job of the autoharp is to run test chunks from the solution script in the student environment. Here are the detailed steps.

  1. First, the solution environment has to be populated. This is where the populate_soln_env from autoharp comes into play. The inputs to this function are

    • the solution script,
    • a pattern that identifies which chunks are test chunks, and
    • the directory to knit the solution script in.

    The function will first run all the code within the solution script and store the objects in an environment. Let’s call this the soln_env.

    If an object name is listed in the autoharp.objs hook, then a copy of it is also placed in the soln_env. These objects can be used in test code. All the chunks whose labels are prefixed with “test” will be extracted and placed in a solution script within a temporary directory. Within this script, all chunks that contain the hook autoharp.scalars will be wrapped within a try expression. The names of all objects listed in the autoharp.scalars hooks will be stored in the solution environment too.

    The return object from this function is a list of length 2, containing the solution environment, and a path to the solution script.

    The items in the solution environment can be accesssed with this code:

ls(s_env$env, all.names=TRUE)
#>  [1] ".myfilename"      ".rf"              ".scalars_to_keep" ".X"              
#>  [5] "f1"               "for_loop"         "lenfn"            "lenX"            
#>  [9] "mean.X"           "rf"               "sd.X"             "X"

The contents of the solution script s_env$test_fname are:

library(autoharp)
library(rlang)
try_out <- try({
lenX <- (length(X) == length(.X))
lenfn <- (length(fn_fmls(rf)) == length(fn_fmls(.rf)))
})
try_out <- try({
mean.X <- mean(X)
sd.X <- sd(X)
})
try_out <- try({
f1 <- rmd_to_forestharp(.myfilename)
for_loop <- fapply(f1, detect_for_in_fn_def, fn_name = "rf", combine=TRUE, 
                   combiner_fn = function(x) any(unlist(x)))
})
  1. The next step is to render the student script or Rmd into a html file. This is done within the render_one function of autoharp. Once the student file has been successfully rendered, the objects it generates are stored in the student_environment.

    Rendering of student files is carried out in a separate R process, so that paths are reset for each student. At this point, run-time statistics would already have been generated.

  2. The next step is to run the correctness check. The “model” objects from the solution environment are then copied into the student environment. Remember, these should not conflict with what is in the student environment, because they would have a period in their prefix. For instance, the student environment will now contain .X (from solution template) and X (from student).

  3. Correctness is assessed by running the temporary solution script (from step 1) within the student environment. The autoharp.scalars are then appended to the runtime statistics to generate a data frame with one row for each student script.

    This figure contains a more detailed breakdown of an instructor’s workflow when using the package. He essentially needs to prepare a question paper and the solution template; ideally, populate_soln_env and render_one should do the rest.

Instructor workflow

Instructor workflow

This final figure zooms in on the tasks that render_one accomplishes: