Tuesday, December 26, 2017

What we learned in CS 222, Largent's Fall 2017 edition

My colleague, Paul Gestwicki designed our CS 222: Advanced Programming course. He first taught it during the fall 2010 semester. Most semesters since then, he has blogged about an exercise performed as a final exam to identify what the students felt they learned during the course. You may review his fall 2017 blog entry, if you wish.  I've had the pleasure of teaching the course since fall 2015. Having just started to blog this fall, this is my first opportunity to create a blog post similar to Paul's. So here goes...

The final exam process for CS 222 consists of three parts. First, the students are asked to list everything they can think of that they learned--directly or indirectly--because of their participation in CS 222 this semester. Once the consolidated list is compiled (and cleaned of duplicates), each student has a limited number of votes to cast to identify the items they feel are the most significant. And lastly, each student then selects one of the items which received the most votes and writes a reflective essay about how they learned that item.

The consolidated list contained 103 items (after removing duplicates). Each student had seven votes to cast for the items they felt were the most significant for them. Because of a tie for seventh place, we included eight items in our "Top N List," which ended up including these items:
  • Clean code techniques (15)
  • TDD (15)
  • Stack overflow (14)
  • It’s easier to take the time to write cleanly than to write dirty code and fix it later. (8)
  • Code is like art. Anyone can look at code and call it beautiful, but it’s rare to find someone who can tell you why. (7)
  • Time-management is a key factor in a lot of things (7)
  • Consider what the needs of the end user are when developing functions, rather than developing functions based on satisfying your own goals for your application. (5)
  • Navigation of GitHub (5)
The students in Paul's fall 2017 section of the course developed this list:
  • TDD (13)
  • Good names (12)
  • Refactor vs. Redesign (7)
  • Self-reflection (7)
  • git (6)
  • Objects vs. Data Structures (6)
  • Time management (6)
Comparing the two lists, I see many similarities. Both lists include references to clean code, TDD, Git, and time management. As Paul mentions in his blog post, clean code and TDD are major themes of the course, both of which are completely new concepts to the students. As such, they struggle with them. Similarly the use of Git and GitHub is new to virtually all students, and causes its own challenges, until they take the time to figure it out. Time management is a common issue students (and many of us!) have; seldom are they able to accurately estimate how much time it will take to develop their projects. Because of the nature of the course (with some limited scaffolding we "throw them in the deep end of the pool and ask them to swim"), it is not surprising that students would feel the use of Stack Overflow is significant.

I found the "Code is like art." item a bit interesting. One of the course's essential questions is "What is software craftsmanship?" Perhaps the discussions we held about that throughout the semester struck a chord with some of the students more than I realized. User needs versus your own project goals was an item I've not had show up before, but it may reflect a few of the groups' experiences when they performed user acceptance testing.

There were a few items that I found interesting but didn't receive enough votes to make the "Top N List."
  • How much I’ve learned since CS 120
  • Reflecting on yourself can prove to be useful
  • Must make an effort to learn from past mistakes the first time I make them
  • Asking for help and needing help are two different animals
These items all focus on learning, and our human tendencies; what has been, is being, and will be learned, and our inclination to avoid doing what we need to do to learn. The fact that a few students observed this in themselves, and a few even felt it was significant (as evidenced by a few votes) is a good thing.

Comparing the course grades of this semester's students to those of the spring semester, the grades appeared to drop, though not what I'd consider significantly. There were fewer As, more Bs and Cs, and fewer Ds.

Spring 2017 Fall 2017
A 52% 42%
B 41% 48%
C 3.5% 10%
D 3.5% 0%
F 0% 0%

I did change how grades were calculated this semester, having switched to utilizing specifications grading, but the assessable items did not change--only how they were evaluated. I reflected on my experiences with this transition in my previous blog entry. Some of the over all grade change might be attributed to this, and may be reflecting a slightly more accurate representation of learning and effort than the previous point system, which tended to "average" my assessment of their learning. The students did show evidence of learning and growth in their final exam reflective essays.

Sunday, December 24, 2017

According to specs…


During the fall 2016 semester, I participated in a multi-session faculty learning community that explored what specifications grading is, how it might be implemented, and what it might do for our students and us. In her book Specifications grading: Restoring rigor, motivating students, and saving faculty time, Linda Nilson claims that specifications grading can make grading faster, clearer, and more precise, allowing the instructor to focus more on promoting improvement in students, rather than worrying about justifying a grade.

The basic tenets of specifications grading are to provide a very clear, detailed description of what is expected of the student for a given assessment (the specification), and then evaluate their submitted work as complete or incomplete against that specification—they either met the specification, or they did not. Simply providing clear expectations can make a significant difference in a student’s ability to submit what you are wanting. It almost seems too good to be true.

Historically, I have assigned points to everything students do for a course, and then determined their course grade based on the total of their earned points compared to what was possible to earn. This means that a student can do poorly on some assignments (or not do it at all), and great on other assignments, and then depend on the “extra” points from the great assignment to offset the lack of points on the poor assignment. This effectively generates an “average grade”, but does not necessarily reflect what the student accomplished. Students were also inclined to beg for partial credit.

I decided if even one of the claims Nilson made in her book title was true, it would be worthwhile implementing specs grading. My first opportunity was for a discussion-based course (CS 200: Computers and Society) during the spring 2017 semester. I converted my evaluation of all assessments to complete/incomplete—they met the detailed specifications I provided, or they did not. One exception I made to grading complete/incomplete was for the exams, although, even here, I still altered my approach. Previously I often awarded partial points for a response that contained some of the answer for which I was looking, but not all of it. For this course, I evaluated each question on a complete/incomplete basis—full points or zero. Since most of the assessments were an all or nothing situation, I did offer the students limited opportunities to resubmit incomplete assignments. In the absence of points to add up, I determined the final course grade based on how many of each assessment category the student completed. For example, if a student successfully completed at least five of six quizzes, that gave them an A, provided they had an A in all other assessment categories. If they only passed four, they dropped to a B, etcetera.

I found it critical to clearly (and repeatedly) explain to the students why I was grading this way, and what the benefits were for them and their learning. I made minor adjustments, based on my experiences and student feedback, and implemented specs grading for the same course during the summer 2017 semester as well. I have just completed the fall 2017 semester during which I implemented specs grading on two more courses. One of these courses (CS 239: Social and Professional Issues) I implemented the same way I did the CS 200 course during the spring 2017 semester. For the second course (CS 222: Advanced Programming), I used specs grading on all assessments, excepting the programming projects. For the projects, I continued to assign points based on how close to a correct solution they were. The projects were a category of assessment items, and had a different minimum points the student had to accumulate for each letter grade.

Reflecting on my experiences teaching courses utilizing specifications grading, I observed a variety of positive outcomes, including the following.
  • Students did not beg for missed points. Although there were a few (very few!) times when a student asked if I would consider passing a given assignment, rather than the failing evaluation I had provided.
  • As a whole, final course grades stayed substantially the same in the specifications-graded courses, compared to before I implemented specifications grading. I believe however, that the grades more accurately reflect the level of course content mastery. That is, if a student received a course grade of “B”, they had mastered most all of the course material at a “B” level or better, as opposed to some material at an “A” level, and other material at a “C” level.
Not all was perfect, however. My experiences and student feedback did highlight some areas that needed adjustment, including the following.
  • I needed to provide a better explanation and justification for using specifications grading at the start of the semester.
  • I did not originally provide enough detail in a few of the specifications. Others contained minor ambiguities.
  • Specification grading has not significantly reduced my grading time. In large part, this is because I want to provide the student with as much feedback as possible as soon as possible, rather than stopping my evaluation as soon as I found a specification they did not meet.
  • Some students became discouraged and gave up when they failed too many assignments.
Thus far, students tend to either like, or dislike the use of specifications grading; there is not much middle ground. Some student quotes from the end-of-semester course evaluation may serve to illustrate this. Students in the CS 222 course offered these comments about specification grading (I list the positives, and then the negatives):
“The grading system was very clear. Everyone knew what needed to be done for a specific grade.”
“Unique grading system, which might seem daunting at first, but is overall a worthwhile system.”
“Grading system is a complete mess, I know it's newish, but it really doesn't have a place in an advanced programming class.”
“Specification grading encouraged me to not complete more achievements because I knew that I could not get a grader higher than a C because of my project scores. In that regard, I think standard weighted grading systems are better as every assignment counts. In this system, students will always have incentives to work hard on every remaining assignment because they all impact the student's grade.”
The student comments in the CS 239 course were all on the negative end of the scale.
“The fact that you could have one hundred percent on everything throughout the semester, but then get a B on the final and it would bring your final grade down to a B was kind of ridiculous, especially for a one credit hour class”
“This course has a horrible grading system. Pass/Fail for sections would have been alright, if they were then averaged.”
“The grading system needs revision.”
“I hate this grading scale. People are naturally going to be better in certain areas and it seems unfair for their skill in those areas not matter if they don't do well in another area. Because of the specification grading I did only the bare minimum on all of the assignments because I knew there was less weight placed on my actual skill on that assignment.”
This course has a clear grading system.
Interestingly, if we look at the student responses to Likert scale questions that are part of the course evaluation, we get a much more positive response for both courses. The response rate for CS 222 was 20 of 39 students or 64.5%, and for CS 239 was 22 of 30 or 73.3%. The questions related to specifications grading were the following:

The use of specification grading in this course encouraged me to complete more of my work well, rather than depending on a good grade on one item to cover lesser performance on another item.

The use of specification grading in this course encouraged me to attend class regularly, since I knew it would have a direct impact on my final course grade.

I felt like I could determine my final course grade (or my path towards it) more accurately at any time during the semester because of the use of specification grading.
Once I got used to it, I wish more instructors used specification grading for their courses.
All five questions had a combined strongly agree and agree that was beyond 65%, with many of them being 80% or better. Apparently those that disliked specifications grading must have been more inclined to provide comments.