Friday, March 6, 2026

But why specifications grading? What's wrong with points?

I use Specifications Grading for the courses I teach. This grading approach does not assign any points to assessable items. That is, there are no points to add up at the end of the semester to determine the final course grade. Each assessable item is either determined to have met the specifications I provided for it (in which case the item is marked “complete”), or that is did not meet the specifications (in which case the item is marked “incomplete”). The assessable items are grouped into a variety of categories (depending on the course). To earn a particular final course grade, a learner must “complete” a specified number of items in each assessable category, with lower grades typically requiring fewer “complete” items in each category.

But why!?

I sometimes get asked why I use specifications grading. This is usually followed with the question, “What’s wrong with points? That’s what we’ve been using since I was in school, and what today’s students are familiar with.” Let's explore these questions a bit.

Bucket of points

With a traditional point-based grading system, there is no way to _guarantee_ that a course’s student learning outcomes (SLO) are met, unless the _only_ thing that is _ever_ assessed are the SLOs. And even in that scenario, there is no (practical) way to know if they met _all_ of the SLOs (and if not, which ones did they meet?), unless they earn 100% of the possible points for the semester. Because all points go into the same “point bucket” to determine the final course grade, they get mixed together and “averaged,” and the ability to identify what the learner has shown evidence of learning is lost. Areas the learner does well in (earning lots of points) help mask area they may not have learned at all, due to the “extra” points from the good areas filling in the voids. As such, it is possible that we have learners completing a course with a grade of C having fully met very few (maybe none?) of the SLOs. Further, if they earned a B or C (or maybe even an A), we have no way of knowing which/how many of the SLOs were met.

Just throw them in the bucket...

Consider the following table showing scores for three learners. Do you consider all three of these learners B students? They all ended the semester with 82.5% of the possible points.

A table showing scores for three learners who all have earned 825 out of 1000 points. However, one of the learners is consistent, getting about 80% of the points for every item. The other two learners have erratic scores on the various items. One does well, but misses some items completely. The other does poorly at the start of the semester, but finishes strong.
As I look at this data, I see three very different learners. On the one hand, Benny has pretty consistent scores on everything, in each case earning 80-83% of the possible points. Hoton, on the other hand, starts out very well, but then misses a couple of quizzes, and earns marginal scores on Exam 1 and the Final Exam. On the third hand, Dee doesn't do very well on the first few quizzes and Exam 1, but then does very well towards the end of the semester. 

I suspect we can agree that Benny is a B student. We may not be able to agree if Hoton and Dee are, however. If we just look at the Final Exam score, it's not clear that Hoton has learned course material at a B level, since they only earned 71% of the possible points. Did they not learn the material, or did they juust have a bad test day? However, again looking at the Final Exam score, Dee did better than Benny, earning 90% compared to Benny's 82%. Could that mean Dee simply needed more time to master the content, and maybe her Final Exam score indicates she should be considered an A student? Following similar logic, Is Hoton a C student? Or, do we declare all of them B students because they all earned 825 points? Because all of the points are dumped into the same bucket, we have an "average" of  the semester, and loose the details that might be beneficial in making decisions.

Let's fly across the country...

A plane in mid air. There is a heading at the top that reads "Would you get on a plane if..."
Before I describe this scenario to you, let me assure you that I do know this is not how airplanes are designed. But please hear me out, and just consider it an example, however contrived it may be.

Let's imagine there is a school you can attend where you take four courses and are considered to be fully qualified to design an entire airplane, with no minimum grade needed to pass a course. Further, let's imagine two particular engineers have taken these four courses, and both graduated from the program with identical 3.0 GPAs. Upon graduation, each engineer designed an airplane, and airplanes have been built to the engineer's specs. Does it matter to you which airplane you get on to fly across the county? Based on each engineer's GPA--the only thing you can base our decision on--your answer is likely that it does not matter.

But, what if I told you that Engineer 1 earned course grades of A, A, A, and an F, and Engineer 2 earned all Bs? Does that added information sway you one way or the other? Many people choose the airplane that Engineer 2 designed, even though they did not earn any As, because Engineer 1 earned an F. But, now what if you knew what the four courses were? Could that make a difference in our decision? If Engineer 1's F was in wing design, engine design, or cockpit design, you'd likely for sure choose choose the Engineer 2's plane. But if Engineer 1's F was a class about seat and interior design, you might opt for their plane, since you'd know that the engine, wings, and cockpit should be based on A-level work. You just might have an uncomfortable seat. Being able to know those details is important. Just having the average grade (the GPA) of the four courses hides the needed information you need to make the best decision.

Binned countable items

Specifications grading can provide a solution to the “problems” I just described. If the final course grade specifications are structured to do so, a particular final course grade can tell you exactly what/how many SLOs were met by a learner. For example, let’s assume a course has 5 SLOs. The specs could be configured such that a final course grade of C means they met at least 2 of the 5 SLOs. A grade of B means at least 4 of the 5, and an A means they met all 5. If structured even more carefully, the specs could even be configured such that a Grade of C would indicate exactly which 2 SLOs were met. I see this as a benefit! It allows a final course grade to mean something specific. Using the example I presented above, if a learner earns a B we can know they showed proficiency in at least 4 of the 5 SLOs, and maybe even know what they did not show proficiency in. For a points-based-graded course, all we can say is they earned at least 800 out of a 1000 points, but have little idea of what they actually learned. 

For me, points just don't add up!

Ever since I realized that a point-based grading system hides deficiencies as well as strengths in an average, I've been very uncomfortable using it. Fortunately, I discovered specifications grading's benefits many years ago, and have been using it ever since. There is no going back for me, because points just don't meet my specs.

3 comments:

  1. David, how does SG show where the deficit is if all one sees is a letter grade at the end?

    ReplyDelete
    Replies
    1. Fair question. It depends on how the specs are set up that define a given course grade.

      Using the example from my post, where there are 5 SLOs, let's assume that...
      -- C is defined as having met (at least) SLOs 1 & 2,
      -- B is defined as having met SLOs 1-4, and
      -- A means all 5 SLOs were met.

      Given this set up, if a learner ends up with a course grade of B, we know they met SLOs 1-4, but not 5, otherwise they'd have an A. If a learner earns a C, we know they met SLOs 1 & 2 but did not meet at least one of SLOs 3 and 4, maybe neither. The C is a bit ambiguous about which SLOs were not met, but we know that (at least) one of them were not met.

      Delete
  2. This is a very interesting approach. It is very thoughtfully considered. It clearly attempts to balance "Mastery of Material" and analytical assessment. I've been doing volunteer tutoring for many years. I am amazed at how HS students are allowed late assignments and test "retakes" with little consequences. I understand it is to help their mastery of the subject. I like the way your assessment strikes a balnce between the "old" and "new" ways of assessment.

    ReplyDelete