Autograding the structure of code assignments for CS Education and code bootcamps using CodeGrade and Semgrep without an AST.
Guides
April 4, 2022

Webinar: Autograding Code Structure using Semgrep

In 30 seconds...

In this webinar we discussed:

  • Why you should start autograding code structure;
  • What the tool Semgrep is;
  • How Semgrep works together with CodeGrade;
  • The basics of Semgrep patterns and rules;
  • Three step by step examples of autograding code structure in CodeGrade.

Learn all about it in this article or watch the webinar here!

In our latest webinar, we tell you everything you need to know about autograding code structure using Semgrep in CodeGrade, including many practical step by step examples! This webinar was part of our monthly CodeGrade Webinars series and was recorded live on April 1st 2022 - available on-demand now.

Semgrep and CodeGrade

Traditional linters, like pylint for Python or eslint for JavaScript, are easily used in CodeGrade and great for general, broad language standards, but not for specific code structure checks. Semgrep is a tool that can do static code analysis on the structure of code, based on very simple patterns you provide it. Originally designed to find security vulnerabilities in code, Semgrep is an open-source tool by the software security company r2c (originally developed at Facebook) that supports many programming languages like Go, Java, JavaScript, Python and Ruby, with languages like PHP and C currently being beta-tested.

With Semgrep, you can use simple YAML configuration files that include patterns to look for specific structures in code. In the webinar, Devin will go over the basics of these patterns and rule files. You can also find more information in Semgrep's official documentation here: https://semgrep.dev/docs/. Using these configuration files is way easier and portable than creating your own script and parsing the AST (Abstract Syntax Tree) yourself each time you want to assess code structure.

Finally, as mentioned in the webinar, a great place to try out your patterns is using Semgrep's Playground, which can be found here: https://semgrep.dev/playground.

CodeGrade has built in support for Semgrep in it's Unit Test step and has made Semgrep into an education-ready tool. Specifically for education, we have added the `match-expected` field in the rule YAML, which you can use to look for both wanted and unwanted structures.

With CodeGrade, you can autograde every part of even the most complex code assignments. Learn more now!

Step by Step Examples

Below, you can find the example YAML configuration files that we used for the three examples in the webinar:

Example 1, checking for imports:

-!- CODE language-yaml -!-rules:
- id: pandas-import
 match-expected: false
 pattern: import pandas
 message: You are not allowed to use pandas in this simple assignment!
 severity: INFO
 languages:
   - python

The above file called `import.yml` can be uploaded as a fixture and used for your Python assignments right away.

Example 2, checking for for-loops:

In the webinar, something went wrong during the live example 2. Later we found out that this was not due to a typo, but due to a bug in Semgrep. We are currently upgrading our Semgrep installation in hopes that this will be resolved soon. This bug was however specifically for Java, and the Python YAML configuration below will work for the same purpose.

-!- CODE language-yaml -!-rules:
- id: for-loop
match-expected: true
pattern: |
  for $EL in $LST:
      ...
message: A for-loop was used
severity: INFO
languages:
  - python

- id: no-while-loop
match-expected: false
pattern: |
  while $COND:
      ...
message: No while-loop was used
severity: INFO
languages:
  - python

Example 3, checking for function and variable names:

In my opinion, this is one of the easiest yet most effective use cases of Semgrep in CodeGrade. In all your assignments for which you require your students to use specific naming in order for your (unit) tests to work, you can add a Semgrep check before those tests to check if the naming is correct. As a result, students will get a clear and helpful message from Semgrep when they made a naming mistake instead of a complicated error message from your (unit) test, preventing confusion past the deadline.

-!- CODE language-yaml -!-rules:
- id: function-name
 match-expected: true
 pattern: |
   def calculate_weight(...):
     ...
 message: You are using the function called calculate_weight().
 severity: INFO
 languages:
   - python

- id: variable-name
 match-expected: true
 pattern: bestsellers = $X
 message: You used the right variable name.
 severity: INFO
 languages:
   - python

Want to read more about Semgrep? You can also take a look at our Help Center article here!

Continue reading

Best Practices for Rubric Design in Coding Assignments

Discover best practices for rubric design in coding education. Learn to align rubrics with learning objectives, use automated tests, and explore ungrading for fairer, growth-focused assessments

Watch now! How to teach Python

Watch our 2024 webinar for updates on the Introductory Python course, including new modules, knowledge checks, and grading automation with CodeGrade.

Follow A New User!

See how CodeGrade streamlines grading and improves assignment design in programming courses.

How to configure an AI Assistant for code

CodeGrade’s AI Assistant offers a flexible way to support coding education through tailored guidance and feedback. This guide explains how to configure and customize the assistant to fit your teaching goals, from helping students debug code to providing conceptual explanations.

Sign up to our newsletter

Transform your Jupyter Notebooks course today!