Step 1: Import libraries

Step 2: Setup BERT

Step 3: Load the ICNALE Corpus data

Step 4: Divide the data up into training, validation, and test sets

Step 5: Preprocessing Function

Step 6: Set up the metrics we use for model training and evaluation

Step 7: Display the distribution of CEFR scores in the training, validation, and test sets.

Step 8: Set up the training scheme

Step 9: Run the Trainer

Step 10: Evaluate the final model on the held-out test dataset

Step 11: Examine the confusion matrix for this model

Step 12: Reset model to run a regression

Step 13: Redefine the preprocess function</font>

Reset the data using the new reprocess function

Define metrics for regression

The metrics we use for regression are different than metrics we use for classification. • Error is predicted value minus absolute value. MSE squares that and takes the average, which penalizes more for larger errors. MAE just takes the absolute value of the error and takes the average, which doesn't penalize large errors so severely. R2 (r squared) is the correlation between predicted and true values, squared, again penalizing weaker relations between actual and predicted scores more strongly.