After looking at George Leckie's recent exploration of the relationship between the DFE's new measure that includes the average grade based on the best 8 GCSE's I too thought I would have a look. Have a read here:

Click on the links below, to see the results and to follow the methodology for yourself.


To do this manually, I would firstly need to download the current performance tables from here:

The file when opened looks like this:

Then I converted it into an Excel file. If you are not familiar with converting CSV files to Excel, you can download the Excel file.

To find the information that I would need to use, I also would need to download the metadata file from here:

I then identified that I need two columns of data from the KS4 performance tables. These are 'Percentage of pupils achieving 5+ A*-C or equivalents including A*-C in both English and mathematics GCSEs' or the data label 'PTAC5EM' at column 52 to be precise. In addition, I need to use data in the 'Average grade per pupil capped at the best 8 qualifications' column (data label: AVGRDPEPPCP or column 415).

As I want to create a scatter plot to show this relationship, I need to convert every item of data in the AVGRDPEPPCP column from the performance tables into a point score. To do this I used the following table:

In converting the 'Average grade per pupil capped at the best 8 qualifications' for each school, I was then able to plot this against the Percentage of pupils achieving 5+ A*-C or equivalents including A*-C in both English and mathematics GCSEs data for each school.


The visualisation created resembled George Leckie's diagram (left) with a few exceptions (see right).

Diagram 3 from (accessed 13/2/14) Diagram from (accessed 13/2/14)

To remove some of the discrepancies, I then adjusted the filter to include only average grade score of 30+ which resulted in a significantly closer visualisation (as shown below) to that produced by Leckie (as above left). I did not remove other schools from this view (Leckie identifies that 8 outliers were removed in total and based on the representation below, I have only removed 1- however, it is quite clear which were removed and it is easy to speculate on the reasons why).


With the above representation including all schools, schooldata could be used to identify an individual school using this representation or even groups of schools. Having completed the activity using Excel it took over an hour to reproduce the above (18 minutes to download the performance tables, 5 minutes to identify which columns to use, 25 minutes to organise the data, remove columns in excel and run the scatter plot, 15 minutes to make adjustments to the visual representation of the scatter plot and to include a macro to allow me to view the names of schools). The following chart shows this comparison:

To provide a balanced evaluation, the chart produced by George Leckie does include grades as opposed to point scores along the vertical axis. However, the visualisation in does allow you to identify a particular school or make comparisons against groups of schools.
In addition, it may take some people less time to do the above (either using Excel or other tools), I nevertheless hope it gives a clear understanding of what can do.

In total, the above visualisation took only a few minutes using schooldata. Watch the real-time video to see how allows this to be done easily. It is only about 4 minutes long and includes the rationale for my doing this visualisation.

Using school data - video

To keep on top of current developments, access some free tools and find out more, create an account by following this link.

You can also follow me on twitter (@philbourne77).