11 April 2020 | on Dejan Lovren: the SQL

We return to important matter of considering the goalscoring exploits of Liverpool and Croatia central defender Dejan Lovren.

In my previous blog post, I referred to the team top scorers charts on the BBC Sport website, noting that the statistics include minutes played (and assists provided) in only those competitions in which a player has scored. For players such as Lovren who have scored in a competition in which they have played relatively few minutes (the Champions League) and also played many minutes in at least one other competition (the Premier League) without scoring, this creates a misleading impression of prolificness.

I advised BBC Sport and Opta, which supplies its data, of this apparent error. Neither responded, and the data calculation has not changed, so I assume they are content with the way things are.

But according to these statistics, in terms of minutes per goal, players Lovren is currently (courtesy of his single goal) having a more prolific season than include teammate Roberto Firmino (11 goals) and Kevin De Bruyne, Riyad Mahrez, James Maddison and Dele Alli (9 each).

I decided that looking at Lovren's figures again would be a good way to practice basic SQL commands (specifically, PostgreSQL). Here's the data I entered, relating to all of his appearances this season until the coronavirus-inspired suspension (competitions are League Cup, Premier League, Champions League and FA Cup):

Dejan Lovren 2019-20 stats by match

Summing Lovren's total minutes, goals and assists for the season so far gives:

Dejan Lovren 2019-20 sums

Here's a reminder of the BBC Sport figures:

BBC Sport Liverpool Top Scorers, 11 December 2019

As previously noted, the difference is due to the BBC Sport statistics including only games in which a player has scored, thus its total number of minutes is much smaller (and the assist is missing).

Grouping Lovren's sums by competition clarifies:

Dejan Lovren 2019-20 sums by competition

Lovren played 233 minutes in the Champions League, and his assist came in the Premier League (in which he has not scored).

At the 2018 World Cup, Lovren was (in)famously quoted as saying, "I think people should recognise that I'm one of the best defenders in the world." Perhaps he can now also claim to be a goal machine.




11 December 2019 | on Dejan Lovren's missing minutes

Congratulations to Liverpool Football Club on qualifying for the knockout stages of this season's UEFA Champions League with a victory over Red Bull Salzburg yesterday.

Looking at the Top Scorers statistics for the club on the BBC Sport website, some of the figures didn't look right to me. A good example relates to the Croatian defender Dejan Lovren:

BBC Sport Liverpool Top Scorers, 11 December 2019

According to BBC Sport, he has played 233 minutes this season, i.e. the equivalent of just over two and a half full matches.

But BBC Sport's own individual match reports state that he played the full 90 minutes in each of 10 matches as well as parts of two further matches.

The Premier League's official statistics page states that he has played 670 minutes this season in the Premier League alone:

Premier League Dejan Lovren minutes played stats, 11 December 2019

What is going on with the numbers?

The only goal that Lovren has scored so far this season came in the Champions League match against Napoli on 27 November. He played the full 90 minutes in that match and also against Genk (23 October) as well as 53 minutes against Red Bull Salzburg yesterday, his only other appearances in the competition so far. 90 + 90 + 53 = 233 minutes played in Champions League matches.

It seems that the statistics count minutes played only in competitions in which the player has scored. Evaluation of other players' figures seems to confirm this. It seems a strange restriction to make; I assume it is an error.

As well as omitting minutes played in competitions in which the player has not scored, the statistics appear to omit assists: Lovren is not credited in the BBC Sport statistics with the impressive pass leading to Divock Origi's second goal in the recent Merseyside derby.

The Premier League official statistics confirm that he has made one assist in the Premier League this season:

Premier League Dejan Lovren stats inc assists, 11 December 2019

This omission seems a bit harsh, given the quality of the pass and the fact that it was only his fourth-ever assist in 160 Premier League games.

If BBC Sport would like to employ an additional proofreader or data assistant, I should be happy to be considered. :)




4 August 2019 | on searching for steam locomotives

Since childhood I have admired the sleek elegance of the LNER Class A4 steam locomotives. 35 of the locomotives were built between 1935 and 1938, and they became famous for their aerodynamic design, distinctive sloping front and impressive speed. In 1938 No. 4468 Mallard reached 126mph, still the world record for steam locomotives.

I decided to add a wordsearch in their honour to my JavaScript collection.

LNER Class A4 locomotives wordsearch

Some of the locomotives changed names during their career and I had to choose which to use. I selected the current name for the six preserved locomotives and the original name for the others (other than No. 4495 Golden Fleece as its original name Great Snipe was also used for No. 4462). For the most part I prefer the earlier bird-referencing names.

I did most of the programming for the wordsearch app in March 2019 and have gradually built up the collection of wordsearches since. Due to the largish quantity of items to find (35) and the longish names involved ("Commonwealth of Australia" is 23 letters long), the grid for the A4 locomotives is rather larger than any of the previous 40 or so wordsearches.

I thought it would be fun to add a smaller wordsearch paying tribute to some smaller locomotives. The result was new addition referencing the 16 steam locomotives of the narrow guage Isle of Man Railway, which I have fond memories (sight, sound, smell and motion) of riding as a child.

Isle of Man Railway steam locomotives wordsearch

I think this one works better, being a more managable size and with a more balanced spread of orientations of words.

Until recently all my wordsearches had the same colour scheme but with a recent addition for the 2019 Ashes series cricket I added some code to enable variation and presented it in green and white to reference the playing surface and players' clothes. In the two new wordsearches I have picked colours to reference the railway liveries: LNER garter blue with yellow lettering and IoMR Indian red and holly green.

In 2008 I found LNER No. 4468 Mallard and in 2009 IoMR No. 4 Loch.

Edward Kinley and Mallard Edward Kinley and Loch

Please feel free to have a go at finding some yourself!




5 July 2019 | on marbles

I decided to make a JavaScript app based on the classic board game solitaire (sometimes called peg solitaire). Physical instances of this game often comprise a wooden board with marbles for the moving pieces. I wanted to photograph suitable items to create images to use in the app.

When I first moved to Edinburgh I was struck by the quantity and character of steps in the city centre, many in its famous closes. One set of steps that was practically useful but somewhat unattractive (being somewhat dilapidated and malodorous) was the Scotsman Steps.

A few years ago, however, the Scotsman Steps were given a new lease of life by artist Martin Creed and each step now bears a different marble. I thought these would make an attractive and varied set of marbles for my game.

Scotsman Steps from outside Scotsman Steps in various marbles stone railings at Scotsman steps

For the wooden board I used images of my old neglected classical guitar. I am glad that it has at last participated in something at least slightly artistic or creative.

guitar strings and sound hole guitar wood close-up guitar wood close-up

I wanted the game design to be simple and uncluttered, inspired by some of the elegant physical sets. Use of the app I sought to make intuitive and I hope I have succeeded. I am pleased with the results.

Solitaire app Solitaire app close-up of marbles

I enjoyed the project design, programming and image choosing. Please feel free to play the game or view the code. Thanks for reading!




16 May 2019 | on Scottish migration statistics, 1951-2018

I spotted a dataset recently updated by National Records of Scotland (NRS), Total Migration to or from Scotland. I thought it looked interesting and decided to have an exploratory look, using the statistical software environment R to analyze and display the data.

The data I am considering cover the period 1951 to 2018 and include net migration to or from Scotland relative to the rest of the UK, to overseas, and the total (sum of the former two).

(I will not be making any political comments, e.g. on whether certain levels of migration are (dis)advantageous; just looking to find and display trends and any other interesting characteristics of the data.)

First, here is the total net migration over time, the yellow line indicating zero net migration:

R code for total net migration
graph for total net migration

The increasing trend is clearly seen and appears to be fairly linear. Adding a linear regression line emphasizes further:

R code for total net migration linear regression line
graph for total net migration with linear regression line

I was interested to note that net migration was consistently negative until about 2000 and has been consistently positive since (I detect no effect of the 2016 referendum on European Union membership), and also that the recent net immigration has not yet matched the quantity of the earlier net emigration, the net effect over the whole time period being emigration of more than half a million (data in units of 1000):

R code for total total migration

How much of the net migration is to or from the rest of the UK, and how much is to or from overseas?:

R code for UK and overseas net migration
graph for UK net migration graph for overseas net migration

The trend is similar for each component. Plotting them together with the total shows how they reinforce each other:

R code for UK, Overseas and Total net migration overlay plot
UK, Overseas and Total net migration overlay plot

It appears to me that the total trend may be influenced a little more by the overseas migration than by the UK migration. Comparison of the magnitude of the net annual amounts shows that overseas migration contributed more to the total than did UK migration:

R code for UK, Overseas and Total net migration overlay plot

And overseas migration net amounts varied more:

R code for UK, Overseas and Total net migration overlay plot

But there was only weak evidence of different underlying variances; not enough to pass a formal statistical test (p-value 0.16):

R code for UK, Overseas and Total net migration overlay plot

One thing that particularly surprised me was the fact that Scotland's net rest-of-UK migration has been positive for the last 18 years straight:

R code for net migration levels, 1998-2018

If I was to study these and related data further, a few lines of enquiry seem of interest:

- Consideration of the actual emigration and immigration quantities, rather than the net amounts only. (The NRS dataset contains some such data, but for recent years only.) There might be considered to be a big difference between, say, a net migration of 10,000 caused by 5,000 leaving and 15,000 arriving and the same net amount caused by, say, 80,000 leaving and 90,000 arriving.

- Statistics for other parts of the UK. If Scotland has for the past two decades had consistent positive net migration from the rest of the UK, which parts have had negative migration?

- Statistics for the total population. What effect have rates of births and deaths had relative to migration patterns?

- Statistics for home-building over the period of net migration. Have homes been built at the same rate as the population has grown?

- Other aspects of the NRS dataset, including breakdown by age and sex.

Altogether I enjoyed having a look at the data and some of its trends and statistics. No doubt it would repay further study in some depth, but I am glad to have gained some knowledge of the quantities and patterns present. Thank you to NRS.




27 May 2018 | on coasting along

Several days ago I was coasting along in my thirties, when something unusual happened.

Something I'd never experienced before.

My fortieth birthday.

My cousin gave me as a birthday gift this very stylish coaster:

It does its job rather well and looks good with a variety of drinking utensils:

However, I was suspicious. Were the numbers correct? I decided to write some Ruby code to investigate...

(After some consideration it seemed apparent that 365.2425 had been used for the number of days in a year. Given that the regular leap day has not been missed since 1900 and won't be until 2100, I would have thought 365.25 would have been apt, but what do I know? My life is only just beginning.)

Anyway, here is the output to the Ruby code together with the coaster:

So the days, hours and seconds match precisely while the minutes figures are out by one.

The coaster has taken one minute of my life. (But not the equivalent 60 seconds.)

What happened to that minute? I don't remember it. Was I hypnotized? Did I commit some heinous crime? Did I do something fun that I am destined never to enjoy recalling?

Where is that minute? Has it gone to the same place as that missing question mark?

Is a question without a question mark a question?

Is a rhetorical question a question?

Is this a question?

Is this an answer?

One of the unexpected pleasures of learning coding is the fact that it frequently raises interesting philosophical questions. I'll drink to that.

(Vanilla milkshake, I think.)