Tag Archives: Moneyball

Forecasting France Euro 2016

I have a work colleague who not only is a tremendous negotiator and aircraft seller but also has a great sense of humor and manages in his free time late in the night to set up a contest for office staff to try to guess winners, matches’ scores, top scorers, etc., of major international soccer competitions. The France Euro 2016 which starts this afternoon could not be missed. Nacho managed to set up the contest in time.

In this post I am going to explain how I went about forecasting the results of the UEFA Euro 2016.

“when in doubt, build a model”, Nate Silver.

The readers of this blog may already know how much I do like to build models to produce forecasts, guesstimates, etc. In relation to forecasting this UEFA Euro 2016 there is some background that has shaped my mind in relation to the subject in the recent years, let me give you some hints:

Having shared this background, you may understand that I tried to remove all the beauty of guessing and my football knowledge out of the forecasting process (1).

I rather made use of:

  • ESPN Soccer Power Index (SPI) ranking, introduced by the economist Nate Silver. I used its offensive and defensive scores plus weight for each of the scores based on a tip indicating that in competitive matches the defensive factor tends to be slightly more important (see “A Guide to ESPN’s SPI rankings”) (2).
  • The frequency of different scores in the group phases of the Euro 2012 and the World Cup 2010, the in the round of 16, quarter finals and semi-finals.

Frenquency

  • A few simple rules about how to allocate results given the difference between SPI ratings of the two nations playing each match. (3)
  • The total number of goals during group phases the latest Euro and World Cup. In order to cross check that the total numbers of goals that my forecast yielded was in check with previous competitions.

It may sound very complex. It is not. It requires a bit of reading (which most of it I did years ago), retrieving the latest ratings, giving it a bit of thought to set up the model and then, not even looking at the names of the teams, you go about allocating the scores based on raw figures. Let’s see how my forecast fares this time! (4)

Porra Euro 2016

Les grandes personnes aiment les chiffres” (5), the Little Prince.

(1) In fact I have not watched a single national team football match from any country since the World Cup in Brazil in 2014.

(2) See here the blog post I published yesterday in which I made a more thorough review of the ESPN SPI index.

(3) I set up rules like “if the difference of the combination of indices of the two nations is below this threshold, I take it as a draw, if it is between x and y as victory by 1 goal, if higher…”, etc.

(4) This way of forecasting allowed me to finish 4th out of 47 in 2010, 15th out of 87 in 2014. As it removes biases it allows to be better than the average, though it prevents you of guessing outliers, gut feelings, etc.

(5)”Adults love figures”.

Note: In the blog post from yesterday I mentioned that the latest complete ranking from the ESPN SPI index that I could retrieve dated from October 2015. That is the one I have used, therefore, Germany results as winner. Of the latest ranking, covering the Top 25 nations, only 13 countries of the 24 competing at the Euro 2016 are included. I could have set up an hybrid ranking taking the latest rankings and ratings for the top 13 from June and using the October figures for the lower 11 teams. I decided to go on with a single set of data. If I had done so, the maing changes would have come from the semifinals onwards. France would have appeared as winner instead of Germany. We’ll see if that was a good decision.

Leave a comment

Filed under Sports

Forecasting 2014 FIFA World Cup Brazil

I have a work colleague who not only is a tremendous negotiator and contracts’ drafter but also has a great sense of humor and manages in his free time late in the night to set up a contest for office staff to try to guess winners, matches’ scores, top scorers, etc., of major international soccer competitions. The 2014 FIFA World Cup in Brazil, which will start tomorrow, could not be missed. Nacho managed to set up the contest in time.

To set up the background as to how I have approached the game of forecasting this World Cup:

  • I had written a review of the book “Soccernomics“, which among other things advocates the use of data in order to make decisions in relation to football transfer market, forecasting, etc. This book relies somewhat heavily in “Moneyball” another book which I read some months ago with a similar scope but with baseball as the theme sport.
  • When the draw of the World Cup took place last December, I wrote a couple of blog posts discussing what was the so-called “group of death” basing the analysis on FIFA and ESPN rankings.
  • During the last year, I read a couple of books which approach how we make decisions and how to remove different kind of biases from the thought processes of making them: “Thinking Fast and Slow” (by the 2002 winner of the Nobel Prize in Economics Daniel Kahneman) and “Seeking Wisdom“.
  • Finally, last year I followed the open course “A Beginner’s Guide to Irrational Behavior” by Dan Ariely (though I missed the last exam due to my honeymoon and could not get credit for it).

Having shared this background, you may understand that I tried to remove all the beauty of guessing and my football “knowledge” to the forecasting process. I rather made use of  ESPN Soccer Power Index (SPI) ranking, introduced by the economist Nate Silver. I used its offensive and defensive scores plus the tip indicating that in competitive matches the defensive factor tends to be slightly more important (see “A Guide to ESPN’s SPI rankings”).

Once I plugged in the numbers from the index and used the referred tip on the defensive side, I built a simple model to guess each of the World Cup matches. Once you take this approach you will find that the model gives you plenty of results such as Nigeria 1.32 – 1.53 Bosnia… What to do with it? When the result was very tight I resolved it as a draw, otherwise a victory for the team with the highest score.

In very few instances I forecast that a team would score 3 or more goals in a match. I bore in mind that in the 2010 World Cup 80% of the matches ended up with scores of 1-0 (26% of the matches), 2-1 (15%), 0-0, 1-1 or 2-0 (each 13%).  That a team scores more than 3 goals in a match will certainly happen in some games, but I did not bother to guess in which ones, the odds are against.

The prize pot of the game organized by this colleague is not particularly big (few hundreds euros). The main point of the game is enjoying the chit-chat with work colleagues. My second main point is putting this rational approach to work and see how it fares.

Finally, what did I forecast?

A World Cup won by Brazil against Argentina in the final. With Spain beating Germany for the third place (in the penalties). For my English readers: England defeated by Colombia in the 1/8 of final. For the ones from USA, it doesn’t make the cut from the group phase. We will see along this month how well do I fare.

2014 FIFA World Cup Brazil forecast.

2014 FIFA World Cup Brazil forecast.

1 Comment

Filed under Sports

Brazil 2014 FIFA World Cup: “group of death”? (using ESPN ranking)

In a previous blog post I used FIFA world rankings to see which was the “group of death” of the following Brazil 2014 World Cup finals.

I received some comments questioning FIFA ranking based on the position of some specific countries: Switzerland, Portugal, Argentina, Colombia, Chile… I am sure that when one looks at how each country is playing he will believe that this or that country plays much better than the other placed higher in the ranking. But, the goodness of the ranking is that it removes perceptions from the process and simply establishes a set of rules by which all teams are going to be measured. It then goes on computing teams’ results along the year and the positions in the ranking are established, for good and bad.

In one of the comments I received I got the suggestion to rather use ESPN Soccer Power Index (SPI) ranking. I was even more attracted to that hint as the ESPN SPI index was introduced by the economist Nate Silver of worldly fame, who many readers will know from his forecasts on recent elections in the USA (check his blog FiveThirtyEight).

In a post from 2009, when the SPI was introduced, just before the 2010 World Cup, he explained how the index was computed (“A Guide to ESPN’s SPI rankings”). As he explained, the process had 4 main steps:

  • Calculate competitiveness coefficients for all games in database
  • Derive match-based ratings for all international and club teams
  • Derive player-based ratings for all games in which detailed data is available
  • Combine team and player data into a composite rating based on current rosters; use to predict future results.
ESPN SPI ranking at the end of Nov 2013.

ESPN SPI ranking at the end of Nov 2013.

The main difference in relation to FIFA ranking algorithm is that it takes player-based ratings for those players who play in clubs in the Big Four leagues (England, Spain, Italy, Germany) and the UEFA Champions’ League. The player-based rating is merged into the national team coefficient. The player-based rating weighs heavily in national teams with many players playing in the main leagues (e.g. England or Spain national teams) and less heavily in other nations which roster is composed of many players not playing in clubs of the 4 main leagues (e.g. Russia).

Other details of the ESPN’s approach are similar to those used by FIFA: e.g. giving weights to results depending on the opponent, measuring the competitiveness of the match, the different confederations, etc.

You can see the top ranked countries at the picture above.

Without entering on whether this or that country is far better placed in one or the other ranking based on perceptions, one simple yardstick to measure them is to see how many of their 32 top countries are not among the 32 countries qualified for the World Cup:

  • FIFA ranking: 7 teams among the top 32 are not in the World Cup: Ukraine (18), Denmark (25), Sweden (27), Czech Republic (28), Slovenia (29), Serbia (30) and Romania (32). All coming from Europe, and not qualified for the World Cup due to the limited amount of places for UEFA countries (they all placed 2nd or 3rd in their groups).
  • ESPN SPI ranking: 6 teams among the top 32 are not in the World Cup: Paraguay (19), Serbia (20), Ukraine (21), Peru (27), Sweden (29) and Czech Republic (30). 4 countries from Europe and 2 from South America, out for the same reason. Here however, Paraguay is still placed 19th despite of being the last country of the CONMEBOL qualifying.

With the information from the ESPN SPI ranking I produced the same table:

Brazil 2014 groups heat map based on ESPN SPI ranking.

Brazil 2014 groups heat map based on ESPN SPI ranking.

And then, the same analysis as in my previous post follows.

The most difficult groups in terms of total ratings are:

  1. B (Spain, Netherlands, Chile, Australia) with 327.
  2. D (Uruguay, Costa Rica, England, Italy) with 323.
  3. G (Germany, Portugal, Ghana, USA) with 322.

Looking at the average ranking, the most difficult groups are:

  1. D (Uruguay, Costa Rica, England, Italy) with 14.
  2. G (Germany, Portugal, Ghana, USA) with 15,25.
  3. B (Spain, Netherlands, Chile, Australia) with 17,5.

And excluding the rating of the favorite team (pot 1) in each group, which is the favorite facing the toughest group?

  1. Uruguay in group D, facing 239.
  2. Spain in group B, facing 238.
  3. Germany in group G, facing 234.

Then, combining the 3 approaches, the toughest group is between B (in terms of combined ratings) or D (in terms of average rating and from the favourite point of view).

Using the ESPN ranking group G would definitely would not be the toughest one, but the 3rd toughest.

I would understand ESPN journalists calling group B or D the toughest one. What strikes me is why FIFA website content editors call group B the “group of death” if by their ranking that group would be the group G!

It will be interesting to see how one ranking fares against the other at the time of predicting the actual development of the Brazil 2014 World Cup.

2 Comments

Filed under Sports

Brazil 2014 FIFA World Cup: “group of death”?

The draw of the groups for the Final phase of the football World cup to take place in Brazil from June 2014 has taken place today. As it always does, it drew much attention and right afterwards lots of speculation, especially to identify which one will be the so-called “group of death”.

I read in the Spanish sports press that Group B, where Spain is placed, is called as “lethal”. I thought to myself: “playing the victims before the competition”. Then I read in the FIFA website:

Spain, the Netherlands, Chile and Australia will make up the proverbial ‘group of death’ at the 20th FIFA World Cup™, while Uruguay, Italy, England and Costa Rica will comprise another intriguing pool.

Well, no.

Take a look at the groups in the picture. What would be your guess as to the most difficult or the easiest group?

Brazil 2014 groups

Brazil 2014 World Cup groups.

FIFA ranking end Nov 2013

FIFA ranking end Nov 2013

I then decided to take a quantitative approach using precisely FIFA world rankings, a classification made up with the points each country is getting for their results every month.

FIFA uses a formula to compute those points:

M x I x T x C = P

M: winning, drawing or losing a match

I: importance of the match

T: strength of opposing team

C: confederation strength weights

P: points for a game

Take a look in the picture in the right, to see the FIFA rankings at the end of November, just before the draw has taken place. You will see Spain in the top spot with 1,507 points, well ahead of Germany, Argentina, etc. Most of the countries in the top 23 that you can see in the picture are represented in the World Cup with the exception of Ukraine. See the whole ranking here.

With this information I built the following table, attaching to each country in the different groups the current ranking and points. Then, I calculated the average ranking of each group and the total amount of points. I then, also summed up the amount of points per group excluding the favourite in each group, showing in that way which has been the most difficult or the easiest group for the favourite countries (those placed in the pot 1 of the draw). Finally, I coloured results in a heat map: more red, more difficult. Which is then the “group of death”?

FIFA 2014 groups heat map.

FIFA 2014 groups heat map.

As you can see the most difficult groups in terms of total points are:

  1. G (Germany, Portugal, Ghana, USA) with 4,358.
  2. B (Spain, Netherlands, Chile, Australia) with 4,191.
  3. D (Uruguay, Costa Rica, England, Italy) with 4,031.

Looking at the average ranking, the most difficult groups are:

  1. G (Germany, Portugal, Ghana, USA) with 11,25.
  2. D (Uruguay, Costa Rica, England, Italy) with 14,25.
  3. C (Colombia, Greece, Côte d’Ivoire, Japan) with 20,25.

And excluding the points of the favorite team (pot 1) in each group, which is the favorite facing the toughest group?

  1. Germany in group G, facing 3,040.
  2. Uruguay in group D, facing 2,899.
  3. Spain in group B, facing 2,684.

Then, combining the 3 approaches, to me it becomes clear that the toughest group is G, with Germany, Portugal, Ghana and USA, by the total amount of points, ranking of the teams and in relation to what Germany will face.

Then, I would say that the second most difficult group is D, both looking at ranking and from the point of view of Uruguay. The third being group B (though between D and B, depends on the approach).

On the other hand, for the Netherlands, Chile and Australia (the worst team of the competition) it is clear that group B is the most difficult, as from their point of view their group has the most points excluding themselves (mainly thanks to the 1,507 of Spain).

Finally, after having done the analysis and seeing the heading of conversations on groups’ difficulty are taking I realize how few people have read about “Soccernomics” or “Moneyball“… just like with stock markets, at least this is just football.

7 Comments

Filed under Sports

Soccernomics

Soccernomics.

If you love football (soccer) and have read one of the books of the “Freakonomics” saga or any book from Malcolm Gladwell, then “Soccernomics“, by Simon Kuper and Stefan Szymanski (430 pgs.), will be a great read for you.

The book is written in the same style as the other books mentioned above: using economics’ techniques, plenty of data, statistics, citing several papers, studies, etc., in order to bring up uncovered issues about football or refocus the attention about other ones. Some examples:

  • Mastering the transfer market. Departing from the example of Billy Beane in baseball, described in “Moneyball“, by Michael Lewis (of which a movie was also made starring Brad Pitt), the authors show how pouring money in transfer markets doesn’t bring titles. The key issue is to have a balanced net investment (sales/acquisitions). In soccer the main example would be Olympique Lyon which will “sell any player if a club offer more than he is worth”, for which each player is previously assigned a price (much like value investing).
  • The more money is paid to players the better (in salaries). Instead of buying new expensive players it seems to make more sense to pay well and ensure the adaptation of the stars already playing for the team.
  • The market for managers is not yet very open (e.g. no black coaches in main European teams), thus many of them do not make a real difference. There was even an English team Ebbsfleet United who dispensed the coach and allowed subscribed fans to vote the player selection for each match.
  • The book, written at the beginning of 2012 forecasted that soon teams from big European capitals would win the Champions’ League, being those capitals: London, Paris, Istanbul and Moscow. Few months later Chelsea won its first one, let’s see the others.
  • The main factors for the success of football national teams seem to be the experience (international games played by the national team), wealth and population.
  • The authors give much weight to Western Europe dominance of football due to the interconnectedness of continental Europe. Explaining the rise of Spain in the ’90s and ’00s due to its growth in population, improved economy since joining the EU, more experience and exchanges of styles with coaches of other countries.
  • The authors claim that future national football will be dominated by countries such as Iraq, USA, Japan or China.

As you can see there are many different topics, all with some data to support them (even if sometimes you doubt about the consistency of their claims, e.g. their statements on industrial cities as dominating football, dictatorships, etc.). I marked many pages with some anecdotes or papers that I would like to read.

One final anecdote: tips given to clubs and teams in KO competitions in case they face a penalty shoot-out. In the Champions’ League final of 2008, Chelsea and Manchester United reached the penalties. An economist had given Chelsea a study of Manchester goal keeper and penalty-shooters. Once you read the book and the tips the economist provided (“Van der Sar tends to dive to the kicker’s natural side”, “most of the penalties that Van der Sar stops are mid-height, thus is better to shoot low or high”, “if Cristiano Ronaldo stops half-way in the run-up to the ball chances are 85% that he shoots to his natural side”…), it is quite interesting to actually see that penalty shoot-out and how the different players acted.

[Pay special attention at Van der Sar’s reaction at 09’40”, when it seems he noticed about Chelsea having been tipped]

I definitely recommend this book to football fans. I also recommend two other books about which I wrote in the blog some time ago: “How soccer explains the World” and “Historias del fútbol mundial“.

8 Comments

Filed under Books, Sports