Grado: Doctorado

Especialización: Economía

Grado: Doctor.

Especialización: Teoría de toma de decisión. Grupo de investigación Análisis de rendimiento y eficiencia de factores sociales, económicos y educativos

Summer Olympic Games in Rio 2016 were the biggest and the most important sport event in 2016. Athletes’ performance at Olympics is always of a high interest and serve as a basis for analyses. Many countries have started programs of higher sport funding to increase the athletes’ performance. A particular example can be Great Britain and its enormous program of sport funding. In this article, we make an econometric analysis of quantitative and qualitative impact of Gross Domestic Product (GDP) on sport performance, with regard to corruption and other social and demographic factors. Our results show that the best explanatory model of the medal ranking in the Summer Olympic Games in Rio 2016 includes qualitative GDP, corruption and Economic Active Population. Therefore, Olympic performance is not only explained by the basic population-GDP theory, but there are other social and demographic factors that make the relation complete.

Los Juegos Olímpicos de Verano en Río 2016 fueron el evento deportivo mayor y más importante en 2016. El rendimiento de los atletas en los Juegos Olímpicos siempre es de gran interés y sirven como base para análisis. Muchos países han iniciado programas de mayor financiamiento del deporte para aumentar el rendimiento de atletas. Un ejemplo particular puede ser Gran Bretaña y su enorme programa de financiación deportiva. En este artículo se realiza un análisis econométrico del impacto cuantitativo y cualitativo del Producto Interno Bruto (PIB) sobre el rendimiento deportivo, relacionado con el nivel de corrupción y otros factores sociales y demográficos. Nuestros resultados muestran que el mejor modelo explicativo del ranking de medallas en los Juegos Olímpicos de Verano en Río 2016 incluye PIB cualitativo, corrupción y Población Económicamente Activa. Mostrando con ello, que el rendimiento olímpico, no solo es explicado por la teoría clásica de población-PIB, sino que existen otros factores sociales y demográficos que hacen más completa la relación.

The Summer Olympics in Rio 2016 were the biggest and the most important sport event (or a general event) in the year 2016. More than 11,000 athletes from 205 countries (including for the first-time Kosovo, South Sudan, and the Refugee Olympic Team) competed in 306 events in 28 different sports (Rio,

The importance of sport event can be measured regarding to several criteria to evaluate its influence. Considering classification developed by Müller (

Therefore, sport performance at Olympics is of a high importance. Sport performance at international level is usually measured regarding to level of population and GDP (the traditional population-GDP based theory of Olympic success). These two factors are recognized as the two most important economic and demographic factors (

A part of the main economic and demographic factors, other factors are important for sport performance. For example, corrupted environment has negative impact on public sector efficiency and performance (Transparency International,

Moreover, other economic, demographic or social indicators can be used to analyze the sport performance (such as inflation rate, unemployment, number of athletes, etc.). The objective of this article is to analyze and evaluate economic, demographic and social effects related to sport performance and summer Olympic Games in Rio 2016.

The article is organized as follows: Section 1 briefly describes the econometric theory. Section 2 outlines the data and variables employed in this study, whereas section 3 describes the econometric model of this article. Results are presented in section 4 and section 5 offers a discussion over the achieved results.

The main purpose of the article is to apply econometric modelling on relations that exist between results of Olympic Games in Rio 2016 and economic, social and demographic variables. To accomplish this goal, it is necessary to analyze both a functional relationship and a probabilistic analysis of the proposed model. When an econometric analysis is made, and when it is necessary to show the importance of variables, we must consider two parts: 1) intuitive part reflecting the expected theoretical analysis, and 2) the second part deals with statistical significance that shows the minimal explanatory error of the independent variable over the dependent variable. Regarding the intuitive part, it is necessary to define a correct functional form. On the other hand, the statistical part relates to methodology of hypotheses testing. The main purpose of the presented analysis is theoretical and statistical justification of the proposed econometric models.

The simplest representation that captures economic changes of an independent variable with respect to another independent variable, can be expressed as

which shows, how price of a product has a direct impact on the level of individual’s purchase Q. However, there are more factors that have direct impact on a purchase decision, such as a price of a substitute (P_{S}), price of a complement (P_{C}), individual’s income (

However, it is not necessary that the constructed model is linear. For example, model (1) can be expressed as

that capture more real economic situation. Further, we can transform (3) to a linear form, such as

This expression is the same as (1), but the variables are expressed in form of logarithms. Furthermore, coefficient b captures elasticity rather than a slope, i.e. instead of measuring unit impacts the coefficient measures percentage impacts (elasticities). Therefore, if the model consists of more than one variable and model is not linear, then we should express a model in Cobb-Douglas form, such as

Equations of this type can economically
show utility functions

which expresses similar form as (2) but with two independent variables and its coefficients as elasticities, i.e. if the original variable was maintained or there was a need to transform it by logarithms or some other expression. For this we must

Provide graphical analysis between each independent variable and dependent variable, and see the relation that exists between them, in order to be able to suggest a transformation for all the variables.

Know what represents the independent variables as well as the dependent variable, in addition to their domains, to see if any of them need any transformation.

In this article, we work with ii. For example, if represents life expectancy of population, which depends, among other variables, on income level , then we can consider following linear expression

where

Based on the above, the objective is not to start from a functional form that comes from the work, but to infer and compare both the impacts as the relationships between independent and dependent variables and statistically support these impacts.

Apparently, a complete representation, linear or non-lineal, can be constructed. However, not all variables can be observed. There are other variables, such as crisis, strike, war, inflation, etc., which somehow influence the model. Although these variables are somehow known, we cannot control them, as they occur with certain probability. Therefore, these variables are of a random form. Such variables are called disturbances following a known distribution.

Model where disturbances are considered is an econometric model, which representation can be expressed as

where is a dependent variable,

The

Dependent variable Ui must be independent, i.e. if we seek to explain sales of a company in terms of observed variables

Moreover, these variables var

The

Dependent variable Ui must be independent, i.e. if we seek to explain sales of a company in terms of observed variables

Moreover, these variables var

The last point is important in the inference analysis of the explanatory variables. For example, the presence of heteroscedasticity is inherited towards the explanatory variable

In addition, with respect to the interpretive part of (8), the effect of changes of _{1},Y_{2,}…,Y_{k}

which represents a percentage effect of

We can estimate values of the coefficients _{j }(j=1,2,...,k)_{1},Y_{2,}…,Y_{k, }

The explanatory variables in (4) have quantitative character. However, in some cases, it is of a high interest to introduce variables with a qualitative character, such as difference in incomes between genders, different size of country or region. To identify the effects of qualitative variables, we must introduce qualitative variables into an econometric model, such as

Where D is a dichotomous variable representing a quality or not. For example,

Thus, if we would like to estimate an average income of a woman, then (11) becomes

where the average income of a woman is . Similarly, if we would like to estimate an average income of a man, then (11) becomes

where the average income of a man is

The qualitative effects can also
differentiate the qualitative effects on _{i}
_{ }as

where _{i}
_{ }on _{i}
_{,} regarding the character which represents _{i}
_{.} In models such as (11) the effects are fixed, whereas in models such as (14) the effects are random.

Finally, one of the objectives in an econometric model is to have the best model, i.e. to have more variables explicative and significant. However, this inclusion of variables must be statistically justified. Thus, we think in a model with k explicative variables such as

and suppose that to add g-k variables, getting

Then, to justify if the aggregation of these new variables is significant, we test the following hypothesis

To reject H_0 it is necessary that
squared sum of residuals of augmented model (RSSA_{(g)}) is smaller than squared sum of residuals of the reduced model
(RSSA_{(k)}), and the statistical test is

So, if

then we reject _{0}
_{ }and, thus, justify that the aggregation is
statistically significant.

In total, 204 nations participated at the Summer Olympic Games in Rio 2016. However, we had to make some corrections due to availability of data. At first, we excluded Independent Olympic Athletes and Refugee Olympic Athletes (both participating in Rio 2016) from the further analysis as they both are not factual countries and, therefore, no economic and demographic data are available.

To express the economic and demographic power of each participated nation, we use GDP in US dollars as the economic indicator, whereas Economic active population (population ages 15-64 as a part of total population) as the demographic indicator. Both indicators were obtained from the World Bank database (

At second, we have eliminated following 19 countries due to missing data of either GDP or Economic active population: American Samoa, Andorra, Bermuda, British Virgin Islands, Cayman Islands, Chinese Taipei, Cook Islands, Dominica, Liechtenstein, Marshall Islands, Monaco, Nauru, Netherlands Antilles, Palau, Palestine, Saint Kitts & Nevis, San Marino, Tuvalu, US Virgin Islands. On the other hand, we could find data regarding GDP for North Korea

We use World Bank’s classification by income to classify countries per their income (World Bank, 2016) as third factor of the analysis. This factor is treated as dichotomous variable as: 1 – low-income economies (Gross National Income (GNI) per capita of $1,025 or less), 2 – lower middle-income economies (GNI per capita between $1,026 and $4,035), 3 - upper middle-income economies (GNI per capita between $4,036 and $12,475), and 4 - high-income economies (GNI per capita of $12,476 or more)[6]. Further, we use data from World Bank for Inflation rate (INF) as GDP deflator (annual %).

Further, we use Transparency International (Transparency International, 2016a) Corruption Perception Index (CPI) as a factor describing level of corruption in participated nations at the Olympic Games. The CPI data was not available in case of Belize, Antigua & Bermuda, Grenada, Solomon Islands, Tonga, and Vanuatu. To keep these countries in the analysis, we extrapolated their CPI considering geographical location: for Belize (31.375) as an average of CPI results of El Salvador, Guatemala, Honduras and Nicaragua; for Antigua & Bermuda and Grenada (69.556) as an average of Barbados, St. Lucia and St. Vincent & the Grenadines, and, finally, for Solomon Islands, Tonga and Vanuatu (45.375) as an average of Fiji and Samoa.

Finally, the last factor of the analysis consists of medal ranking. Usually, the analyses of medal ranking consider golden, silver and bronze medal ranking (Li et al., 2008; Wu, Liang, and Yang, 2009). However, to be able to analyze more countries, in this article we consider first 8 place from each discipline. To give higher importance to golden, silver, bronze medals and higher places, we use the IAAF methodology assigning 8pts to golden medal, 7pts to silver, etc. until 1pts to 8th place in each discipline. Data were obtained from the official website of Rio 2016 summer Olympic Games (

The data covers period from 2011 to 2015, as the length of a preparation for Olympic Games is commonly based on 4-year-long cycles. Moreover, we treat all factors as an average through this period. The data included in this article can be seen in Flegl and Andrade (2016).

Our variables for constructing the model are as following

Y – Weighted medal ranking of the first eight positions;

_{1}

_{2}

_{3}

_{4}
_{ }– Countries’ income classification regarding Gross national income (GNI), treated as dichotomous variable (see (14));

_{5}

With these variables, our general model is as follows

Models as complete as (18) are difficult to obtain, either by the wrong positive or negative sign of a variable or by the significance of all variables. Therefore, it is advisable to begin with a simple model, such as (1) o (2), and enter the variables one by one, until we reach the best model. This process requires an econometric analysis, which we explain in the next section.

The variables in (18) are of a social character (such as corruption level and countries’ income level), as well as of an economic and demographic character (Gross domestic product, economic active population and inflation). We apply log-transformation on some variables to make the magnitude between regressor variables and returned variable as comparable as possible.

One of the most important variables influencing the medal ranking at Summer Olympics is GDP. Therefore, considering (12), the first model is as follows:

from where we get the following result

where SE corresponds to standard error of the estimators, t corresponds to values of t-test, and RSS refers to residual sum of squares

We can conclude that the level of GDP is very important to performance at Olympic games, considering the positive value of lnX_{1i.} Any country without GDP would have a negative performance (negative coefficient -932.6314), and, thus would not participate at Olympic Games. Given =40.7059, a growth of GDP by 1% would be reflected by a growth in the medal ranking by 40.7059 points. For example, a percentage increase would result in winning 5 golden medals, or in a combination of one bronze medal, 4 times 5th place, three times 6th place, two 7th places and one 8th place.In addition to the achieved results in (19), we can divide analyzed countries into the following four groups considering their Olympic performance (considering the weighted medal ranking):

Considering this division, a country’s participation resulting in one bronze medal, one 4th place and two 5th places (19 points in total) can be seen as regular. On the other hand, a country with two golden medals, three silver medals, two bronze medals, and one 4th position (54 points in total), can be seen with a good performance

Therefore, a country must reach at least 150 points in the medal ranking to have an efficient participation at the Summer Olympics (Table 1). Considering (19) we get following data

Thus, a country with GDP lower than 12,893.39 million dollars, is supposed to perform bad at Olympics. Moreover, country with a regular performance would need to increase, in average, its GDP by 2.36 times to have a good performance at the Olympics, and by 27.56 times to reach an excellent performance.

The previous results proved statistical significance of GDP on the medal ranking (

In Rio 2016, Great Britain won 2 medals more than in London 2012, resulting in 14 weighted medals more (). Considering the growth of public funding by 11%, it means that a growth of public funding by 1% resulted in weighted medal growth by 1.2727. Thus, we can test a hypothesis, whether the increased funding was reflected by the performance in Rio 2016 or not. Therefore, we test following hypothesis

Using the confidence level 95%, _{1})

Therefore, the confidence interval according to the data in (20) is (32.97, 48.43), which means that the value of does not belong to this interval, and, thus, we can say that is rejected. Great Britain’s growth of sport funding was not reflected in higher performance in Rio 2016. On average, each medal at the Rio 2016 has cost GBP 5.5 million (approximately USD 6.95 million). The results of the analysis show decreasing returns to scale of GDP to sport performance. Thus, GDP is not the only important factor that affects the sport performance. Therefore, it is necessary to analyze the effect of other economic, demographic and social factors.

Including variable (GNI) into (17), countries’ income classification, we get

The idea is to verify, whether the income classification has an impact on the medal ranking or not. Variable X4 represents country’s status and, thus, can be expressed as dichotomous variable, as follows:

where

using (22) we get the following result

where SE corresponds to standard error of the estimators, t corresponds to values of t-test, and RSSA refers to residual sum of squares of the augmented model.

The effect of lnX1i is positive and statistically significant (t=9.17). On the other hand, the effect of D is negative and not significant (t=-0.01). The negative effect of D seems illogical, considering the definition of D. Richer country should have better performance and be higher in the weighted medal ranking. However, we can run a test of joint significance:

We obtain resulting in rejection of H0, i.e. the variables together are statistically significant to explain the weighted medal ranking at the Summer Olympic games. Further, as we observe variable D as insignificant, then there must some relation between lnX1 and D. In this case, the information explained by D is immersed in lnX1, and vice versa. To prove this, we run the following estimation

and we get the following result

High correlation between and is logical since the status is determined by the income of each country and this is related to the level of GDP

If we observe a multicollinearity between regressor variables, then the first solution is to separate the impacts, i.e. make a regression only for , as in (14), and make another regression as

from where we get the following result

As a result, in (24) the effect of _{
i }is positive and significant (

We can conclude that if

So far, we have analyzed the qualitative
and quantitative effect of GDP on the performance at Rio 2016. Furthermore, we
can evaluate effect of other variables on the weighted medal ranking.
Therefore, we can include level of corruption (CPI) to find out whether there
is a statistically significant effect as in the case of GDP and GNI. Thus, we
include X_{3
}into (19) and (24) to measure this significance. We get following expression

and we get the following result

Both variables GDP and CPI are statistically significant, in case of level of corruption tobs=1.99>1.9736=ttab, which is significant at the confidence level 95%. Further, we can run a test of joint significance for GDP and level of corruption as following

We obtain resulting in rejection of H_{0}, i.e. both variables together are statistically significant to explain the weighted medal ranking at the Rio 2016 Summer Olympic games.

In addition, as RSSA_{(25)}=2,379,825.53 is lower than RSS_{(19)}=2,432,965.44, then the aggregation of level of corruption into the model is statistically significant. And, thus, we get more complete model explaining the performance at Summer Olympics regarding GDP and CPI.

With respect to the achieved results in (25), the effect of corruption on Olympic performance is positive, . Considering the interpretation of CPI (

with following result

Although we got for the countries’ income
classification (_{obs}
_{tab}

We obtain

resulting in rejection of H_{0}, and both variables together are
statistically significant to explain the weighted medal ranking at the Summer
Olympic games. In addition, as RSSA_{(26)}=3,476,398 is lower than RSS_{(24)}=3,589,166.85,
then the aggregation of level of corruption into the model is statistically
significant. And, thus, we get more complete model explaining the performance
at Summer Olympics with regard to country’s income classification and the level
of corruption.

With respect to the achieved results in (25), the effect of corruption on Olympic performance is positive, . Therefore, every improvement of the CPI by one point, would result in an increase of weighted medal ranking by 1.518 points. For example, if a country with CPI=50 increases its level of corruption by 20 points up to CPI=70, then this country could expect increase in weighted medal ranking by 30.36 points (e.g. 2 more golden medals and 2 more silver medals).

Further, we analyze the effect of the economic active population (EAP) level on the weighted medal ranking. Thus, we EAP, X2, into (25).

We get the following result

Although we get a positive coefficient
(13.2018) for EAP, which is logical, this variable is not statistically
significant for the medal ranking (_{obs}

where _{1 }captures
the effect of GDP per capita, in this case per EAP, over the sport performance
at the Summer Olympics in Rio 2016. We use GDP per EAP as we seek to measure
the performance over population better describing nations’ potential to
generate number of participants to the Olympic Games. We get the following
result

Although we get a positive coefficient of the GDP per EAP (and significant as t=3.18), the effect of corruption becomes nonsignificant (t=0.21). Therefore, in this combination we lose significance of corruption on sport performance, which we get in (25).

Moreover, we can also consider the same combination using GDP per EAP in form logarithmic form, as following

where_{1} shows the
percentage effect of GDP per EAP over sport performance, and the estimation is

Similarly,
the effect of corruption in this last regression loses its significance (

remains significant (

Fortunately, we have other model that includes both variables, GDP and corruption, in this case of qualitative form (30). Therefore, we add the PEA variable to this model and we have

with following result

In this case,
the EAP is statistically significant (

Where RSSA_{(30)}=2,408,523.53 and RSSA_{(26)}=3,476,398. This is, squared sum of residuals of augmented model RSSA_{(30)}=2,408,523.53 was significantly reduced.

Therefore, the aggregation of level of EAP makes the model more complete to explain the performance at Summer Olympics, i.e. model (30) represents a relation between the performance at the Summer Olympic Games and economic effects (GNI), demographic effects (EAP) and the corruption (CPI).

It is important to emphasize the way in which the EAP variable enters (30). Although formally does not deduce the functional form to be applied (not the intention of the work), the idea of taking logarithm is to make the regressors of the variable more comparable. Thus, to avoid problems of heteroscedasticity and significance (see section 1.2).

Finally, we can analyze the effect of inflation rate on the weighted medal ranking. Thus, we incorporate inflation rate X5 into (25)

We get the following result

In this case, INF is not statistically significant for explaining the medal ranking at Olympics (_{tab}

Similarly, we can analyze the effect of inflation rate together with GNI and CPI. We get the following model

We get the following result

As in model (31), the combination of
inflation rate together with GNI and CPI is not statistically significant (

The most representative model can be considered (30), where the weighted medal ranking is explained by GNI, CPI and EAP. In this case, increase of EAP by 1% would result in gaining 4.29 in weighted medal ranking, improving corruption level by 1 point would result in 2.476 in weighted medal ranking, and if a country improves in GNI classification (for example from low-income economies to lower middle-income economies), then this country would gain 29.853 points more in weighted medal ranking. Theoretically, this is the most suitable combination how to improve sport performance at Olympic Games.

This result seems logical, as EAP relates to population ages 15-64 as a part of total population. Therefore, this variable describes countries’ potential to generate number of participants (athletes) to Olympic Games. As Lozano et al. (

Result (24) shows that countries ranked as low-income and lower middle-income economies are expected to gain zero points in the weighted medal ranking (result is significant,

First, it is important to mention that the ranking of GNI classification changes (last time in July 2015). Therefore, some countries from low-income economies are now listed as lower middle-income economies (consequently movements from lower middle-income economies to upper middle-income economies). Second, Li et al. (

Models (25), (26), (27) and (30) indicate positive impact of corruption level on medal ranking. Figure 1 summarizes relation between CPI and Weighted medal ranking regarding GNI classification. We can see that most of the countries of high-income economies have better level of corruption. Moreover, with better level of corruption, these countries achieve better medal ranking. Similarly, countries of upper middle-income economies achieve better medal ranking than lower middle income economies, etc. Therefore, we can conclude that improvement in corruption perception leads to higher probability of achieving better medal ranking.

This result is in contradiction to Potts (

Finally, different variables for explaining the performance at Summer Olympics (regarding the weighted medal ranking) can be used. For example, Churilow and Flitman (

The main objective of the article was to analyze which economic, demographic and social indicators affect sport performance and summer Olympic game in Rio 2016. For this purpose, we used a set of 5 indicators: Gross domestic product, Economic active population, Corruption perception index, Gross national income, and Inflation rate. This set of indicators covers the most important economic and demographic indicators commonly used in sport performance analysis (GDP and EAP), as well as social indicators of CPI. The performance is represented by the weighted medal ranking, which includes first 8 positions from each discipline weighted using the IAAF methodology.

As the best explanatory model of the medal ranking can be supposed model (30) including GNI, CPI and EAP indicators. In this model, all parameters are statistically significant. Thus, the Olympic performance can be explained more complete than just by the classical population-GDP theory. In addition, the model has correct economic interpretation as all three parameters have positive coefficient. Therefore, growth in each of these parameters would result in better performance in Olympic Games. In detail, improving level of corruption by 1 point (considering Transparency International CPI index), would result in +2.476 weighted medal ranking, increase of EAP by 1% would result in gaining 4.29 in weighted medal ranking, and a change in GNI classification by one classification would result in a gain of 29.853 points in weighted medal ranking.

In this article, we have not included other economic, demographic or social factors, such as level of education, relation between sport performance and population health, as well as impacts of governmental policies on sport performance. Therefore, the future analysis will lead in this direction.

The authors would like to thank to La Salle University in México City, Mexico for the support in carrying out this work, which was done under university grant projects.

It is important to emphasize that the objective of the study is not to make an in-depth analysis of econometrics, but rather to justify “linear” relations and introduction of explanatory variables, through the assumptions mentioned in the model (8).

Trading Economics. 2016. Available at

Kosovo Agency of Statistics, available at

Detailed methodology is available at:

Residuals are nothing more than an estimation of disturbances or errors, and, as the residuals are unknown, an estimation of them is made.

The levels of performance efficiency are chosen by the authors considering the statistic distribution of the data set, and thus are subjective, but logical.

In particular, the estimator has a normal distribution.

We can also analyze through a test of hypothesis whether the aggregation of to a model is significant or not. However, sufficient observing , which does not differ much from . Therefore, the aggregation is not significant.