Using microsimulation model to get things right: a wage equation for Poland.
Not for quotation
1Leszek Morawski, Warsaw University
Michał Myck, DIW - Berlin
Anna Nicińska, Warsaw University
Keywords: wage equation, sample selection bias, return to education
JEL classification: J24, J30
We present an application of the Polish microsimulation model SIMPL to estimating the wage equation on data from the Polish Household Budget Survey for year 2005. And yet, research in Poland has so far failed to address two key issues in this area. Because all household level surveys collect information on net incomes some papers analysed net rather than gross wages. Apart from that all known to us estimations of the wages equation using Polish data have so far failed to control for selection, which since Heckman’s seminal paper (1979) has been recognised as a crucial correction in appropriate identification of the parameters of the wage equation. SIMPL allows us to do both: we can compute gross incomes on the basis of net incomes information in the data, and use income in out-of-work scenario generated in the model (together with a set of demographic factors) as an instrument in the selection-corrected equation. The paper demonstrates the degree of bias induced in the approaches taken in the past, presents the first set of selection-corrected wage equation parameters and stresses the importance of the simulated out-of-work incomes as an instrument in controlling for selection.
1 Acknowledgement: Financial support from the EFS by the Polish Ministry of Labor and Social Policy in a project entitled “Model mikrosymulacyjny jako narzędzie wspierające polityki rynku pracy” is gratefully
acknowlegded. The data availability from Ministry of Economy, Ministry of Finance, Central Statistical Office, Social Insurance Institution is gratefully appreciated. The usual disclaimer applies.
The tradition of estimating microeconomic wage equation dates back to the paper by Mincer (1974) where both theoretical background and empirical results are provided. Such analyses were popular in 70’s and 80’s (eg. Willis, 1986). However, this hardly concerns Poland. At
that time Poland was not a free market economy, and wages were established arbitrary by a planner, thus wages under socialism may be treated as a separated field of economics. On the other hand, it was also possible that despite central planning wages were determined i a similar way to free market economies (Rutkowski, 1994). Once the economical system in Poland transformed, there has not been as much discussion on the determinants of expected wages neither in the Polish economy nor in other ones as in 1980’s since many countries had already investigated the issue in details.
In order to properly capture decisions that people make on the labor market one needs to model not only what happens in the labor force, but also what makes people enter or leave labor market. For this reason one needs to be able to predict a wage for everybody regardless whether she or he declares a positive wage. Moreover, the knowledge about potential income outside work for those who work is necessary. Both conditions may be satisfied if there exists a microsimulation model enabling estimation of wage equation corrected for sample selection and simulating incomes from social benefits and assistance if not working. A microsimulation
2) has been recently developed in cooperation among researchers model for Poland (SIMPL
from Warsaw University, IZA-Bonn and DIW-Berlin (see Bargain et al. (2006)) with financial support by the Polish Ministry of Labour and Social Policy, Ministry of Economy and Ministry of Finances. The additional advantage of SIMPL is a distinction between gross wages, income taxes and benefits from tax and benefit policies. This makes estimation of gross wages possible, thus the impact of wealth distribution within economy is controlled and income taxation is irrelevant for expected wage rate.
The way in which wages depend on individual characteristics is the matter of both microeconomic and macroeconomic policies, including inflation, unemployment and labor force participation issues. The knowledge of the way in which wages in Poland are yielded is
2 The model is currently extended in a project called "Microsimulation model as an instrument support labor market policies" financed from the EFS by the Polish Ministry of Labour and Social Policy (see www.simpl.pl).
fundamental for understanding the processes that are taking place at the moment and its future consequences. This knowledge seems to be insufficient.
This paper aims to provide results of wage estimation using most adequate methodology that has not been employed in such analysis in Poland, as far as we know, and to present basic studies on wages together with most important publications on Poland. First part of this paper describes briefly tradition and methodology of estimating wage equation. Then studies concerning Poland in this field are presented. Following chapter describes methodology and data employed in the research. Than empirical results are provided and interpreted. Final chapter concludes.
1. Review of the most important studies concerning wages
According to microeconomics, wage is the price of the labor, which is a factor of production, thus wage is equal to the marginal productivity of labor. Keeping in mind all reservations connected with the uniqueness of the production factor supplied by human beings, the wage reflects one’s productivity, which is a function of one’s human capital. There are other factors affecting level of wages, such as inflation rate, trade unions, and regulations such as a minimum wage and so on, but they are not captured directly by the human capital theory. It is said that the level of human capital accumulated is a matter of decision made by a student that can choose between continuing studies for another year subject to forgone earnings and quit schooling. It is assumed that all individuals formulate their human capital in the process of full-time schooling and the cost of each year of schooling related to foregone earnings from that period is constant over time and equal to one for everyone (Becker and Chiswick, 1966). The opportunity cost states a budget constraint for a student maximizing her stream of lifetime earnings conditional of hers human capital. The wage after S years of schooling,
assuming constant interest rate is determined in a following way:
The above equation was used by Mincer (1974) in his empirical research for the US keeping the theoretical log-linear relation between earnings and schooling (Heckman and Polachek, 1974). The assumption of the opportunity cost being a solely cost of education can be relaxed by adding direct costs and cost of student loans (K greater than one) or distracting tax remissions for students (lowering K) (Chiswick, 1997). Moreover, the way in which human capital is generated might be broadened by the inclusion of skills developed at work after finishing formal education. The later paper extends the functional form proposed by human capital theory by the potential experience measured by the time spent potentially at work (one’s age above the age of finishing formal education) (Mincer, 1974) according to the following equation:
The paper discussed in such detail above was the beginning of a stimulating discussion. Most controversial issue was raised by Heckman in his paper on a sample selection bias as an omitted variable (1976, 1979). It was claimed that the estimates might be biased, as the expected wage was estimated upon the selected sample of those only, who declare positive wage. It is very likely that those who work are equipped with more human capital than those who fail to find a job due to greater ability that cannot be observed and the selection to the employment is not random. If so, sample on which study is conducted is not random and the estimates are inconsistent as a consequence of an omitted variable. Heckman (1979) extends the wage equation model by the participation equation with an instrumental variable allowing for sample selection correction. His method is connected with the reservation wage concept claiming that if one faces wage that is positive but lower than her reservation wage, one is not going to enter the labor market and her wage will not be observed. The nonrandom selection
is revealed by positive statistically significant correlation between the two random terms of the two equations.
The proper instrumental variable partially correlated with participation decision but insignificant for the wage is crucial for the identification of an omitted variable in the sub sample of workers and the consistency of whole model (Heckman, 1979). There are not many variables that affect decision of entering or leaving job without posing any impact on the wage. However, there is a number of recognized and widely employed instruments such as a non-labor income of an individual, income of a spouse, household wealth or having children, the later especially for women (Puhani, 2000a). The new techniques have been developed and two-step procedure in sample selection models even thought still used (eg. Shonkwiler and Yen, 1999) may be replaced by more efficient but demanding large sample (Puhani, 2000a) full-information or limited-information maximum likelihood methods (eg. Jensen et al., 2001). The maximum likelihood methods are more sensitive to assumption of linear relationship between random terms in wage and participation equation than two-step procedure. However sophisticated described methods are, it is still easier to predict expected wages of those who work (so called two-part model, see: Duan et al. 1983, 1984a, 1984b) than expected wages for the whole population including those with unobserved wages (Hartman, 1991). The most recent studies in methodology of dealing with non-random selection improve self-selection correction estimation (Cunha and Heckman, 2007) while others develop non-parametric methods (Blundell and Powell, 2004).
The discussion has raised not only methodological issues. Many doubts concern the linearity of the relation between human capital and earnings. There is still no unanimity on this topic as there are arguments pro (Heckman and Polachek, 1974; Card and Krueger, 1992) and contra linearity (Heckman et al. 1996; Trostel, 2005). The other way of accumulation human capital,
which is experience gained during work, seems to be related to wages in a much more complex way as both third and sometimes even fourth power of potential experience are statistically significant (Lemieux, 2004). This has to do with the imperfection of the measure employed by Mincer as the proxy for potential experience making many assumptions criticized in the literature, namely homogeneity of skills (Willis and Rosen, 1979), differentiation of schooling institutions’ efficiency (Psacharopoulos, 1989), life histories and
probability of being unemployed through a lifetime which in fact differ, thus the costs of education (Chiswick, 1997) and consequently the rate of return to education may be treated as a variable changed throughout lifetime (Murphy and Welch, 1992). The question whether education is a proper metric of human capital has been raised often (eg. Kroch and Sjoblom, 1994). Furthermore, there is a controversy about the way in which schooling should be measured as the marginal revenue from different levels of education does not need to be constant over generations (Connelly and Gottschalk, 1995) nor over time for a given cohort thus more popular metric of education became the highest level of education obtained instead of years at full-time education (eg. Trostel, 2005).
Studies on Polish wages
The price regulation mechanism responsible for production factors market equilibrium was not officially in operation in nationalized and centrally planned economy that was present in Poland since the Second World War till 1989. There are few papers concerning determinants of wages under central planning which is a question raised by Rutkowski (1994). It is claimed there that despite the low variation of wages, their determination does not vary much from those under free market rules. However, this study is not based upon human theory solely as it distinguishes a number of additional explanatory variables such as value of fixed assets per employee, firm size, structure of the industry of the firm, ratio of final profit to sales revenue (Rutkowski, 1994).
In 1989 the trade has been liberalized and free market was introduced successfully in Poland. Economy was ruled by few regulations; wages could develop freely. The private sector grew rapidly from 29% of GDP in 1989 to 60% of GDP in 1995 (Keane and Prasad, 2002) which was the result of an immediate and full liberalization. Transition has been a period of many deep adjustments in the economy. However interesting there were from the economists’ point
of view, there was little tradition (Rutkowski, 1994) and no urgent need to investigate expected wage determinants.
However, the case of Polish economy transformation has been recognized as relatively successful and all processes that took place between 1989 and 1996 (when macroeconomic indicators reached stable level) have been interesting to many scientists. Among studies concerning Polish transformation, there is some focusing on wage dynamics and structure of earnings differentiation (eg. Goh and Javorcik, 2005, Newell and Socha, 2005; Puhani, 2000b). Most of them are recent and usually approach the topic from the macroeconomic perspective. The direct result of a free market introduction was an increase in households’ earnings polarization, which was on average greater in private sector than in public sector by 10% and 20% respectively for small and medium firms (Keane and Prasad, 2002). The stronger impact on private enterprises is explained by the fact that public firms were less vulnerable to free market competition, modern management and marketing.
The estimation led by Keane and Prasad (2002) was an extended version of standard Mincer equation, controlling for levels of education, rural or urban area, sex, industry and experience in a quadratic form. However, the research was based on cross-sections individual net income over period 1985-1992 and 1994-1996 without sample selection correction for there was no proper instrument available. The most recent estimation of Mincerian equation for Poland explains determination of monthly gross wages of full-time workers of companies employing 9 and more workers controlling for gender, years in formal education and on-the-job training (Rogut and Roszkowska, 2007). The empirical evidence confirms predictions of human capital theory noticing that in case of men the work experience has greatest impact on their wages while in case of women that would be the formal education (Rogut and Roszkowska, 2007).
The analysis of wage inequality by Newell and Socha (2005, 2007) are based on individual Poles’ net incomes from work. The research investigates hourly wage rates, and states that their differentiation has not changed much after the transition (Newell and Socha, 1998). It is iclaimed that the increase in the household earnings differentiation was caused not by the change of hourly wage rate, but most of all by the drop of labor supplied by a household due to unemployment (in case of older adults) and decision of continuing education (in case of younger adults). This way of reasoning has been confirmed by Newell (2001) and is in line
with the empirics showing significantly greater revenue from higher education level after transition than before it (Keane and Prasad, 2002). Moreover, the revenue from education was greater in private sector than public one in Poland in 1995 (Socha and Weisberg, 2002). The issue of revenue from education is still investigated and there are numerous cross-national studies aiming to compare cultural and national differences between educational systems (eg. Trostel, 2005). One of them estimates extended Mincer equation for male full-time full year workers and self-employed aged 30-55 coming from a number of countries, among them also from Poland, based on data on annual earnings gross of taxes and employee’s social insurance contribution (Hartog et al., 2004). The definition of the correct measure of education is crucial for that comparative research. Note also, that the sample covers very limited part of population so it is not random which means that results shall not be generalized over whole society.
The expected wage determination is a field where discrimination might be investigated. Many concepts have been developed in order to explain reasons for which gender is important for the wages and the gender pay gap has been widely examined (eg. Plasman and Sissoko, 2004) also for Poland (Grajek, 2001). Empirical studies on net incomes of full-time employees show that there was a decline in female employment outcomes while the average gender gap remained stable at the 22-23 per cent level between 1993 and 1998 in Poland (Adamchik and Bedi, 2003). The paper by Latuszyński and Woźny (2006) applies Oaxacia decomposition method (1973) in order to find determinants of hourly net wages of employees working in 221 private firms in Poland. The region, actual experience (measured in years at an enterprise and in the industry of the enterprise), level of education and number of children are controlled for. The study focuses solely on the decomposition, so it concerns not what has been explained by qualifications, experience, industry and regional differentiation, but just the opposite: on what can not be explained by the observed individual characteristics other than gender. According to Latuszyński and Woźny (2006) the main reason for lower average female wages is the structure of labor supply, where women usually work in low skills occupations whereas man in high skills occupations and the gender gap in earnings vanishes in marketing, research and development.
Another question raised in the literature is relation between unemployment and wages. Such analysis for Poland must take into account regional differences in unemployment rate and labor demand structure and that is the case of estimation of Polish wage curve for period
1991-1996 (Duffy and Walsh, 2001). According to the non-accelerating inflation rate of unemployment concept (Stiglitz, 1997), such unemployment rate would not affect level of wages. This macroeconomic perspective involves additional agents to the standard microeconomic framework such as a central bank with its inflation policy and trade unions with their wages rigidity aiming to verify whether the rate of unemployment reached the natural level neutral for inflation or its nature is structural, difficult to remove and unfavorable for the economy as a whole. Results of the Granger causality test suggest that Polish wage level depended on the unemployment rate, thus unemployment in Poland between 1993 and 2004 was of the later kind (Gaweł, 2006). However macroeconomic analysis of inflation, wages and unemployment are common (eg. Commander and Coricelli, 1992; Welfe and Majsterek, 2002), the functional form of the estimated wage equation is not derived directly from the human capital theory and does not tell much about the individual characteristics affecting marginal productivity of labor.
The research presented in the following chapters is a unique one in the light of thus far achievements of Polish labor market analysis. None of them covers such wide sample of part-time and full-time both gender individuals employed by all types of companies with their gross incomes nor proper instrumental variable enabling sample selection correction.
2. Methodology and the data
According to empirical studies concerning selection to the labor market, decision whether to work or not is logically and statistically significantly correlated with the level of expected earnings from that work. For this reason Heckman two-step procedure is a proper method of dealing with the estimators’ bias. Linear estimators are also provided in order to make possible adequate comparisons. The dependent variable is a logarithm of monthly wage rate in gross and net terms.
The instrumental variables used for model identification are: family disposable income if not working, other household members’ equivalized income and whether a household contains more than one family. There are multifamily households and wealth of household members that does not belong to a family may affect participation decision within this family. In order to distinguish between no other families in household income due to lack of income and due to lack of other families, we control whether a household is a one or more family unit. The disposable income of family if not working is generated by SIMPL as a weighted sum of all benefits and social assistance that a family is entitled to, judging from its financial and demographical situation. It is expected that the greater the non-labor income, the lower probability of entering labor market. The disposable income if not working is calculated at the family level, as a spouse income affects participation decision. Moreover, eligibility to some benefits depends on number of not working partners, thus non-labor income is generated for all possible scenarios (both not working, one is working, both are working) and the value of instruments consistent with observation is chosen for given family. The non-labor income for one family households captures all benefits that whole households are entitled to, such as for instance housing benefit. Other household’s memebers disposable income if a consireded family is not working might be more important for two generation families, where grandparents’ income shall also be taken into account while making labor market participation decision that is why number of families in one household is also included. All these instrumental variables are assumed to be uncorrelated with the individual marginal productivity.
The dataset comes from the Household Budget Survey (BBGD) 2005 provided by the Central Statistical Office (GUS). The gross wages are generated by SIMPL according to the Polish income taxation patterns from 2005. The grossed up wages contain income tax, employee’s
social and health insurance contributions (Morawski, 2007). It is assumed that all part time workers are exactly half-time workers. The sample contains persons that may be treated as a part of labor force, so children, pensioners and persons unable to work due to health conditions are not taken into account. The category of self-employed is also excluded from the sample as their income from employment cannot be distinguished from their profits. That is why their earnings shall not be treated as wages. The sample is composed of individuals aged 18-59 who are neither retired nor unable to work for health reasons nor self-employed. However, all disabled persons that are able to work are included in the sample, regardless from their disability level.