Analysis of the Left Censored Data from the Generalized
Sharmishtha Mitra & Debasis Kundu
Department of Mathematics & Statistics
Indian Institute of Technology Kanpur
The generalized exponential distribution proposed by Gupta and Kundu (1999) is an important lifetime distribution in survival analysis. In this paper, we consider the maximum likelihood estimation procedure of the parameters of the generalized exponential distribution when the data are left censored. We obtain the maximum likelihood estimators of the unknown parameters and also obtain the Fisher Information matrix. Simulation studies are carried out to observe the performance of the estimators in small sample.
KEYWORDS: Fisher Information, generalized exponential distribution, left censoring, maximum likelihood estimator.
The generalized exponential (GE) distribution (Gupta and Kundu; 1999) has the cumulative distribution function (CDF)
with the corresponding probability density function (PDF) given by
(？1xx？？；；, for . fxee(;,)1(；(；，？x！0;；
and are the shape and scale parameters respectively. GE distribution with the Here (；
shape parameter and the scale parameter will be denoted by . It is GE(,)(；(；
known that the shape of the PDF of the two-parameter GE distribution is very similar to the corresponding shapes of gamma or Weibull distributions. It has been observed in Gupta and Kundu (1999) that the two-parameter can be used quite effectively GE(,)(；
in analyzing many lifetime data, particularly in place of two-parameter Gamma or two-parameter Weibull distributions. The two-parameter can have increasing and GE(,)(；
decreasing failure rate depending on the shape parameter. The readers are referred to Raqab (2002), Raqab and Ahsanullah (2001), Zheng (2003) and the references cited there for some recent developments on GE distribution.
Although several papers have already appeared on the estimation of the parameters of GE distribution for complete sample case, see for example the review article of Gupta and Kundu (2006b), but not much attention has been paid in case of censored sample. The main aim of this is to consider the statistical analysis of the unknown parameters when the data are left censored from a GE distribution. We obtain the maximum likelihood estimators (MLEs) of the unknown parameters of the GE distribution for left censored data. It is observed that the MLEs can not be obtained in explicit form and the MLE of the scale parameter can be obtained by solving a non-linear equation. We propose a simple iterative scheme to solve the non-linear equation. Once the MLE of the scale parameter is obtained, the MLE of the shape parameter can be obtained in explicit form. We have also obtained the explicit expression of the Fisher information matrix and it has been used to construct the asymptotic confidence intervals of the unknown parameters. Extensive simulation study has been carried to observe the behavior of the proposed methods for different sample sizes and for different parameter values and it is observed that the performances of the proposed methods are quite satisfactory.
There is a widespread application and use of left-censoring or left-censored data in survival analysis and reliability theory. For example, in medical studies patients are subject to regular examinations. Discovery of a condition only tells us that the onset of sickness fell in the period since the previous examination and nothing about the exact
date of the attack. Thus the time elapsed since onset has been left censored. Similarly, we have to handle left-censored data when estimating functions of exact policy duration without knowing the exact date of policy entry; or when estimating functions of exact age without knowing the exact date of birth. A study on the “Patterns of Health Insurance
Coverage among Rural and Urban Children” (Coburn, McBride and Ziller, 2001) faces
this problem due to the incidence of a higher proportion of rural children whose spells were "left censored" in the sample (i.e., those children who entered the sample uninsured), and who remained uninsured throughout the sample. Yet another study (Danzon, Nicholson and Pereira, 2004) which used data on over 900 firms for the period 1988-
(phases 1, 2 and 3) biotech and 2000 to estimate the effect on phase-specific
pharmaceutical R&D success rates of a firm‟s overall experience, its experience in the relevant therapeutic category, the diversification of its experience across categories, the industry‟s experience in the category, and alliances with large and small firms, saw that
the data suffered from left censoring. This occurred, for example, when a phase 2 trial was initiated for a particular indication where there was no information on the phase 1 trial. Application can also be traced in econometric model, for example, for the joint determination of wages and turnover. Here, after the derivation of the corresponding likelihood function, an appropriate dataset is used for estimation. For a model that is designed for a comprehensive matched employer-employee panel dataset with fairly detailed information on wages, tenure, experience and a range of other covariates, it may be seen that the raw dataset may contain both completed and uncompleted job spells. A job duration might be incomplete because the beginning of the job spells is not observed, which is an incidence of left censoring (Bagger, 2005). For some further examples, one may refer to Balakrishnan (1989), Balakrishnan and Varadan (1991), Lee et al. (1980),
The rest of the paper is organized as follows. In Section 2 we derive the maximum likelihood estimators of in the presence of left censoring. In Section 3, we GE(,)(；
provide the complete enumeration of the Fisher Information matrix and discuss certain issues on the limiting Fisher information matrix. Simulation results and discussions are provided in Section 4.
2. MAXIMUM LIKELIHOOD ESTIMATION
In this section, maximum likelihood estimators of the are derived in presence GE(,)(；
of left censored observations. Let be the last order statistics from a XX,...,nr？(1)()rn？
random sample of size following distribution. Then the joint probability GE(,)(；n
density function of is given by XX,...,(1)()rn？
Then the log likelihood function denoted by (or simply,) is Lxx,...,;,(；L(；,;；;；(1)()rn？
The normal equations for deriving the maximum likelihood estimators become
n?？Lnr？？；；xx(1)()ri？ln1ln10(2.3)，？？？？，ree;；;；?((?1，？ir ？x；()innxe?？Lnrr(？x()i；(1)r？and (1)0.(2.4)，？？？？，xex(??(1)()ri？？？xx(1)()ri？；；?11？？ee；；11irir，？，？
From (2.3), we obtain the maximum likelihood estimator of as a function of , say (；ˆ where (；(),
nr？ˆ (2.5) (；()，？n？？；；xx(1)()ri？ln1ln1ree？？？;；;；?1，？ir
ˆPutting in (2.2) we obtain the profile log-likelihood on as (；()；