Traditional metrics

By Nicholas Bryant,2014-04-15 06:04
8 views 0
Traditional metrics

The Jungle that is Object Oriented Metrics

    Ralph Johnson

    Info 630

    Summer 2006

    Submitted: September 6, 2006

    Professor Glenn Booker

    Drexel University

    Page 1 of 25

Table of Contents


    The Pioneers


    Recent Literature

    Real Life


    References Page 2 of 25


     It would be preferable to have a minimum set of OO metrics.Why is it dangerous to go into the jungle between 2 and 4 in the afternoon?

    Because the elephants are jumping out of the trees.

    Why are pygmies so small?

     Because they went in the jungle between 2 and 4 in the afternoon.2


    This paper investigates some of the literature about object-oriented metrics. Depending on which authors one reads, the core measures and methods of OO metrics may or may not yet

    have been established. Generally, one can say that many existing OO metrics have been

    empirically validated, and sometimes proven effective with case studies. Whether they are

    actually used beyond studies, effectively or not, is difficult to ascertain. If there is a central

    observation here, it is that an inordinate number of researchers base their metrics on faulty

    premises, sometimes infer causal direction when only correlation has been established, or even

    claim correlation when it has not been rigorously established.

    The Pioneers

    The original impetus for the creation of OO metrics was the object-oriented revolution of the late 80‟s and early 90‟s, which changed the world of those in object-oriented circles forever.

    The 00 approach centers around modeling the real world in terms of its objects,

    which is in stark contrast to older, more traditional approaches that emphasize a

    function oriented view that separated data and procedures. Given the

    1 Subramanian, G.; Corbin, W. (2001). An empirical study of certain object-oriented software metrics. The Journal

    of Systems and Software, 59.1, pp. 57-63 2 Elephant joke. Apologies to pygmy readers.

    Page 3 of 25

    fundamentally different notions inherent in these two views, it is not surprising to

    find that software metrics developed with traditional methods in mind do not

    direct themselves to notions such as classes, inheritance, encapsulation and 3 message passing. Therefore, given that current software metrics… are easily seen 45 The Lorenz and Kidd Metrics (1994), Abreu Metrics (1993) and Chidamber and as not supporting key 00 concepts, it seems appropriate to develop a set, or suite

    of new metrics especially designed to measure unique aspects of the 00 6Kemerer Metricsapproach. (1994) were among the first attempts to address this problem and a host of

    78910111213other followed over the next 5 years.,,,,,, Rather than attempt to catalog the individual

    measures presented by these researchers, it seems more feasible to present categories and types

    of OO metrics into which many can be roughly pigeonholed.

    Most metrics can be classified as measuring (or being some indication of) system size, class or method size, coupling and inheritance (the degree to which classes are connected (or not))

    laterally or vertically, or, class or method complexity. Various types of measures include:

    total function calls in the system

    3 Chidamber, S. R. & Kemerer, C. F. (1991). Towards a metrics suite for object oriented design. OOPSLA '91

    Proceedings, 197--211. 4 Lorenz, M. Kidd, J. (1994). Object-Oriented Software Metrics. Prentice Hall PTR; 1st edition 5 Abreu, F.B. (1993). Metrics for Object-Oriented Environment. Proceedings of the Third International Conference on Software Quality, Lake Tahoe, Nevada, October 4-6, pp. 67-756 Chidamber, S. R. & Kemerer, C. F. (1994). A metrics suite for object-oriented design. IEEE Trans. Software Engineering 20(6), pp. 476--493.7 Bellin, D, Tyagi, M, & Tyler, M (1994). Object-oriented metrics: an overview. IBM Centre for Advanced Studies

    Conference Proceedings of the 1994 conference of the Centre for Advanced Studies on Collaborative research Toronto, Ontario, Canada8 Balasubramanian, N V (1996). Object-Oriented Metrics. apsec, p. 30, Third Asia-Pacific Software Engineering Conference (APSEC'96)9 Tang, M., Kao, M., & Chen, M. (1999). An Empirical Study on Object-Oriented Metrics. In Proceedings of the 6th

    international Symposium on Software Metrics (November 04 - 06, 1999). METRICS. IEEE Computer Society, Washington, DC, 242.10 Marchesi, M. (1998). OOA metrics for the Unified Modeling Language. Software Maintenance and Reengineering. Proceedings of the Second Euromicro Conference on , vol., no.pp.67-73, 8-1111 Harrison, R.; Counsell, S., & Nithi, R.. (1998). Coupling metrics for object-oriented design. Software Metrics Symposium, 1998. Metrics 1998. Proceedings. Fifth International , vol., no.pp.150-157, 20-2112 Bansiya J. & Davis C. (1999). Class Cohesion Metric For Object-Oriented Designs. J. Object-Oriented Programming, vol. 11, no. 8, pp. 47-5213 Li W. & Henry S. (1993). Object-Oriented Metrics that Predict Maintainability. Journal of Systems and Software,

    vol. 23, no. 2, pp. 111-122, 1993.

    Page 4 of 25

    number of classes in the system

    function calls per class or per method, to and from

    number of methods, or public methods, per class

    number of attributes (or “variables”) per class, public and private

    number of methods inherited by a subclass

    number of methods overidden by a subclass

    number of methods added by a subclass 11,14average method size

    class re-use count

    depth of inheritance tree

    number of children per classMost of the Abreu Metrics are compound metrics, in other words, some ratio of one

    metric to another. There are, as cited above, many other forms of similar metrics, which in total

    would seem too great an undertaking to enumerate, much less classify (although someone has

    attempted this, as we shall see). Further there are traditional metrics which can be and are applied

    to OO software, such as the cyclomatic complexity of classes or methods, “comment

    percentage,” forms of function point counts, and more obviously, LOC.15

    The intent of most of these metrics, as with traditional metrics, is to convey information

    about the quality of the software development process, or about the quality of the particular

    software product being developed. Most often the particular qualities sought are process

    improvement (budgeting, schedule, control) and functional correctness the absence of


    As is also the case with traditional metrics, many OO metrics can be used as static,

    descriptive measures, or as dynamic, inferential ones. It is often pointed out in the OO literature

14 Schroeder M., (1999). A Practical Guide to Object-Oriented Metrics. IT Professional - IEEE, Vol.1, No.6 15 Rosenberg, L., Stapko, E, & Gallo, A. (1991). Applying object oriented metrics. ISSRE, NASAGoddard Space

    Flight Center, Accessed August 30, 2006. 16 Card D., El-Emam K. & Scalzo, B. (2001). Measurement of Object-Oriented Software Development Projects.

    Software Productivity Consortium NFP

    Page 5 of 25

that historical data on the same or similar metrics (preferably with the same software language, 1614, This data enables optimum predictive capability for quality

    IDE, domain, organization, staff, office furniture, climate control, well you get the idea) is of 17estimation and risk management.

    paramount importance.


    In some ways, posterity has been quite kind to OO metrics pioneers. No matter how

    unreliable their metrics prove to be, not matter how poorly designed their studies were, no matter

    how extravagant their conclusions seem in retrospect, they will always be the pioneers. This is

    not to say that even now anyone has conclusively proved them wrong, or inept or malevolent

    (people have tried) (well maybe not malevolent). But it will be long time before the names of the

    founding fathers will be dropped from the citations of the latest OO metrics research.

    Of course, being citable is akin to being Johnny Ringo in “The Gunfighter”

    18 anyone

    looking to make a name for themselves will be coming after you. Along with the OO metrics

    growth came the OO metrics critique and commentary literature which, if this paper‟s references

    can be counted as evidence, far outnumbers the abundant metrics-producing literature. Some of

    this later literature is extremely critical, pointing out deficiencies in metrics, models, empirical

    validation, and a dearth of real life case study support. Some of it quite constructive in attempts

    to convey an understanding of the state of OO metrics or even to synthesize past efforts into a

    cohesive meta-model. Without examining the self-citations, one would not necessarily know that

    many of the critical authors have their own metrics they purport to be better than those they are

17 El Emam, K. (2001). A Primer on Object-Oriented Measurement. Proceedings of the 7th International Symposium

    on Software Metrics, IEEE Computer Society. Washington, D.C. 18

    Page 6 of 25

critiquing. Conversely, some of the authors have worked with, presumably are on good terms

    with, and perhaps completely agree with the authors of the original articles.

    There are of course examples of early, encouraging study results, although they are

    sometimes presented tentatively. Basilli et al (1996) “investigated” the CK metrics and conclude

    that five of them appear to be useful to predict class fault-proneness during the high- and low-

    level design phases of the life-cycle.” (emphasis added). They add that the CK metrics are

    19 better predictors than the best set of traditional code metrics, which can only be collected

    One of the earlier comprehensive papers is Harrison, Counsell, and Nithi, "An Overview during later phases of the software development processes.

    of Object-Oriented Design Metrics" (1997) which describes and critiques the Chidamber and

    Kemerer (hereafter CK), Lorenz and Kidd (hereafter L&K) and Abreu metrics. They state their

    perspective plainly : “Various shortcomings emerge when we begin to consider criteria important

    in designing, using and interpreting object-oriented metrics for real systems.”11 These criteria

    vary somewhat between studies, but Harrison et al is typical in its desire to be presented with, or

    to establish an explicit validation of each metric in a suite. They want to know “what attributes of

    software we are measuring, and how we go about measuring those attributes… A metric must

    measure what it purports to measure.” 11 This and many other papers are also concerned with “empirical evaluation” - investigating the performance of such technologies and the quality of

    20the resulting object-oriented (OO) software products.

19 Basili, V.R., Briand, L.C., & Melo, W. L. (1996). A Validation of Object-Oriented Design Metrics as Quality

    Indicators. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 22, NO. 10 20 Briand L., Arisholm S., Counsell F., Houdek F. & Thévenod-Fosse P. (2000). Empirical Studies of Object-

    Oriented Artifacts, Methods, and Processes: State of the Art and Future Directions. Empirical Software

    Engineering, vol. 4, no. 4, pp. 387-404.

    Page 7 of 25

    After describing the “suites,Harrison et al run through specific problems with validity of various metrics. For example, they evaluate the CK metric Weighted Methods per Class

    (WMC) the number of methods in a class, which is intended to measure complexity, which

    would increase the potential for defects, and therefore increase development effort. They observe

    “We can not view WMC as an indicator of the effort to develop a class, since a class containing a

    11 single large method may take as long to develop as a class containing a large number of small

    The CK Response for a Class (RFC) (“the set of all methods which can be invoked in methods. The same can be said of the L&K Public Methods (PM) metric.”

    response to a message to an object of the Class”), the L&K Number of Methods Inherited by a

    subclass (NMI) and the L&K Number of times a Class is Reused (NCR) metrics are all criticized

    for being vaguely specified: “In each, there is some ambiguity in exactly what the respective

    designers meant the metric to measure. This forces the user to guess what they think the metric

    was intended to measure. 11

    As an example of the need for empirical evaluation they present Abreu‟s coupling factor (CF) “the number of inter-class communications.” CF

    “is claimed to increase complexity and reduce both encapsulation and potential

    reuse. The thesis regarding increased complexity would be supported more

    strongly if an empirical evaluation were performed to identify any correlation

    between CF and, perhaps, a subjective measure of the complexity of each class

    (provided by the system designer).”11

    Other suggestions for empirical evaluation are quite simple. Because Chidamber and

    Kemerer claim the metric Number of Children (NOC) metric indicates the amount of testing a

    class will require, Harrison et al suggest an empirical investigation of the relationship between

    NOC values and the testing times for each class. 11

    Page 8 of 25

    The authors note more practical problems with data collection for some of the metrics:

    “For large systems, collection of the more involved metrics becomes prohibitively time-

    consuming. For example, calculation of the Lack of Cohesion in Methods (LCOM) metric (C&K)

    requires careful consideration of the use of variables in a class, and so is only practical for

    systems with a small number of classes.” They say this would also apply to the CK metric

    Coupling Between Objects. Conversely, they drolly observe, “in systems with no inheritance, the collection of metrics such as C&K‟s, L&K‟s and Abreu‟s becomes relatively trivial. Although

    11 this makes the metrics collection easier, our understanding of the system being analysed is then

    In conclusion, They take the authors to task for being vague in their definitions of how limited by a large number of metrics with values of zero. their metrics relate to quality:

    “The metrics described give suggestions as to what aspects of quality object-

    oriented software they would be useful for measuring. Quality factors such as

    reusability, maintainability and testability are frequently quoted. However, no

    concrete notion of what constitutes quality is provided by the designers of the

    metrics studied.” 11

    They also take exception to the purported ability of the proposed metrics to capitalize on

    the OO development model. In other words, since the metrics were designed specifically for OO,

    they should be applicable to earlier phases of the design process, and be less closely tied to


    Many of the metrics outlined are simply code metrics in the sense that they are

    measures of the code‟s characteristics…. Yet, the three set of metrics studied

    claim to be high-level design metrics, and to indicate features of object-oriented

    systems design. It would be more useful to have metrics which measured the

    quality of the design at a much higher level of abstraction. 11

    Page 9 of 25

    One would not expect the pioneers to sit still for this sort of treatment, and indeed

    Chidamber, Darcy and Kemerer return in 1998 with an empirical study that suggests that several

    of their metrics “provide significant explanatory power for variations in [cost, quality, and

    21 They also make some observations which anticipate criticisms to come. They cite a productivity], over and above that provided by traditional measures, such as size in lines of

    study that showed that CK metrics “explained additional variance in maintenance effort beyond code.

    that explained by traditional size metrics” and the note that several of their metrics (WMC, RFC,

    CBO) exhibit “multicollinearity” which they suggest indicates that only one of these should be used at a time in a given situation. 21

    Despite the contentions that emerged over the following years, one must give these

    authors credit for making an honest effort, and for acknowledging the limitations of their

    research: “It cannot be overemphasized that, given the nascent state of OO metrics research, the

    analysis presented in this paper is exploratory in nature, and it would be unwise to overgeneralize

    from the empirical estimates presented…”21

    Chronologically the next major study covered here, and seemingly the most inflammatory,

    is El Emam et al. "The Confounding Effect of Class Size on the Validity of Object-Oriented

    Metrics" (2000).22 The authors do an extensive survey of OO metrics literature. They begin by

    citing a quote from J. Munson and T. Khoshgoftaar which epitomizes a popular rationale for OO

    metrics: “‟There is a clear intuitive basis for believing that complex programs have more faults

    in them than simple programs‟ . However, an intuitive belief does not make a theory.” The

21 Chidamber, S.R., Darcy, D.P. & Kemerer, C.F. (1998). Managerial use of metrics for object-oriented software:

    and exploratory analysis. IEEE Transactions on Software Engineering, 24, pp629-639 22 El Emam, K., Benlarbi, S., Goel, N. & Rai, S. (2000). The Confounding Effect of Class Size on the Validity of

    Object-Oriented Metrics. To appear in IEEE Transactions on Software Engineering. Page 10 of 25

Report this document

For any questions or suggestions please email