I originally wrote this article in October 2007 for a course in Software Maintenance, now making it available on the web. The intended audience is software engineers and software project managers who are faced with a task of estimating software size and cost.
Planning for a software development (or maintenance) project involves estimating the effort in terms of cost and time to establish a budget and a schedule. The estimated effort is directly proportional to the size of the project—larger projects require more time and money. Therefore, a size estimation (scope, or amount of work) must be made to serve as the basis for the effort estimation (cost and time).
In software projects, lines of code (LOC) immediately comes to mind as a measure of size. However, estimating in terms of lines of code can be problematic as there is no consistent definition of what a line of code is. Does whitespace count as a line of code? Is each line in an if-statement a line of code, or does the entire if statement count as one line of code? In addition, lines of code are dependent on the programming language being used: a 100-line program written in Perl may take 500 lines in C (Kernighan and Pike). Finally, if concise code is a design goal, a count of the lines of code may not reflect the necessary effort required to develop a concise design to satisfy a requirement. Instead of using lines of code as a metric, function points provide an alternative unit of size that addresses some of the shortcomings of using lines of code.
Function points are a measure of the functions—as in functionality—of a system, independent of how the software requirements are specified (natural language or formal notations), how the software is designed (structured or object-oriented), and how the software is implemented (programming language). The determination of the number of function points in a system depends on five parameters described by Albrecht (1983): external input types, external output types, logical internal file types, external interface file types, and external inquiry file types. In more modern terminology used by Futrell, Shafer, and Shafer (2002), these parameters can also be referred to as: inputs, outputs, data structures (files), interfaces, and inquiries. The following list provides brief definitions for each of the parameters.
- An external input is any object (data or control) that provided to the system, regardless of whether the source is a user or another system.
- An external output is any object that the system provides to an external recipient be it a user or another system. Reports produced for a user, messages displayed to a user, and messages sent to another system are examples of outputs.
- A data structure or internal file type is a structure maintained within the system to manage some logical grouping of information. A data structure may be some object in memory that maintains the data to be manipulated by the system.
- An interface is any data that crosses the system boundary: a file that resides outside the application or data structures received from or sent to other systems.
- An inquiry is an input-output pair where some request is made of the system and that request triggers a response by the system. One can think of an inquiry as an event triggered by a user (command) or another system (request) that results in an immediate response.
Upon identifying the parameters in the system, each parameter is classified as simple, average, or complex. For example, if the system has ten outputs, each of the ten outputs are individually classified as simple, average, or complex. The classifications are then used to determine the weight of each parameter. After counting and classifying the parameters in the system under consideration, a weighted sum of the parameters is taken to obtain the raw function point count.
The raw function point count serves as the basis for the final function point count, but it first must be adjusted for environmental factors that can affect the size of the system, up to 35 percent (Jalote, 1997), due to increased complexity. There are 11 environmental factors that are taken into consideration: data communications, distributed processing, performance objectives, operation configuration load, transaction rate, online data entry, end user efficiency, online update, complex processing logic, reusability, installation ease, operational ease, multiple sites, and desire to facilitate change (Jalote, 1997). Each of these environmental factors are evaluated and determine the complexity adjustment factor (CAF). The number of raw function points multiplied by the CAF results in the final function point count, also referred to as the number of adjusted function points or delivered function points. Futrell, Shafer, and Shafer (2002) provide a summary of the steps described above to determine the number of (adjusted) function points in a system.
- Count the number of functions (parameters) in each category.
- Apply complexity weighting factors.
- Apply environmental factors.
- Calculate the complexity adjustment factor.
- Compute the adjusted function points.
As an optional last step, one can convert adjusted function points to lines of code. There are known conversion factors that help to estimate the number of lines of code in a particular programming language to implement one adjusted function point. Although it might seem counterintuitive to convert function points into lines of code after noting all of the deficiencies of size estimation by lines of code, this last step is often performed to provide additional insight into the software size estimate. An estimate in lines of code provide developers a sense of the size of the system as it is often easer to visualize scale in lines of code, rather than in function points. In addition, cost estimation models like the widely-known Constructive Cost Model (COCOMO) use lines of code as the estimation unit for its size input. Note however that there is distinct difference between estimating a system’s size directly in lines of code and estimating a system’s size by function points and then converting function points into lines of code. Estimating size in lines of code via function point analysis is less subjective than estimating size ad-hoc in lines of code.
Estimating software size through function point analysis provides a methodical and consistent means of gathering input for software project planning, particularly in determining an initial budget and schedule. Function point counts allows for tracking progress as the system is developed in terms of number of function points satisfied. Even during the maintenance phase of the software lifecycle, it can be an invaluable tool in understanding the scale of the system to be maintained and evaluating the impact of changes on the system to effectively plan for maintenance efforts.
Note to readers: Function point analysis is one, but not the only method of estimating software size. This article is meant as an introduction to the function point analysis, which is commonly mentioned in the literature and one of the more prevalent size estimation methods, used in conjunction with COCOMO to produce a cost estimate.
References
Albrecht, A. J., & Gaffney, J. E. (1983). Software function, source lines of code, and development effort prediction: A software science validation. IEEE Transactions on Software Engineering, 9(6), 639-648.
Futrell, R. T., Shafer, D. F., & Shafer, L. I. (2002). Quality Software Project Management. Upper Saddle River, NJ: Prentice Hall PTR.
Jalote, P. (1997). An Integrated Approach to Software Engineering (2nd ed.). New York, NY: Springer-Verlag.
Kernighan, B. W., & Pike, R. (1999). The Practice of Programming. Reading, MA: Addison Wesley.