Calibrating Parametric Estimation Models

Calibrating Parametric Estimation Models

Parametric Estimation models express output as a function of input, so for example expressing the cost of undertaking a project when expressed as a function of several variables is an example of an estimation model that uses parameters. Such models use data from some source to generate the coefficients. If such models are used to predict the output variable in question using a linear /non linear equation, then the data used to train the estimation model to yield the coefficients are dependent upon some conditions that they have to be recalibrated during some other conditions.

Parametric estimation models may use regression which is nothing but the method of least squares to estimate say Project Cost = function (complexity, no of resources, size of the project ) etc., The data used to arrive at the coefficients used in the function are specific to certain conditions, so the equation Project Cost = function (complexity, no of resources, size of the project ) will not be valid under all conditions other than similar conditions which are used to train the data In another example Estimation professionals use COCOMO which is a model in which software development project effort is expressed as a function of several variables such as complexity, size of project, skill level of resources etc., But this model is trained using data obtained from software projects done in NASA. The same model may not be valid if used to estimate projects done in a manufacturing plant or in an offshore development center in India.

The validity of the outputs predicted by such models should be counter verified by collecting data which is specific to the context of the project. In the case of context sensitive data one has to collect project data which is current and which is exhaustive and covers the width of the project in all aspects and scenarios. This once again boils down to generating regression equations/machine learning models that predict output from input.

So a question arises as to whether such packaged estimation models (COCOMO, SLIM etc.,) used in the Industry can be used in an organization without recalibrating the coefficients used in the models to predict output from input variables. The answer is no.

If an organization decides to use these models, it is best to use data which is context sensitive and use statistical regression or as an alternative recalibrate the coefficients used in these packaged models using context sensitive data.