Neural Network-Based Model for Supporting the Expert Driven Project Estimation Process in Mold Manufacturing

One of the crucial activities for running a successful mold manufacturing business is project estimation. The estimation process is an early project activity which is usually handled by highly skilled, in-house experts. One of the most important parameters affecting the estimation process is the volume of manufacturing hours (VMH) to produce the mold. This article suggests how to address the problem of estimating the volume of manufacturing hours by using the support of an artificial neural network (ANN) model, and its inclusion into the expert driven project estimation process. Based on the histogram of ANN estimations the percentage of unwanted underestimations of the VMH can be estimated as well and decreased by an introduced safety factor. The developed model-based estimation enables an expert to improve project estimation by using easily obtainable input data.


INTRODUCTION
The mold making industry is project driven, and as such it has to cope with the characteristics of an individual production process.One of the major sources of risk in project management is the inaccurate forecast of project costs, demand, and other impacts [1].In the mold production process it is crucial to minimize uncertainty in the early project estimation phase.The estimation phase is commonly a human expert driven activity which is sensitive to the expert's bias.This bias can lead to an underestimation of project resources when the estimator is overconfident, or to over-estimation of project resources when the estimator does not have sufficient confidence that all aspects of the project can be properly covered.Both scenarios have a negative impact on future business.In the case of underestimation, the project will bring economic loss, and in the case of overestimation, it will most likely be assigned to a competitive supplier.The estimator's key competence is to properly collect and evaluate all significant information for making the project estimation successful.The contradiction lies in the fact that the estimator should spend minimal time necessary on estimation activity since usually less than 10% of all offers turn into orders in the mold making industry, as stated in [2] to [4].
Estimations in the mold manufacturing business still rely heavily on intuitive methods, which are subjective and prone to reliability and repeatability problems.A solution for these problems is addressed in this article, with the development of a supported expert driven project estimation process.
In the project estimation process the volume of manufacturing hours represents one of the most important pieces of information.It reflects the majority of costs in the final project price, and it most significantly shapes the project schedule.The research objective is to develop an ANN-supported, expert driven project estimation process to improve the estimation of the volume of manufacturing hours in the mold production.In addition to the development of a reliable estimation model, it is also very important to properly position the supporting model in the expert driven estimation process.Therefore, in addition to model building, the problem of proper position of the supporting model will be addressed in the paper.
Following these aims, first an overview of estimation process is given.Then, the solution for the problem of proper placement of an estimation support model is addressed.Furthermore, the proposed ANN-based model for estimation of the volume of manufacturing hours is presented.Finally, the results of ANN modelling are presented and discussed.

THE PROJECT ESTIMATION PROCESS
A major challenge of the project estimation process in general is to achieve sufficient project estimation reliability within minimal time consumption for this operation.Estimation reliability is directly related to the amount and quality of the data available at the moment the estimation process takes place.As shown in Fig. 1, the availability of data differs during different project stages.As we move along the timeline of the project the availability of data increases.Consequently, estimation uncertainty and risks decrease, so more accurate and reliable results can be expected.Estimation methods differ in accordance with the project stage in which they are used [3], [5] and [6] and are divided into: • intuitive, • analogical, • parametric, • analytical.
Intuitive estimation methods are based on the human expert's prior knowledge and experience.A major downside of these methods is that results are very susceptible to many different subjective factors.So, the results obtained face problems regarding reliability and repeatability.These problems can be reduced to a certain extent by applying methods that use more than one estimator or estimation method [7].A major benefit of these methods is moderate time consumption.They are usually applied in early project stages.
Analogical estimation methods are based on finding successful projects with similar characteristics like the one being estimated.On the basis of detected similarities corresponding values are assigned to the estimated project.These methods become applicable when the basic product shape is defined.They are also considered as conditionally reliable methods since the relations between similarities are usually estimated by an expert [7].Their main strengths are transparency of gained results and the ability to achieve the solution rapidly.These methods strongly rely on the database of previous projects, and become unreliable if proper mapping of similar characteristics cannot be obtained.
Parametric estimation methods are used to make estimations on the basis of parameters that are able to directly translate the properties of the product or project into an estimated value.These methods are built on the databases of past projects.Estimations are obtained by collecting input parameters and processing those to formulate a proper estimation impact.These methods are usually seen as 'black box' solutions.A major challenge is in defining a proper set of input parameters.These methods offer both speed and sufficient reliability if used properly.By keeping the database of a past project open and adding the data of new projects, this model gains the ability of adaptation and learning, which comes forward significantly when used properly with ANN platforms.Parametric methods are prone to use both parametric (e.g.multiple regression-model) and nonparametric models (e.g.ANN model), which were all found to give acceptable estimates.
Analytical estimation methods are usually applicable in the later stages of the product life cycle, when both product data and manufacturing technology are defined in details.The estimation is made on a detailed breakdown of the complete process into elementary tasks [8].For every task the relations between inputs and corresponding outputs are analytically determined.These methods are usually rigid and relations between parameters are not easily modified.They do not have adaptation ability [8].Gained results give the most accurate estimations.From the literature it is evident that the majority of research activities, related to the problems of project estimation, are focused on defining estimation models that should be able to define the link between geometric characteristics of the product and price/cost of the product/project.By focusing on these economic values, the estimating process is contaminated with influences that do not possess technical and technological characteristics of the manufacturing process.These are actually influences of the market, reflecting request and demand, and have very little to do with technological issues.
Articles which are the most significant for this research are related to product complexity [9] to [11], and the implementation of ANN in the mold production estimation process [9] and [12].All these approaches give quite accurate estimates only when used for very specific types of products.
The above mentioned articles offer the solution of the complex estimation process by using a single estimation model, taking into account all its limitations and benefits.The idea presented in this article is to develop an ANN supported project estimation process, which combines benefits from both intuitive methods and ANN estimation models.

ANN-SUPPORTED ESTIMATION PROCESS
In general, expert estimations can represent a very broad solution space (see Fig. 2).This is mainly due to limited information availability; expert's limited capability of simultaneously processing multiple information; and the expert's bias.By using a supported expert estimation process the solution space gets narrower and the risk of underestimating or overestimating minimizes.

Fig. 2. Expert estimation solution space
The estimation process in the mold making business most commonly relies on human intuitive methods [5], or a combination of intuitive and analogical estimation methods.Mold makers put major emphasis on retrieving accurate project estimation with minimal time consumption, because a large number of quotations have to be processed in order to achieve sufficient order load.The reason for that lies in a very moderate success rate of all submitted offers.In order to achieve a sufficient level of result credibility, the estimation process has to be systematically approached.With this aim, a detailed step-by-step, expert driven, and ANN-supported estimation approach has been developed as shown schematically in Fig. 3 In IDR phase all input data necessary for completing the estimation is collected and evaluated.Having all the prescribed input data (a 3D CAD model of product, a part drawing, and technical requirements for mold design) at disposal is a necessary condition for moving to the next phase.
In the CDPMV phase an expert defines the basic mold concept, starting with: proper part orientation; undercut area definition; basic mold dimension definition; and mold subsystems definition.To support his/her decisions in this phase the expert usually uses set of design rules, decision trees, and a past mold design database.In the CDPMV phase the expert also verifies product manufacturability for the prescribed manufacturing technology, in this case injection molding.For this step commercial CEA software is available.
In the REP phase an expert is faced with estimation of proper resources for a complete project.This is a crucial phase of the estimation process.To formulate estimation in this phase the expert usually relies on information from a mold material database, a post-calculation database, and a manufacturing technology database.The REP phase is followed by the ECP phase in which the estimation is translated into corresponding financial values.
In the REP phase experts usually use intuitive estimation methods, which have the aforementioned reliability disadvantages.To minimize the problem regarding the reliability of the estimation results it is proposed to place estimation supporting model.The position of the supporting model in the estimation process shown in Fig. 2 is denoted in red colour.The estimation support can be achieved by different modelling methods like regression, ANN, support vector machines, etc.By applying the estimation supporting model the unsupported estimation process is upgraded to a supported estimation process (see Fig. 2).
In this article the focus is on the most influential factor in the project estimation process -the volume of manufacturing hours (VMH).VMH is defined by: and represents the total of all machining hours spent to complete all parts (P) of the mold.Only the hours when machines are actually occupied are taken into account.This means that at each operation (OP) machining time (t m ), the loading time (t l ) and unloading time (t u ) are taken into account.
To support the estimation of the VMH, the ANN-based model is used, which is described in the following section.

ARTIFICIAL NEURAL NETWORK MODEL
ANNs are recognized as universal function approximators and can be efficiently used to model high dimensional and nonlinear relations [14].They represent a valid alternative, especially when relationships are not known in either parametric or in an analytical form.This is an empirical model that learns from past examples and generalizes the solution for new cases.
In our case, the purpose of ANN is to generate mapping from selected input data into a corresponding estimation of the VMH, based on learning by using empirical data without any prior knowledge of the mapping function.The ANN output retrieved from the model is categorized as an evaluation indicator for the expert to confirm their estimation or to re-evaluate and correct it accordingly.
The methodology for the implementation of an ANN-based estimation of the VMH consists of three major phases: input variable definition; ANN architecture definition and training; and model validation, as shown in Fig. 4.After ANN architecture and input variables are optimized, and the ANN model performance is approved by an expert, it is ready for implementation as a support in the estimation process as presented in Fig. 3.

Input Variable Definition
When implementing an ANN model for the VMH estimation one of the most vital steps is to define an appropriate set of input variables that are presumably related to the VMH.In our case the VMH is mostly influenced by (see Fig. 5): • micro and macro part geometry and quality requirements (MMPGQR), prescribed with a 3D CAD model, part drawing, and special technical requirements [28], • technical requirements for the injection mold (TRFIM) that define the environment in which the mold will operate in serial production (molding facility), • mold design principles/rules (MDP/R).
Production environment characteristics in which mold manufacturing takes place (mold shop equipment, organization, technology utilization, corporate culture, etc.) can also be used as ANN input variables.However, these characteristics are more applicable for estimations used in later project stages, when mold design is already completed.In the case when a cumulative variable like the VMH is observed, it can be presumed, that the production environment influence is already captured within the expected ANN output.These are the outputs that are collected through the samples described in Section 2.3.
When the selection of ANN input variables was considered an expert opinion was taken into account.Based on this, 22 input variables were used of which 11 describe the MMPGOR, five describe the TRFIM, and six describe the MDP/R characteristics.Names and the corresponding variable value type are shown in Table 2.

ANN Architecture Definition
To model a multivariable relation between the 22 selected input variables and the corresponding VMH value a multi-layer feed-forward network is used.For ANN training a Levenberg-Marquard learning rule is applied.It is a method which is fast and most appropriate for training moderate-sized, feed-forward neural networks [29].
As a performance function for feed-forward networks a mean square error (MSE), has been used, which defines the average squared difference between the network outputs and the target VMH Outputs.
The initial ANN architecture is shown in Fig. 6.In addition to the 22 units in the input layer it consists of 10 neurons with a sigmoid activation function in the hidden layer, and a single output neuron with a linear activation function.The ANN structure is implemented in a MATLAB environment.

Validation of the ANN Model
Training and validation of ANN model relies on the large amount of samples comprised of ANN input and the corresponding target output data.Obtaining a large number of samples in an individual production, such as mold manufacturing, represents a certain obstacle, because companies hold this information as internal know-how.In our case 105 samples were obtained from a mid-sized mold shop.The samples were taken from automotive industry projects where the injection mold typically holds mirrored part geometry.These are usually referred as 1+1 cavity molds (see Fig. 7).By narrowing the research to a certain type of molds, improved results are expected, and a narrower and denser decision space is achieved.In order to overcome the obstacle of a restricted number of samples in ANN performance validation a multifold cross-validation procedure was used.For this purpose, a set of 105 input samples was randomized and divided into five subsets, each containing 21 samples.For each training the assigned subset was selected as a testing subset.The remaining four subsets were used for training.For statistical relevance of the ANN performance the ANN training and testing with the defined subsets was repeated five times.An average value of the output error was used as a measure of ANN performance for assigned testing and training subsets.Through an iteration process the number of neurons in the hidden layer was optimized, keeping in mind the fundamental ANN rules of minimizing the output error and keeping the network small.The final ANN architecture consists of four neurons in the hidden layer with a sigmoid activation function and one neuron with a linear activation function in the output layer.As ANN inputs in our case all 22 variables presented in Table 2 were used.

ANN MODEL ESTIMATION RESULTS
An example of the comparison between network outputs and target outputs is shown in Fig. 8.
From the figure a low scatter and an acceptable correlation between the target value and corresponding estimation of the VMH with a correlation coefficient 0.92545 is evident.Although this result is very encouraging, it is also very deceptive.
To further analyze the ANN model performance, additional indicators are used.ANN model performance was characterized by relative percentage error (RPE) and mean absolute percentage error (MAPE) of the estimated VMH defined by Eqs. ( 2) and (3): Fig. 8. Scatter plot of network outputs vs. target outputs In the Eqs.( 2) and (3) t i and y i denote target and by ANN estimated value of the VMH and N denotes the number of input samples.From the above defined errors (Eqs.( 2) and ( 3)) the MAPE is used for statistical characterisation of ANN performance, whereas the RPE has an additional practical interpretation as negative and positive RPE correspond to underestimation and overestimation of VMH, respectively.While overestimation represents either profit or in the worst case, a non-competitive offer, underestimation means very dangerous nonprofitability of the project.
The RPE for each sample i is shown in Fig. 9 and the corresponding histogram is presented in Fig. 10.From Fig. 10 it is evident that the majority (89.5%) of the results predicted the VMH values have an RPE in a range between -25 and +25%.However, the fact that in 4.8% of the predicted VMH values the corresponding RPE is below -25% should not be overlooked.In the most extreme case underestimation Neural Network-Based Model for Supporting the Expert Driven Project Estimation Process in Mold Manufacturing shows RPE -38.1%.An estimator should keep in mind the level of underestimation that can be expected from using ANN model.
The results of the ANN model performance for a particular validation subset are shown in Table 3.In addition to RPE and MAPE, the minimum and maximum ANN output of the VMH are also given, indicating the ANN output range.The overall network output based on performed cross-validation using five subsets yields a MAPE 0.133.These results show that additional instruction should be implemented in order to apply the results gained from the ANN model in the estimation process.

Fig. 10. RPE sample histogram and cumulative distribution
For an expert it is important to have sufficient confidence in estimations given by the ANN model.
For this purpose, the RPE shown in Fig. 9 was reshaped in histogram form and the corresponding cumulative function as shown in Fig. 10.The RPE sample histogram gives better information regarding ANN model behaviour.The information gained from this diagram is the basis for proposing a practical safety-factor approach.The goal of this approach is to give an expert the guidance on interpretation and how to use the ANN network estimations in order to shape the conservative decision in real life application.For the purpose of practical safety-factor approach, the 80/20 Pareto principle was applied.From the cumulative distribution it can be seen that 20% of all outputs have an RPE of -15% or less.This gives a basis for defining safety-  To achieve even more conservative decisions a higher, 25% safety factor is advised.In this case it can be expected that only around 4.8% of all cases will fall in an underestimation interval.

CONCLUSIONS
This paper proposes an implementation of the ANN based model that can be used as expert support in the project estimation process.The proposed supported project estimation process defines a bridge between expert-driven intuitive models and datadriven models.As an example, an ANN model to estimate VMH is considered.The results show that the presented ANN model fulfils the requirements of relevancy, simplicity, and reliability.A major benefit of ANN is the ability to model multivariable relations, but on the other hand the model showed in some cases output deviations that should not be neglected in reallife application of the model.
By implementing a safety-factor approach, guidance is given to the expert on how to handle network output in order to decrease the probability of unwanted project underestimation, and to achieve acceptable confidence of estimation, respectively.
The following benefits can be expected by applying proposed supported estimation approach: • lowered risk of underestimating the complexity of the project, • embedded repeatability and stability in the decision making process, • improvement in expert estimation reliability, • a significantly shorter estimation process, • allowing an enterprise to foresee sufficient manufacturing resources in the early project stage, • by adapting input data specific to the estimator's environment this model can be applied in any mold shop, • it can be used as a learning assistant for novice estimators.
The major limitation of the proposed modelbased, supported project estimation process is a limited number of samples.In addition, the assumption that by implementing a limited number of parameters the information is incomplete from a wider perspective cannot be neglected.As a result, in decision making processes, experts frequently rely on information that is incomplete.To overcome this obstacle, future research activities will consider implementation and development of a specially tailored expert elicitation model.

Fig. 3 .
Fig. 3.The systematic, expert driven project estimation process supported by ANN

Fig. 7 .
Fig. 7. Example of typical injection mold for automotive industry holding geometry for mirrored parts (left and right side of the vehicle)

Table 1 .
Literature overview

Table 2 .
ANN Input and output variables with corresponding value type, and encoding

Table 3 .
Network output indicatorsfactor approach.To achieve the 15% safety factor on the gained ANN model output, we artificially shift the obtained histogram into overestimating interval.It can be can expected that by applying this factor approximately 20% of all cases will fall in a safer underestimation interval, as shown inFig 11.