Error Prediction for a Large Optical Mirror Processing Robot Based on Deep Learning

Predicting the errors of a large optical mirror processing robot (LOMPR) is very important when studying a feedforward control error compensation strategy to improve the motion accuracy of the LOMPR. Therefore, an end trajectory error prediction model of a LOMPR based on a Bayesian optimized long short-term memory (BO-LSTM) was established. First, the batch size, number of hidden neurons and learning rate of LSTM were optimized using a Bayesian method. Then, the established prediction models were used to predict the errors in the X and Y directions of the spiral trajectory of the LOMPR, and the prediction results were compared with those of back-propagation (BP) neural network. The experimental results show that the training time of the BO-LSTM is reduced to 21.4 % and 15.2 %, respectively, in X and Y directions than that of the BP neural network. Moreover, the MSE, RMSE, and MAE of the prediction error in the X direction were reduced to 9.4 %, 30.5 %, and 31.8 %, respectively; the MSE, RMSE, and MAE of the prediction error in the Y direction were reduced to 9.6 %, 30.8 %, and 37.8 %, respectively. It is verified that the BO-LSTM prediction model could improve not only the accuracy of the end trajectory error prediction of the LOMPR but also the prediction efficiency, which provides a research basis for improving the surface accuracy of an optical mirror.


INTRODUCTION
With the development of the information age, modern optical systems are rapidly developing towards features including large aperture, high precision, high resolution, and high power. The requirements for LOMPS are also increasing. In order to realize the rapid development of modern optics and keep up with the development trends of advanced optical systems, LOMPS suitable for large-scale, aspherical, efficient and precise optical mirror processing have become a key technology to be solved urgently. Optical mirror processing includes four steps: rough grinding, milling forming, fine grinding, and polishing [1] and [2]. A fixed trajectory is used for the large optical mirror processing robot (LOMPR) in these four steps, with a grid trajectory, concentric circle trajectory, or spiral trajectory commonly used. The dynamic characteristics and control parameters of the LOMPR under a specific trajectory also show periodic changes [3] and [4]. Moreover, the processing of optical mirrors requires multiple iterations. The process of fabricating a large finished optical mirror from a blank can last from several days to several months [5] to [7]. Determining the long-term periodic movement would aid in the mathematical statistical analysis of the non-linear error and predicting the direction and size of the error in the future, which will allow countermeasures to be taken in advance based on the predicted results [8] and [9]. Therefore, it is necessary to predict the trajectory error of the LOMPR based on the previous processing parameters to compensate for the uncertainty of the model and improve the surface accuracy and processing efficiency of the optical mirror [10] to [12].
Deep learning has rapidly developed in recent years, and many algorithms are suitable for non-linear fuzzy data processing. In fact, many scholars have applied deep learning and some intelligent analysis methods to the research field of traditional robots. Dai et al. applied a long short-term memory (LSTM) model to vehicle trajectory prediction [13]. Mici et al. [14] predicted the future instructions of a motor using a neural network structure to compensate for the delay error. To solve the trajectory tracking problem of a wheeled mobile robot under non-holonomic constraints and in the presence of model uncertainties, Mirzaeinejad et al. [15] derived and optimized a control law by minimizing a pointwise quadratic cost function for the predicted tracking errors of a wheeled mobile robot. Wang et al. [16] proposed an adaptive Jacobian controller based on the prediction error to realize an adaptable kinematic and dynamic parameters driven by the prediction error. Las Casas et al. [17] established a fully connected feedforward artificial neural network with supervised learning to predict the error between the welding parameters of a welding robot. In order to predict the motion direction of a robot's end tool path in real time, Wang et al. [18] proposed that the robot's end tool path be guided by a human in off-line training of the LSTM to generate a trajectory predictor. Zhang et al. [19] predicted the torque of a six-degrees-of-freedom (6-DOF) cooperative robot based on a compensated cooperative robot dynamic model of an LSTM. Yang et al. [20] predicted the fault of a robot in a dynamic working state based on deep learning. Chebbi et al. [21] carried out a sensitivity analysis and positioning error limit prediction for a 3-DOF translational parallel robot. Zatout et al. [22] analysed the optimal output problem of a fuzzy logic controller for quadrotor attitude stabilisation through particle swarm optimization, bat algorithm and cuckoo search, and obtained the optimal performance of bat method. The above scholars applied deep learning to the fields of robot parameter optimization, trajectory prediction, fault diagnosis, and robot positioning, and achieved good results. However, few scholars have conducted relevant research on the error prediction of the end execution point of a hybrid robot with large load and coupling characteristics [23] to [25]. The trajectory error of the end execution point of a robot can be predicted using deep learning, which can provide feedforward compensation data for the servo control system. However, not all deep learning models can be applied to different engineering fields. In order to cope with the characteristics of different engineering states and give deep learning algorithms a better fusion effect, many scholars have improved or optimized them. Li et al. [26] proposed an LSTM training method based on evolutionary attention and a competitive random search, which was applied to multivariate time series forecasting. Ahmad et al. [27] proposed a predictive control strategy for a four-degree-of-freedom halfcar model in the presence of an active pneumatic surface to improve the attitude control ability of the vehicle. Ullah et al. [28] proposed model predictive control and H∞ control of a 6-DOF manipulator to improve the trajectory accuracy of the robot. Ali et al.
[29] used a particle swarm optimization algorithm to optimize the gain of proportional-integral-derivative, the weighting matrices of linear quadratic regulator, and their ratio of contributions, so as to improve the performance of the controller and obtain the accurate trajectory tracking ability of the controller. Bai et al. [30] analysed the performances of linear model predictive control, linear error model predictive control, non-linear model predictive control, and nonlinear error model predictive control for mobile robot path tracking. To solve the time delay compensation problem for a non-linear teleoperation system, Shen et al. [31] proposed a motion prediction method based on a cascaded structure state observer. Tang et al. [32] predicted the wind speed range through particle swarm optimization of a deep learning network. Wang et al. [33] designed a high-performance detection model based on a superimposed LSTM to realize unmanned aerial vehicle real-time fault detection using a statistical threshold. Mei et al. [34] proposed a one-dimensional convolutional LSTM based on a vibration terrain classification method to learn both the spatial and temporal characteristics of dampened vibration signals. Wu et al. [35] predicted the health state of an aviation turbofan engine using a vanilla LSTM. Focusing on the unique attributes of the engineering field studied, the above scholars improved the existing deep learning algorithm, optimized the search algorithm, optimized the parameters, adjusted the deep learning framework to predict the data, and achieved ideal results. This background provides a reliable theoretical basis for predicting the error of the end execution point of a LOMPR using a Bayesian optimized long short-term memory (BO-LSTM).
Effectively predicting the end execution trajectory error of a LOMPR is very important when compensating for the robot's feedforward error and improving the surface accuracy of an optical mirror. This study established an end trajectory error prediction model for a LOMPR based on the Bayesian hyperparametric optimization of an LSTM. It can effectively predict the errors of the LOMPR in X and Y directions under different processing trajectories. Based on the above introduction, the structure of this paper is as follows. Section 1 introduces the structure of the LOMPR and the tool path of the robot when a computer controlled optical surfacing (CCOS) grinding system is used. Section 2 shows how the LSTM deep learning model was constructed based on Bayesian optimization. Section 3 shows how the effectiveness of the prediction model under the spiral trajectory was verified using an experimental analysis. Finally, Section 4 gives the conclusion of this paper.

LARGE OPTICAL MIRROR PROCESSING ROBOT
Based on the requirements of the optical mirror grinding process, the LOMPR needs at least five degrees of freedom (5-DOF). The LOMPR used in this study was a 5-DOF hybrid robot developed using a 3-DOF parallel (UPS+UP) manipulator combined with a 2-DOF serial manipulator. The topology of the LOMPR is shown in Fig. 1. Among them, the parallel part of the moving platform and fixed platform were connected by three UPS driven branches and one UP constrained branch. A world coordinate system (O-XYZ) was established, where O was the intersection of the horizontal axis of the fixed platform centre point and the vertical axis of the optical mirror. The X-axis pointed in the negative direction of point O 1 , the Z-axis was vertically downward, and the Y-axis was determined according to the right-hand rule. The rotation axis of the primary rotary head was collinear with the UP branch chain, and the rotation axis of the secondary rotary head was perpendicular to the primary rotary head and parallel to the Y-axis.
where O Ai is the central hinge point of the Hooke hinge of the fixed platform, the z Ai -axis is downward along the driving branch chain, the y Ai -axis is perpendicular to the driving branch chain along the internal axis of Hooke hinge, and the x Ai -axis follows the right-hand rule. The grinding tool used the CCOS grinding system. The CCOS grinding system consists of a rotating motor and a revolving motor. Among them, a pneumatic suspension device is designed above the rotation axis. When the LOMPR produces a position error in the Z-axis direction, it will push the suspension axis up and down to compensate for the error. Therefore, the error caused by the end trajectory of the LOMPR in the Z-axis direction will not affect the surface processing accuracy of the optical mirror. The position error of the end execution point of the LOMPR is the result of the coupling of multiple influencing factors. The serial manipulator of the LOMPR has a relatively stable structural size, low motion speed and small stroke, so the error caused by the serial manipulator is very small. Here, the position error caused by the parallel module UPS branch chain is considered. The main factors affecting the error of a single UPS branch chain are the friction of moving pair, elastic deformation and servo system error, so the error of a single UPS branch chain can be expressed as: e e e e e li fi di ci where e li is the comprehensive error of UPS branch chain i, e fi is the error caused by friction of moving pair, e di is the error caused by force deformation of branch components, e ci is the error of servo system, and e oi is other errors. The friction of the moving pair of the mechanism is mainly the friction generated during the movement of the Hooke hinge, composite ball hinge and ball screw of the UPS, resulting in insufficient driving force. Moreover, the posture of the mechanism in the motion space is different, the force state of each motion pair is also different, and the friction of the motion pair of the UPS is also related to its spatial posture. Therefore, the error caused by the friction of the moving pair can be expressed as: where F fi is the conversion function between friction and position error of moving pair, M fi is the friction of the moving pair, and (x, y, z) is the spatial coordinate of the LOMPR. The kinematic pair friction is analysed by introducing viscous friction based on the Coulomb friction model. So, the friction of the UPS can be obtained as: are the friction torque of composite ball hinge around x Ai , y Ai , and z Ai axis, respectively; F fiz p is the friction force of the ball screw along the z Ai axis; M fiz p is the friction torque of ball screw around z Ai axis; α, β, and γ represents the RPY angle of the moving platform relative to the fixed platform.
The position error caused by the elastic deformation of UPS can be expressed by the kinematic constraints of a parallel mechanism under elastic deformation: where J di is the kinematic constraint matrix, and q i is the displacement and rotation of UPS under elastic deformation.
When the speed feedforward control model is adopted, the servo system error function of UPS can be expressed as: where J is the load moment of inertia of the motor, B is viscous resistance, K pv is the proportional coefficient of the speed loop, K T is the torque coefficient, K sv s is the speed feedforward transfer function,

LSTM Deep Neural Network
The LSTM was developed using recurrent neural networks (RNNs) and is mainly used to solve the problem of long-term data dependence in RNNs.
An LSTM can remember the information and its correlation over a long period of time and can easily be recalled so that the information does not decay. An LSTM consists of three steps: a state forward calculation, error back propagation, and a weight update.

1) State forward calculation
The input value of an LSTM is network input value X t at the current time and network state S t-1 at the previous time. These flow through forgetting gate F, input gate I, hiding unit H, and output gate O.
(1) Forgetting gate where U F , W F , and b F represent the input connection matrix, feedback connection matrix, and offset term of the forgetting gate, respectively, g is the sigmoid function.
(2) Input gate where U I , W I , and b I represent the input connection matrix, feedback connection matrix, and offset term of the input gate, respectively.
(3) Hiding unit where U H , W H , and b H represent the input connection matrix, feedback connection matrix, and offset term of the hiding gate, respectively.
(4) Storage element status update where U O , W O , and b O represent the input connection matrix, feedback connection matrix, and offset term of the output gate, respectively.

(6) LSTM hidden layer output
The information, C t , of the storage element is multiplied by the result of the tanh transfer function and the output gate to obtain the LSTM hidden layer output.

S O C
(7) Network output The hidden layer output of the LSTM is sent to the softmax output layer through connection matrix V to obtain the actual output of the network.

2) Error back propagation
Based on the actual network output obtained by the forward calculation, the error function can be defined as follows: where Y t is the annotation signal from the data. For time step t, the corresponding error is found as follows: In other words, the downstream error at a certain time, t, can be divided into two parts: the error of the backflow in the later time step, t+1, and the error of the same time step, t. The errors of these two parts are expressed as follows: where δ is the neural node error. The expressions of the transfer function sigmoid and hyperbolic tangent function are as follows: (16) Further calculation shows that the errors are as follows:

3) Weight update
Based on the errors of the forgetting gate, input gate, and output gate in each time step, the connection weight parameters associated with these components can be updated and combined with the output values of each neuron obtained in the forward calculation.
The formula for updating the connection parameters of a forgetting gate is as follows: where β is the learning rate, and n is the number of iterations.
The formula for updating the connection parameters of the input gate is as follows: The formula for updating the connection parameters of the hidden layer is as follows: The formula for updating the connection parameters of the output gate is as follows:

Bayesian Optimized Hyperparameters
During the training of the LSTM, there are many hyperparameters in the model, including the data length, number of iterations, learning rate, number of hidden neural units, and forgetting coefficient, which can be set at will. Appropriate super parameters are vital to improving the performance of the model. Bayesian optimization can not only set continuous hyperparameter values but also has a high search efficiency and accuracy. Bayesian optimization uses the approximate global optimization algorithm to find the hyperparametric combination that minimizes the objective function: where f represents the objective function; D = {(x 1 , y 1 ), (x 1 , y 1 ), … (x n , y n )} is the data set; p( f |D) and p( f ) are the a posteriori probability and a priori probability, respectively; p(D| f ) is the likelihood distribution of y; and p(D) is the boundary likelihood distribution of f. Assuming a set of super parameter combinations, X = (x 1 , x 2 , …, x n ), each super parameter is evaluated, and the evaluation result is f(x n ). Then, the optimal hyperparameter is found as follows:

1) Probabilistic agent model
The probability agent model adopts a Gaussian process, and the Gaussian distribution can be expressed as follows: where μ(x) is the mean function, k(x, x) is the positive definite covariance function, and f(x) is the average absolute error.
According to the nature of the Gaussian process, f t and f t+1 obey a joint Gaussian distribution: where K is the covariance matrix,

2) Acquisition function
The acquisition function can prevent the Bayesian optimization from falling into a local optimal solution. The excepted improvement (EI) acquisition function can find the next sample point with the greatest improvement expectation. The EI sampling function is as follows: where Φ is the probability density, φ is the standard normal distribution function, and f(x + ) is the existing maximum. In addition, the following is true: Then, the flow of the BO-LSTM is shown in Fig.  2.

EXPERIMENTAL ANALYSIS
The prototype of the LOMPR is shown in Fig. 3. In mirror processing, grid trajectory, concentric circle trajectory, and spiral trajectory are commonly used. The curvature of spiral trajectory constantly changes, which severely tests the dynamic characteristics of LOMPR, and the end trajectory error generated by the processing robot is also relatively large. Therefore, the spiral trajectory is used as a representative to predict the error in the X and Y directions of the end trajectory Fig. 3. Prototype of LOMPR of the LOMPR. Because the optical mirror needs to complete an iterative cycle before surface detection, it is impossible to collect the error of the mirror surface in real time for prediction. Therefore, the training data used are the end execution point position parameters calculated from the collected motion branch chain parameters into the forward kinematics model. Then the actual trajectory error collected from the experimental mirror is compared with the prediction results. On the one hand, it can verify the effectiveness of the BO-LSTM prediction model; on the other hand, it can verify the accuracy of the forward kinematics model. The technical route is shown in Fig. 4.
The experiment was based on the deep learning framework Keras library written in the Python language to complete the BO-LSTM model construction, data loading, training, and testing. Firstly, the data is pre-processed, including time stitching, missing data processing, abnormal data processing and data normalization. In the process of data prediction, the learning rate, batch capacity, and the number of hidden neurons directly affected the convergence speed and prediction accuracy of the model. Then, the above hyperparameters were optimized, and the appropriate parameters were input into the model for training. In order to compare the effectiveness of this hyperparametric optimization, a set of default values were used based on engineering experience, including a batch size of 1500, 35 hidden neurons, and a learning rate of 0.01. Considering the influence of the hyperparameters on the model performance and the interaction between hyperparameters, the value range of the batch size was set at (200, 600, 1000, 1500), the number of hidden neurons was (5,15,35,50), the learning rate was (0.1, 0.01, 0.001), the optimizer was "Adam," and the activation function was "Relu".
Bayesian optimization was used to search the above hyperparameters, and the following optimal combination of hyperparameters was obtained: a batch size of 1000, 15 hidden neurons, and a learning rate of 0.001. The experimental data were imported into the established LSTM error prediction model using 90 % as training data samples and 10 % as test samples. The end execution point errors of the optical mirror machining robot under different trajectories were then predicted using the BO-LSTM. In order to verify the effectiveness of the BO-LSTM model in the prediction of optical mirror processing trajectory error, the BP neural network prediction model is used for prediction and analysis, and the prediction results are compared with those of BO-LSTM.
The trajectory equation of the robot during spiral processing is as follows: The LOMPR was controlled to process continuously, the motion period is 40 s and the sampling period is 40 ms. And the motion data of each rotating axis were collected, and the trajectory error of the end execution point was calculated. According to the collected data, the prediction results of BO-LSTM prediction model and BP neural network prediction model are shown in Fig. 5.
The data collected during the experiment and the trajectory error prediction results of the spiral trajectory in the X and Y directions are shown in Fig.  5. As the spiral trajectory is a trajectory with gradually increasing curvature, the errors of the LOMPR in the X and Y directions accumulates, resulting in the trajectory error in the X and Y directions gradually increasing amplitude and periodically changing in the form of a sinusoidal function curve. The prediction parameters of the BP neural network are shown in Fig. 4. Technical route of error prediction the two models are analysed through the error integration criterion. It can be seen that the IAE, ISE, ITAE, and ITSE of the prediction error of BO-LSTM in the X direction are reduced to 56.44 %, 33.33 %, 53.63 %, and 30.47 %, respectively compared with the prediction error of the BP neural network; the IAE, ISE, ITAE, and ITSE of the prediction error of BO-LSTM in the Y direction are reduced to 57.87 %, 37.84 %, 54.79 %, and 34.27 % respectively compared with the prediction error of BP neural network. In summary, the prediction accuracy of the established BO-LSTM end trajectory error prediction model is significantly higher than that of the traditional BP neural network, which further verifies the accuracy and effectiveness of the established model.  .51 %, respectively. When using the BO-LSTM prediction model for trajectory error prediction, the training time in X and Y directions are reduced to 9 min and 7 min, respectively. Further analysis of the prediction results shows that the MSE, RMSE, and MAE of the prediction error in the X direction were 0.09 %, 2.99 %, and 2.32 %, respectively; the MSE, RMSE, and MAE of the prediction error in the Y direction were 0.10 %, 3.14 %, and 2.46 %, respectively. Moreover, through the predicted trajectory error data of 14.6 s, the prediction results of