1 s2.0 S0888327023008804 Main
1 s2.0 S0888327023008804 Main
Communicated by Y. Lei It has become a general consensus that nacelle-mounted LiDAR can be used to calibrate the
yaw misalignment or drive the real-time yaw motions for wind turbines, which would improve
Keywords:
the power-generation efficiency. The advantage of LiDAR utilization is that the accuracy of
Wind turbines
Yaw misalignment
inflow wind measurement would be greatly improved, while its disadvantage is that the cost
Machine learning remains high and the data validity is not sufficiently high. In this paper, an efficient machine
LiDAR learning method for estimating LiDAR measurement is developed to establish the real-time
Field test yaw calibration framework and sustain the LiDAR rolling utilization. Firstly, the correlation
of LiDAR measurement with SCADA features is analyzed to estimate LiDAR measurement using
only SCADA data. Secondly, several machine learning algorithms are studied for performance
comparison, and the dependence of each algorithm on data size is also analyzed. Experimental
results show that, the proposed XGBoost algorithm has high accuracy, requires less data, and
can quickly calibrate the yaw misalignment. Finally, the field testing is held for a commercial
2 MW wind turbine to verify the effectiveness. The field-test results show that the proposed
method is feasible for industrial applications and can improve the annual theoretical power
generation by 3.66% compared to the situation without calibration, which also provides an
executable and economical solution for LiDAR replacement planning.
1. Introduction
Wind energy is a representative form of renewable energy, known for its high capacity density, efficiency, cleanliness, and
controllability. According to the Global Wind Energy Council (GWEC), the world’s total installed wind energy capacity has reached
approximately 906 GW, with 78 GW of new capacity added in 2022 [1]. With the advent of the wind energy era, the ever-increasing
operating and maintenance costs also followed. Therefore, enhancing wind turbine(WT) performance holds significant importance
in mitigating these operating expenses.
The yaw control system, as a typical controller in a wind turbine, is critical in the wind energy conversion process. The critical
part is to drive the yaw motor to ensure that the rotor faces the inflow wind direction [2]. For wind turbines in nominal operation,
the wind vane measures the direction of the inflow wind and digitizes the direction information to the control module, so that
the central axis of the nacelle is kept parallel to the inflow wind direction. In practice, there are always structural deviations in
the mechanical structure due to external reasons [3]. Relevant research results show that even a minor yaw misalignment can
significantly reduce WT production [4].
∗ Corresponding author.
E-mail address: [email protected] (Z. Lin).
https://doi.org/10.1016/j.ymssp.2023.110972
Received 22 February 2023; Received in revised form 20 October 2023; Accepted 22 November 2023
Available online 30 November 2023
0888-3270/© 2023 Elsevier Ltd. All rights reserved.
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Accurate wind alignment is a prerequisite for yaw control. There are several ways to reduce yaw misalignment. The study in Pei,
Yan et al. [5] proposed a yaw misalignment analysis and detection framework based on a SCADA data-driven method. Song, Dongran
et al. [6] proposed a yaw control solution to maximize crew power using predicted wind direction data, and analyzed the effect of
yaw misalignment on the crew yaw control system. Bao, Yunong et al. [7] proposed a compensation approach to overcome the yaw
misalignment. Study in Astolfi et al. [8] presented a method for systematic yaw error detection through analyzing the yaw error–
rotor speed curve. Yang, Jian et al. [9] proposed a method incorporating environmental impacts into zero-point shifting diagnosis
based on SGPR and SciForest for data preprocessing. Qu, Chenzhi et al. [10] presented a power-oriented method of calibrating
yaw misalignment, and carried out the field test on six commercial WTs. In general, the cost of steady-state yaw calibration is low,
and the calibration effect can be clearly observed for wind turbines with poor performance. However, the benefits of steady-state
calibration are essentially limited due to the long-term statistics, and will gradually decrease as the wind turbine operates year after
year. Hence, the real-time yaw misalignment needs to be calibrated urgently, which merits this study.
With the continuous development of remote sensing wind measurement technology, LiDAR is used to calibrate the yaw
misalignment of wind turbines has become a relatively mature technology in recent years. Wagner et al. [11] presented a
standardized method of LiDAR measurement and analysis, and a procedure for performing a power curve measurement with nacelle-
mounted LiDAR. Fleming et al. [12] showed how nacelle-mounted LiDAR can be used to improve wind turbine power capture by
reducing yaw misalignment in field tests. Furthermore, while the use of LiDAR may eliminate the yaw misalignment caused by
wind wane [13], the high costs of the equipment and the associated maintenance remain major barriers [14]. Zhang, Le et al. [15]
proposed a tabulation method based on LiDAR measurements to correct the static yaw misalignment. Obviously, LiDAR provides
high accuracy for yaw calibration, and LiDAR is expensive and limited in availability, so it is very necessary to maximize the role
of LiDAR based on the balance of payments.
In addition to LiDAR in wind turbines, the application of machine learning models in wind power control is also in full swing.
Most literature uses machine learning models for wind turbine condition monitoring (such as blade failure detection or generator
temperature monitoring) [16,17]. Based on the Random Forests and XGBoost, Zhang, Dahai et al. [18] proposed an efficient WT
fault detection method. Of course, machine learning models are also widely used to predict parameters such as wind speed and
direction [19,20]. In the study of Ouyang, Tinghui et al. [21], a data-driven approach to minimize yaw error using a model predicting
wind direction was presented. Khosravi et al. [22] presented three machine learning methods were to predict wind speed, wind
direction and output power of a WT. Saénz-Aguirre et al. [23] proposed an artificial neural network (ANN)-based reinforcement
learning (RL) yaw angle control strategy of a wind turbine, and simulated different wind scenarios in TurbSim. In a subsequent
study, Saénz-Aguirre et al. [24] proposed a method to improve the performance and reduce the mechanical loads in the yaw system
of this ANN-based RL yaw control strategy. At present, some studies have applied machine learning algorithms to yaw calibration.
In the research of [25], the offset method is used to correct static yaw misalignment, while machine learning algorithms are used to
estimate the dynamic yaw misalignment, using both ground-LiDAR and nacelle-LiDAR. In [26], a framework enabling the detection
and correction of yaw errors through directly forecasting dynamic yaw error through a machine learning model is proposed, resulting
in a reduction of total yaw error by up to 85% (on average of 71%).
In order to implement the real-time yaw misalignment calibration, this paper proposes a data-driven method for estimating
LiDAR measurements. The SCADA operational data is first analyzed for correlation with LiDAR measurements, then four machine
learning algorithms are analyzed for performance comparison to determine the most suitable one. Taking a commercial 2MW wind
turbine as the research object, the field test is implemented to verify the effectiveness and applicability of the proposed method.
Moreover, a preliminary LiDAR-replacement planning is given to balance the LiDAR-investment cost and yaw-calibration benefit.
The main contribution consists of:
• A machine learning-based yaw misalignment calibration model is established, which could sustainably estimate the nacelle-
mounted LiDAR measurement based on its correlation with SCADA features. This method is suitable for low-cost LiDAR rolling
utilization.
• The field test is held with three strategies including conventional wind vane control, LiDAR-assist control and machine
learning-based control, and then the field-test verification is analyzed through the power output characteristics.
• The cost and benefit are estimated to discuss the feasibility of nacelle-mounted LiDAR replacement, which could provide a
basic analytical framework for industrial applications.
The structure of this manuscript is as follows: In Section 2, the yaw control system of the wind turbine is introduce with
misalignment, while the LiDAR-assist control is developed to reduce the misalignment. Using a data-driven methodology, Section 3
develops a machine learning-based yaw misalignment calibration model. Section 4 presents the results of LiDAR replacement via
yaw misalignment calibration on real datasets. In Section 5, the setting of the field test and the verification through the power output
characteristics are introduced and analyzed. The results show that the proposed real-time yaw misalignment calibration method is
efficient. Section 6 proposes an executable economical solution for LiDAR replacement planning. Finally, Section 7 concludes the
whole paper.
The yaw control system of the wind turbine is an automatic control system, and its composition is shown in Fig. 1. The working
principle of the yaw control system can be concluded as follows: The wind vane sensor collects the yaw error as the system input
2
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 2. Diagram of yaw misalignment (top view of nacelle). The wind direction (in front of the rotor) represents the direction of incoming wind. The measured
wind direction (behind the rotor) represents the wind direction actually measured by the wind vane mounted behind the rotor.
signal, and transmits it to the yaw controller. After data processing, the yaw action changes the nacelle position. The yaw controller
outputs the yaw command and yaw angle according to the logic operation of the central controller. When the rotor axis is aligned
with the wind direction, the yaw motor stops by finally achieving the purpose of the nacelle facing the wind, which ends the yaw
control process of this round.
The wind direction sensor collects the wind direction data and transmits it to the yaw controller. After processing the wind
direction data, the yaw controller judges according to the reference standard, and gives instructions on whether to yaw and the
corresponding yaw direction and angle, and finally achieves the purpose of the cabin facing the wind. To reduce the gyro torque
during yaw, the motor speed is slowed down by the coaxially connected reducer, and then the yaw torque acts on the large gear of
the rotating body to drive the nacelle to yaw against the wind.
For the yaw system, the cable device is used as a protection device, generally consisting of a control switch and a contact
mechanism. The yaw counter is used to record the real-time angle of the cable during the yaw process. If the WT yaws continuously
in the same direction, the WT will have a cable twist. To ensure the safe operation of the WT, the main controller performs an
automatic decoupling operation using internal logic judgement.
Fig. 2 presents a schematic diagram of the yaw misalignment. In an ideal situation, the yaw error is the angle of the nacelle
central axis to the direction of the incoming wind (wind direction in front of the rotor) 𝜎. However, in practice, the yaw error
measurement can be influenced by two factors: static yaw misalignment 𝛥𝛾 and dynamic yaw misalignment 𝛥𝜃, leading to the wind
vane measurement value becoming 𝜑.
• The static yaw misalignment 𝛥𝛾 is the misalignment between the nacelle central axis (the theoretical wind vane zero position)
and the actual zero position of wind vane, which is also referred to as the zero shift angle. The causes of the static misalignment
𝛥𝛾 are numerous and include inadequate human operation, mechanical wear and tear, as well as harsh conditions during wind
vane installation and usage.
• The dynamic yaw misalignment 𝛥𝜃 occurs primarily because the wind vane is installed behind rotor. The rotation of the rotor
induces the upstream and downstream air to also rotate, forming the wake of rotor surface and the turbulence around the root
of blade. This error changes with the wind conditions.
For the above reasons, there is a difference between the actual yaw error and the theoretical yaw error, which is the yaw
misalignment and could be expressed as
𝛥𝜎 = 𝜎 − 𝜑 = 𝛥𝛾 + 𝛥𝜃. (1)
3
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
The presence of yaw misalignment 𝛥𝜎 results in incorrect measurement of the wind vane, which degrades the performance of
the yaw control. According to the Betz principle, the power captured by the wind turbine is
1
𝑃 = 𝜌𝐶 (𝜆, 𝛽)𝐴𝑣3 (𝑐𝑜𝑠𝜎)3 . (2)
2 P
It shows that the power of the wind turbine is affected by the yaw misalignment, and the greater absolute value of misalignment,
the smaller power generated by wind turbine. If there is yaw misalignment 𝛥𝜎, the energy captured by the wind turbine will be lost
to 𝑃 × (1 − 𝑐𝑜𝑠3 (𝜑)). Therefore, accurate information on the inflow wind direction supports maximizing wind energy capture.
LiDAR wind measurement is more accurate than wind vane measurement, which can significantly optimize the accuracy of yaw
to wind, and there is no static measurement error when installed as standard. Therefore, it is reasonable to assume that the LiDAR
measured yaw error is 𝜎. So when a nacelle-mounted LiDAR is deployed, the 𝛥𝜎 can be assumed to be zero. However, LiDAR is less
stable than wind vane, and may be invalidated by adverse factors like extreme weather.
Due to the advantage of high accuracy and the disadvantage of low availability, previous work has mainly focused on providing
a statistically static calibration by analyzing the historical SCADA data and offline LiDAR measurements. However, the static error
cannot reflect the real-time misalignment with effective limitations. So far, LiDAR has not been fully exploited, with lower returns
compared to the high investment.
A baseline LiDAR-assist yaw control strategy could be designed as follows. Similar to the conventional wind vane control, the
real-time wind position could be driven by the LiDAR measurement as shown in Fig. 3. Considering the start-up and shut-down
procedures, the accuracy of the wind direction is not as important as in the MTTP stage, the conventional wind vane is still
adopted for reliability (usually two vanes are configured). The LiDAR-assist control is essentially based on the wind vane control
compensating with the real-time yaw misalignment. The process of improving validation and LiDAR-assist control is given as follows:
The realization process of LiDAR-assist yaw control is depicted in Fig. 4. At time 𝑡𝑛𝑜𝑤 , the inputs are the wind vane measurement
value 𝜃𝑛𝑜𝑤 and the LiDAR measurement value 𝜃̃𝑛𝑜𝑤 , the output is the yaw angle value 𝜃. The input data undergoes lowpass filtering
𝑡𝑠
to obtain 𝜃 𝑛𝑜𝑤 and 𝜃̃𝑛𝑜𝑤
𝑡𝑠 . Furthermore, when LiDAR fails to provide sufficient valid data during a specific period, the wind vane
measurement for that time may be considered. This safety check is implemented to prevent abnormal yawing of the turbine due
to invalid LiDAR measurements. 𝐶𝑜𝑛𝑑𝑖 is the operating condition of the 𝑖th moment of LiDAR, 𝑇𝑒𝑟𝑟𝑜𝑟 records the time when LiDAR
𝑡𝑠
measurement value is invalid, 𝜃 𝑝𝑟𝑒 is the LiDAR data when the previous second data.
Based on the above discussion, the difference between LiDAR-assist control and wind vane control would be represented through
the real-time yaw misalignment on wind direction measurements. Consequently, a calibration method is proposed to estimate LiDAR
measurements and enable sustained yaw corrections. Once the real-time misalignment could be reconstructed by the data-driven
method, the LiDAR could be removed to the next wind turbine and realized the low-cost rolling utilization.
In this section, the yaw misalignment calibration method is proposed based on ML algorithm. The overall framework of the
proposed method is shown in Fig. 5, including data pretreatment, correlation analysis, establishment of yaw misalignment calibration
model based on ML algorithm and evaluation of yaw misalignment calibration model. The specific process is as follows:
(1) Remove the invalid data, conditions-specific data, and data for other abnormal conditions from the SCADA and LiDAR data.
After that, SCADA and LiDAR data need to be time-aligned.
(2) The dependency of yaw misalignment on SCADA features is analyzed and the correlation between the LiDAR measurement
and SCADA features is quantified. Then, the features with a high correlation with the LiDAR measurement value are selected
as the input of the prediction model to improve the efficiency and accuracy.
(3) The yaw misalignment calibration model is established based on four machine learning algorithms, namely Linear Regression,
Gradient Boosting, Random Forests and XGBoost.
(4) Four ML models are compared and analyzed according to error-related performance metrics, and the influence of different ML
models on calibration results is analyzed.
4
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
5
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Table 1
Description and identification criterion of anomaly.
Resource Type Discriminant criterion
⎧𝑣𝑖 < 𝑣𝑐𝑢𝑡_𝑖𝑛 𝑜𝑟 𝑣𝑖 > 𝑣𝑟𝑎𝑡𝑒𝑑
⎪
⎪𝑟𝑖 > 𝑟𝑟𝑎𝑡𝑒𝑑
Unrated conditions ⎨
⎪𝜃𝑖 ≠ 0
SCADA ⎪𝑃 < 0 𝑜𝑟 𝑃 > 𝑃
⎩ 𝑖 𝑖 𝑟𝑎𝑡𝑒𝑑
A vast quantity of abnormal data is generated during wind farm operation attributed to factors such as WT start and stop,
communication noise, and equipment failure. If these data are used directly without processing, the wind turbine analysis results
will be affected. Thus, invalid data, data for pitch conditions, and data for other abnormal conditions must be removed from the
SCADA and LiDAR data. The details of the data cleaning are given in the Table 1, where 𝑣𝑖 , 𝑟𝑖 , 𝜃𝑖 , 𝑃𝑖 , 𝑃𝑆𝑒𝑡𝑝𝑜𝑖𝑛𝑡 , and 𝐶𝑜𝑛𝑑𝑖 are the
wind speed, rotor speed, blade pitch angle, active power, active power setpoint and the operating condition of the 𝑖th moment,
respectively.
In this paper, only the operating data below the rated wind speed has been studied, as the WT performs a variable pitch constant
power control above the rated wind speed. When the 𝑣𝑖 is greater than the 𝑣𝑟𝑎𝑡𝑒𝑑 , the blade pitch angle 𝜃𝑖 would be changed
dynamically at different positive position. According to the principles in Table 1, the corresponding operating data are eliminated.
In addition, it is necessary to make the time-aligned operation for the data measured by LiDAR with the data measured by
SCADA, and then analyze the data.
The basic idea of real-time yaw misalignment calibration is to use the SCADA features to estimate the wind direction measurement
value of the LiDAR through the ML algorithm, based on the high accuracy of the wind direction measured by the LiDAR. On the other
hand, to realize the rolling utilization of nacelle-mounted LiDAR, there will be specific requirements for the data-driven methods,
such as higher accuracy with fewer samples. Herein, four ML algorithms are applied for the corresponding performance comparison,
namely Linear Regression, Random Forests, Gradient Boosting and XGBoost. Note that, SCADA data is structured feature data, and
the XGBoost algorithm is technically suitable for processing structured feature data. Therefore, the XGBoost model is primarily
chosen to construct the relationship between the SCADA features and the LiDAR wind direction measurement.
XGBoost, the extreme gradient boosting algorithm, works very well for classification or regression problems. It uses the decision
tree as the base learner to construct multiple weak learners. In the iterative learning process, the model is continuously trained in
the direction of decreasing the gradient, and the second-order Taylor series is used to expand the loss function. To find the overall
optimal solution, the regularization term is added to the objective function to control the accuracy and complexity of the model [28].
This method has the advantages of preventing overfitting, high speed, and multi-threaded parallel processing [29]. The core of the
algorithm can be summarized as follows:
∑
𝑘
𝑦̂𝑖 = 𝑓𝑘 (𝑥𝑖 ), 𝑓𝑘 ∈ 𝐹 , (4)
1
6
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Table 2
Algorithm comparison.
Algorithm Linear regression Gradient boosting Random forests XGBoost
Classification – Boosting Bagging Boosting
Training time Very short Long Very long Short
Prediction accuracy Low High High High
High dimensional data adaptability High Low High High
Interpretability Very high Low Low Low
Overfitting occurs Less Frequently Less Less
where, 𝑦̂𝑖 is the predicted value of the model; 𝑥𝑖 is the 𝑖th sample; 𝑘 is the number of decision trees; 𝑓𝑘 is the independent function
of the function space; 𝐹 is the function space, which is composed of decision trees.
Since the optimization parameter of the objective function is the model, it cannot be optimized in the Euclidean space using
traditional optimization method, so it is understood as an additive method during model training. This means that the learned tree
is fixed, and a new tree is added to it each round to minimize the objective function. The process is presented as follows:
𝑦̂𝑖 = 0
𝑦̂(1) (0)
𝑖 = 𝑦̂𝑖 + 𝑓1 (𝑥𝑖 )
𝑦̂(2) (1)
𝑖 = 𝑦̂𝑖 + 𝑓2 (𝑥𝑖 )
(5)
⋮
𝑦̂(𝑡) (𝑡−1)
𝑖 = 𝑦̂𝑖 + 𝑓𝑡 (𝑥𝑖 ),
where, 𝑦̂(𝑡)
𝑖 is the predicted value obtained in the 𝑡th round, 𝑦̂𝑖
(𝑡−1)
represents the prediction result of the previous round (𝑡 − 1), 𝑓𝑡 (𝑥𝑖 )
represents the newly added regression tree for residual fitting.
After building the model, we need to give the model an optimization objective, so that the learned parameters can make the
predicted value 𝑦𝑖 as close to the true value 𝑦̂𝑖 as possible. The objective function is calculated as:
where, 𝑙(𝑦𝑖 , 𝑦̂𝑖(𝑡) ) represents the loss function, usually the square loss or logistic loss. The loss error function describes the difference
between the predicted value of the target and the true value of the target. The calculation formula is as follows:
1 ∑ 2
𝑇
𝛺(𝑓𝑡 ) = 𝛾𝑇 + 𝜆 𝑤 , (9)
2 𝑗=1 𝑗
where, 𝑇 represents the number of leaves in the tree, 𝛾 and 𝜆 are hyperparameters, 𝛾 is used to shrink the number of leaves, and 𝜆
controls that the weight fraction of the leaves is not too large. Both prevent the model from overfitting.
The basic comparison among XGBoost and other tested ML algorithms as in [30–32] are given in Table 2.
The detailed comparison and analysis will be made with experimental field studies in Section 4, where the dependence of each
ML algorithm on the amount of data is also analyzed.
In order to measure and compare the different ML models more comprehensively, several error-related performance metrics are
adopted, including the mean absolute error (MAE), the mean absolute percentage error (MAPE) and the root mean square error
(RMSE) are adopted. The corresponding values of MAE, MAPE and RMSE could be calculated as follows:
1∑
𝑛
MAE = ∣ 𝜎 − 𝜎𝑒 ∣, (10)
𝑛 𝑖=1 𝑙
1 ∑ ∣ 𝜎𝑙 − 𝜎𝑒 ∣
𝑛
MAPE = ( ) × 100%, (11)
𝑛 𝑖=1 𝜎𝑙
√
√ 𝑛
√1 ∑
RMSE = √ (𝜎 − 𝜎𝑒 )2 , (12)
𝑛 𝑖=1 𝑙
where, 𝜎𝑙 represents the LiDAR measurement, 𝜎𝑒 represents the estimated LiDAR measurement.
7
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Table 3
Parameters of wind turbine and Molas NL LiDAR.
Turbine parameter Value
Rated power 2000 kW
Rotor diameter 115 m
Hub height 80 m
Rated rotor speed 13.55 r/min
Cut-in wind speed 3 m/s
Rated wind speed 10.5 m/s
Cut-out wind speed 20 m/s
Yaw deviation threshold 15 deg
LiDAR parameter Value
Installation position On nacelle
Measuring frequency 4 Hz
Measuring distance 50–400 m
Numbers of beams 4
Numbers of measuring section 10
In addition, the computation time and memory usage of different ML models are also considered as performance indicators,
which are important for industrial implementation.
To verify the proposed real-time yaw misalignment calibration method, the actual operational data of a commercial WT installed
with LiDAR is selected for testing in this section. On the one hand, the accuracy and reliability of the proposed calibration method
are verified. On the other hand, the data volume requirements of different ML models are analyzed.
For this experiment, a UP2000-115 wind turbine located in a wind farm in eastern China was utilized. A nacelle-mounted LiDAR
system was installed on the wind turbine between October 2020 and December 2021. The parameter information of the tested WT
and the LiDAR are give in Table 3. The LiDAR can measure the wind conditions from 50 m to 400 m in front of the rotor. To reduce
the influence of mountain topography and ensure that the wind condition measured by LiDAR can be closer to the wind vane wind
condition, the data closest to the rotor was selected as the LiDAR measured value. The LiDAR was mounted on the top of the nacelle,
as shown in Fig. 6.
The SCADA system collects, adjusts, calculates, and controls the state parameters of wind turbines, enabling the monitoring
and automatic control of the operating status of all wind turbines in the entire wind farm through data sharing and real-time
communication. Fleming et al. [12,13] described two field tests in which LiDAR was used for wind direction measurements to
directly control the yaw position of wind turbines (NREL CART2 and NREL CART3). The LiDAR sampling rate was set to 2 Hz
in [12] and 10 Hz in [13] for real-time control. In this study, the resolution of SCADA is on the order of seconds. To maintain
synchronization, the resolution of LiDAR is also on the order of seconds, which is sufficient for the further implementation of
real-time yaw calibrating control due to the large inertial design.
In this experiment, the computer is configured with GPU: NVIDIA GeForce RTX 2080, CPU: Intel(R) Core(TM) i7-10700 CPU @
2.90 GHz, RAM: 16 GB.
8
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Table 4
Performance comparison of different ML algorithms.
Algorithm MAE MAPE RMSE Computation time Memory usage
Linear regression 3.639 1.495 4.779 1.42 s 2 kB
Gradient boosting 3.311 1.188 4.197 527.92 s 174 kB
Random forests 3.217 1.143 4.059 1987.15 s 15.8 GB
XGBoost regression 3.148 1.118 3.958 3.78 s 757 kB
The yaw misalignment is calibrated in real-time using the collected data from the wind turbine, following the process outlined in
Fig. 5. First, the raw SCADA data and LiDAR data are cleared according to Table 1. Then the LiDAR wind direction measurements and
the wind vane wind direction measurements are mapped one to one in time, and the data is analyzed. Fig. 7 shows the comparison
between the LiDAR measurement and the wind vane measurement. There is no obvious static difference between the two ways, and
the difference value looks random. From another perspective, which is a good opportunity for ML.
The correlation of yaw misalignment with rotor speed, wind speed, air temperature and wind direction is presented as shown in
Fig. 8. In Fig. 8(a), the yaw misalignment changes with the change of rotor speed of SCADA data can be observed that, in the bin
of 8 rpm to 11 rpm. In Fig. 8(b), the yaw misalignment value is unstable and fluctuates greatly at low wind speed. When the wind
speed between 7.5 m/s and 11 m/s, the increase in yaw misalignment with increasing wind speed is relatively obvious. There is no
apparent correlation between yaw misalignment and temperature change in Fig. 8(c). In Fig. 8(d), yaw misalignment varies with
absolute wind direction, but does not show regularity clearly.
In conclusion, the correlation between yaw misalignment and SCADA features is not very obvious, and it is possible to directly
analyze and quantify the correlation between LiDAR measurement and SCADA features.
The wind speed (𝑤𝑠), wind vane 1 measurement (𝑤𝑑1), wind vane 2 measurement (𝑤𝑑2), nacelle position (𝑛𝑝), active power
(𝑎𝑝), air temperature (𝑡𝑒𝑚𝑝), rotor speed (𝑟𝑠), filtered wind vane 1 measurement (𝑓 𝑖𝑙𝑡𝑒𝑟 − 𝑤𝑑1) in the SCADA features are selected
with LiDAR wind direction measurement value (𝑤𝑑 −𝐿𝑖𝐷𝐴𝑅) is used for correlation analysis, and the correlation coefficient diagram
is shown in Fig. 9.
As shown in Fig. 9, where each grid value in the figure is the correlation coefficient between the two variables corresponding to
the row and column. The highest correlation is observed between the filtered wind vane 1 measurement and the LiDAR measurement,
with a coefficient of 0.6, indicating a strong correlation. The Pearson coefficients for the wind vane 1 measurement, wind vane 2
measurement, wind speed, and LiDAR measurement are 0.48, 0.35, and 0.11, respectively, indicating a moderate correlation. On
the other hand, the correlation between air temperature and LiDAR is weak, with a coefficient of only 0.01. Nacelle position, active
power, and rotor speed show negative correlations with the LiDAR measurement. To enhance the model’s efficiency and minimize
interference from irrelevant data, the input variables for the estimation model include filtered wind vane 1 measurement, wind vane
1 measurement, wind vane 2 measurement, and wind speed.
In this subsection, three classical ML models are presented for comparison in order to analyze the influence of different ML
models on the calibration results, namely Linear Regression, Gradient Boosting and Random Forests. For each model, using the
standard evaluation method of 𝑘-fold cross-validation, the dataset is randomly divided into 𝑘 disjoint subsets, each time 𝑘 − 1 groups
are taken as training data, and the unused one group is used for the test. In this experiment, 𝑘 is taken as 10.
9
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 8. Correlation of yaw misalignment on rotor speed, wind speed, air temperature and wind direction. The gray dotted lines represent a mean yaw misalignment
value of 6.51 deg. The size of the dot symbolize the size of samples, and each vertical line indicates range of the data.
Fig. 9. Pearson correlation coefficients between LiDAR measurement and SCADA features.
In Fig. 10, it can be seen that the predicted values of the four models are significantly correlated with the true values. Fig. 11
shows the comparison of the prediction results of each model in the time domain. The specific quantification results are presented
in Table 4. On the one hand, it can be concluded that the XGBoost regression algorithm outperforms the other algorithms in terms
of the error-related performance of the prediction problem. On average, the MAE, MAPE, and RMSE of the XGBoost are 2.14%,
2.19%, and 2.49% lower than those of the second-best-performing algorithm. On the other hand, it can also be seen from the table
that Linear Regression has certain advantages in terms of computation time and memory usage. In general, the longer the training
time, the longer the model takes to make predictions. Although the computational cost of Linear Regression is small, there is an
irreparable problem with the prediction accuracy. Therefore, XGBoost regression is a more suitable model considering the prediction
accuracy and ensuring the low computational cost.
10
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 10. Comparison of estimations from different algorithms with real LiDAR measurements.
The ultimate purpose of yaw misalignment calibration is to make the rotor face the incoming wind as much as possible within
the rated operating conditions, so as to improve the power generation of the WT. Therefore, by quickly constructing an accurate
and reliable yaw misalignment calibration model with fewer sample data, the calibration can be completed in advance, and the
installation time of the LiDAR can be shortened, thereby reducing the cost of power generation. To this end, the estimation results
of different ML models on datasets with different amounts of data are compared and analyzed.
For ML algorithms, the capacity requirements of the dataset cannot be calculated directly. Due to the large amount of historical
data that can be used in this paper, the construction of a learning curve is considered to estimate the size of the dataset required
for training by different ML algorithms.
All datasets were divided into ten groups in chronological order according to the cross-validation method, eight groups were
used to train the model, and the unused two groups were used for testing and verification. Then the training dataset is divided into
100 groups and the number of training dataset groups 𝑚 (𝑚 = 1, 2, … , 100) used in the training model is changed, and the RMSE
on the validation set is changed accordingly. In this way, the performance of the algorithm on the RMSE of the validation set can
be observed as the number of training dataset groups changes, which is the learning curve. Fig. 12 shows the learning curve for
each algorithm. As the amount of training data increases, the RMSE of the algorithm generally decreases initially, but after it has
decreased to a certain level, the RMSE does not decrease significantly even as the number of training dataset groups 𝑚 increases.
The amount of training samples required to deviate from the RMSE minimum of algorithm ±0.1 deg is defined as the minimum
amount of training data required. On this basis, the amount of data is converted into the number of days of data with 86,400 a
day, and the proportion of data cleaning is considered to estimate the minimum amount of data in days required for the algorithm.
Table 5 shows the minimum amount of data in days for training of each algorithm. Linear regression requires the least amount of
data but is less accurate. The amount of data required by XGBoost is not too much, but the accuracy rate is relatively high.
11
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 12. Relationship between the amount of training data and RMSE for different algorithms.
Table 5
Minimum amount of data in days for training of different ML algorithms.
Algorithm Minimum amount of data in days for training
Linear regression 13 days
Gradient boosting 39 days
Random forests 60 days
XGBoost regression 26 days
To sum up, whether the accuracy or dependence on the amount of data, XGBoost has a good performance in terms of
comprehensive performance. Therefore, the XGBoost algorithm can quickly establish the yaw misalignment calibration model,
correct the yaw misalignment in real-time, and improve the performance of WTs.
In this section, the field test scheme and field implementation are presented, and then the test results are analyzed to verify the
effectiveness of the proposed calibration method.
For this field implementation, the yaw controller was modified from what was used in [12]. The modified yaw control logic
diagram is shown in Fig. 13. A timer with a cycle every three-hour is added to the yaw control system to ensure that the yaw
error measurement can be switched from the XGBoost model estimation, the wind vane measurement, the LiDAR measurement
with validity judgment for LiDAR-assist control. The first hour used the XGBoost model estimation, the second hour used wind-vane
yaw error measurement and the third hour used the LiDAR measurement with validity judgment for LiDAR-assist control. Each of
the three strategies runs for an hour, and the whole operation mode is carried out in a loop. When the yaw error passes through
two low-pass filters, resulting in a rapidly changing measurement value (𝑒𝑟𝑓 𝑎𝑠𝑡 ) with a time constant of 25 s and a slowly changing
measurement value (𝑒𝑟𝑠𝑙𝑜𝑤 ) with a time constant of 60 s. The timer starts counting when (𝑒𝑟𝑠𝑙𝑜𝑤 ) reaches the threshold. When (𝑇𝑠𝑡𝑎𝑟𝑡 )
reaches the delay time for start, the yaw actuator starts to act.
This experiment of real-time yaw misalignment calibration was officially launched on 20th December 2021 and continued for
a week. As some periods were during WT down-regulation operation scenario due to the transient transmission limits of power
grid, data for a continuous period from December 22 to December 25, 2021, was extracted and applied to evaluation. In all, about
66 h of overall operation data, and the operation data under each strategy is about 22 h. The blue bars in Fig. 14 exhibit the
distribution of wind resources drawn from the original microsite selection report of the wind farm. It is reported that wind speed is
mainly distributed between 3.0 m/s to 10.0 m/s with a proportion of 83.10%, and the average speed is 6.17 m/s. According to this
reference, the wind speed distribution of collected datasets is analyzed and given in the orange bars in Fig. 14. Collected datasets
relatively covered a majority of wind speed, especially above the 3 m/s section. In the collected field test data, the coverage of the
wind turbine from 3.0 m/s to 10.0 m/s is 81.28% and the averaged wind speed is 5.70 m/s. Therefore, the adopted datasets have
covered relatively sufficient conditions for evaluation.
Finally, the statistical results of wind speed and wind direction under the operation time are shown in Figs. 15(a) and 15(b).
The statistical distribution of wind speed under the three yaw control modes is approximately same. However, the limited amount
of data collected means that the distribution of wind direction does not cover the full 0◦ ∼360◦ range. Despite this, the distribution
of wind direction for each mode is very similar. In other words, the data collected during the respective runs of the three strategies
are comparatively balanced. As a consequence, the results of the three yaw control modes can be evaluated and compared.
12
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 13. Modified yaw control logic diagram (Three yaw control strategies are switched through the measurement toggle switch by hourly).
Fig. 14. Comparison of historical wind frequency distribution with collected dataset wind frequency distribution. Blue bars represent the group in the original
microsite selection report, orange bars represent the group in the field test.
The comparison results of wind direction during the entire field test were compared. Fig. 16 depicts the performance of wind
direction estimation by XGBoost model for 66 consecutive hours. It can be seen from the results that most of the wind direction
estimates of the XGBoost model are in a good range, while the LiDAR has low validity of time data in many cases.
In this section, to quantify whether the power of the WT is improved under the XGBoost model estimation strategy, wind speed-
power curve fitting of WTs in three operating modes is required. However, there are a lot of abnormal data in the SCADA system,
which needs to be eliminated in combination with the data of the corresponding time period, so as not to affect the accuracy of the
13
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
evaluation results. Three data preprocessing methods to remove outliers are proposed here. The first is the conventional method,
the second is the method of removing the abnormal points by combining the standard power curve of the WT. The third method
uses the Isolated Forest anomaly detection algorithm to eliminate outliers, which is a method of removing outliers by using the local
outliers factor algorithm. The data processing results of three methods are different, and the fitting results of the wind speed-power
curve are also different, but the qualitative conclusions are the same.
where, 𝑉𝑖 represents the average wind speed in the 𝑖th bin, 𝑣𝑛,𝑖,𝑗 represents the wind speed value under the standard air density in
the 𝑖th bin, 𝑃𝑖 represents the average active power in the 𝑖th bin, 𝑝𝑖,𝑗 represents the active power value of the sample point in the
𝑖th bin, 𝑁𝑖 represents the number of data sample points in the 𝑖th bin.
14
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 17. Comparison of scatter plots before and after the outliers elimination using first method. Blue scatters mean the pre-treatment samples, while orange
scatters mean the post-treatment samples.
Based on the above calculation, the wind speed-active power data could be obtained, and the power boost ratio of each wind
speed section can be calculated. The calculation formula is given as follows:
𝑃𝑖L − 𝑃𝑖V
𝑖𝑚𝑝L𝑖 = × 100%, (15)
𝑃𝑖V
𝑃𝑖X − 𝑃𝑖V
𝑖𝑚𝑝X
𝑖 = × 100%, (16)
𝑃𝑖V
where, 𝑖𝑚𝑝L𝑖 is the active power boost percentage in the 𝑖th wind speed bin when the LiDAR strategy is put into operation. 𝑖𝑚𝑝X 𝑖
is the active power boost percentage in the 𝑖th wind speed bin when the XGBoost model is put into operation. 𝑃𝑖L represents the
output power value corresponding to the 𝑖th wind speed bin of the wind speed-active power curve when the LiDAR strategy is put
into operation. 𝑃𝑖M represents the output power value corresponding to the 𝑖th wind speed bin of the wind speed-active power curve
when the XGBoost model is put into operation. 𝑃𝑖V represents the power curve corresponding to the 𝑖th wind speed bin of the wind
speed-active power curve when the wind vane strategy is put into operation.
Below the rated wind speed, an increase in power output means an increase in power generation under the normal operation
conditions. Above the rated wind speed, the power output of the WT would be controlled at the rated power, so the power and power
generation remain unchanged. Therefore, the proportion of the power generation of the WT under different wind speed sections in
the annual power generation in the historical operation data is calculated, and the power generation weight of the WT in each wind
speed bin is quantified. The power generation weight is multiplied to obtain the theoretical power generation increase ratio of the
wind speed bin, and the theoretical power generation increase ratio of each wind speed bin below the rated wind speed is added
to obtain the theoretical power generation increase ratio of the test WT, as shown below.
∑
𝑁
𝐼𝑚𝑝L = 𝐺𝑖 × 𝑖𝑚𝑝L𝑖 , (17)
𝑖=1
∑
𝑁
𝐼𝑚𝑝X = 𝐺𝑖 × 𝑖𝑚𝑝X
𝑖 , (18)
𝑖=1
where, 𝐼𝑚𝑝L𝑖 represents the proportion of theoretical power generation increase in the whole wind speed bin when the LiDAR-assist
control strategy is put into operation. 𝐼𝑚𝑝X
𝑖 represents the proportion of theoretical power generation increase in the whole wind
speed bin when the XGBoost model is put into operation. 𝐺𝑖 represents the ratio of the generating capacity of the WT in the 𝑖th
wind speed bin to the annual electricity generation, and 𝑁 represents the quantitative participation calculation of the number of
wind speed bins.
After the above operations, the comparison charts of wind speed-power scatter points before and after the outliers elimination of
the LiDAR strategy, the XGBoost model, and the wind vane strategy were obtained, as shown in Fig. 17. There are not many outliers
eliminated by this method, and the integrity of the original dataset is well preserved. As shown in Fig. 18, it is the comparison of
15
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 18. Power scatter plot-based performance evaluation results using first method.
the wind speed and power scatter plots when the three strategies are put into operation. Finally, the data after eliminating outliers
are fitted according to the power curve model described in Section 5.3.1, and the comparison chart of the wind speed-power fitting
curves when the three strategies are put into operation is shown in Fig. 19.
As can be observed from the comparison results, the distribution of the LiDAR samples and wind vane samples is relatively
similar, while the XGBoost samples are more concentrated in the upper part.
16
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 20. Comparison of scatter plots before and after the outliers elimination using second method. Blue scatters mean the pre-treatment samples, while orange
scatters mean the post-treatment samples.
Fig. 21. Power scatter plot-based performance evaluation results using second method.
17
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Fig. 23. Comparison of scatter plots before and after the outliers elimination using third method. Blue scatters mean the pre-treatment samples, while orange
scatters mean the post-treatment samples.
Fig. 24. Power scatter plot-based performance evaluation results using third method.
When 𝐸(ℎ(𝑥)) is close to 0, 𝑠 is close to 1, that is, the sample 𝑥 is judged to be abnormal when the anomaly score of 𝑥 is close
to 1. When 𝐸(ℎ(𝑥)) is close to 𝑛 − 1, 𝑠 is close to 0, and the sample 𝑥 is judged to be normal.
Set the IF outlier data volume to 20%, the number of trees to 100, and the branches to 256. Fig. 23 shows the comparison charts
of wind speed-power scatter points before and after outliers elimination of the LiDAR strategy, the XGBoost model, and the wind
vane strategy put into operation. The data that deviates far are eliminated using this method, and the normal data of the edge part
with high wind speed is also easily isolated and deleted by mistake. As shown in Fig. 24, it is the comparison of the wind speed and
power scatter plots when the three strategies are put into operation. The comparison chart of the wind speed-power fitting curves
when the three strategies are put into operation is shown in Fig. 25. The power curve of the higher wind speed bin cannot be fitted
by the Isolated Forest method. From comparing the current fitting curves, the samples of the XGBoost model are more concentrated
in the upper part.
18
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Table 6
Data cleaning effects of different methods.
Elimination method Operation strategy Percentage of deleted data
Wind Vane 24.54%
Quartile method LiDAR 26.74%
XGBoost 26.45%
Wind vane 49.42%
Theoretical power curve method LiDAR 55.15%
XGBoost 60.02%
Wind vane 18.82%
Isolated forest LiDAR 27.36%
XGBoost 20.67%
Table 7
Statistical table of power generation improvement.
Operation strategy Quartile method Theoretical power curve method Isolated forest
LiDAR 1.08% 1.47% 1.71%
XGBoost 9.65% 3.66% 9.70%
6. Economical solution
Considering that the main purpose of this work is to learn and replace LiDAR measurements, economic benefits are the focus. This
section provides an executable economical solution for LiDAR replacement planning, i.e., to determine how many nacelle-mounted
LiDARs should be invested for an in-service wind farm.
The economic feasibility assessment uses Net Present Value (NPV) as a measure of the cost trade-off of using LiDAR. The NPV
method is based on the project investment as the calculation basis (that is, assuming that the project investment is all self-owned
funds), which is the difference between the present value of future cash inflows and the present value of future capital inflows. If the
NPV is positive, the investment proposal is acceptable. Conversely, if the NPV is negative, the investment proposal is unacceptable.
The higher the NPV, the better the investment scenario. According to the above definition, we have
∑
𝑛
𝑁𝑃 𝑉 = (𝐶𝐼 − 𝐶𝑂)𝑡 (1 + 𝑖)−𝑡 , (22)
𝑡=0
where, 𝐶𝐼 represents the value of the cash inflow, 𝐶𝑂 represents the value of the cash outflow, 𝑖 represents the benchmark discount
rate. Income and investments are the cash flows of the model, discounted at 7% annually.
Suppose a wind farm has 𝑇 WTs in-service, all of the same size and model, and all of which are subject to yaw misalignment. To
improve the overall revenue, each WT implements yaw calibration based on LiDAR replacement. If only one LiDAR is invested, the
yaw misalignment will last longer and the economic benefits will be reduced. Meanwhile, an excessive number of LiDAR devices in
a wind farm could result in a significant investment in LiDAR devices being idle. Therefore, how many LiDAR devices are required
for wind farms to maximize the return on investment of wind farms is an issue to be studied.
Such one or more LiDARs stay on the WT for a period of time, during which time wind direction data measured by the LiDAR
is collected, which will be used in a ML model that estimates the LiDAR measurements. After the stay period, the LiDAR will
be transferred to another WT. Until each WT in the wind farm has built a corresponding machine learning model, an overall
improvement in economic benefits can be achieved.
19
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
Table 8
Cost parameter presetting.
Turbine parameter Value Unit
Remaining service life 15 year
Rated power 2000 kW
Annual hours of utilization 1900 h
Power generation improvement 3.66% unit
Discount 7% per year
Local electricity price 0.61 CNY/kWh
LiDAR price 200 000 CNY/unit
LiDAR migration 10 000 CNY/unit-event
Initial implementation 10 000 CNY/unit
Operation and maintenance 5000 CNY/unit
Fig. 26. Number of LiDAR equipment versus NPV for a wind farm with 40 × 2 MW in-service wind turbines.
To sum up, the optimal LiDAR replacement plan for wind farms can be roughly given by comprehensively considering the
equipment acquisition cost and additional cost, combined with local electricity price and electricity generation estimation.
As an example, consider a wind farm with 40 × 2 MW WTs in operation in eastern China. Table 8 shows the cost parameters
required for the LiDAR replacement via yaw misalignment for this wind farm. In this example, the power generation improvement
ratio of 3.66% in Section 5.3.5 is selected for calculation. Due to the short field test period, it is more realistic to choose the
generation improvement ratio calculated by the second method, which is closer to the theoretical power curve of the WT. What is
particularly noteworthy is that some of the cost prices in the LiDAR replacement plan are based on necessary subjective assumptions,
such as initial implementation, LiDAR migration, operation and maintenance, etc.
Fig. 26 shows the NPV curve of the above plan applied in a wind farm. 𝑋-axis means the number of LiDAR equipment and 𝑌 -axis
means NPV up to the corresponding number of LiDAR devices. The results show that the NPV of the LiDAR replacement plan for
yaw misalignment calibration increases in the initial phase and then keeps decreasing. The maximum NPV is: CNY 11.28 million,
and the corresponding number of LiDAR devices is : 4.
Through the above calculation, an executable and optimal investment scheme could be obtained for wind-farm LiDAR rolling
utilization and replacement.
7. Conclusion
Since the WT has been in the yaw control state for a long time, the yaw misalignment has an essential impact on the performance
of the WT. In this paper, a real-time yaw calibration method was proposed based on the machine learning model to estimate the
LiDAR measurement, which resolves the conflict between the LiDAR measurement characteristic (high-accuracy but less-availability)
and the low-cost utilizing objective. The main conclusions are summed up as follows:
• The proposed method has been tested and verified on the historical datasets of a WT. At the same time, the dependence
of different machine learning algorithms on the data volume was analyzed. Experimental results indicate that: the proposed
XGBoost algorithm has high accuracy, requires less data, and can quickly calibrate the yaw misalignment.
• The short-term field test has been held for a 2 MW commercial WT to achieve more objective conclusions. At the same time,
considering that the calculation results of the power generation increase may be affected by the wind speed power dataset,
three different outliers elimination methods are used for the data cleaning of the test result dataset, and the annual theoretical
power generation calculation is further carried out. The results calculated by the three methods all show that the dynamic
20
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
correction ability of the machine learning algorithm can be used to eliminate a large part of the error, so that the power
generation of the WT can be effectively increased.
• An economic analysis framework was provided to decision-makers to select the optimal amount of LiDAR devices to maximize
profits from real-time yaw-misalignment calibration.
In future works, the optimization and transferability of the proposed method needs to be considered for further improvement,
and longer-term field tests are needed for more comprehensive verifications.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared
to influence the work reported in this paper.
Data availability
Acknowledgments
This work was supported by the National Natural Science Foundation of China (No. 61973114), and the Research on the
cooperative control technology through the wake redirection of Guodian New Energy Technology Research Institute Co., Ltd. (No.
GJNY-19-87).
References
[1] H. Mark, Z. Feng, G.W.E. Council, Global wind report 2023, 2023, https://gwec.net/globalwindreport2023/.
[2] L.Y. Pao, K.E. Johnson, Control of wind turbines, IEEE Control Syst. Mag. 31 (2) (2011) 44–62.
[3] P.M. Gebraad, F. Teeuwisse, J. Van Wingerden, P.A. Fleming, S. Ruben, J. Marden, L. Pao, Wind plant power optimization through yaw control using a
parametric model for wake effects—a CFD simulation study, Wind Energy 19 (1) (2016) 95–114.
[4] A. Bowen, N. Zakay, R. Ives, The field performance of a remote 10 kW wind turbine, Renew. Energy 28 (1) (2003) 13–33.
[5] Y. Pei, Z. Qian, B. Jing, D. Kang, L. Zhang, Data-driven method for wind turbine yaw angle sensor zero-point shifting fault detection, Energies 11 (3)
(2018) 553.
[6] D. Song, J. Yang, X. Fan, Y. Liu, A. Liu, G. Chen, Y.H. Joo, Maximum power extraction for wind turbines through a novel yaw control solution using
predicted wind directions, Energy Convers. Manage. 157 (2018) 587–599.
[7] Y. Bao, Q. Yang, A data-mining compensation approach for yaw misalignment on wind turbine, IEEE Trans. Ind. Inform. 17 (12) (2021) 8154–8164.
[8] D. Astolfi, F. Castellani, M. Becchetti, A. Lombardi, L. Terzi, Wind turbine systematic yaw error: Operation data analysis techniques for detecting it and
assessing its performance impact, Energies 13 (9) (2020) 2351.
[9] J. Yang, L. Wang, D. Song, C. Huang, L. Huang, J. Wang, Incorporating environmental impacts into zero-point shifting diagnosis of wind turbines yaw
angle, Energy 238 (2022) 121762.
[10] C. Qu, Z. Lin, P. Chen, J. Liu, Z. Chen, Z. Xie, An improved data-driven methodology and field-test verification of yaw misalignment calibration on wind
turbines, Energy Convers. Manage. 266 (2022) 115786.
[11] R. Wagner, R.L. Rivera, I. Antoniou, S. Davoust, T.F. Pedersen, M. Courtney, B. Diznabi, Procedure for wind turbine power performance measurement
with a two-beam nacelle lidar, DTU Wind Energy Rep. (2013).
[12] P.A. Fleming, A. Scholbrock, A. Jehu, S. Davoust, E. Osler, A.D. Wright, A. Clifton, Field-test results using a nacelle-mounted lidar for improving wind
turbine power capture by reducing yaw misalignment, in: Journal of Physics: Conference Series, Vol. 524, IOP Publishing, 2014, 012002.
[13] A.K. Scholbrock, P.A. Fleming, A. Wright, C. Slinger, J. Medley, M. Harris, Field test results from lidar measured yaw control for improved power capture
with the NREL controls advanced research turbine, in: 33rd Wind Energy Symposium, 2015, p. 1209.
[14] R. Bakhshi, P. Sandborn, Maximizing the returns of LIDAR systems in wind farms for yaw error correction applications, Wind Energy 23 (6) (2020)
1408–1421.
[15] L. Zhang, Q. Yang, A method for yaw error alignment of wind turbine based on LiDAR, IEEE Access 8 (2020) 25052–25059.
[16] A. Stetco, F. Dinmohammadi, X. Zhao, V. Robu, D. Flynn, M. Barnes, J. Keane, G. Nenadic, Machine learning methods for wind turbine condition monitoring:
A review, Renew. Energy 133 (2019) 620–635.
[17] R. Pandit, D. Infield, T. Dodwell, Operational variables for improving industrial wind turbine yaw misalignment early fault detection capabilities using
data-driven techniques, IEEE Trans. Instrum. Meas. 70 (2021) 1–8.
[18] D. Zhang, L. Qian, B. Mao, C. Huang, B. Huang, Y. Si, A data-driven design for fault detection of wind turbines using random forests and XGboost, Ieee
Access 6 (2018) 21020–21031.
[19] Z. Tang, G. Zhao, T. Ouyang, Two-phase deep learning model for short-term wind direction forecasting, Renew. Energy 173 (2021) 1005–1016.
[20] A. Khosravi, L. Machado, R. Nunes, Time-series prediction of wind speed using machine learning algorithms: A case study Osorio wind farm, Brazil, Appl.
Energy 224 (2018) 550–566.
[21] T. Ouyang, A. Kusiak, Y. He, Predictive model of yaw error in a wind turbine, Energy 123 (2017) 119–130.
[22] A. Khosravi, R. Koury, L. Machado, J. Pabon, Prediction of wind speed and wind direction using artificial neural network, support vector regression and
adaptive neuro-fuzzy inference system, Sustain. Energy Technol. Assess. 25 (2018) 146–160.
[23] A. Saenz-Aguirre, E. Zulueta, U. Fernandez-Gamiz, J. Lozano, J.M. Lopez-Guede, Artificial neural network based reinforcement learning for wind turbine
yaw control, Energies 12 (3) (2019) 436.
[24] A. Saenz-Aguirre, E. Zulueta, U. Fernandez-Gamiz, A. Ulazia, D. Teso-Fz-Betono, Performance enhancement of the artificial neural network–based
reinforcement learning for wind turbine yaw control, Wind Energy 23 (3) (2020) 676–690.
[25] D. Choi, W. Shin, K. Ko, W. Rhee, Static and dynamic yaw misalignments of wind turbines and machine learning based correction methods using lidar
data, IEEE Trans. Sustain. Energy 10 (2) (2018) 971–982.
[26] L. Gao, J. Hong, Data-driven yaw misalignment correction for utility-scale wind turbines, J. Renew. Sustain. Energy 13 (6) (2021).
21
P. Chen et al. Mechanical Systems and Signal Processing 208 (2024) 110972
[27] B. Zhou, X. Ma, Y. Luo, D. Yang, Wind power prediction based on LSTM networks and nonparametric kernel density estimation, IEEE Access 7 (2019)
165279–165292.
[28] T. Chen, C. Guestrin, Xgboost: A scalable tree boosting system, in: Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery
and Data Mining, 2016, pp. 785–794.
[29] J. Li, X. An, Q. Li, C. Wang, H. Yu, X. Zhou, Y.-a. Geng, Application of XGBoost algorithm in the optimization of pollutant concentration, Atmos. Res.
(2022) 106238.
[30] T. Hastie, R. Tibshirani, J.H. Friedman, J.H. Friedman, The elements of statistical learning: data mining, inference, and prediction, vol. 2, Springer, 2009.
[31] J.H. Friedman, Greedy function approximation: a gradient boosting machine, Ann. Statist. (2001) 1189–1232.
[32] L. Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32.
[33] Z. Lin, X. Liu, M. Collu, Wind power prediction based on high-frequency SCADA data along with isolation forest and deep learning neural networks, Int.
J. Electr. Power Energy Syst. 118 (2020) 105835.
22