While the above content compared domain adaptation methods with CNN and fusion domain adaptation methods with general domain adaptation methods, variations in parameter settings, as well as differences in training and test set allocations for each method may still affect the diagnostic efficiency, accuracy, and the fairness of the comparisons. The following section discusses the impact of parameter settings on the fault diagnosis accuracy for each method.
4.3.1. Dropout-Rate Parameters
A Dropout layer was added after the fully connected layer in various domain adaptation methods to enhance the model’s performance when generalizing to the target domain. The most critical hyperparameter in the Dropout layer is the Dropout rate. A rate that is too high results in discarding too many neurons, leading to unstable model classification and underfitting. Conversely, a rate that is too low may cause the model to overly rely on the source domain training data. Furthermore, the premise of enhancing generalization by adding Dropout is that the diagnostic accuracy of the model trained on the source domain dataset remains unaffected.
From the experimental results, as the Dropout rate increases, the test set accuracy gradually becomes slightly higher than that of the training set, indicating an improvement in the model’s generalization ability. Up to a Dropout rate of 0.5, although the test set continues to exhibit good generalization performance, the accuracies of both the training and test sets begin to decline. This suggests that the proportion of discarded neurons has started to negatively impact the model’s diagnostic performance, and this effect intensifies as the Dropout rate increases further. Therefore, to maintain the model’s performance in the source domain, the Dropout rate is set to 0.4.
4.3.2. Training–Testing Set Ratio
Analysis of the table shows that the proportion of training and testing sets does not significantly affect the training accuracy in the target domain. However, the overall trend indicates that as the proportion of the testing set increases, diagnostic accuracy decreases. Nonetheless, it is evident that when the testing set comprises 20% of the entire dataset, higher diagnostic accuracy is observed. Therefore, this study uses a 20% testing set as the experimental reference.
4.3.3. Parameters of Each Method
- (1)
AdaBN Influence of Training Parameters
From the comparison experiments on BN layer update batch sizes, it is observed that as the batch size increases beyond 64, the accuracy of the source domain test set begins to gradually decrease. This finding is consistent with the results from CNN batch-size experiments. An excessively large batch size reduces the frequency of updates to the BN layer’s statistical parameters in the target domain, leading to a corresponding decrease in target domain test accuracy. Conversely, an excessively small batch size can make the BN layer’s statistical parameters more susceptible to outlier data, resulting in reduced accuracy in the target domain. Based on the experimental results, a batch size of 64 is identified as optimal for BN layer updates when using batch updating.
From the comparative experimental results, it can be observed that whether using batch updates or updating the entire target domain dataset, there is no significant impact on the final diagnostic results of the source domain test set. However, the diagnostic results in the target domain, which reflect the model’s domain adaptation characteristics, do differ. With a batch size of 64, the stable accuracy in the target domain is 74.47%, while the stable accuracy with full data updates is 72.71%. Therefore, updating the BN layer using batch mode demonstrates stronger adaptation capabilities in the target domain compared to full data updates.
- (2)
Feature Mapping Training Parameters
The domain adaptation capability of the feature mapping method is directly related to the mapping distance metric function, which in turn depends on the accuracy of domain distance calculation and the parameters involved in this process. Specifically, MK-MMD and JMMD are significantly influenced by the type of mapping kernel function (kernel type) and the number of kernels (kernel size).
The choice of kernel function type directly affects the model’s nonlinear mapping capability and domain adaptation generalization performance. Currently, five kernel functions are commonly used: linear, polynomial, sigmoid, Laplacian, and Gaussian Radial Basis Function (RBF). Each has its own characteristics. The Gaussian RBF kernel function is the most widely used and versatile in practical applications, making it the most effective choice for the kernel function in both MK-MMD and JMMD mapping methods.
According to the experimental results, the performance of both MK-MMD and JMMD is influenced by changes in kernel size. The performance of MK-MMD is more significantly affected by kernel size, with its performance exhibiting a trend of increasing and then decreasing. For MK-MMD, the highest diagnostic accuracy in the target domain is achieved with a kernel size of 5, whereas for JMMD, the highest diagnostic accuracy is attained with a kernel size of 6. However, the difference in accuracy between kernel sizes 5 and 6 is only 0.53%.
- (3)
Domain adversarial discriminator network structure parameters
From the experimental results, it can be observed that for DAN, the discriminator performs well with a single fully connected layer, achieving effective feature alignment between domains during domain adversarial training. The optimal number of neurons for this configuration is 256. In contrast, the JAN discriminator demonstrates the strongest adaptation capability to the target domain with two fully connected layers, each containing 128 neurons. This improved performance may be attributed to the inclusion of joint distributions between output and input, which increases the dimensionality and complexity of the adversarial training features. Consequently, the discriminator benefits from deeper fully connected layers to enhance its nonlinear expression capability and improve domain adaptation. In summary, the optimal structure for DAN is a single layer with 256 neurons, while for JAN, it is two layers with 128 neurons each, which yields the best results for fault diagnosis.
Source link
Chao Zhang www.mdpi.com

