Research Article  Open Access
Zhun Cheng, Zhixiong Lu, "A Novel Efficient Feature Dimensionality Reduction Method and Its Application in Engineering", Complexity, vol. 2018, Article ID 2879640, 14 pages, 2018. https://doi.org/10.1155/2018/2879640
A Novel Efficient Feature Dimensionality Reduction Method and Its Application in Engineering
Abstract
In the engineering field, excessive data dimensions affect the efficiency of machine learning and analysis of the relationships between data or features. To render feature dimensionality reduction more effective and faster, this paper proposes a new feature dimensionality reduction approach combining a sampling survey method with a heuristic intelligent optimization algorithm. Drawing on feature selection, this method builds a featurescoring system and a reduceddimension lengthscoring system based on the sampling survey method. According to feature scores and reduceddimension lengths, the method selects a number of features and reduceddimension lengths that are ranked in the front with high scores. This feature dimensionality reduction method allows for indepth optimal selection of features and reduceddimension lengths with high scores using an improved heuristic intelligent optimization algorithm. To verify the effectiveness of the dimensionality reduction method, this paper applies it to road roughness timedomain estimation based on vehicle dynamic response and geneselection research in bioengineering. Results in the first case show that the proposed method can improve the accuracy of road roughness timedomain estimation to above 0.99 and reduce measured data of the vehicle dynamic response, reducing the experimental workload significantly. Results in the second case show that the method can select a set of genes quickly and effectively with high disease recognition accuracy.
1. Introduction
The curse of dimensionality is a common problem in engineering. Ineffective and unreasonable feature dimensionality reduction will compromise machinelearning efficiency, pattern recognition accuracy, and datamining efficiency while increasing the workload of measured data experiments to some extent. A set of highdimension features possesses the following problems: too many features but few samples, too many features with few or no relations to the mining task, and excessive redundancy among features [1–4].
Feature dimensionality reduction methods can be classified into two types: feature selection and feature projection. Feature projection is also called feature reparameterization and transforms the original feature space into one with a smaller dimensionality and more independent dimensions [5–7]. Feature projection generally uses principal component analysis (PCA), kernel principal component analysis, or other improved principal component analysis methods for linear or nonlinear transformations. Although these methods can realize feature dimensionality reduction to some extent, they require extensive calculations and high computational complexity. These dimensionality reduction methods also cannot reduce necessary experimental data and are inconvenient when identifying features heavily correlated to mining tasks. By contrast, feature selection obtains lowdimension features more effectively compared to machine learning or data mining and carries low computational complexity [8–10]. However, existing feature selection methods are still timeintensive with respect to calculations, and some data dimensionality reduction problems in engineering remain difficult to solve using these methods.
The road roughness timedomain estimation of vehicle dynamic response is taken as an example in this paper, capturing feature dimensionality reduction in multivariate time series. Acquiring road roughness information based on vehicle dynamic response is an economical and practical method, but most researchers have only improved the precision of road roughness estimation by using different neural networks or enhanced estimation methods [11–14]; they have thus far failed to evaluate the types and quantities of dynamic responses to be assessed and the positions of points to be measured in the vehicle. Measurement points can change positions freely in the vehicle, and each position can offer three dynamic responses (vertical vibration acceleration, speed, and displacement); as such, the example has theoretically infinite feature dimensions. In this case, to improve the accuracy and speed of road roughness estimation based on neural networks, key features must be selected. Additionally, the subsequent experiment needs to test and measure the selected features, which guides the experiment and reduces the workload substantially.
For instance, in geneselection research on cancer, the mining of gene expression profile data can identify cancer types. The gene expression profile has many data dimensions but small samples. For instance, leukemia has 7129 gene dimensions but only 72 samples for analysis, which increases the challenges of feature dimensionality reduction. Various methods exist for gene selection, but they are computationally costly and complex [15–18]. Many researchers have introduced information to facilitate gene selection, but these methods cannot properly reflect the length of a set of gene features, namely, the number of dimensions after dimensionality reduction. Overall, engineering projects in road roughness timedomain estimation based on vehicle dynamic response and biological gene selection suffer from a curse of dimensionality with an extensive number of features. In addition to their computational cost and complexity, common dimensionality reduction methods (e.g., PCA and cluster analysis) fail to effectively reduce dimensionality (i.e., experimental data) and thus cannot improve estimation accuracy. If researchers use feature selection, uncertainty persists regarding the number of features selected such that the length of the final feature set is unclear. Several variables must be measured to determine which are uncertain.
To solve the problem of excessive dimensions, this paper proposes a new feature selection method based on a sampling survey method and heuristic intelligent optimization algorithm. The general process is as follows: first, we build a featurescoring system and a reduceddimension lengthscoring system based on the sampling survey method; second, we sort features and reduceddimension lengths according to their scores and select the features and reduceddimension lengths that are ranked in the front; third, to apply selection optimization to features and reduceddimension lengths simultaneously, we redefine the meaning of the information carried by individuals in the heuristic intelligent optimization algorithm population and improve the algorithm accordingly. To better explain and verify the effect of the dimensionality reduction method, this paper conducts feature dimensionality reduction using road roughness timedomain estimation of vehicle dynamic response and geneselection research in bioengineering. Results show that the new feature dimensionality reduction method proposed can select features quickly and effectively: the accuracy of road roughness timedomain estimation was generally higher than 0.99 and effectively reduced vehicle dynamic response measurement data, and a reasonable number of genes was selected.
2. New Feature Dimensionality Reduction Method Based on Sampling Survey Method and Improved Heuristic Intelligent Optimization Algorithm
2.1. Scoring System Based on Sampling Survey Method
The sampling survey method takes part of the population as samples for analysis [19]. In a typical dimensionality problem, given a vast number of features, the final number remains unknown after dimensionality reduction; therefore, using enumeration to analyze all samples requires an intensive workload and may fail. By employing the sampling survey method and a certain sample scale, this paper analyzes the features and number of them after dimensionality reduction. All samples must have the same chance of being selected. To obtain useful information for feature dimensionality reduction, we propose a featurescoring system and a scoring system for reduceddimension length.
The featurescoring system operates as follows: Step 1Draw sets of features with any number of dimensions taken randomly from features with equal probability. Calculate the fitness of each sample based on classification recognition accuracy, datamining efficiency, or neural network estimation accuracy.Step 2Set a threshold value and select all samples with fitness larger than ; these samples comprise the set of valid feature scores.Step 3Score each feature based on the set of valid feature scores. The score of each feature is composed of the following two parts:where is the first part of the score (Score 1) of a feature, is the number of sets of valid feature scores, is the fitness ranking by the ^{th} set of valid feature scores, is the minimum of all Score 1s, and is the maximum of all Score 1s. where is the second part of the score (Score 2) of a feature; is a piecewise function such that if the ^{th} set of valid feature scores contains the feature, then and otherwise; is the minimum of all Score 2s; and is the maximum of all Score 2s.
The total score of a feature is represented by the following formula: where is the total feature score.
According to Formulas (1) and (2), and , strictly speaking, each belong to the range of [0, 1]. reflects the degree of contribution of a feature to fitness. reflects the frequency of a feature in a set of valid feature scores. A feature with a higher or can realize a better fitness value. Step 4Rank features according to their total scores and choose the first features for the set of valid features.
The scoring system for reduceddimension length is as follows: Steps 1 and 2These steps are the same as in the featurescoring systemStep 3Score the length of each reduced dimension according to the set of valid feature scores. The score of each dimension length is as follows:where is the total reduceddimension length score, is the number of sets with the same dimension length in a set of valid feature scores, and is the fitness of each sample. Step 4Rank each dimension length according to , select those with a proportion of cumulative gross scores greater than 80% out of the total score of all dimension lengths, and identify reasonable dimension lengths after dimensionality reduction.
To verify that a scoring system based on the sampling survey method can be used to effectively select features with the greatest influence on results, this paper uses a calculation example for simple mathematical analysis: the classical problem of Hald cement. Hald cement contains four chemical components: 3CaO·Al2O3 (), 3CaO·SiO2 (), 4CaO·Al2O3·Fe2O3 (), and 2CaO·SiO2 (). Table 1 lists data on relations between heat (calorie) released by each gram of cement and the content of these four components.

In statistics, variance reflects the degree to which two or more data points are different. Then, using the idea of calculating the degree of contribution in PCA, this paper takes the proportion of variance of one variable from the total variance of all variables to represent the degree of influence of that variable: where is the degree of influence of the ^{th} feature, is the variance of the ^{th} feature, and is the total variance of all features.
Formula (5) can be used to calculate the degrees of influence of four components of Hald cement (, , , and ), which are 0.0579, 0.4050, 0.0686, and 0.4686, respectively. and have high degrees of influence, has less influence, and has the least influence. To further reflect the degree of influence of these variables on heat , we employed the radial basis function neural network (RBFNN) and considered several parts of the four components of Hald cement as input features to estimate , the heat released by each gram of each component. If a feature does not have typical representativeness, then the estimation accuracy of RBFNN will be low. Therefore, we can rank the four features in terms of their influence degree according to the estimation accuracy after RBFNN machine learning. Considering a single feature as input into the neural network and the coefficient of determination as the evaluation index for the degree of closeness of heat ’s predicted value to the actual value, we obtained coefficients of determination for , , , and as 0.7765, 0.9580, 0.8454, and 0.9495, respectively. The results are similar to the variance analysis; therefore, we can conclude that and exert great influences on heat , has less influence, and has the least influence.
According to Step 1 of the scoring system based on the sampling survey method proposed in this paper, we must also calculate the fitness of each sample by considering the accuracy of classification recognition, datamining efficiency, or neural network estimation accuracy as the fitness to analyze the four features. To better compare their degrees of influence, we considered the output approximation degree of RBFNN when using the scoring system; Table 2 lists the results.

When ranking the four features by degree of influence in descending order, we obtained , , , and . The results of dimensionality length show that when the dimensionality is 2, we can estimate heat using the RBFNN with zero error. In this case, two features input into the neural network must contain either or . The above analysis confirms that the proposed scoring system based on the sampling survey method can be used to evaluate the degree of feature importance to select an ideal featureset length.
2.2. Improving the Heuristic Intelligent Optimization Algorithm to Enhance Algorithm Execution Efficiency
This paper uses an artificial fish swarm algorithm (AFSA) and particle swarm optimization (PSO) to explain and verify the new feature dimensionality reduction method [20–24]. To improve the rate of convergence and precision of the algorithm, we refine AFSA and PSO.
AFSA is improved as follows: (1)Add the memory function to all AFSA behaviors and store historical optimal positions and corresponding optimal values of individuals and the entire fish swarm(2)Introduce the speed and position calculation formulas of PSO into swarm foraging behavior and update and calculate the movement speed and position of the nextgeneration fish swarm(3)According to the idea of variation in a genetic algorithm, if the global optimum remains the same in multiple records, we shall disturb multiple fish in the swarm with a certain probability
The PSO is improved as follows: (1)To avoid the prematurity phenomenon in PSO, introduce the probability judgment Metropolis rule of a simulated annealing (SA) algorithmwhere is the probability of accepting a poor solution, is the difference in fitness between two generations, and is the temperature variable in SA, which declines as the number of iterations increases. (2)Change the update criterion of temperature variable towhere is the value of of the ^{th} iteration, is the initial value of , is the final value of , and is the total number of iterations. (3)Taking the nonuniform mutation idea of a genetic algorithm as reference, disturb learning parameter of flight speed in PSO to intensify the local search ability. The following formula calculates learning parameter :where is the value of of the ^{th} iteration, is the value of of the ^{th} iteration, and is a constant, such that in this paper.
To theoretically confirm the excellent optimum searching performance of the improved heuristic intelligent optimization algorithm, this paper analyzes convergence of the improved PSO and improved AFSA. Solis and Wets [25] studied the stochastic optimization algorithm and provided convergence criteria for stochastic optimization. For a type of stochastic optimization algorithm to converge to the global optimal solution, the following two conditions must be satisfied: (1) Condition H1: , and if , then . In this condition, is the feasible solution space, is the stochastic optimization algorithm, is the result of the ^{th} iteration, , and is the solution searched by the algorithm in this iteration.
Condition H1 ensures that the fitness in each iteration of the stochastic optimization algorithm is nonincremental. The stochastic optimization algorithm with convergence ability can get sequence with a probability of 1, and the sequence converges to a certain value. A stochastic optimization algorithm satisfying condition H1 has convergence ability; although the final convergence value is not always a globally optimal solution, it may be locally optimal. (2) Condition H2: For ’s arbitrary Borel subset , if its measure >0, then . is the probability measure of the algorithm’s ^{th} iteration result in set .
Condition H2 indicates that if a stochastic optimization algorithm can search and find the globally optimal solution, the algorithm goes through the globally optimal solution at least once when the number of iterations approaches infinity; that is, the algorithm has ergodicity. The convergence of the improved PSO is confirmed below.
In the PSO, we define each particle by updating its speed and position based on the following formulas: where is the speed of a particle in the ^{th} iteration, is the inertia weight factor, and are learning factors, is the historical best position of an individual, is the historical best position of the swarm, and is the position of a particle in the ^{th} iteration.
The PSO stores the historical optimal position of a swarm, , thus ensuring that the optimization result obtained in each step of the algorithm is nonincremental; hence, the PSO satisfies condition H1.
Then, we judge the convergence of particle speed and position according to convergence in probability. With an inertia weight and the proposed learning factor calculation formula, when the number of iterations approaches infinity, we have , where is an arbitrarily small positive value. Suppose is a point of convergence in the PSO; then, we have .
If the number of iterations is sufficiently large, then the particle’s position and update speed will converge to and 0, respectively, with a probability of 1.
Additionally, we analyze and confirm the convergence of PSO using the difference equation [26]. Although and are multidimensional variables, the dimensions are mutually independent; therefore, we simplify algorithm analysis to be a 1D case. To simplify the calculation, we suppose that , the position of optimal solution of a particle, and , the position of the optimal solution of the whole swarm, are constant.
Then, according to Formulas (9) and (10), we get
Substituting Formulas (9)–(11) into Formula (12), we get
It is a secondorder constantcoefficient nonhomogeneous difference equation, and this paper uses the characteristic equation method to solve it.
Equation (13)’s characteristic equation is . A quadratic equation with one unknown quantity can be divided into the following three cases: , , and ().
If , then we have . In this case, we obtain the following equation: where is the initial value of the algorithm iteration and is the initial speed of the iteration.
If , then we have and . In this case, we get the following equations:
If , then we have and . ’s calculation formula is the same as the solution in the case of .
If has a limit when approaches infinity, then requires convergence in all three cases. We get and . Essentially, the PSO’s convergence conditions are , , and .
We propose an improved PSO to improve learning factors and . With an increase in the number of iterations, and approach 0. Therefore, the improved PSO surely satisfies the above convergence conditions only if the inertia factor has a proper value.
We prove the convergence of the improved PSO proposed using the three methods above. However, if the algorithm is required to converge to the global optimal solution, it must satisfy condition H2. In this case, the algorithm goes through the global optimal solution at least once when its number of iterations approaches infinity; that is, the particle in the PSO should demonstrate ergodicity. Apparently, standard PSO cannot guarantee identification of the global optimal value, resulting in the aforementioned prematurity phenomenon. We introduce SA into the evolutionary criteria of PSO because SA has been theoretically proven to converge to the global optimal solution with a probability of 1 in certain conditions [27].
Next, this paper proposes modifying the learning factor to provide the algorithm with a fast flying speed in early iterations and then obtaining a large search scope in the search space to search and find the global optimal solution when the search scope declines as the search proceeds. According to the derivation of reference [28], the standard AFSA is convergent. Proposed improvements introduce the speed and position calculation formulas of PSO into the foraging behaviors of the fish swarm. This way, in addition to clustering and tailgating behaviors, the entire fish swarm can share information about the historical optimal position of the swarm and the optimal positions of individuals during foraging behavior. Moreover, with the first improvement to AFSA (i.e., storing the historical optimal positions of individuals and the whole fish swarm and corresponding optimal values) and the convergence analysis of PSO, we can ensure that the improved AFSA satisfies the convergence judging condition H1 in the stochastic optimization algorithm.
Because we use the speed and position update formulas of PSO in the fish swarm’s foraging behaviors where the fish swarm may also exhibit clustering and tailgating behaviors, all of which move in the direction with higher fitness, we suppose that is the convergence point of AFSA. Then, we have where is an arbitrarily small positive value; therefore, the AFSA possesses convergence.
Similarly, to satisfy condition H2, the position of the fish swarm should have ergodicity in AFSA. Like PSO, the standard AFSA also suffers from the prematurity phenomenon. If a super individual appears in the early iterations of the algorithm, then the whole fish swarm will converge quickly to the value. Therefore, we propose a third AFSA improvement: if the historical optimal solution does not change in multiple iterations, then the algorithm randomly changes some individuals in the fish swarm. When the iteration approaches infinity, the fish swarm position will take all possible values in the definition domain with a theoretic probability of 1. Essentially, the improved AFSA is globally convergent with a certain probability.
To verify the computational complexity of the improved PSO and improved AFSA, we choose the following test function [29]: where is an independent variable within the domain of [−1, 2].
For analysis, we used a Lenovo computer with an Intel(R) Core(TM) i54590 processor, CPU at 3.30 GHz, and 8 GB memory. The PSO and AFSA used 20 particles over 30 iterations. Figure 1 presents the algorithm test results.
The standard AFSA and standard PSO were trapped in the area of the local optimal value in early iterations and slowly identified the global optimal value as the number of iterations increased, whereas the improved AFSA and improved PSO quickly identified the area of global optimal value in early iterations and finally converged to the global optimal value. The improved PSO and improved AFSA spent only 0.0904 s and 0.0719 s, respectively, converging to the global optimal value, whereas the standard PSO and standard AFSA took 0.0990 s and 0.5321 s, respectively.
2.3. Improving the Heuristic Intelligent Optimization Algorithm Based on the Need for Feature Dimensionality Reduction
To apply the heuristic intelligent optimization algorithm to optimal feature selection, we improved the algorithm based on classification recognition accuracy, datamining efficiency, or neural network estimation precision as the target value of algorithm optimization. We redefined the individuals of each population in the intelligent optimization algorithm (i.e., redefined information carried by either the fish swarm in AFSA or each particle in PSO). By imitating the transcription and translation process of genetic information in biology, information carried by individuals is defined as shown in Figure 2.
Here, is the first information of an individual within a value range of [0, 1]. is the total number of elements in the valid feature set. Table 3 lists reasonable dimension lengths after dimensionality reduction.

In this case, , , and represent reasonable dimension lengths after the first, second, and ^{th} dimensionality reduction (i.e., the number of reasonable features after dimensionality reduction), respectively. Parameters are , , , , , and . is equal to the proportion of the total score of out of the total score of all reasonable dimension lengths. Similarly, is equal to the proportion of the total score of out of the total score of all reasonable dimension lengths. The following is the calculation formula of : where and are the upper and lower limits , respectively, of the ^{th} reasonable dimension length; is the total score of , and is the total number of reasonable dimension lengths.
If the value of is within the interval of the ^{th} line in Table 1, then the corresponding reasonable dimension length is selected. Moreover, the heuristic intelligent algorithm will extract information from to according to the value of , and the information will be used to calculate the target value of the algorithm.
According to continuous iterative calculation of the heuristic intelligent optimization algorithm, the information carried by each individual will evolve in the direction of the optimal target value to obtain the most reasonable dimension length after dimensionality reduction and feature selection.
3. Road Roughness TimeDomain Estimation Based on Proposed Dimensionality Reduction Method
Researchers generally measure contacting or noncontacting road roughness to determine roughness; however, this method returns measurement results slowly and requires expensive equipment. A method combining a dynamic response and neural network can quickly estimate road roughness. Yet little research exists about the types and quantity of dynamic response and the points to be measured in the vehicle. Because measurement points can change positions freely in the vehicle, and each position can offer three dynamic responses (i.e., vertical vibration acceleration, speed, and displacement), this example theoretically has infinite feature dimensions.
3.1. Building a Vehicle Vibration Model with Freely Changing Measurement Points in the Vehicle
Assume the vehicle is symmetric along the longitudinal axis, and the left and right wheels are under the same road conditions while driving. Given these assumptions, we simplify the vehicle model into a halfvehicle model and the vehicle body into a rigid rod. The vibration model in this paper only considers degrees of freedom: the vehicle body’s vertical vibration, pitchangle vibration, vertical vibration of the seat system, vertical vibration of front wheel, and vibration of the rear wheel (Figure 3). We consider the points in the vehicle, of which the distance to the center of mass is , as the dynamic response measurement points and build the vibration equation based on Lagrange’s second equation.
In Figure 3, , , , , and denote the vertical vibration displacements (m) of the front wheel, rear wheel, seat system, center of mass of the vehicle body, and pitchangle displacement of the vehicle body (rad), respectively; , , , , and are the mass of the front wheel, rear wheel, seat system, vehicle body (kg), and rotational inertia of the vehicle body’s mass center around the horizontal axis (kg·m^{2}), respectively; , , , and represent the distance from the front axle, back axle, seat system, and measurement point to the center of mass (m), respectively; , , , , and are the damping of the front suspension, rear suspension, seat system, front wheel, and rear wheel (N·s/m), respectively; , , , , and are the rigidity of the front suspension, rear suspension, seat system, front wheel, and rear wheel (kN/m), respectively; and and are the vertical displacement of road roughness to the front wheel and rear wheel (m), respectively.
The following is Lagrange’s second equation: where denotes the generalized coordinates; represents the difference between the expression of kinetic energy and potential energy ; is the dissipated energy, referring to energy consumed by the damping element; and is the generalized force.
Because the pitch angle has a smaller vibration, the paper takes . We assume the vertical vibration displacement of the measurement point to be and the distances from the measurement point to the front axle and the rear axle to be and , respectively.
The kinetic energy of the system is
The potential energy of the system is
The dissipated energy of the system is
Substituting Equations (19)–(21) into Equation (18), we obtain the vehicle vibration differential equation in the example. Figure 4 illustrates the simulation model built in Matlab/Simulink.
According to the vibration simulation model, we can obtain dynamic response data conveniently. The vibration model uses the filtered white noise method to produce the road roughness signal, meeting experiment requirements, and considers a levelB road as the road to be estimated.
3.2. Research on Feature Dimensionality Reduction in Road Roughness Estimation Using the Proposed Method
This example uses the improved AFSA for optimal feature selection. We number the vertical accelerated speed, vertical speed, and vertical displacement of the positions of measurement point, center of the front wheel, center of the rear wheel, and center of the driver’s seat as shown in Table 4.

Numbers 2, 7, and 12 are the pitchangle accelerated speed, pitchangle speed, and pitch angle of the measurement point, respectively.
The example uses the improved AFSA and assumes information carried by the ^{th} fish to be . where is the value of the first information, within the range of [0, 1]; are positive integers within [1, 15], the values of which represent the corresponding dynamic response numbers; and is the distance from the measurement point to the center of mass, , and in the example.
In the iteration process, the improved AFSA finds the corresponding reasonable dimension length from Table 3 according to the value of and selects the first elements from accordingly.
The example uses the RBFNN for timedomain estimation of road roughness. We used the improved AFSA and conducted 12 total optimization experiments; results appear in Table 5.

In Table 5, is the coefficient of determination. The feature dimensionality reduction method proposed can choose features with great influence on the precision of road roughness estimation. The sets of features selected in Experiments 3 and 4 were applied in Experiments 7 and 9; thus, we used the features selected in Experiments 3 and 4 for timedomain estimation of road roughness. Figures 5 and 6 display the estimation results.
The actual value was close to the estimated value. The coefficient of determination exceeded 0.99 in both tests, indicating that the proposed feature dimensionality reduction method can select a feature set with high contributions to road roughness using only four sets of features; thus, we can accurately estimate road roughness using only four dynamic response values. To further confirm this result, we used the random forest method to estimate road roughness based on the results of Experiments 3 and 4. Estimation results are presented in Figures 7 and 8.
The road roughness estimation results obtained with feature sets from Experiments 3 and 4 based on the random forest method were 0.9489 and 0.9708, respectively. Although the estimation accuracy of the random forest method was slightly lower than that of RBFNN, it was still quite high, suggesting that the set of dynamic response features selected with the proposed method can match other estimation data methods perfectly. When the proposed feature selection method was combined with a prediction method, only a few dynamic response parameters were needed to estimate road roughness accurately.
4. GeneSelection Research in Bioengineering Using the Proposed Dimensionality Reduction Method
The experimental subjects in the example consisted of four open microarray datasets on leukemia, the colon, SRBCT, and brain cancer. Leukemia [30] contained 72 samples, each with 7129 genes. Acute myeloid leukemia (AML) contained 25 samples, and acute lymphoblastic leukemia contained 47. The colon [31] contained 62 samples (40 of tumor tissue and 22 of normal tissue), each with 2000 genes. SRBCT [32] included 83 samples, each with 2308 genes: 29 samples of the Ewing family of tumors, 18 samples of neuroblastoma, 25 samples of rhabdomyosarcoma, and 11 samples of Burkitt lymphomas.
Brain cancer [33] contained 60 samples, each with 7219 genes: 46 samples of patients with classic brain cancer and 14 samples with desmoplastic brain cancer. Although brain cancer only had two classes, the dataset has proven difficult to classify in current research with classification accuracy lower than 85% in much of the literature. Lung cancer [34] included 203 samples, each with 3312 genes. Samples were classified into five subtypes: 139 adenocarcinoma, 20 pulmonary carcinoids, 21 squamouscell lung carcinomas, six smallcell lung carcinomas, and 17 normal lungs. Lymphoma had 58 samples, each with 7129 genes. The samples included two subtypes: 32 cured patients and 26 not cured.
Because each type of tumor disease is related to many genes, we roughly extracted genes using the information index to classification (IIC) [34]. This example uses the improved PSO and assumes information carried by the ^{th} particle to be . where is the value of the first information, within the range of [0, 1]; are positive integers within the range of [1, ], and their values represent corresponding gene numbers; and is the number of roughly extracted genes based on IIC.
The example uses the extreme learning machine (ELM) to recognize tumor subtypes and takes the mean and standard deviations of the 5fold CV accuracy of 100 experiments to verify the accuracy of extracted genes.
To demonstrate the advantage of the proposed method in sample classification accuracy, we chose five methods with high geneselection accuracy, including binary particle swarm optimization BPSOELM, KmeansPSOELM, KmeansBPSOELM, BPSOGCSIELM, and KmeansGCSIMBPSOELM, to compare with the proposed method. Results of 100 independent repeated experiments were taken from [16]; Table 6 shows the comparison results.

BPSO is a PSO algorithm that solves discrete optimization problems using particles formed by the binary system [35]. BPSOELM optimally selects diseaserelated genes using BPSO and evaluates the classification results of genes selected each time using ELM [36]. KMeansPSOELM and KmeansBPSOELM each complete clustering analysis on all diseaserelated genes using Kmeans first followed by optimal selection with standard PSO and BPSO combined with the ELM classifier [37]. GCSI refers to genetoclass sensitivity information; the BPSOGCSIELM method generates an initial particle swarm using the GCSI of each gene and influences the updating and evolutionary process of particles [38]. Combining the above methods, KmeansGCSIMBPSOELM first completes Kmeans clustering analysis on all diseaserelated genes and then introduces the GCSI into the modified BPSO to optimally select genes in combination with the ELM classifier [39].
Because other gene extraction methods have demonstrated good effects on gene extraction of leukemia and SRBCT, this paper does not compare the results of these two types of diseases. Tables 6 and 7 indicate that we can select genes with greater influences on tumor subtype classification using the proposed feature dimensionality reduction method. Gene sets selected with the proposed method exhibited high classification accuracy, substantially higher than other methods. The standard deviation of classification accuracy of the proposed method was smaller than that of the other methods except brain cancer. For the four diseases, the average accuracy of the proposed method was higher than the highest average accuracy of the other methods by 1.25%, 3.84%, 1.57%, and 6.63%, respectively (from left to right in Table 6). For brain cancer and lymphoma, the average classification accuracy improved to approximately 93%; for colon and lung cancer, the average classification accuracy reached approximately 99%. The classification accuracy was fairly high, greatly improving the practicability of genebased disease detection.

To compare the ELM classifier with other classifiers based on disease classification results, we selected the random forest classifier and support vector machine (SVM) classifier to identify tumor diseases based on the gene sets in Table 7. We used the average and standard deviation of 5fold CV accuracy from 100 tests to evaluate the classification results of different classifiers. Table 8 shows the results of the random forest classifier and SVM classifier.

In terms of tumor disease recognition accuracy based on the selected gene sets using the proposed method, the random forest classifier and SVM classifier were generally poorer than the ELM classifier. For lung cancer, however, the SVM classifier’s average accuracy was slightly higher than that of the ELM classifier and its standard deviation was smaller.
5. Conclusions
(1)This paper proposes a new feature dimensionality reduction method based on a sampling survey method and heuristic intelligent optimization algorithm. First, we build a featurescoring system and a reduceddimension lengthscoring system based on the sampling survey method and select features ranking in the front with reasonable dimension lengths. Then, we select features according to the improved heuristic intelligent optimization algorithm. Our method has only two steps with simple operation and low computational complexity.(2)We successfully apply the proposed feature dimensionality reduction method to road roughness timedomain estimation research. Results show that we only need as few as four dynamic responses to estimate road roughness accurately. The experimental results of 12 feature dimensionality reductions reveal the mean accuracy of road roughness timedomain estimation to be 0.9897.(3)We successfully apply the proposed feature dimensionality reduction method to geneselection research in bioengineering. Results show that the gene sets selected with the proposed method have high classification accuracy with mean classification accuracy substantially higher than that of other geneselection methods.
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This research was supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX17_0647) and National Key R&D Program of China (2016YFD0701103).
References
 A. Jalilvand and N. Salim, “Feature unionization: a novel approach for dimension reduction,” Applied Soft Computing, vol. 52, pp. 1253–1261, 2017. View at: Publisher Site  Google Scholar
 H. Itoh, A. Imiya, and T. Sakai, “Dimension reduction and construction of feature space for image pattern recognition,” Journal of Mathematical Imaging and Vision, vol. 56, no. 1, pp. 1–31, 2016. View at: Publisher Site  Google Scholar
 W. Zhao and S. Du, “Spectral–spatial feature extraction for hyperspectral image classification: a dimension reduction and deep learning approach,” IEEE Transactions on Geoscience and Remote Sensing, vol. 54, no. 8, pp. 4544–4554, 2016. View at: Publisher Site  Google Scholar
 C. Liu, W. Wang, M. Konan et al., “A new validity index of feature subset for evaluating the dimensionality reduction algorithms,” KnowledgeBased Systems, vol. 121, pp. 83–98, 2017. View at: Publisher Site  Google Scholar
 J. Liu, “Feature dimensionality reduction for myoelectric pattern recognition: a comparison study of feature selection and feature projection methods,” Medical Engineering & Physics, vol. 36, no. 12, pp. 1716–1720, 2014. View at: Publisher Site  Google Scholar
 M. Yu, L. Shao, X. Zhen, and X. He, “Local feature discriminant projection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 9, pp. 1908–1914, 2016. View at: Publisher Site  Google Scholar
 H. X. Vo and L. J. Durlofsky, “Regularized kernel PCA for the efficient parameterization of complex geological models,” Journal of Computational Physics, vol. 322, pp. 859–881, 2016. View at: Publisher Site  Google Scholar
 S. Guo, D. Guo, L. Chen, and Q. Jiang, “A L1regularized feature selection method for local dimension reduction on microarray data,” Computational Biology and Chemistry, vol. 67, pp. 92–101, 2017. View at: Publisher Site  Google Scholar
 J. Liu, Y. Lin, M. Lin, S. Wu, and J. Zhang, “Feature selection based on quality of information,” Neurocomputing, vol. 225, pp. 11–22, 2017. View at: Publisher Site  Google Scholar
 S. Goswami, A. Chakrabarti, and B. Chakraborty, “An efficient feature selection technique for clustering based on a new measure of feature importance,” Journal of Intelligent & Fuzzy Systems, vol. 32, no. 6, pp. 3847–3858, 2017. View at: Publisher Site  Google Scholar
 H. Imine, Y. Delanne, and N. K. M'Sirdi, “Road profile input estimation in vehicle dynamics simulation,” Vehicle System Dynamics, vol. 44, no. 4, pp. 285–303, 2006. View at: Publisher Site  Google Scholar
 A. González, E. J. O'brien, Y. Y. Li, and K. Cashell, “The use of vehicle acceleration measurements to estimate road roughness,” Vehicle System Dynamics, vol. 46, no. 6, pp. 483–499, 2008. View at: Publisher Site  Google Scholar
 Z. Q. Gu, Y. F. Zhu, S. Zhang, and X. K. Ma, “Identification of mining road roughness based on GABP neural network,” China Mechanical Engineering, vol. 25, no. 23, pp. 3232–3238, 2014. View at: Google Scholar
 Y. C. Qin, J. F. Guan, L. Gu, Y. Li, and R. Liu, “Road profile input estimation using adaptiveneuro fuzzy inference system,” Transactions of Beijing Institute of Technology, vol. 35, no. 5, pp. 481–489, 2015. View at: Google Scholar
 W. Sun and F. Han, “Gene selection method based on genetoclass sensitivity information and binary particle swarm optimization,” Application Research of Computers, vol. 31, no. 9, pp. 2648–2664, 2014. View at: Google Scholar
 F. Han, W. Sun, and Q. H. Ling, “A novel strategy for gene selection of microarray data based on genetoclass sensitivity information,” PLoS One, vol. 9, no. 5, article e97530, 2014. View at: Publisher Site  Google Scholar
 P. W. Novianti, V. L. Jong, K. C. B. Roes, and M. J. C. Eijkemans, “Metaanalysis approach as a gene selection method in class prediction: does it improve model performance? A case study in acute myeloid leukemia,” BMC Bioinformatics, vol. 18, no. 1, p. 210, 2017. View at: Publisher Site  Google Scholar
 Y. X. Wang, J. X. Liu, Y. L. Gao, C. H. Zheng, and J. L. Shang, “Differentially expressed genes selection via Laplacian regularized lowrank representation method,” Computational Biology and Chemistry, vol. 65, pp. 185–192, 2016. View at: Publisher Site  Google Scholar
 S. Liu and X. Liu, “A sample survey based linguistic MADM method with prospect theory for online shopping problems,” Group Decision and Negotiation, vol. 25, no. 4, pp. 749–774, 2016. View at: Publisher Site  Google Scholar
 W. Hu and F. Y. Xu, “Selftuning of PID parameters based on improved particle swarm optimization,” Application Research of Computers, vol. 29, no. 5, pp. 1791–1794, 2012. View at: Google Scholar
 J. S. Wang, J. C. Wang, and W. Wang, “Selftuning of PID parameters based on particle swarm optimization,” Control and Decision, vol. 20, no. 1, pp. 73–81, 2005. View at: Google Scholar
 A. H. Halassi, “An attractorbased multiobjective particle swarm optimization,” International Journal of Applied and Computational Mathematics, vol. 3, no. 2, pp. 1019–1036, 2017. View at: Publisher Site  Google Scholar
 C. Y. Wu, “A new improved artificial fish swarm optimization algorithm,” CAAI Transactions on Intelligent Systems, vol. 10, no. 3, pp. 1–6, 2015. View at: Google Scholar
 Q. Duan, M. Mao, P. Duan, and B. Hu, “An improved artificial fish swarm algorithm optimized by particle swarm optimization algorithm with extended memory,” Kybernetes, vol. 45, no. 2, pp. 210–222, 2016. View at: Publisher Site  Google Scholar
 F. J. Solis and R. J. B. Wets, “Minimization by random search techniques,” Mathematics of Operations Research, vol. 6, no. 1, pp. 19–30, 1981. View at: Publisher Site  Google Scholar
 S. Gao, K. Tang, X. Jiang, and J. Yang, “Convergence analysis of particle swarm optimization algorithm,” Science Technology and Engineering, vol. 6, no. 12, pp. 1625–1631, 2006. View at: Google Scholar
 Y. Feng, Research and Application of Simulated Annealing Algorithm, Kunming University of Science and Technology, Kunming, China, 2004.
 G. Huang, J. Liu, and Y. Yao, “Global convergence proof of artificial fish swarm algorithm,” Computer Engineering, vol. 38, no. 2, pp. 204–206, 2012. View at: Google Scholar
 Z. Cheng and Z. Lu, “Research on the PID control of the ESP system of tractor based on improved AFSA and improved SA,” Computers and Electronics in Agriculture, vol. 148, pp. 142–147, 2018. View at: Publisher Site  Google Scholar
 Y. Wang, F. S. Makedon, J. C. Ford, and J. Pearlman, “HykGene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data,” Bioinformatics, vol. 21, no. 8, pp. 1530–1537, 2005. View at: Publisher Site  Google Scholar
 U. Alon, N. Barkai, D. A. Notterman et al., “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” Proceedings of the National Academy of Sciences of the United States of America, vol. 96, no. 12, pp. 6745–6750, 1999. View at: Publisher Site  Google Scholar
 J. Khan, J. S. Wei, M. Ringnér et al., “Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks,” Nature Medicine, vol. 7, no. 6, pp. 673–679, 2001. View at: Publisher Site  Google Scholar
 Y. Y. Leung, C. Q. Chang, Y. S. Hung, and P. C. Fung, “Gene selection for brain cancer classification,” in 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, pp. 5846–5849, New York, NY, USA, September 2006. View at: Publisher Site  Google Scholar
 Y. Li and R. Xiaogang, “Feature selection for cancer classification based on support vector machine,” Journal of Computer Research and Development, vol. 42, no. 10, p. 1796, 2005. View at: Publisher Site  Google Scholar
 J. Kennedy and R. C. Eberhart, “A discrete binary version of the particle swarm algorithm,” in 1997 IEEE International Conference on Systems, Man, and Cybernetics. Computational Cybernetics and Simulation, pp. 4104–4108, Orlando, FL, USA, October 1997. View at: Publisher Site  Google Scholar
 D. Tang, F. Han, and Z. Cheng, “Gene selection method based on gene scoring strategy and improved particle swarm optimization,” Computer Engineering and Design, vol. 39, no. 3, pp. 710–715, 2018. View at: Google Scholar
 S. Yang, F. Han, and J. Guan, “A hybrid gene selection and classification approach for microarray data based on clustering and PSO,” Communications in Computer and Information Science, vol. 375, pp. 88–93, 2013. View at: Publisher Site  Google Scholar
 C. Yang and F. Han, “A gene selection method based on BPSO with prior information,” Software Guide, vol. 14, no. 7, pp. 36–40, 2015. View at: Google Scholar
 F. Han, C. Yang, Y. Q. Wu et al., “A gene selection method for microarray data based on binary PSO encoding genetoclass sensitivity information,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 14, no. 1, pp. 85–96, 2017. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2018 Zhun Cheng and Zhixiong Lu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.