Kernel ridge regression
In order to deal with non-linear relationships between inputs and outputs, the powerful machine learning technique known as Kernel Ridge Regression (KRR) combines the tenets of ridge regression with the application of kernel functions. KRR is widely used in various fields, including regression analysis, classification, and data mining11,12.
In KRR, we aim to find a function f(x) that maps input data x to the output y, where x denotes a feature vector, and yrepresents a real-valued target variable. The model can be represented as13:
To prevent overfitting, KRR introduces regularization by adding a penalty term based on the \(\:{L}_{2}\) norm of the \(\:{{\upalpha\:}}_{i}\)coefficients13:
$$\:\underset{{\upalpha\:}}{\text{min}}\left(\frac{1}{n}{\sum\:}_{i=1}^{n}{\left({y}_{i}-{{\upalpha\:}}_{i}k\left({x}_{i},x\right)\right)}^{2}+{\uplambda\:}|{\upalpha\:}{|}_{2}^{2}\right)$$
where \(\:{\uplambda\:}\) stands for the regularization parameter that controls the balance between fitting the training data and regularization.
Once the model is trained by solving the optimization problem, it can be used for making predictions on new data points13:
$$\:\widehat{y}=f\left({x}_{\text{new}}\right)={\sum\:}_{i=1}^{n}{{\upalpha\:}}_{i}k\left({x}_{\text{new}},{x}_{i}\right)$$
Polynomial regression
Polynomial Regression is a popular method for modeling associations between an input variable and a set of independent variables in the fields of machine learning and statistics. It extends the simple linear regression model by fitting a polynomial function to the data, allowing for the modeling of non-linear relationships.
In Polynomial Regression, we aim to find a polynomial function of degree d that best fits the data. The model can be represented as14:
$$\:y={{\upbeta\:}}_{0}+{{\upbeta\:}}_{1}x+{{\upbeta\:}}_{2}{x}^{2}+\cdots\:+{{\upbeta\:}}_{d}{x}^{d}+\epsilon$$
Here, y denotes the dependent variable, x stands for the independent variable, d shows the degree of the polynomial, \(\:{{\upbeta\:}}_{0},{{\upbeta\:}}_{1},\cdots\:,{{\upbeta\:}}_{d}\) denote the coefficients to be estimated, and \(\epsilon\) represents the error term.
To estimate the coefficients \(\:{{\upbeta\:}}_{i}\), we typically use the method of least squares, which minimizes the sum of squared errors5,14:
$$\:\underset{{{\upbeta\:}}_{0},{{\upbeta\:}}_{1},\dots\:,{{\upbeta\:}}_{d}}{\text{min}}{\sum\:}_{i=1}^{n}{\left({y}_{i}-\left({{\upbeta\:}}_{0}+{{\upbeta\:}}_{1}{x}_{i}+{{\upbeta\:}}_{2}{x}_{i}^{2}+\cdots\:+{{\upbeta\:}}_{d}{x}_{i}^{d}\right)\right)}^{2}$$
The optimal values of \(\:{{\upbeta\:}}_{i}\) are found through this optimization process.
ν-Support Vector Regression(ν-SVR)
The method of \(\:{\upnu\:}\)-SVR is a powerful regression technique used for regression tasks. It is an extension of traditional Support Vector Regression (SVR) that introduces a parameter \(\:{\upnu\:}\)to control the balancing between model complexity and the number of support vectors15. In \(\:{\upnu\:}\)-SVR, the objective is to find a function f(x) that estimates the target variable y given a set of input features x. This model can be written down in the form of16:
$$\:f\left(x\right)={\sum\:}_{i=1}^{n}{{\upalpha\:}}_{i}K\left(x,{x}_{i}\right)+b$$
Here, \(\:K\left(x,{x}_{i}\right)\) refers to kernel function that measures the similarity between input data point x and support vectors \(\:{x}_{i}\). Also, b is the bias term and \(\:{{\upalpha\:}}_{i}\) values are Lagrange multipliers.
The objective in \(\:{\upnu\:}\)-SVR is to optimize a loss function that includes a regularization term and a loss term that penalizes deviations from the actual target values. The loss function can be defined as17:
$$\:\underset{{\upalpha\:},b}{\text{min}}\left(\frac{1}{2}|{\upalpha\:}{|}^{2}+C{\sum\:}_{i=1}^{n}{\left[\text{max}\left(0,\left|{y}_{i}-f\left({x}_{i}\right)\right|-\epsilon\right)\right]}^{2}\right)$$
Here, C stands for a regularization parameter, and \(\epsilon\) controls the width of the \(\epsilon\)-insensitive tube, which allows for a margin of error around the target values.
Fruit-Fly optimization algorithm (FFOA)
FFOA is a metaheuristic search method18that emulates the foraging behavior of fruit flies. This algorithm draws inspiration from the observation that fruit flies rely on their olfactory senses to locate food sources. When they approach a potential food source, they emit pheromones, which in turn influence the flight paths of other fruit flies, leading them towards the food or areas where their fellow flies have gathered19. During their flight, these fruit flies continually update their positions and flight directions, optimizing their paths to get closer to the food source. Consequently, undertaking this task currently represents your most favorable option. The algorithm is considered complete when either the iteration process culminates in maximization or the attained result meets the specified precision threshold20,21. Steps of FFOA algorithm are shown in the flowchart of Fig. 3.

Flowchart for FFOA algorithm.
Performance metrics
To assess and compare the outcomes of these models, metrics such as the Coefficient of Determination (R2), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE) have been employed22,23. The computation of these metrics is detailed in Table 1.
In Table 1, the following symbols represent key statistics and values:
-
n corresponds to the size of data collection.
- \(\:\stackrel{-}{y}\) and \(\:\stackrel{-}{o}\) represent the mean values of the predicted and actual values, respectively.
- \(\:{o}_{i}\) and \(\:{y}_{i}\) represent the actual (expected) and predicted outputs for the output variable.
- \(\:{o}_{max}\) and \(\:{o}_{min}\) denote the maximum and minimum values among the actual values.