Ridge and Lasso regression are two powerful regularization techniques, but each comes with its own limitations:
- Ridge Regression (L2): Shrinks coefficients to prevent overfitting but never reduces them to zero so it doesn’t perform feature selection.
- Lasso Regression (L1): Can shrink some coefficients exactly to zero, effectively selecting features but it struggles with highly correlated predictors and can select only one from a group of related variables.
This is where Elastic Net Regression comes in.
Elastic Net is a regularization algorithm that combines the characteristics of both lasso and ridge regression. In this blog we will going to talk about the Elastic Net regression in R language. It provides a balanced, flexible approach — especially useful when:
- You have many features (possibly more than observations),
- Some predictors are highly correlated, and
- You want both feature selection and regularization.
In short, Elastic Net bridges the gap between Ridge and Lasso, giving us the best of both worlds.
Why Use Elastic Net?
When you’re dealing with high-dimensional data (i.e., more features than observations or features that are highly correlated), traditional linear regression may overfit or behave poorly.
Elastic Net helps in such cases by:
- Handling multicollinearity like Ridge.
- Preventing overfitting through regularization.
- Performing feature selection like Lasso.
It can tackle the problem of multicollinearity and perform feature selection also that’s why we called it as elastic regression. It also reduces the impact of different features while not eliminating all of the features.
The Elastic Net Formula
The elastic net regression formula is given bey
Elastic Net regression = loss function + λ1 || w ||2 + λ2 || w ||
- Where loss function is the difference between predicted and the real values,
- λ1 controls L1 penalty (Lasso)
- λ1 controls L2 penalty (Ridge)
- || w ||2 Squared magnitude of coefficients
Here plays an important role
- If λ1 =0, then no features are eliminated
- If λ1 = ∞, then all features are eliminated
- As λ1 increases, bias increases.
- As λ1 decreases, variance increases.
- λ1 is directly proportional to bias but inversely proportional to variance.
Implementation of Elastic Net regression in R
The first step is to Install & Load Required Package
# install.packages("glmnet") # Run only once
library(glmnet)
The second step is to load the dataset and define the input and the target variables. Here the mtcars dataset is used for the practical purpose.
data(mtcars)
# Define response and predictors
x <- as.matrix(mtcars[, -1]) # predictors (excluding 'mpg')
y <- mtcars$mpg # response variable
The third step is to fit the elastic net model. We’ll use the alpha parameter to control the mix between Lasso and Ridge:
- alpha = 1: Lasso
- alpha = 0: Ridge
- 0 < alpha < 1: Elastic Net
# Fit Elastic Net with alpha = 0.5 (50% Lasso + 50% Ridge)
elastic_net_model <- glmnet(x, y, alpha = 0.5)
The fourth step is to perform Cross-Validation to Choose Best Lambda
set.seed(123) # for reproducibility
cv_model <- cv.glmnet(x, y, alpha = 0.5)
best_lambda <- cv_model$lambda.min
print(best_lambda)
The fifth step is to plot the Cross-Validation Curve. With the help of this we will be going to find out the optimal values of lambda.
plot(cv_model)

The sixth step is to print the Coefficients of the Final Model
# Coefficients at best lambda
coef(cv_model, s = "lambda.min")
At last make the prediction
# Predict on new data (here using training data for simplicity)
predictions <- predict(cv_model, s = "lambda.min", newx = x)
Conclusion
In conclusion the elastic net regression combines the ability of ridge and lasso regression giving us the best of both worlds. By blending L1 and L2 penalties, it not only performs effective feature selection like Lasso but also handles multicollinearity and grouped variables like Ridge. This makes Elastic Net especially valuable when dealing with high-dimensional data or when predictors are highly correlated.
If you like the article and would like to support me, make sure to: