I want to use Support Vector Regression (SVR) for regression as it seems quite powerful when I have several features. As I found a very-easy-to-use implementation in scikit-learn, I’m using that implementation. My questions below are regarding this python package in particular, but if you have any solution in any other language or package please let me know as well.
So, I’m using the following code:
from sklearn.svm import SVR from sklearn.model_selection import cross_validate from sklearn.model_selection import KFold svr_rbf = SVR(kernel='rbf') scoring = ['neg_mean_absolute_error', 'neg_mean_squared_error', 'r2'] scores = cross_validate(estimator, X, y, cv=KFold(10, shuffle=True), scoring=scoring, return_train_score=False) score = -1 * scores['test_neg_mean_absolute_error'] print("MAE: %.4f (%.4f)" % (score.mean(), score.std())) score = -1 * scores['test_neg_mean_squared_error'] print("MSE: %.4f (%.4f)" % (score.mean(), score.std())) score = scores['test_r2'] print("R^2: %.4f (%.4f)" % (score.mean(), score.std()))
As you can see, I can easily use 10-fold cross validation by dividing my data in 10 shuffled folds, and getting all the MAE, MSE and r^2 for each fold very easily.
However, my big question is how can I get the pvalue, r and adjusted r^2 for my SVR regression model specifically, just like I find in other python packages including statsmodels for linear regressions?
I guess I will have to implement the cross validation with KFold by myself in order to achieve this, but I don’t think that’s a big problem. The issue is that I’m not sure how to get these scores from sklearn’s implementation of SVR itself.