
nthreads should be IMHO as well called n_jobs to follow sklearn conventions, but currently it is not. Gs = RandomizedSearchCV(xgbreg, params, n_jobs=1)Īs you can see, I use nthreads=-1 and n_jobs=1.

I use it for a regression problems.įirst, prepare the model and paramters: from xgboost.sklearn import XGBRegressorĪnd then just plug it into RS: from sklearn.model_selection import RandomizedSearchCV Of course, you should tweak them to your problem, since some of these are not invariant against the regression loss!
PYTHON XGBREGRESSOR HOW TO
How to use XGBoost with RandomizedSearchCVĪre you still using classic grid search? Just don't and use RandomizedSearchCV instead.īelow is an example how to use scikit-learn's RandomizedSearchCV with XGBoost with some starting distributions. Xclas = XGBClassifier() # and for classifierĪnd as I said, since it expose scikit-learn API, you can use as any other classifier: cross_val_score(xclas, X_train, y_train) If you are familiar with that one, these lines should be obvious to you: from xgboost.sklearn import XGBClassifier I will present the scikit-learn interface. X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, X, y = datasets.make_classification(n_samples=10000, n_features=20, Let's prepare some data first: from sklearn import datasetsįrom sklearn.model_selection import train_test_splitįrom sklearn.model_selection import cross_val_score days of training time or simple parameter search). About milion or so it started to be to long to be used for my usage (e.g. I wasn't able to use XGBoost (at least regressor) on more than about hundreds of thousands of samples. it is not clear what parameter names should be used in Python (to what parameters it corresponds in the core package). It has publication of some API and some examples, but they are not very good.

PYTHON XGBREGRESSOR INSTALL

