python - Interpreting the DecisionTreeRegressor score? -


i trying evaluate relevance of features , using decisiontreeregressor()

the related part of code presented below:

# todo: make copy of dataframe, using 'drop' function drop given feature new_data = data.drop(['frozen'], axis = 1)  # todo: split data training , testing sets(0.25) using given feature target # todo: set random state.  sklearn.model_selection import train_test_split   x_train, x_test, y_train, y_test = train_test_split(new_data, data['frozen'], test_size = 0.25, random_state = 1)  # todo: create decision tree regressor , fit training set  sklearn.tree import decisiontreeregressor   regressor = decisiontreeregressor(random_state=1) regressor.fit(x_train, y_train)  # todo: report score of prediction using testing set  sklearn.model_selection import cross_val_score   #score = cross_val_score(regressor, x_test, y_test) score = regressor.score(x_test, y_test)  print score  # python 2.x  

when run print function, returns given score:

-0.649574327334

you can find score function implementatioin , explanation below here , below:

returns coefficient of determination r^2 of prediction. ... best possible score 1.0 , can negative (because model can arbitrarily worse).

i not grasp whole concept yet, explanation not helpful me. instance not understand why score negative , indicates (if squared, expect can positive).


what score indicates , why can negative?

if know article (for starters) might helpful well!


you can find rest of code here

you can find dataset here

the article execute cross_val_score in decisiontreeregressor implemented. may take @ documentation of scikitlearn decisiontreeregressor. basically, score see r^2, or (1-u/v). u squared sum residual of prediction, , v total square sum(sample sum of square).

u/v can arbitrary large when make bad prediction, while can small 0 given u , v sum of squared residual(>=0)


Comments