Why am I seeing a negative R^2 value?

R^2 is defined as:  

R^2 = 1 - mean(e_1^2)/var(y_1)

where mean(e_1^2) is the mean of the squares of the residuals, and var(y_1) is the variance of the dependent variable that is being regressed. It isn't actually the square of any quantity!

R^2 can take values between 1 and -Infinity, and it will be negative whenever the residuals are larger than the residuals would be for the model y_1 ~ m.

Have more questions? Submit a request

4 Comments

  • 1
    Avatar
    Xander Camejo

    This post doesn’t make a lick of sense to me.

  • 0
    Avatar
    Steven Sarasin

    @James Rothman, no actually. For arbitrary models of y_1, you have arbitrary errors e_1, there is no link between e_1 and y_1. You can have an arbitrarily bad approximation to the value of y_1 and therefore an arbitrarily large value of e_1, but your y_1 would be constant assuming you are comparing different models of the same data. 

    @Tobias You didn't read the post

    @Xander Comejo Please imagine a situation where you want to compute the R^2 for any model of a set of data (X_1, Y_1). Your model could even be comically bad. The value of R2 is 1 minus the ratio of the average of the errors in your approximation divided by the variance in the data you are modeling. The error in your approximation is the difference between what you predicted y should be for a given input and what it actually is in the data set. You can imagine an approximation that is purposefully bad. Let y ~ y_1 + n where n is any positive number, then all of your e_1's are equal to n, so the mean is equal to n, well now just imagine n gets larger and larger and you can see that ratio becomes infinite, since the data you are modeling is unchanged, var y_1 is the same (a constant) so only mean(e_1)=n is growing arbitrarily large in this example, causing the value of R^2 to tends towards negative infinity (1 - infinity is negative infinity).

  • 0
    Avatar
    Tobias Cunningham

    You can’t do it

  • -1
    Avatar
    James Rothman

    Is not mean(e_1^2) <= to the variance of e? If so it is possible to show that it has to be less than var(y_1) and that R^2 has to be positive.

Article is closed for comments.
Powered by Zendesk