Logistic Regression

yueyuan
1 min readJul 30, 2020

--

Part 1: Coefficient

Part 2

Part 3: R-squared and p-value

In linear regression, R-squared and p value are calculated using the residuals. In brief, we square the residuals and then add them up. We can this SS(fit), for sum of squares for the residuals around the best fitting line. And we compare that to the sum of squared residuals around the worst fitting line, the mean of the y-axis values. We call this SS(mean).

R-squared is the percentage of variation around the mean that goes away when you fit a line to the data.

R-square = SS(mean)-SS(fit)/SS(mean)

It goes from 0 to 1.

Difference with linear regression

One big difference between linear regression and logistic regression is how the line is fit to the data.

With linear regression, we fit the line using “least squares”. In other words, we find the line that minimizes the sum of the squares of these residuals. We also use the residuals to calculate R² and to compare simple models to complicated models.

For logistic regression, like linear regression, we need to find a measure of a good fit to compare to a measure of a bad fit. Unfortunately, the residuals for logistic regression are all infinite, so we can’t use them. But we can project the data onto the best fitting line and then we translate the log(odds) back to probabilities. Lastly, calculate the log-likelihood of the data given the best fitting squiggle.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

yueyuan
yueyuan

No responses yet

Write a response