Difference between revisions of "Logistic regression"
Jump to navigation
Jump to search
Line 1: | Line 1: | ||
'''Logistic regression''' is a type of curve fitting. It used for | '''Logistic regression''' is a type of curve fitting. It used for categorical outcome variables. Examples of categorical outcomes variables are: (1) pass/fail, (2) dead/alive. | ||
==Special types of LR== | ==Special types of LR== | ||
Line 6: | Line 6: | ||
===Ordered LR=== | ===Ordered LR=== | ||
If the dependent variable is categorical and ordered (e.g. ''Grade 1'', ''Grade 2'', ''Grade 3''), ''ordered logistic regrssion'' is used.<ref>Ordinal Logistic Regression | R Data Analysis Examples. UCLA. URL: [https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/ https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/]. Accessed on: March 24, 2018</ref> | If the dependent variable is categorical and can be ordered in a meaningful way (e.g. ''Grade 1'', ''Grade 2'', ''Grade 3''), ''ordered logistic regrssion'' is used.<ref>Ordinal Logistic Regression | R Data Analysis Examples. UCLA. URL: [https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/ https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/]. Accessed on: March 24, 2018</ref> | ||
==GNU/Octave example== | ==GNU/Octave example== |
Revision as of 16:32, 24 March 2018
Logistic regression is a type of curve fitting. It used for categorical outcome variables. Examples of categorical outcomes variables are: (1) pass/fail, (2) dead/alive.
Special types of LR
Multinomial LR
If the dependent variable is categorical and the categories are mutually exclusive (e.g. benign, suspicious, malignant, insufficient), multinomial logistic regression is used.
Ordered LR
If the dependent variable is categorical and can be ordered in a meaningful way (e.g. Grade 1, Grade 2, Grade 3), ordered logistic regrssion is used.[1]
GNU/Octave example
% ------------------------------------------------------------------------------------------------- % TO RUN THIS FROM WITHIN OCTAVE % run logistic_regression_example.m % % TO RUN FROM THE COMMAND LINE % octave logistic_regression_example.m > tmp.txt % ------------------------------------------------------------------------------------------------- clear all; close all; % Data from example on 'Logistic regression' page of Wikipedia - https://en.wikipedia.org/wiki/Logistic_regression y = [ 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 1, 1, 1, 1, 1 ]; x = [ 0.50, 0.75, 1.00, 1.25, 1.50, 1.75, 1.75, 2.00, 2.25, 2.50, 2.75, 3.00, 3.25, 3.50, 4.00, 4.25, 4.50, 4.75, 5.00, 5.50 ]; y=transpose(y); x=transpose(x); [theta, beta, dev, dl, d2l, gamma] = logistic_regression(y,x,1) % calculate standard error se = sqrt (diag (inv (-d2l))) % create array % [[ alpha se_alpha zvalue_alpha zvalue_alpha^2 P_Wald_alpha ] % [ beta_1 se_beta_1 zvalue_beta_1 zvalue_beta_1^2 P_Wald_beta_1 ] % [ beta_2 se_beta_2 zvalue_beta_2 zvalue_beta_2^2 P_Wald_beta_2 ] % [ ... ... ... ... ... ] % [ beta_n se_beta_n zvalue_beta_n zvalue_beta_n^2 P_Wald_beta_n ]] logit_arr=zeros(size(beta,1)+1,5); logit_arr(1,1)=theta; logit_arr(2:end,1)=beta; logit_arr(1:end,2)=se; logit_arr(1:end,3)=logit_arr(1:end,1)./logit_arr(1:end,2); % zvalue_i = coefficient_i/se_coefficient_i logit_arr(1:end,4)=logit_arr(1:end,3).*logit_arr(1:end,3); % zvalue_i^2 logit_arr(1:end,5)=1-chi2cdf(logit_arr(1:end,4),1) % Wald statistic is calculated as per formula in https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1065119/ % calculate the probability for a range of values studytime_arr= [ 1, 2, 3, 4, 5] len_studytime_arr=size(studytime_arr,2); % As per the manual ( https://www.gnu.org/software/octave/doc/v4.0.3/Correlation-and-Regression-Analysis.html ) GNU/Octave function fits: % logit (gamma_i (x)) = theta_i - beta' * x, i = 1 ... k-1 for ctr=1:len_studytime_arr P_logit(ctr)=1/(1+exp(theta-studytime_arr(ctr)*beta)); end % print to screen output P_logit logit_arr % create plot len_x = size(x,1) for ctr=1:len_x P_fitted(ctr)=1/(1+exp(theta-x(ctr)*beta)); end plot(x,y,'o') hold on; grid on; plot(x,P_fitted,'-') xlabel('Study time (hours)'); ylabel('Probability of passing the exam (-)');
See also
References
- ↑ Ordinal Logistic Regression | R Data Analysis Examples. UCLA. URL: https://stats.idre.ucla.edu/r/dae/ordinal-logistic-regression/. Accessed on: March 24, 2018