Limit search to available items
Book Cover
Book
Author Harrell, Frank E., author

Title Regression modeling strategies : with applications to linear models, logistic and ordinal regression, and survival analysis / Frank E. Harrell, Jr
Edition Second edition
Published CHAM : SPRINGER, 2015
Cham ; New York : Springer, [2015]

Copies

Location Call no. Vol. Availability
 MELB  519.536 Har/Rms 2015  AVAILABLE
 MELB  519.536 Har/Rms 2015  AVAILABLE
Description xxv, 582 pages : illustrations (some color) ; 27 cm
Series Springer series in statistics, 0172-7397
Springer series in statistics. 0172-7397
Contents Contents note continued: 10.1.3.Detailed Example -- 10.1.4.Design Formulations -- 10.2.Estimation -- 10.2.1.Maximum Likelihood Estimates -- 10.2.2.Estimation of Odds Ratios and Probabilities -- 10.2.3.Minimum Sample Size Requirement -- 10.3.Test Statistics -- 10.4.Residuals -- 10.5.Assessment of Model Fit -- 10.6.Collinearity -- 10.7.Overly Influential Observations -- 10.8.Quantifying Predictive Ability -- 10.9.Validating the Fitted Model -- 10.10.Describing the Fitted Model -- 10.11.R Functions -- 10.12.Further Reading -- 10.13.Problems -- 11.Binary Logistic Regression Case Study 1 -- 11.1.Overview -- 11.2.Background -- 11.3.Data Transformations and Single Imputation -- 11.4.Regression on Original Variables, Principal Components and Pretransformations -- 11.5.Description of Fitted Model -- 11.6.Backwards Step-Down -- 11.7.Model Approximation -- 12.Logistic Model Case Study 2: Survival of Titanic Passengers -- 12.1.Descriptive Statistics --
Contents note continued: 12.2.Exploring Trends with Nonparametric Regression -- 12.3.Binary Logistic Model With Casewise Deletion of Missing Values -- 12.4.Examining Missing Data Patterns -- 12.5.Multiple Imputation -- 12.6.Summarizing the Fitted Model -- 13.Ordinal Logistic Regression -- 13.1.Background -- 13.2.Ordinality Assumption -- 13.3.Proportional Odds Model -- 13.3.1.Model -- 13.3.2.Assumptions and Interpretation of Parameters -- 13.3.3.Estimation -- 13.3.4.Residuals -- 13.3.5.Assessment of Model Fit -- 13.3.6.Quantifying Predictive Ability -- 13.3.7.Describing the Fitted Model -- 13.3.8.Validating the Fitted Model -- 13.3.9.R Functions -- 13.4.Continuation Ratio Model -- 13.4.1.Model -- 13.4.2.Assumptions and Interpretation of Parameters -- 13.4.3.Estimation -- 13.4.4.Residuals -- 13.4.5.Assessment of Model Fit -- 13.4.6.Extended CR Model -- 13.4.7.Role of Penalization in Extended CR Model -- 13.4.8.Validating the Fitted Model -- 13.4.9.R Functions --
Contents note continued: 13.5.Further Reading -- 13.6.Problems -- 14.Case Study in Ordinal Regression, Data Reduction, and Penalization -- 14.1.Response Variable -- 14.2.Variable Clustering -- 14.3.Developing Cluster Summary Scores -- 14.4.Assessing Ordinality of Y for each X, and Unadjusted Checking of PO and CR Assumptions -- 14.5.A Tentative Full Proportional Odds Model -- 14.6.Residual Plots -- 14.7.Graphical Assessment of Fit of CR Model -- 14.8.Extended Continuation Ratio Model -- 14.9.Penalized Estimation -- 14.10.Using Approximations to Simplify the Model -- 14.11.Validating the Model -- 14.12.Summary -- 14.13.Further Reading -- 14.14.Problems -- 15.Regression Models for Continuous Y and Case Study in Ordinal Regression -- 15.1.The Linear Model -- 15.2.Quantile Regression -- 15.3.Ordinal Regression Models for Continuous Y -- 15.3.1.Minimum Sample Size Requirement -- 15.4.Comparison of Assumptions of Various Models -- 15.5.Dataset and Descriptive Statistics --
Contents note continued: 15.5.1.Checking Assumptions of OLS and Other Models -- 15.6.Ordinal Regression Applied to HbA1c -- 15.6.1.Checking Fit for Various Models Using Age -- 15.6.2.Examination of BMI -- 15.6.3.Consideration of All Body Size Measurements -- 16.Transform-Both-Sides Regression -- 16.1.Background -- 16.2.Generalized Additive Models -- 16.3.Nonparametric Estimation of Y-Transformation -- 16.4.Obtaining Estimates on the Original Scale -- 16.5.R Functions -- 16.6.Case Study -- 17.Introduction to Survival Analysis -- 17.1.Background -- 17.2.Censoring, Delayed Entry, and Truncation -- 17.3.Notation, Survival, and Hazard Functions -- 17.4.Homogeneous Failure Time Distributions -- 17.5.Nonparametric Estimation of S and A -- 17.5.1.Kaplan--Meier Estimator -- 17.5.2.Altschuler--Nelson Estimator -- 17.6.Analysis of Multiple Endpoints -- 17.6.1.Competing Risks -- 17.6.2.Competing Dependent Risks -- 17.6.3.State Transitions and Multiple Types of Nonfatal Events --
Contents note continued: 17.6.4.Joint Analysis of Time and Severity of an Event -- 17.6.5.Analysis of Multiple Events -- 17.7.R Functions -- 17.8.Further Reading -- 17.9.Problems -- 18.Parametric Survival Models -- 18.1.Homogeneous Models (No Predictors) -- 18.1.1.Specific Models -- 18.1.2.Estimation -- 18.1.3.Assessment of Model Fit -- 18.2.Parametric Proportional Hazards Models -- 18.2.1.Model -- 18.2.2.Model Assumptions and Interpretation of Parameters -- 18.2.3.Hazard Ratio, Risk Ratio, and Risk Difference -- 18.2.4.Specific Models -- 18.2.5.Estimation -- 18.2.6.Assessment of Model Fit -- 18.3.Accelerated Failure Time Models -- 18.3.1.Model -- 18.3.2.Model Assumptions and Interpretation of Parameters -- 18.3.3.Specific Models -- 18.3.4.Estimation -- 18.3.5.Residuals -- 18.3.6.Assessment of Model Fit -- 18.3.7.Validating the Fitted Model -- 18.4.Buckley--James Regression Model -- 18.5.Design Formulations -- 18.6.Test Statistics -- 18.7.Quantifying Predictive Ability --
Contents note continued: 18.8.Time-Dependent Covariates -- 18.9.R Functions -- 18.10.Further Reading -- 18.11.Problems -- 19.Case Study in Parametric Survival Modeling and Model Approximation -- 19.1.Descriptive Statistics -- 19.2.Checking Adequacy of Log-Normal Accelerated Failure Time Model -- 19.3.Summarizing the Fitted Model -- 19.4.Internal Validation of the Fitted Model Using the Bootstrap -- 19.5.Approximating the Full Model -- 19.6.Problems -- 20.Cox Proportional Hazards Regression Model -- 20.1.Model -- 20.1.1.Preliminaries -- 20.1.2.Model Definition -- 20.1.3.Estimation of β -- 20.1.4.Model Assumptions and Interpretation of Parameters -- 20.1.5.Example -- 20.1.6.Design Formulations -- 20.1.7.Extending the Model by Stratification -- 20.2.Estimation of Survival Probability and Secondary Parameters -- 20.3.Sample Size Considerations -- 20.4.Test Statistics -- 20.5.Residuals -- 20.6.Assessment of Model Fit -- 20.6.1.Regression Assumptions --
Contents note continued: 2.4.8.Advantages of Regression Splines over Other Methods -- 2.5.Recursive Partitioning: Tree-Based Models -- 2.6.Multiple Degree of Freedom Tests of Association -- 2.7.Assessment of Model Fit -- 2.7.1.Regression Assumptions -- 2.7.2.Modeling and Testing Complex Interactions -- 2.7.3.Fitting Ordinal Predictors -- 2.7.4.Distributional Assumptions -- 2.8.Further Reading -- 2.9.Problems -- 3.Missing Data -- 3.1.Types of Missing Data -- 3.2.Prelude to Modeling -- 3.3.Missing Values for Different Types of Response Variables -- 3.4.Problems with Simple Alternatives to Imputation -- 3.5.Strategies for Developing an Imputation Model -- 3.6.Single Conditional Mean Imputation -- 3.7.Predictive Mean Matching -- 3.8.Multiple Imputation -- 3.8.1.The aregImpute and Other Chained Equations Approaches -- 3.9.Diagnostics -- 3.10.Summary and Rough Guidelines -- 3.11.Further Reading -- 3.12.Problems -- 4.Multivariable Modeling Strategies --
Contents note continued: 20.6.2.Proportional Hazards Assumption -- 20.7.What to Do When PH Fails -- 20.8.Collinearity -- 20.9.Overly Influential Observations -- 20.10.Quantifying Predictive Ability -- 20.11.Validating the Fitted Model -- 20.11.1.Validation of Model Calibration -- 20.11.2.Validation of Discrimination and Other Statistical Indexes -- 20.12.Describing the Fitted Model -- 20.13.R Functions -- 20.14.Further Reading -- 21.Case Study in Cox Regression -- 21.1.Choosing the Number of Parameters and Fitting the Model -- 21.2.Checking Proportional Hazards -- 21.3.Testing Interactions -- 21.4.Describing Predictor Effects -- 21.5.Validating the Model -- 21.6.Presenting the Model -- 21.7.Problems
Contents note continued: 4.1.Prespecification of Predictor Complexity Without Later Simplification -- 4.2.Checking Assumptions of Multiple Predictors Simultaneously -- 4.3.Variable Selection -- 4.4.Sample Size, Overfitting, and Limits on Number of Predictors -- 4.5.Shrinkage -- 4.6.Collinearity -- 4.7.Data Reduction -- 4.7.1.Redundancy Analysis -- 4.7.2.Variable Clustering -- 4.7.3.Transformation and Scaling Variables Without Using Y -- 4.7.4.Simultaneous Transformation and Imputation -- 4.7.5.Simple Scoring of Variable Clusters -- 4.7.6.Simplifying Cluster Scores -- 4.7.7.How Much Data Reduction Is Necessary? -- 4.8.Other Approaches to Predictive Modeling -- 4.9.Overly Influential Observations -- 4.10.Comparing Two Models -- 4.11.Improving the Practice of Multivariable Prediction -- 4.12.Summary: Possible Modeling Strategies -- 4.12.1.Developing Predictive Models -- 4.12.2.Developing Models for Effect Estimation -- 4.12.3.Developing Models for Hypothesis Testing --
Contents note continued: 4.13.Further Reading -- 4.14.Problems -- 5.Describing, Resampling, Validating, and Simplifying the Model -- 5.1.Describing the Fitted Model -- 5.1.1.Interpreting Effects -- 5.1.2.Indexes of Model Performance -- 5.2.The Bootstrap -- 5.3.Model Validation -- 5.3.1.Introduction -- 5.3.2.Which Quantities Should Be Used in Validation? -- 5.3.3.Data-Splitting -- 5.3.4.Improvements on Data-Splitting: Resampling -- 5.3.5.Validation Using the Bootstrap -- 5.4.Bootstrapping Ranks of Predictors -- 5.5.Simplifying the Final Model by Approximating It -- 5.5.1.Difficulties Using Full Models -- 5.5.2.Approximating the Full Model -- 5.6.Further Reading -- 5.7.Problem -- 6.R Software -- 6.1.The R Modeling Language -- 6.2.User-Contributed Functions -- 6.3.The rms Package -- 6.4.Other Functions -- 6.5.Further Reading -- 7.Modeling Longitudinal Responses using Generalized Least Squares -- 7.1.Notation and Data Setup -- 7.2.Model Specification for Effects on E{Y) --
Contents note continued: 7.3.Modeling Within-Subject Dependence -- 7.4.Parameter Estimation Procedure -- 7.5.Common Correlation Structures -- 7.6.Checking Model Fit -- 7.7.Sample Size Considerations -- 7.8.R Software -- 7.9.Case Study -- 7.9.1.Graphical Exploration of Data -- 7.9.2.Using Generalized Least Squares -- 7.10.Further Reading -- 8.Case Study in Data Reduction -- 8.1.Data -- 8.2.How Many Parameters Can Be Estimated? -- 8.3.Redundancy Analysis -- 8.4.Variable Clustering -- 8.5.Transformation and Single Imputation Using transcan -- 8.6.Data Reduction Using Principal Components -- 8.6.1.Sparse Principal Components -- 8.7.Transformation Using Nonparametric Smoothers -- 8.8.Further Reading -- 8.9.Problems -- 9.Overview of Maximum Likelihood Estimation -- 9.1.General Notions--Simple Cases -- 9.2.Hypothesis Tests -- 9.2.1.Likelihood Ratio Test -- 9.2.2.Wald Test -- 9.2.3.Score Test -- 9.2.4.Normal Distribution---One Sample -- 9.3.General Case -- 9.3.1.Global Test Statistics --
Contents note continued: 9.3.2.Testing a Subset of the Parameters -- 9.3.3.Tests Based on Contrasts -- 9.3.4.Which Test Statistics to Use When -- 9.3.5.Example: Binomial---Comparing Two Proportions -- 9.4.Iterative ML Estimation -- 9.5.Robust Estimation of the Covariance Matrix -- 9.6.Wald, Score, and Likelihood-Based Confidence Intervals -- 9.6.1.Simultaneous Wald Confidence Regions -- 9.7.Bootstrap Confidence Regions -- 9.8.Further Use of the Log Likelihood -- 9.8.1.Rating Two Models, Penalizing for Complexity -- 9.8.2.Testing Whether One Model Is Better than Another -- 9.8.3.Unitless Index of Predictive Ability -- 9.8.4.Unitless Index of Adequacy of a Subset of Predictors -- 9.9.Weighted Maximum Likelihood Estimation -- 9.10.Penalized Maximum Likelihood Estimation -- 9.11.Further Reading -- 9.12.Problems -- 10.Binary Logistic Regression -- 10.1.Model -- 10.1.1.Model Assumptions and Interpretation of Parameters -- 10.1.2.Odds Ratio, Risk Ratio, and Risk Difference --
Machine generated contents note: 1.Introduction -- 1.1.Hypothesis Testing, Estimation, and Prediction -- 1.2.Examples of Uses of Predictive Multivariable Modeling -- 1.3.Prediction vs. Classification -- 1.4.Planning for Modeling -- 1.4.1.Emphasizing Continuous Variables -- 1.5.Choice of the Model -- 1.6.Further Reading -- 2.General Aspects of Fitting Regression Models -- 2.1.Notation for Multivariable Regression Models -- 2.2.Model Formulations -- 2.3.Interpreting Model Parameters -- 2.3.1.Nominal Predictors -- 2.3.2.Interactions -- 2.3.3.Example: Inference for a Simple Model -- 2.4.Relaxing Linearity Assumption for Continuous Predictors -- 2.4.1.Avoiding Categorization -- 2.4.2.Simple Nonlinear Terms -- 2.4.3.Splines for Estimating Shape of Regression Function and Determining Predictor Transformations -- 2.4.4.Cubic Spline Functions -- 2.4.5.Restricted Cubic Splines -- 2.4.6.Choosing Number and Position of Knots -- 2.4.7.Nonparametric Regression --
Summary This highly anticipated second edition features new chapters and sections, 225 new references, and comprehensive R software. In keeping with the previous edition, this book is about the art and science of data analysis and predictive modeling, which entails choosing and using multiple tools. Instead of presenting isolated techniques, this text emphasizes problem solving strategies that address the many issues arising when developing multivariable models using real data and not standard textbook examples. It includes imputation methods for dealing with missing data effectively, methods for fitting nonlinear relationships and for making the estimation of transformations a formal part of the modeling process, methods for dealing with "too many variables to analyze and not enough observations," and powerful model validation techniques based on the bootstrap
Bibliography Includes bibliographical references (pages 539-570) and index
Notes Also issued online
Subject Linear models (Statistics)
Regression analysis.
LC no. 2015942921
ISBN 3319194240
9783319194240