Indmedica Home | About Indmedica | Medical Jobs | Advertise On Indmedica
Search Indmedica Web
Indmedica - India's premier medical portal

Indian Journal of Community Medicine

Application of Logistic Regression to Estimate Prognosis in Acute Myocardial Infarction

Author(s): S. V. Kakade1, N.K. Tyagi2, R.N. Kadam1

Vol. 31, No. 2 (2006-04 - 2006-06)


Research Question: How the logistic regression is useful in performing the statistical model to predict death after AMI using demographic and clinical variables of AMI patients recorded on admission in the hospital. Objective: To develop the prognostic model and compare the results of bivariate and multivariate analysis. Study Design: Record based study. Setting: A teaching and referral hospital in Western Maharashtra. Participants: Three hundred thirty one subjects including 110 (33.2%) deaths during treatment of AMI. Methods: The bivariate analysis, using χ2 test, was performed to asses the role of each variable on the outcome variable-death or survival. Logistic regression analysis was performed to find the set of best prognostic variables. Odds ratios were obtained by both methods to compare the role of covariates in prognosis of the outcome variable. Results: Logistic regression analysis identified age, sex, residence, time gap in initiation of treatment, and hospital stay as most significant variables in the prediction of death after AMI. A search of the hidden association revealed that sex, smoking and alcoholism were significantly associated with each other. Conclusion: The early diagnosis & early treatment of the AMI patients may enable health providers to provide the timely medical care and reduce the loss of life. However, there is need of further probing in the model due to multicollinearity.

Key Words: Chi-square test, Logistic regression, AMI.


Logistic regression is one of the important methods to perform the statistical model(s) in epidemiological and medical research1. It allows the investigators to examine the relationship between a binary dependent variable and a set of continuous and/or discrete independent variables2. The regression coefficients are determined in logistic regression analysis. These coefficients are further converted into standardized odds ratio expressing the individual variables effect on the outcome1. Interpretation in terms of the odds ratio is a key attraction of the logistic regression procedure2.

To understand utility and applicability of this method it was decided to study the in-hospital outcome (survival or death) of cases admitted due to acute myocardial infarction. Acute myocardial infarction (AMI) - the cardiovascular or ischeamic heart disease (IHD) is rapidly increasing in India and other developing countries3. Cause specific mortality data indicates that cardiovascular disease is an important contributor to mortality4.

A steady decline in the mortality rate from AMI has been observed across several population groups since 1960. Although it's development is still a fatal event in approximately one third of patients5. AMI causes death or disability in many who are still in the active years of life. Its personal and social costs are profound, both for individuals and families involved and for the countries in which it is common6. Hence, in the present study an attempt has been made to generate the model determining in-hospital prognosis after Acute Myocardial infarction (AMI) using patient's admission variables; to compare the results obtained by bivariate and multivariate analysis, and to recommend the methods to be useful for further analysis of the data.

Materials and Methods

The records (IPD case papers) of subjects admitted with diagnosis of AMI, in a teaching hospital of Western Maharashtra, during April 1996 to March 1999, were reviewed. The AMI patients with co-morbidity like renal failure, pneumonia, neurological disease, jaundice, obstructive respiratory disease, acute gastroenteritis, etc. were excluded from the study because their number was not sufficient to draw valid conclusion about these identities. Thus data from records of 331 cases was collected.

The demographic and risk variables recorded were: hospital discharge status (survived or died), age (years), sex, place of residence (rural/urban), time of onset of disease symptoms (24 hours clock), time elapsed for initiation of treatment (hours), hospital stay (hours), cardiovascular disease in past (yes/no), diabetes (yes/no), hypertension (yes/no), smoking (yes/no) & alcoholic (no/old/occasional/chronic).

The bivariate analysis was performed to understand the role of each variable on the outcome variable - hospital discharge status i.e. survived or died. The differences were tested using χ2 test. Unadjusted odds ratios (OR) were computed to find the reference category for multivariate analysis with least risk of the unwanted outcome i.e., death.

The probability and standardized odds ratios of the prognostic outcome as death was estimated by performing (forward selection) binary logistic regression analysis. The category of independent variables with least proportion of mortality was taken as reference category for the computation of adjusted odds ratio.

The same data set was used for the validation of the logistic regression model. When the probability computed from the model was > 0.5, the individual was predicted (expected) to die in hospital during treatment. The sensitivity (the proportion of death correctly detected by the model out of the observed deaths) and specificity (the proportion of death correctly detected by the model out of the observed survivals) of the model was estimated. The prognostic model with the best predictive set of variables was suggested on of the basis of balanced platform of high R2 - the coefficient of determination and low Log likelihood. The appropriate model was obtained at 5th step.

The adjusted odds ratio (OR) i.e., the ratio of likelihood of death in a category as compared to reference category of each -variable in the recommended model was computed to study the magnitude of impact of risk category of each variable using eβ, where β is regression coefficient. The adjusted and unadjusted OR of the risk category of each variable were compared. The comparison promoted to study the role of covariates in prognosis of the outcome variable after standardizing the effect of other variables.


Three hundred thirty one consecutive subjects including 110 (33.2%) deaths during treatment of Acute Myocardial Infarction (AMI) were analyzed with respect to age, sex, place of residence, time of onset of disease, time taken in initiation of treatment, hospital stay period, habit of alcoholism & smoking, and history of cardiovascular disease, hypertension & diabetes.

Table -I reveals that advancing age (p<0.01) and to be female (p<0.05) are the significant demographic characteristics that increased the mortality due to AMI. However, the increased period of stay in hospital for medicare increased survival significantly (p<0.001). The variables - place of residence, time of onset of disease symptoms and time elapsed for initiation of treatment failed in proving their association with prognosis. Although it was observed that mortality was high among urban residents and whose time of symptoms onset was from 4.00 PM to 8.00 AM, the difference in survived and died were not statistically significant. The time elapsed in initiation of treatment did not show any significant increase in mortality.

Table-II reveals that presence of diabetes increased the risk 1.99 times (p<0.05) whereas, hypertension increased the risk of mortality by 1.72 times due to AMI showing borderline significance as it is significant by Fisher exact test at p=0.031. The unusual result of significantly high mortality among nonsmokers was observed (p<0.05). Similarly mortality was less among alcoholics as compared to non-alcoholics. History of cardiovascular disease did not come out as disadvantage as is believed.

Table I: Bivariate Analysis of Demograpic Variables

Variable Subject Survived Died OR (95% CI)
Total 331 221 (66.8) 110 (33.2)  
Age (Years)**
<50 (R) 66 55 (83.3) 11 (16.7) 1.00
50- 182 124 (68.1) 58 (31.9) 2.34 (1.13-4.85)
70+ 83 42 (50.6) 41 (49.4) 4.88 (2.24-10.74)
Male (R) 234 164 (70.1) 70 (29.9) 1.00
Female 97 57 (58.8) 40 (41.2) 1.64 (1.00-2.66)
Place of Residence
Rural (R) 270 182 (67.4) 88 (32.6) 1.00
Urban 61 39 (63.9) 22 (36.1) 1.17 (0.65-2.11)
Time of Onset of Disease symptoms (24 hrs. clock)
00-8.00 105 67 (63.8) 38 (36.2) 1.88 (0.94-3.74)
8.00-16.00 (R) 69 53 (76.8) 16 (23.2) 1.00
16.00-24.00 71 46 (64.8) 25 (35.2) 1.8 (0.86-3.78)
Not available 86 55 (64.0) 31 (36.0) 1.87 (0.92-3.82)
Time elapsed in treatment (hrs.)
0 - (R) 149 103 (69.1) 46 (30.9) 1.00
6 - 41 27 (65.9) 14 (34.1) 1.16 (0.56-2.41)
12 - 28 19 (67.9) 9 (32.1) 1.06 (0.44-2.53)
24 - 113 72 (63.7) 41 (36.3) 1.28 (0.076-2.16)
Hospital Stay (hrs.)***
<48- 90 11 (12.2) 79 (87.8) 88.43 (39.2-198.3)
48- 28 13 (46.4) 15 (53.6) 14.21 (5.75-34.81)
96- (R) 213 197 (92.5) 16 (7.5) 1.00
R -Referecence category, * - P <0.05, ** - P <0.01, *** - P < 0.001

Table II: Bivariate Analysis of Risk Variables

Variable Subject Survived Died OR (95% CI)
CVD in Past
No History (R) 121 85 (70.2) 36 (29.8) 1.00
History 62 42 (67.7) 20 (32.3) 1.12 (0.58-2.16)
NA 148 94 (63.5) 54 (36.5) 1.36 (0.82-2.27)
Absent (R) 273 190 (69.6) 83 (30.4) 1.00
Present 58 31 (53.4) 27 (46.6) 1.99 (1.12.-3.56)
Absent (R) 264 183 (69.3) 81 (30.7) 1.00
Present 67 38 (56.7) 29 (43.3) 1.72 (0.99-2.97)
Non-Smoker 240 151 (62.9) 89 (37.1) 6.19 (1.41-26.95)
Smoker (R) 23 21 (91.3) 2 (8.7) 1.00
Tabacco 56 41 (73.2) 15 (26.8) 3.84 (0.81-18.47)
Smoker + Tabacco 12 8 (66.7) 4 (33.3) 5.25 (0.80-34.56)
No 277 180 (65.0) 97 (35.0) 5.93 (0.75-46.62)
Old 7 3 (42.9) 4 (57.1) 14.67 (1.17-186.06)
Occasional (R) 12 11 (91.7) 1 (8.3) 1.00
Chronic 35 27 (77.1) 8 (22.9) 3.3 (0.37-29.49)
R - Reference category, * - P = 0.06, ** - P < 0.05, $ - P = 0.051;
Fisher Exact P = 0.036, NA - Information not available.

Table-III reveals the results obtained from logistic regression model, the age of the patient, sex, place of residence, time elapsed in initiation of treatment and hospital stay were significant variables in estimation of prognosis as death. The sensitivity of the model - ability for correct detection of death and specificity - ability for correct detection of survivals were 79.1% and 94.1% respectively at cut-off probability of death greater than or equal to 0.5. The overall correct predictive ability is 89.1 % and the variation explained by the model (R2) is 73.2%. Further, increasing age, to be female, urban residence and delay in admission became more prominent after multivariate logistic regression analysis, since Odds Ratios (OR) in all the variables increased considerably as compared to bivariate analysis.

Table III: Logistic Regression Model for Prognosis as Death

Variable B SE (B) Sign. OR (95% CI)
Age (Years)
<50 (R) 0.00 -- -- 1.00
50- 1.950 0.659 0.003 7.03 (1.93-25.57)
70+ 2.969 0.740 0.000 19.47 (4.57-82.99)
Male (R) 0.00 -- -- 1.00
Female 1.123 0.438 0.010 3.07 (1.30-7.26)
Place of Residence
Rural (R) 0.00 -- -- 1.00
Urban 0.997 0.498 0.045 2.71 (1.02-7.20)
Time gap in treatment (hrs)
0- (R) 0.00 -- -- 1.00
6 - 1.402 0.712 0.049 4.06 (1.01-16.39)
12 - 1.576 0.791 0.047 4.83 (1.03-22.80)
24+ 1.809 0.529 0.001 6.10 (1.17-17.20)
Hospital Stay (hrs.)  
<48 6.079 0.668 0.001 436.71 (117.83-1618.57)
48- 2.842 0.539 0.001 17.15 (5.96-49.33)
96- (R) 0.00 -- -- 1.00
Constant -6.507 0.897 0.001 --
R-Reference category

Alcoholism, smoking, diabetes and hypertension showed significant effect on mortality in bivariate analysis, but they disappeared in multivariate analysis. Hence, a search of the hidden associations of these variables was made, so that appropriate model can be developed.

The Table IV reveals that sex, smoking and alcoholism had positive and significant association with each other. Further, sex has only two groups, making it more prominent in the model. Hence, there is a need of further probing in the model due to multicollinearity.

Table IV: Association Amongst Sex, Smoking and Alcoholism

Variables df x2 Contingency
Sex* Smoking 1 18.4 0.23 (P<0.001)
Sex* Alcoholism 3 26.8 0.27 (P<0.001)
Smoking * Alcoholism 3 26.5 0.27 (P<0.001)


The mortality of 33.2% observed in present study was more than that observed by the other workers7-14. This may be due to delay in admission, rural based etc. However 50% of the AMI deaths occur within one hour of the event.15 The observed high mortality may be also due to patients reached for hospital admission during this period and died. Hospital stay period was the most significant prognostic factor influencing the outcome. Chances of survival increase with increasing hospital stay period. The mortality in early period may be due to older age. The Odds Ratio (OR) for short stay period i.e. during initial 48 hours and after 48 to 96 hours revealed that chance of death was 436.71 and 17.15 times respectively as compared to those who stayed in hospital for 96 hours or more. These ORs were much higher as compared to bivariate ORs.

Age and time gap in initiation of treatment, were the further significant prognostic factors. The positive increasing regression coefficients indicated that the probability of death in hospital increased with age and time gap in initiation of treatment.16

Finding of females at higher risk of death in hospital after AMI as compared to males17,18, is not observed in all studies16. This indicates that inclusion of sex in prognostic model may give improper prognosis.

Urban residents were at higher risk of death after AMI as compared to rural residents,16 with odds ratio 2.711, this may be due to urban life style.

The logistic regression analysis enabled to identify five most important prognostic factors in the prediction of death in hospital after AMI.

Analysis concludes that the early diagnosis and early treatment of the AMI patients may enable health care providers to provide the timely medical care and reduce the loss of life. Further, AMI in case of rural residence, sex as male and lower age may improve the survival.

Logistic regression analysis did not select some of the expected risk factors as prognostic variables in the model. This may be due to multicollinearity, which often happens when some of the explanatory variables within the group are highly correlated. Multicollinearity is a feature of the explanatory variables independent of the values of the dependent (outcome) variable. Deletion of any one of them doesn't affect the prediction of the outcome. Number of interrelated predictor variables creates difficulty in sorting out the meaning of various regression coefficients, also.19 These difficulties may be solved/simplified by using other statistical method viz. Principal Component Analysis, additionally.

Overall logistic regression found useful to reduce large number of risk variables into small set of prognostic variables. The method is also useful to determine the set of prognostic variables that could give high reliability of the outcome. The outcome based on small set of prognostic variables could support the clinician's judgment and to manage services accordingly. The studies requiring collection of data on large number of variables may be conducted by collecting data on small set of variables that are determined by applying logistic regression on data of pilot study. In this study the variables - obesity, diet and exercise were not studied, as they were not recorded in case papers.


  1. John Concato, Alvan R. Feinstein Theodore R. Holford. The risk of determining risk with multivariable models. Annals of Internal Medicine 1993;118:201-210.
  2. Ronald N, Forthofer Eun Sul Lee. Introduction To Biostatistics: A Guide to Design, Analysis and Discovery. Academic Press publication.
  3. Bhat KSS, Understanding ischemic heart disease. Health Administrator 1988; 9:28-30.
  4. Reddy KS, Cardiovascular disease in India. World Health Statist. Quart. 1993; 46:101-107.
  5. Elliott H, Antman M, Braunwald E. Acute Myocardial Infarction. In-Eugene (Volume 2) 1997. W B Saunders Company, Philadelphia.
  6. WHO Prevention of coronary heart disease, TRS No. 678, 1982.
  7. Gil M, Marrugat Sala J, et al. Relation of therapeutic improvements and 28-day case fatality in patients hospitalized with acute myocardial infarction between 1978 and 1993 in the REGICOR study, Gerona, Spain. Circulation 1999; 99:1767- 1773.
  8. Charles Maynard, Nathan R Every, Jenny S Martin, et al. Association of gender and survival in patients with acute myocardial infarction Arch intern Med. 1997; 157:1379-1384.
  9. Ross T Tsuyuki, Koon Ronald M Ikuta, et al. Mortality risk and patterns of practice in 2070 patients with acute myocardial infarction, 1987-92;105:1687-1692.
  10. Khaldoon A Al-Roomi, Abdulrahman O Masaiger, and Abdul- Hai Al-Awadi. Lifestyle and the risk of acute myocardial infarction in a Gulf Arab population. International Journal of Epidemology 1994; 23:931-939.
  11. Hector Bueno, Teresa Violan, Aureliano Almazan, et al Influence of Sex on the Short-term outcome of elderly patients with a first acute myocardial infarction, Jhon M Kalbfleisch, Edward N Brandt, et al. Myocardial Infarction prognosis by discriminant analysis. Arch Intern Med 1963;111:338-345.
  12. Willian L Hughes, John M Kalbfleisch, et al. Myocardial infarction prognosis by discriminant analysis. Arch Intern Med 1963; 111:338-345
  13. Geoffrey H Tofler, James E Muller, Peter H Stone, et al. Factors leading to shorter survival after acute myocardial infarction in patients ages 65 to 75 years compared with younger patients. Am J Cardiol 1988; 62:860-867.
  14. Jean L Rouleau, Louise Potvin, Wayne Warnica, et al. Myocardial infarction patients in the 1990s- Their risk factors, stratification and survival in Canada: The Canadian assessment of myocardial infarction [CAMI] study. J Am Coll Cardiol 1996; 27:1119-1927.
  15. Elliot M Antman & Eugene Braunwald. Acxute Myocardial Infarction. In: Eugene Braunwald, ed, HEART DISEASE: A textbook of cardiovascular medicine (Volume 2). 5th Edition. W B Saunders Company, 1997.
  16. Joel C Kleinman, Victor G DeGruttola, et al. Regional and urban-suburban differentials in cardiovascular heart disease mortality and risk factor prevalence. J Chros Dis 1981;34:11-19.
  17. Jiang He, Michael J Klang, et al. Short and long term prognosis after acute myocardial infarction in Chinese men and women. AM J Epidemiol 1994;139:693-703.
  18. Younger N, Cooper R, et al. Cardiovascular risk factors and mortality in Jamaica: significant sexual dimorphism. Abstract No. 172, Abstract - 2001 Congress of Epidemilogy, In: Am J Epidemiol 2001.
  19. Armitage P, Berry G. Statistical Methods in Medical Research (Third Edition). Blackwell Scientific Publications, 1994.

1. Krishna institute of Medical Sciences, Karad, Maharashtra.
2. Mahatma Gandhi Institute of Medical Sciences, Sevagram, Maharashtra.
Received: 16-1-2003

Access free medical resources from Wiley-Blackwell now!

About Indmedica - Conditions of Usage - Advertise On Indmedica - Contact Us

Copyright © 2005 Indmedica