VARIABLE RELATIONSHIPS AND STATISTICAL ANALYSIS:
A GUIDE TO DATA ANALYSIS AND DATA DESK
How to Use:
Use this outline as a decision tree to find the appropriate analysis and how to perform it using Data Desk.
1. First decide among the single-numbered headings (i.e., 1, 2, or 3). Then under the chosen heading, find the appropriate double-numbered heading (e.g., 2.1), triple-numbered heading (e.g., 2.1.2), etc., until you have found the appropriate analysis.
2. Then follow the instructions under the chosen analysis for using Data Desk to perform the analysis.
Needed Concepts:
1. Y and X as names for variables (Y often considered the "dependent" or "response" variable and X the "independent" or "explanatory" variable)
2. categorical & quantitative variables
3. for categorical variables, how many different categories there are (also called "values" and "groups")
Notes:
1. Pages refer to Active Practice of Statistics text
2. Examples refer to variables included in the "final.dsk" datafile.
3. Analyses in italics have not been covered in this course and are provided for future reference. See the following for analyses not covered in course text: Glass, Gene V, & Hopkins, Kenneth D. (1984). Statistical methods in education and psychology (2nd ed.). Englewood Cliffs, NJ: Prentice-Hall.
1. Categorical Y Variable & Categorical X Variable
1.1. Contingency Tables (also called "cross-tabulations" and "two-way tables"); chi-square test of independence (pp. 314-322)
• e.g., relationship between Sex (male or female) and Religion (Protestant, Catholic, Jewish, etc.)
• Data Desk
• select categorical variables Y (e.g., Sex) and X (e.g, Religion) (order unimportant)
• Calc -> Contingency Tables
• set hypermenu options for c2 test and standardized residuals
2. Quantitative Y Variable(s) & Categorical X Variable(s) (comparing group means)
2.1. one quantitative Y variable & one categorical X variable with two categories (groups)
2.1.1. independent (not paired) categories (groups)
2.1.1.1. Interval Estimate of difference between 2 independent group means (pp. 277-281)
• e.g., relationship between Height (Y) and Sex (X)
• Data Desk
• select Y quantitative variable (e.g., Height) as Y and categorical (group) variable (e.g., Sex) as X
• Manip -> Split into Variables by Group
• select split quantitative variables (Y - X)
• Calc -> Estimate ->Two-sample t-Interval for µ1 - µ2
• select Confidence
• Show Results
2.1.1.2. t-Test of difference between 2 independent group means (pp. 273-277)
• e.g., e.g., relationship between Height (Y) and Sex (X)
• Data Desk
• select Y quantitative variable (e.g., Height) as Y and categorical (group) variable (e.g., Sex) as X
• Manip -> Split into Variables by Group
• select split quantitative variables (Y - X)
• Calc ->Test -> 2-Sample t-Ttest of µ1 - µ2
• select Alpha Level, Ho, & Ha
• Show Results
2.1.2. dependent (paired) categories (groups)
2.1.2.1. Interval Estimate of difference between 2 dependent (paired) group means (pp. 252-255)
• e.g., relationship between Opinion (rated 1-10) and Topic (Y = Welfare; X = Foreign_Aid), with same individuals answering both questions
• Data Desk
• select quantitative variables measured on same units or pairs of units (e.g., Welfare, Foreign_Aid; Y-X)
• Calc -> Estimate ->Paired t-Interval for µ(1-2)
• select Confidence
• Show Results
2.1.2.2. t-Test of difference between 2 dependent (paired) group means (pp. 252-255)
• e.g., relationship between Opinion (rated 1-10) and Topic (Y = Welfare; X = Foreign_Aid), with same individuals answering both questions)
• Data Desk
• select quantitative variables of each group (e.g., Welfare, Foreign_Aid; Y-X)
• Calc -> Test ->Paired t-Test of µ(1-2)
• select Alpha Level, Ho, & Ha
• Show Results
2.2. one quantitative Y variable & one categorical X variable with three or more categories, with each category (group) independent
2.2.1. analysis of variance (ANOVA)
• e.g., relationship between Weight (Y) & [Religion (Catholic, Protestant, Jewish, etc.)
• Data Desk
• select quantitative variable (e.g., Weight) as Y and categorical (group) variable (e.g., Religion) as X (should have more than two categories)
• Calc ->ANOVA -> ANOVA
• observe P-value for the categorical X variable (e.g., Religion)
• to see descriptive statistics for each group (e.g., each Religion) Calc -> Reports -> Summaries -> Reports by Groups
2.3. one quantitative Y variable & two or more (n) categorical Y variables
2.3.1. n-way ANOVA
• e.g., relationship between BOTH [Religion (e.g., Protestant, Catholic, Jewish, etc.) & Sex (e.g., male, female)] & Weight
• Data Desk
• select quantitative variable as Y (e.g., Weight), select categorical variables as Xs (e.g.,Religion & Sex)
• Calc -> ANOVA -> ANOVA or ANOVA With Interactions
• observe p-value for each X variable ("factor") and interactions
2.4. two or more quantitative Y variables & two or more categorical X variables
2.4.1. Multivariate ANOVA (MANOVA)
• e.g., relationship between BOTH [Religion (e.g., Protestant, Catholic, Jewish, etc.) & Sex (e.g., male, female)] & [Weight & Height]
• Data Desk (not included in student version of Data Desk; get full version, or use SAS, SPSS, SYSTAT or other fulll-featured statistical software)
3. Quantitative Y & Quantitative X
3.1. one quantitative Y variable and one quantitative X variable
3.1.1. Pearson product-moment correlation (pp. 84-88; 97-102)
• e.g., relationship between attitude toward Welfare & Foreign_Aid (measured on 10-point scales); interested in strength and direction of linear relationship
• Data Desk
• select X & Y variables (e.g., Welfare & Foreign_Aid; order unimportant)
• Calc -> Correlations -> Pearson Product-Moment
• observe and interpret correlation coefficient (-1 < r < 1)
3.1.2. Regression (pp. 92-102)
• e.g., relationship between attitude toward Welfare & Foreign_Aid (measured on 10-point scales); interested in regression line that predicts Y from X
• Data Desk
• select dependent (response) variable as Y (e.g., Foreign_Aid), independent (explanatory) variable as X (e.g., attitude toward Welfare)
• Calc -> Regression
• observe
• r-square (proportion of Y variance accounted for by X variance, and vice versa)
• Constant coefficient (y-intercept or A) and X-variable coefficient (slope or B)
• p-value of slope (to see if significantly different from zero; p < .05 rejects Ho that population slope b = 0)
3.2. one quantitative Y variable & two or more quantitative X variables
3.2.1. Multiple Regression
• e.g., relationship between both Attitude toward UIUC (X1) & Attitude toward Clinton (X2) to Course_Attitude (Y)
• Data Desk
• select dependent (response) variable as Y (e.g. , Course_Attitude) & independent (explanatory) variables as Xs (e.g., UIUC & Clinton)
• Calc -> Regression
• observe
• r-square
• Constant coefficient (y-intercept) and X-variable coefficients (slopes)
• p-values of slopes (to see if significantly different from zero; p < .05 rejects Ho that population slope b = 0)
3.3. two or more quantitative Y variables & two or more quantitative X variables
3.3.1. Canonical Correlation
• e.g., relationship between both Attitude toward [Psychics & Life_After_Death] & Attitude toward [Evolution & Abortion]
• Data Desk (not included; use SAS, SPSS, SYSTAT or other full-featured statistical software)
3.4. many quantitative variables, no clear distinction between Xs and Ys (i.e., dependent & independent variables)
3.4.1. Factor Analysis
• e.g., interrelationships among all attitude measures (e.g., Clinton, UIUC, Psychics, ETIQ, Life_after_death, Evolution, Abortion, Death_Penalty, Course_Attitude)
• Data Desk (not included in student version; use professional version of Data Desk or SAS, SPSS, SYSTAT or other full-teatured statistical software)
No comments:
Post a Comment