SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
CORRELATION
DATA ANALYSIS
Group 3
Content
1. Pearson’s product moment correlation
2. Spearman rank-order correlation (Rho)
3. Phi coefficient
4. Point biserial correlation
Types of Correlation Coefficients
Correlation Coefficient Types of scales
Pearson’s product moment Both scales interval
Spearman rank-order Both scales ordinal
Phi Both scales nominal
Point biserial One interval, one nominal
Which formula should I use?
Pearson's correlation coefficient when applied to a population is
commonly represented by the Greek letter ρ (rho) and may be
referred to as the population correlation coefficient or
the population Pearson correlation coefficient.
The formula for r is:
Cov: covariance
S(x), S(y): the standard deviation of X and Y
1. Pearson’s product moment correlation
• The Mean is the average of the numbers.
• The Standard Deviation is just the square root of Variance.
E.g. The following data relates to Number of hours studying
and number of correct answers
1. Pearson’s product moment correlation
• The Mean is the average of the numbers.
Mean =
0+1+2+3+5+5+6
7
= 3,142
• Now we calculate each scores differences from the Mean.
+ The Mean is 3.1427.
+ The differences are : - 3.142, -2.142, -1.142, -0.142, 1.858, 1.858,
2.858.
1. Pearson’s product moment correlation
• The Variance is:
σ2
=
(−3.142)2+ (−2.142)2+ (−1.142)2+ (−0.142)2+ 1.8582+ 1.8582+ 2.8582
7
=
30.763384
7
= 4.394
• And the Standard Deviation is just the square root of Variance.
σ = 4.394= 2.096 = 2 (to the nearest score)
1. Pearson’s product moment correlation
• If working with raw data, the Pearson product moment
correlation formula is as follows:
1. Pearson’s product moment correlation
1. Pearson’s product moment correlation
E.g.
The Pearson correlation coefficient r is:

1. Pearson’s product moment correlation
 Conclusion: There is a strong, positive correlation between X and
Y. The more X is, the more Y is.
Exercise
? Find the persons coefficient of correlation between price of
studying facilities and demand from the following data. Then make
your conclusion about their relationship.
1. Pearson’s product moment correlation
2. Spearman rank-order correlation (Rho)
- A measure of the strength and direction of association that exists
between two ranked variables on ordinal scale.
- Denoted by the symbol rs (or the Greek letter ρ, pronounced rho).
−1 ≤ 𝜌 ≤ 1
 Assumption
- Two variables are either ordinal, interval or ratio.
- There is a monotonic relationship between two variables.
2. Spearman rank-order correlation (Rho)
2. Spearman rank-order correlation (Rho)
English
(mark)
Math
(mark)
56 66
75 70
45 40
71 60
62 65
64 56
58 59
80 77
76 67
61 63
- Ranking Data
• The score with the highest
value should be labeled "1"
and vice versa.
2. Spearman rank-order correlation (Rho)
English
(mark)
Math
(mark)
56 66
75 70
45 40
71 60
62 65
64 56
58 59
80 77
76 67
61 63
English
(rank) (X)
Math
(rank) (Y)
9 4
3 2
10 10
4 7
7 5
5 9
8 8
1 1
2 3
6 6
2. Spearman rank-order correlation (Rho)
English
(mark)
Math
(mark)
56 66
75 70
45 40
71 60
61 65
64 56
58 59
80 77
76 67
61 63
- Ranking data
• The score with the highest
value should be labeled "1"
and vice versa.
• When you have two or more
identical values in the data, you
need to take the average of
their ranks
2. Spearman rank-order correlation (Rho)
English
(mark)
Math
(mark)
56 66
75 70
45 40
71 60
61 65
64 56
58 59
80 77
76 67
61 63
English
(rank) (X)
Math
(rank) (Y)
9 4
3 2
10 10
4 7
6.5 5
5 9
8 8
1 1
2 3
6.5 6
2. Spearman rank-order correlation (Rho)
- Choosing the right formula
(1) Your data does NOT have tied ranks
𝜌 = 1 −
6 (𝑋 − 𝑌)2
𝑛(𝑛2 − 1)
(2) Your data has tied ranks
𝜌 =
𝑋𝑌 −
( 𝑋)( 𝑌)
𝑛
( 𝑋2 −
( 𝑋)
2
𝑛
)( 𝑌2 −
( 𝑌)
2
𝑛
)
2. Spearman rank-order correlation (Rho)
English
(mark)
Math
(mark)
56 66
75 70
45 40
71 60
62 65
64 56
58 59
80 77
76 67
61 63
English
(rank) (X)
Math
(rank) (Y)
9 4
3 2
10 10
4 7
7 5
5 9
8 8
1 1
2 3
6 6
(𝐗 − 𝐘) 𝟐
25
1
0
9
1
16
0
0
1
1
54
𝜌 = 1 −
6 𝑋 − 𝑌 2
𝑛 𝑛2 − 1
= 1 −
6 × 54
10 102 − 1
≈ 0.673
2. Spearman rank-order correlation (Rho)
ρ =
XY −
( X)( Y)
n
( X2 −
( X)
2
n
)( Y2 −
( Y)
2
n
)
English
(rank) (X)
Math
(rank) (Y)
9 4
3 2
10 10
4 7
6.5 5
5 9
8 8
1 1
2 3
6.5 6
55 55
𝑿 𝟐
𝒀 𝟐 XY
81 16 36
9 4 6
100 100 100
16 49 28
42.25 25 32.5
25 81 45
64 64 64
1 1 1
4 9 6
42.25 36 39
384.5 385 357.5
𝑿 55
𝑌 55
𝑋2
384.5
𝑌2
385
𝑋𝑌 357.5
2. Spearman rank-order correlation (Rho)
E.g.2.
ρ =
XY −
( X)( Y)
n
( X2 −
( X)
2
n
)( Y2 −
( Y)
2
n
)
=
357.5 −
55×55
10
(384.5−
552
10
)(385 −
552
10
)
= 0.669
 There was a strong, positive correlation
between English and math marks
3. Phi coefficient
A. Definition
B. Formula
C. Example
D. Steps
3. Phi coefficient
A. Definition
- The Phi (ϕ) statistic is used when both of the nominal variables
are dichotomous.
- The obtained value for Phi suggests the relationship between the
two variables.
3. Phi coefficient
B. Formula
Formula:
VARIABLE Y
VARIABLE X
A B A+B
C D C+D
A+C B+D
D)+C)(B+D)(A+B)(C+(A
BC-AD
=
3. Phi coefficient
C. Example
E.g. A class of 50 Ss are asked whether they like using the language
lab. The answer is either yes or no. The Ss are from either Japan or
Iran.
The observed values:
Then:
Japan Iran
Yes 24 8 32
No 6 12 18
30 20
D)+C)(B+D)(A+B)(C+(A
BC-AD
=
41
88.587
0
345600
0
20301832
681224
0.=
24
=
24
=
))()()((
))((-))((
=
3. Phi coefficient
D. Steps
D.1. Using the suggested interpretations of Measure
of Association
1. State the Null hypothesis
2. Determine the Phi coefficient
3. Using the suggested table to state the conclusion
3. Phi coefficient
Suggested Interpretations of Measures of Association
Values Appropriate Phrases
+.70 or higher Very strong positive relationship.
+.50 to +.69 Substantial positive relationship.
+.30 to +.49 Moderate positive relationship.
+.10 to +.29 Low positive relationship.
+.01 to +.09 Negligible positive relationship.
0.00 No relationship.
-.01 to -.09 Negligible negative relationship.
-.10 to -.29 Low negative relationship.
-.30 to -.49 Moderate negative relationship.
-.50 to -.69 Substantial negative relationship.
-.70 or lower Very strong negative relationship.
Source: Adapted from James A. Davis, Elementary Survey Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1971, 49.
3. Phi coefficient
D.2. Transform the Phi coefficient into Chi-square
1. State the Null hypothesis.
2. Choose the Alpha level and determine p-value.
3. Apply the formula for Phi coefficient and determine Chi-
square value:
4. Compare Chi-square value and p-value. State the
conclusion.

22
N=
3. Phi coefficient
41.8410 =))(.(5= 22

4. Point biserial correlation
4.1. Definition & Function
4.2. Formula
4.3. Meaning of point-biserial coefficient
4. Point biserial correlation
4.1. Definition & Function
“When one of the variables in the correlation is nominal, the point
biserial correlation is used to determine the relationship between
the levels of the nominal variable and the continuous variable.”
(Hatch & Farhady, 1982, pp. 204)
E.g. the correlation between each single test item and the total test
score:
- Nominal variable: answers to a single test item
- Continuous variable: total test score
4. Point biserial correlation
4.1. Definition & Function
- Functions:
o To analyze test items
o To investigate the correlation between some language
behaviors for male/female
o To investigate the correlation between any other nominal
variable and test performance
4. Point biserial correlation
4.2. Formula
a. By hand
rpbi =
𝑋 𝑝
−𝑋 𝑞
𝑠
𝑝𝑞
𝑋 𝑝: the mean score on the total test of Ss answering the item right
𝑋 𝑞: the mean score on the total test of Ss answering the item wrong
𝑝: proportion of cases answering the item right
𝑞: proportion of cases answering the item wrong
𝑠:standard deviation of the total sample on the test
4. Point biserial correlation
4.2. Formula
E.g. the correlation between each single test item and total test score
Table 2. Sample Student Data Matrix (Varma, n.d., pp. 4)
4. Point biserial correlation
4.2. Formula
E.g. the correlation between test item 1 and total test score
𝑋 𝑝=
9+8+7+7+7+4
6
=7
𝑋 𝑞=
4+3+2
3
= 3
𝑝 =
6
9
= .67 ; 𝑞 =
3
9
= .33
Mean =
9+8+7+7+7+4+4+3+2
9
= 5.67
𝑠 =
(9−5.67)2+ …+ (2−5.67)2
9−1
= 2.45
Items
Students
4 Total test
scores
Kid A 1 9
Kid B 1 8
Kid C 1 7
Kid D 1 7
Kid E 1 7
Kid F 0 4
Kid G 1 4
Kid H 0 3
Kid I 0 2
rpbi =
7−3
2.45
.67 (.33) = .77 .
4. Point biserial correlation
4.2. Formula
Exercise. the correlation between test item 4 and total test score
Answer:
𝑋 𝑝= 7 ; 𝑋 𝑞= 4
𝑝 = .56 ; 𝑞 = .44
𝑠 = 2.8
rpbi= .53
Items
Students
6 Total test
scores
Kid A 1 9
Kid B 1 8
Kid C 1 7
Kid D 0 7
Kid E 1 7
Kid F 0 4
Kid G 1 4
Kid H 0 3
Kid I 0 2
4. Point biserial correlation
4.3. Meaning of point-biserial coefficient
- A high point-biserial coefficient means that students selecting
more correct (incorrect) responses are students with higher
(lower) total scores
 discriminate between low-performing examinees and high-
performing examinees
- Very low or negative point-biserial coefficients computed after
field testing new items can help identify items that are flawed.
Reference
BBC. (n.d.). Variation and classification. Retrieved from
http://www.bbc.co.uk/bitesize/ks3/science/organisms_behaviour_health/
variation_classification/revision/3/
Hatch, E. & Farhady, H. (1982). Research design and statistics for applied
linguistics. Rowley: Newburry.
Lund, A. & Lund, M. (n.d.). Retrieved from https://statistics.laerd.com/statistical-
guides/spearmans-rank-order-correlation-statistical-guide.php
Reference
Nominal measure of correlation (n.d.). Retrieved from
http://www.harding.edu/sbreezeel/460%20files/statbook/chapter15.pdf
Varma, S. (n.d.). Preliminary item statistics using point-biserial correlation and p-
values. Morgan Hill, CA: Educational Data Systems.

Más contenido relacionado

La actualidad más candente

Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionMohit Asija
 
12.4 probability of compound events
12.4 probability of compound events12.4 probability of compound events
12.4 probability of compound eventshisema01
 
Fundamental counting principle powerpoint
Fundamental counting principle powerpointFundamental counting principle powerpoint
Fundamental counting principle powerpointmesmith1
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probabilityguest45a926
 
2.5.4 Hinge Theorem
2.5.4 Hinge Theorem2.5.4 Hinge Theorem
2.5.4 Hinge Theoremsmiller5
 
Probability of Union of Two events
Probability of Union of Two eventsProbability of Union of Two events
Probability of Union of Two eventsJAYHARYLPESALBON1
 
Probability of Simple and Compound Events
Probability of Simple and Compound EventsProbability of Simple and Compound Events
Probability of Simple and Compound EventsJoey Valdriz
 
Reporting Pearson Correlation Test of Independence in APA
Reporting Pearson Correlation Test of Independence in APAReporting Pearson Correlation Test of Independence in APA
Reporting Pearson Correlation Test of Independence in APAKen Plummer
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierAl Arizmendez
 
Spearman Rank Correlation - Thiyagu
Spearman Rank Correlation - ThiyaguSpearman Rank Correlation - Thiyagu
Spearman Rank Correlation - ThiyaguThiyagu K
 
permutations power point
permutations power pointpermutations power point
permutations power pointAldrin Balenton
 
Introduction of Probability
Introduction of ProbabilityIntroduction of Probability
Introduction of Probabilityrey castro
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsBabasab Patil
 

La actualidad más candente (20)

Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
12.4 probability of compound events
12.4 probability of compound events12.4 probability of compound events
12.4 probability of compound events
 
Fundamental counting principle powerpoint
Fundamental counting principle powerpointFundamental counting principle powerpoint
Fundamental counting principle powerpoint
 
Basic Concept Of Probability
Basic Concept Of ProbabilityBasic Concept Of Probability
Basic Concept Of Probability
 
Permutation
PermutationPermutation
Permutation
 
2.5.4 Hinge Theorem
2.5.4 Hinge Theorem2.5.4 Hinge Theorem
2.5.4 Hinge Theorem
 
Probability of Union of Two events
Probability of Union of Two eventsProbability of Union of Two events
Probability of Union of Two events
 
Probability of Simple and Compound Events
Probability of Simple and Compound EventsProbability of Simple and Compound Events
Probability of Simple and Compound Events
 
Reporting Pearson Correlation Test of Independence in APA
Reporting Pearson Correlation Test of Independence in APAReporting Pearson Correlation Test of Independence in APA
Reporting Pearson Correlation Test of Independence in APA
 
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn LottierRegression Analysis presentation by Al Arizmendez and Cathryn Lottier
Regression Analysis presentation by Al Arizmendez and Cathryn Lottier
 
Random variables
Random variablesRandom variables
Random variables
 
REGRESSION ANALYSIS
REGRESSION ANALYSISREGRESSION ANALYSIS
REGRESSION ANALYSIS
 
Spearman Rank Correlation - Thiyagu
Spearman Rank Correlation - ThiyaguSpearman Rank Correlation - Thiyagu
Spearman Rank Correlation - Thiyagu
 
The spearman rank order correlation coefficient
The spearman rank order correlation coefficientThe spearman rank order correlation coefficient
The spearman rank order correlation coefficient
 
Mean for Grouped Data
Mean for Grouped DataMean for Grouped Data
Mean for Grouped Data
 
permutations power point
permutations power pointpermutations power point
permutations power point
 
Introduction of Probability
Introduction of ProbabilityIntroduction of Probability
Introduction of Probability
 
Measures of Variability
Measures of VariabilityMeasures of Variability
Measures of Variability
 
Les5e ppt 03
Les5e ppt 03Les5e ppt 03
Les5e ppt 03
 
Linear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec domsLinear regression and correlation analysis ppt @ bec doms
Linear regression and correlation analysis ppt @ bec doms
 

Destacado

What is a Point Biserial Correlation?
What is a Point Biserial Correlation?What is a Point Biserial Correlation?
What is a Point Biserial Correlation?Ken Plummer
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionAzmi Mohd Tamil
 
Mann Whitney U Test
Mann Whitney U TestMann Whitney U Test
Mann Whitney U TestJohn Barlow
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis pptElkana Rorio
 
Statisticsforbiologists colstons
Statisticsforbiologists colstonsStatisticsforbiologists colstons
Statisticsforbiologists colstonsandymartin
 
Sosiaalinen media projektien johtamisessa
Sosiaalinen media projektien johtamisessaSosiaalinen media projektien johtamisessa
Sosiaalinen media projektien johtamisessaMatti Vesala
 
Applied 40S March 27, 2009
Applied 40S March 27, 2009Applied 40S March 27, 2009
Applied 40S March 27, 2009Darren Kuropatwa
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression AnalysisASAD ALI
 
parametric test of difference z test f test one-way_two-way_anova
parametric test of difference z test f test one-way_two-way_anova parametric test of difference z test f test one-way_two-way_anova
parametric test of difference z test f test one-way_two-way_anova Tess Anoza
 
What is a Kruskal Wallis-Test?
What is a Kruskal Wallis-Test?What is a Kruskal Wallis-Test?
What is a Kruskal Wallis-Test?Ken Plummer
 
Research (kinds, characteristics and purposes)
Research (kinds, characteristics and purposes)Research (kinds, characteristics and purposes)
Research (kinds, characteristics and purposes)Draizelle Sexon
 
Measures of correlation (pearson's r correlation coefficient and spearman rho)
Measures of correlation (pearson's r correlation coefficient and spearman rho)Measures of correlation (pearson's r correlation coefficient and spearman rho)
Measures of correlation (pearson's r correlation coefficient and spearman rho)Jyl Matz
 
What is a phi coefficient?
What is a phi coefficient?What is a phi coefficient?
What is a phi coefficient?Ken Plummer
 

Destacado (20)

What is a Point Biserial Correlation?
What is a Point Biserial Correlation?What is a Point Biserial Correlation?
What is a Point Biserial Correlation?
 
Pearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear RegressionPearson Correlation, Spearman Correlation &Linear Regression
Pearson Correlation, Spearman Correlation &Linear Regression
 
Spearman Rank
Spearman RankSpearman Rank
Spearman Rank
 
Mann Whitney U Test
Mann Whitney U TestMann Whitney U Test
Mann Whitney U Test
 
Correlation
CorrelationCorrelation
Correlation
 
Regression analysis ppt
Regression analysis pptRegression analysis ppt
Regression analysis ppt
 
Statisticsforbiologists colstons
Statisticsforbiologists colstonsStatisticsforbiologists colstons
Statisticsforbiologists colstons
 
Sosiaalinen media projektien johtamisessa
Sosiaalinen media projektien johtamisessaSosiaalinen media projektien johtamisessa
Sosiaalinen media projektien johtamisessa
 
Applied 40S March 27, 2009
Applied 40S March 27, 2009Applied 40S March 27, 2009
Applied 40S March 27, 2009
 
Les5e ppt 11
Les5e ppt 11Les5e ppt 11
Les5e ppt 11
 
Regression Analysis
Regression AnalysisRegression Analysis
Regression Analysis
 
Phi (φ) Correlation
Phi (φ) CorrelationPhi (φ) Correlation
Phi (φ) Correlation
 
Chi square using excel
Chi square using excelChi square using excel
Chi square using excel
 
Kruskal wallis test
Kruskal wallis testKruskal wallis test
Kruskal wallis test
 
parametric test of difference z test f test one-way_two-way_anova
parametric test of difference z test f test one-way_two-way_anova parametric test of difference z test f test one-way_two-way_anova
parametric test of difference z test f test one-way_two-way_anova
 
What is a Kruskal Wallis-Test?
What is a Kruskal Wallis-Test?What is a Kruskal Wallis-Test?
What is a Kruskal Wallis-Test?
 
Normal Curve
Normal CurveNormal Curve
Normal Curve
 
Research (kinds, characteristics and purposes)
Research (kinds, characteristics and purposes)Research (kinds, characteristics and purposes)
Research (kinds, characteristics and purposes)
 
Measures of correlation (pearson's r correlation coefficient and spearman rho)
Measures of correlation (pearson's r correlation coefficient and spearman rho)Measures of correlation (pearson's r correlation coefficient and spearman rho)
Measures of correlation (pearson's r correlation coefficient and spearman rho)
 
What is a phi coefficient?
What is a phi coefficient?What is a phi coefficient?
What is a phi coefficient?
 

Similar a Data analysis 1

Four Methods in testing reliability
Four Methods in testing reliabilityFour Methods in testing reliability
Four Methods in testing reliabilityMelchorJrTuazon1
 
Lecture 9 correlation-manual calcualtion
Lecture 9 correlation-manual calcualtionLecture 9 correlation-manual calcualtion
Lecture 9 correlation-manual calcualtionDr Rajeev Kumar
 
Pearson product moment correlation
Pearson product moment correlationPearson product moment correlation
Pearson product moment correlationSharlaine Ruth
 
4.4 correlation manual calcualtion
4.4 correlation manual calcualtion4.4 correlation manual calcualtion
4.4 correlation manual calcualtionRajeev Kumar
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfRavinandan A P
 
Applied statistics lecture_4
Applied statistics lecture_4Applied statistics lecture_4
Applied statistics lecture_4Daria Bogdanova
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionRione Drevale
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regressionANCYBS
 
Correlational Research : Language Learning / Teaching Attitudes
Correlational Research : Language Learning / Teaching AttitudesCorrelational Research : Language Learning / Teaching Attitudes
Correlational Research : Language Learning / Teaching Attitudesicheekiez
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationJanet Penilla
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regressionjasondroesch
 

Similar a Data analysis 1 (20)

Correlating test scores
Correlating test scoresCorrelating test scores
Correlating test scores
 
Four Methods in testing reliability
Four Methods in testing reliabilityFour Methods in testing reliability
Four Methods in testing reliability
 
Lecture 9 correlation-manual calcualtion
Lecture 9 correlation-manual calcualtionLecture 9 correlation-manual calcualtion
Lecture 9 correlation-manual calcualtion
 
Pearson product moment correlation
Pearson product moment correlationPearson product moment correlation
Pearson product moment correlation
 
4.4 correlation manual calcualtion
4.4 correlation manual calcualtion4.4 correlation manual calcualtion
4.4 correlation manual calcualtion
 
Correlation.pptx
Correlation.pptxCorrelation.pptx
Correlation.pptx
 
Correlation
CorrelationCorrelation
Correlation
 
Unit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdfUnit 1 Correlation- BSRM.pdf
Unit 1 Correlation- BSRM.pdf
 
Applied statistics lecture_4
Applied statistics lecture_4Applied statistics lecture_4
Applied statistics lecture_4
 
Lect w8 w9_correlation_regression
Lect w8 w9_correlation_regressionLect w8 w9_correlation_regression
Lect w8 w9_correlation_regression
 
Correlation continued
Correlation continuedCorrelation continued
Correlation continued
 
Correlation and regression
Correlation and regressionCorrelation and regression
Correlation and regression
 
Pearson Correlation
Pearson CorrelationPearson Correlation
Pearson Correlation
 
Correlational Research : Language Learning / Teaching Attitudes
Correlational Research : Language Learning / Teaching AttitudesCorrelational Research : Language Learning / Teaching Attitudes
Correlational Research : Language Learning / Teaching Attitudes
 
Les5e ppt 09
Les5e ppt 09Les5e ppt 09
Les5e ppt 09
 
A correlation analysis.ppt 2018
A correlation analysis.ppt 2018A correlation analysis.ppt 2018
A correlation analysis.ppt 2018
 
Study of Correlation
Study of Correlation Study of Correlation
Study of Correlation
 
Data Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and CorrelationData Processing and Statistical Treatment: Spreads and Correlation
Data Processing and Statistical Treatment: Spreads and Correlation
 
Correlation
CorrelationCorrelation
Correlation
 
Correlation and Regression
Correlation and RegressionCorrelation and Regression
Correlation and Regression
 

Último

Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsThinkInnovation
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxdhiyaneswaranv1
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxFinatron037
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 

Último (16)

Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
Optimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in LogisticsOptimal Decision Making - Cost Reduction in Logistics
Optimal Decision Making - Cost Reduction in Logistics
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptxCCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
CCS336-Cloud-Services-Management-Lecture-Notes-1.pptx
 
How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
Rock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptxRock Songs common codes and conventions.pptx
Rock Songs common codes and conventions.pptx
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 

Data analysis 1

  • 2. Content 1. Pearson’s product moment correlation 2. Spearman rank-order correlation (Rho) 3. Phi coefficient 4. Point biserial correlation
  • 3. Types of Correlation Coefficients Correlation Coefficient Types of scales Pearson’s product moment Both scales interval Spearman rank-order Both scales ordinal Phi Both scales nominal Point biserial One interval, one nominal Which formula should I use?
  • 4. Pearson's correlation coefficient when applied to a population is commonly represented by the Greek letter ρ (rho) and may be referred to as the population correlation coefficient or the population Pearson correlation coefficient. The formula for r is: Cov: covariance S(x), S(y): the standard deviation of X and Y 1. Pearson’s product moment correlation
  • 5. • The Mean is the average of the numbers. • The Standard Deviation is just the square root of Variance. E.g. The following data relates to Number of hours studying and number of correct answers 1. Pearson’s product moment correlation
  • 6. • The Mean is the average of the numbers. Mean = 0+1+2+3+5+5+6 7 = 3,142 • Now we calculate each scores differences from the Mean. + The Mean is 3.1427. + The differences are : - 3.142, -2.142, -1.142, -0.142, 1.858, 1.858, 2.858. 1. Pearson’s product moment correlation
  • 7. • The Variance is: σ2 = (−3.142)2+ (−2.142)2+ (−1.142)2+ (−0.142)2+ 1.8582+ 1.8582+ 2.8582 7 = 30.763384 7 = 4.394 • And the Standard Deviation is just the square root of Variance. σ = 4.394= 2.096 = 2 (to the nearest score) 1. Pearson’s product moment correlation
  • 8. • If working with raw data, the Pearson product moment correlation formula is as follows: 1. Pearson’s product moment correlation
  • 9. 1. Pearson’s product moment correlation E.g.
  • 10. The Pearson correlation coefficient r is:  1. Pearson’s product moment correlation
  • 11.  Conclusion: There is a strong, positive correlation between X and Y. The more X is, the more Y is. Exercise ? Find the persons coefficient of correlation between price of studying facilities and demand from the following data. Then make your conclusion about their relationship. 1. Pearson’s product moment correlation
  • 12. 2. Spearman rank-order correlation (Rho) - A measure of the strength and direction of association that exists between two ranked variables on ordinal scale. - Denoted by the symbol rs (or the Greek letter ρ, pronounced rho). −1 ≤ 𝜌 ≤ 1
  • 13.  Assumption - Two variables are either ordinal, interval or ratio. - There is a monotonic relationship between two variables. 2. Spearman rank-order correlation (Rho)
  • 14. 2. Spearman rank-order correlation (Rho) English (mark) Math (mark) 56 66 75 70 45 40 71 60 62 65 64 56 58 59 80 77 76 67 61 63 - Ranking Data • The score with the highest value should be labeled "1" and vice versa.
  • 15. 2. Spearman rank-order correlation (Rho) English (mark) Math (mark) 56 66 75 70 45 40 71 60 62 65 64 56 58 59 80 77 76 67 61 63 English (rank) (X) Math (rank) (Y) 9 4 3 2 10 10 4 7 7 5 5 9 8 8 1 1 2 3 6 6
  • 16. 2. Spearman rank-order correlation (Rho) English (mark) Math (mark) 56 66 75 70 45 40 71 60 61 65 64 56 58 59 80 77 76 67 61 63 - Ranking data • The score with the highest value should be labeled "1" and vice versa. • When you have two or more identical values in the data, you need to take the average of their ranks
  • 17. 2. Spearman rank-order correlation (Rho) English (mark) Math (mark) 56 66 75 70 45 40 71 60 61 65 64 56 58 59 80 77 76 67 61 63 English (rank) (X) Math (rank) (Y) 9 4 3 2 10 10 4 7 6.5 5 5 9 8 8 1 1 2 3 6.5 6
  • 18. 2. Spearman rank-order correlation (Rho) - Choosing the right formula (1) Your data does NOT have tied ranks 𝜌 = 1 − 6 (𝑋 − 𝑌)2 𝑛(𝑛2 − 1) (2) Your data has tied ranks 𝜌 = 𝑋𝑌 − ( 𝑋)( 𝑌) 𝑛 ( 𝑋2 − ( 𝑋) 2 𝑛 )( 𝑌2 − ( 𝑌) 2 𝑛 )
  • 19. 2. Spearman rank-order correlation (Rho) English (mark) Math (mark) 56 66 75 70 45 40 71 60 62 65 64 56 58 59 80 77 76 67 61 63 English (rank) (X) Math (rank) (Y) 9 4 3 2 10 10 4 7 7 5 5 9 8 8 1 1 2 3 6 6 (𝐗 − 𝐘) 𝟐 25 1 0 9 1 16 0 0 1 1 54 𝜌 = 1 − 6 𝑋 − 𝑌 2 𝑛 𝑛2 − 1 = 1 − 6 × 54 10 102 − 1 ≈ 0.673
  • 20. 2. Spearman rank-order correlation (Rho) ρ = XY − ( X)( Y) n ( X2 − ( X) 2 n )( Y2 − ( Y) 2 n ) English (rank) (X) Math (rank) (Y) 9 4 3 2 10 10 4 7 6.5 5 5 9 8 8 1 1 2 3 6.5 6 55 55 𝑿 𝟐 𝒀 𝟐 XY 81 16 36 9 4 6 100 100 100 16 49 28 42.25 25 32.5 25 81 45 64 64 64 1 1 1 4 9 6 42.25 36 39 384.5 385 357.5
  • 21. 𝑿 55 𝑌 55 𝑋2 384.5 𝑌2 385 𝑋𝑌 357.5 2. Spearman rank-order correlation (Rho) E.g.2. ρ = XY − ( X)( Y) n ( X2 − ( X) 2 n )( Y2 − ( Y) 2 n ) = 357.5 − 55×55 10 (384.5− 552 10 )(385 − 552 10 ) = 0.669  There was a strong, positive correlation between English and math marks
  • 22. 3. Phi coefficient A. Definition B. Formula C. Example D. Steps
  • 23. 3. Phi coefficient A. Definition - The Phi (ϕ) statistic is used when both of the nominal variables are dichotomous. - The obtained value for Phi suggests the relationship between the two variables.
  • 24. 3. Phi coefficient B. Formula Formula: VARIABLE Y VARIABLE X A B A+B C D C+D A+C B+D D)+C)(B+D)(A+B)(C+(A BC-AD =
  • 25. 3. Phi coefficient C. Example E.g. A class of 50 Ss are asked whether they like using the language lab. The answer is either yes or no. The Ss are from either Japan or Iran. The observed values: Then: Japan Iran Yes 24 8 32 No 6 12 18 30 20 D)+C)(B+D)(A+B)(C+(A BC-AD = 41 88.587 0 345600 0 20301832 681224 0.= 24 = 24 = ))()()(( ))((-))(( =
  • 26. 3. Phi coefficient D. Steps D.1. Using the suggested interpretations of Measure of Association 1. State the Null hypothesis 2. Determine the Phi coefficient 3. Using the suggested table to state the conclusion
  • 27. 3. Phi coefficient Suggested Interpretations of Measures of Association Values Appropriate Phrases +.70 or higher Very strong positive relationship. +.50 to +.69 Substantial positive relationship. +.30 to +.49 Moderate positive relationship. +.10 to +.29 Low positive relationship. +.01 to +.09 Negligible positive relationship. 0.00 No relationship. -.01 to -.09 Negligible negative relationship. -.10 to -.29 Low negative relationship. -.30 to -.49 Moderate negative relationship. -.50 to -.69 Substantial negative relationship. -.70 or lower Very strong negative relationship. Source: Adapted from James A. Davis, Elementary Survey Analysis. Englewood Cliffs, NJ: Prentice-Hall, 1971, 49.
  • 28. 3. Phi coefficient D.2. Transform the Phi coefficient into Chi-square 1. State the Null hypothesis. 2. Choose the Alpha level and determine p-value. 3. Apply the formula for Phi coefficient and determine Chi- square value: 4. Compare Chi-square value and p-value. State the conclusion.  22 N=
  • 29. 3. Phi coefficient 41.8410 =))(.(5= 22 
  • 30. 4. Point biserial correlation 4.1. Definition & Function 4.2. Formula 4.3. Meaning of point-biserial coefficient
  • 31. 4. Point biserial correlation 4.1. Definition & Function “When one of the variables in the correlation is nominal, the point biserial correlation is used to determine the relationship between the levels of the nominal variable and the continuous variable.” (Hatch & Farhady, 1982, pp. 204) E.g. the correlation between each single test item and the total test score: - Nominal variable: answers to a single test item - Continuous variable: total test score
  • 32. 4. Point biserial correlation 4.1. Definition & Function - Functions: o To analyze test items o To investigate the correlation between some language behaviors for male/female o To investigate the correlation between any other nominal variable and test performance
  • 33. 4. Point biserial correlation 4.2. Formula a. By hand rpbi = 𝑋 𝑝 −𝑋 𝑞 𝑠 𝑝𝑞 𝑋 𝑝: the mean score on the total test of Ss answering the item right 𝑋 𝑞: the mean score on the total test of Ss answering the item wrong 𝑝: proportion of cases answering the item right 𝑞: proportion of cases answering the item wrong 𝑠:standard deviation of the total sample on the test
  • 34. 4. Point biserial correlation 4.2. Formula E.g. the correlation between each single test item and total test score Table 2. Sample Student Data Matrix (Varma, n.d., pp. 4)
  • 35. 4. Point biserial correlation 4.2. Formula E.g. the correlation between test item 1 and total test score 𝑋 𝑝= 9+8+7+7+7+4 6 =7 𝑋 𝑞= 4+3+2 3 = 3 𝑝 = 6 9 = .67 ; 𝑞 = 3 9 = .33 Mean = 9+8+7+7+7+4+4+3+2 9 = 5.67 𝑠 = (9−5.67)2+ …+ (2−5.67)2 9−1 = 2.45 Items Students 4 Total test scores Kid A 1 9 Kid B 1 8 Kid C 1 7 Kid D 1 7 Kid E 1 7 Kid F 0 4 Kid G 1 4 Kid H 0 3 Kid I 0 2 rpbi = 7−3 2.45 .67 (.33) = .77 .
  • 36. 4. Point biserial correlation 4.2. Formula Exercise. the correlation between test item 4 and total test score Answer: 𝑋 𝑝= 7 ; 𝑋 𝑞= 4 𝑝 = .56 ; 𝑞 = .44 𝑠 = 2.8 rpbi= .53 Items Students 6 Total test scores Kid A 1 9 Kid B 1 8 Kid C 1 7 Kid D 0 7 Kid E 1 7 Kid F 0 4 Kid G 1 4 Kid H 0 3 Kid I 0 2
  • 37. 4. Point biserial correlation 4.3. Meaning of point-biserial coefficient - A high point-biserial coefficient means that students selecting more correct (incorrect) responses are students with higher (lower) total scores  discriminate between low-performing examinees and high- performing examinees - Very low or negative point-biserial coefficients computed after field testing new items can help identify items that are flawed.
  • 38. Reference BBC. (n.d.). Variation and classification. Retrieved from http://www.bbc.co.uk/bitesize/ks3/science/organisms_behaviour_health/ variation_classification/revision/3/ Hatch, E. & Farhady, H. (1982). Research design and statistics for applied linguistics. Rowley: Newburry. Lund, A. & Lund, M. (n.d.). Retrieved from https://statistics.laerd.com/statistical- guides/spearmans-rank-order-correlation-statistical-guide.php
  • 39. Reference Nominal measure of correlation (n.d.). Retrieved from http://www.harding.edu/sbreezeel/460%20files/statbook/chapter15.pdf Varma, S. (n.d.). Preliminary item statistics using point-biserial correlation and p- values. Morgan Hill, CA: Educational Data Systems.

Notas del editor

  1. Mean: average; standard deviation: the amount by which a measurement is different from standard