Lab 8: Data Analysis 10/11
Objective:
The objective of this lab was to analyze the data collected from the cell counts and observations of the assays of Tetrahymena in a control and treatment media. The objective was also to take the data that was recorded in an Excel spreadsheet and to use the ToolPak add-on and Data Analysis to find Descriptive Analysis, Histogram, F-test, and T-test for each of the sections of data. The assay that I observed was the changing of directions while spinning in the Tetrahymena.
Purpose:
The purpose of this lab was to analyze and to understand the results collected in the last lab. When using the Descriptive Analysis feature, a table is generated that gives an overview of the mean and range for each data set. Histograms show how distributed the data is in your range. Based on the shape of the Histogram, you can observe whether or not the data range is normal. If there is an even distribution around the data’s median, then it is considered to be normal. The F-test is used to test the null hypothesis which suggests that the variances of the two populations are equal. The T-test compares the data to either support or reject the null hypothesis. It also tests the data to determine how much of the difference is due to chance and how much is due to the actual treatment.
Procedure:
Descriptive Analysis
- Organize data in columns and get access to the ToolPak in Excel. Open the Data Analysis Tools.
- Select Descriptive Analysis and press ok.
- Select data range you want to analyze as the Input Range.
- Select a location on the spreadsheet as the Output Range.
- Make sure Summary statistics is checked.
- Click OK.
- Record the Data Analysis Results for each Control group and corresponding Treatment Group.
Histograms
- Look at the range of your data and make a data bin for each section of data. Record it in your spreadsheet.
- Click on the Data Analysis in the Data tab.
- Select Histogram and click OK.
- Select the Input Range for your data.
- Click in the Bin Range box and select the cells that include the bin numbers.
- Click the Output Range option button, click in the Output Range box and select the cell where you want the output to be sent.
- Check Chart Output.
F-Test Two-Sample for Variances
- On the Data tab, click Data Analysis.
- Select F-Test Two-Sample for Variances and click OK.
- Click in the Variable 1 Range box and select the range of you control measurements.
- Click in the Variable 2 Range box and select the range of your treatment.
- Click in the Output Range box and select a cell for the output.
- Click Ok.
- Be sure that the variance of Variable 1 is higher than the variance of variable 2. If it’s reversed, swap your data.
- If F>F Critical one-tail then we reject the null hypothesis since they are not equal.
T-Test Two-Sample Assuming Unequal Variances
- On the data tab click Data Analysis.
- Select the correct t-test for your data.
- Click in the Variable 1 Range box and select the range for your Control data.
- Click in the Variable 2 Range box and select the range for your Treatment data.
- Click in the Hypothesized mean difference and type 0. Remember we are testing the null hypothesis that there is no difference in the 2 means.
- Click in the Output range box and select a cell for your output.
- Click OK.
Data and Observations
Control Cell Count Descriptive Stats |
|
|
Mean |
28583.3333 |
Standard Error |
5714.27348 |
Median |
17250 |
Mode |
24000 |
Standard Deviation |
31298.3649 |
Sample Variance |
979587644 |
Kurtosis |
5.1861979 |
Skewness |
2.24638647 |
Range |
130000 |
Minimum |
4000 |
Maximum |
134000 |
Sum |
857500 |
Count |
30 |
|
-3.974E-66 |
Treatment Cell Count Descriptive Stats |
|
|
Mean |
33208.3333 |
Standard Error |
3202.57678 |
Median |
32000 |
Mode |
27000 |
Standard Deviation |
19215.4607 |
Sample Variance |
369233929 |
Kurtosis |
1.60241155 |
Skewness |
0.96573833 |
Range |
87000 |
Minimum |
4000 |
Maximum |
91000 |
Sum |
1195500 |
Count |
36 |
|
-3.974E-66 |
Control Direction Change Descriptive Stats |
|
|
Mean |
4.325 |
Standard Error |
0.33662636 |
Median |
4 |
Mode |
4 |
Standard Deviation |
2.12901205 |
Sample Variance |
4.53269231 |
Kurtosis |
-0.679693 |
Skewness |
0.3056211 |
Range |
8 |
Minimum |
1 |
Maximum |
9 |
Sum |
173 |
Count |
40 |
|
-3.974E-66 |
Treatment Direction Change Descriptive Stats |
|
|
Mean |
4.125 |
Standard Error |
0.42545225 |
Median |
4 |
Mode |
6 |
Standard Deviation |
2.69079628 |
Sample Variance |
7.24038462 |
Kurtosis |
1.06318895 |
Skewness |
0.89915578 |
Range |
12 |
Minimum |
0 |
Maximum |
12 |
Sum |
165 |
Count |
40 |
|
-3.974E-66 |
Control Spin Time Descriptive Stats |
|
|
Mean |
0.05 |
Standard Error |
0.03489912 |
Median |
0 |
Mode |
0 |
Standard Deviation |
0.22072143 |
Sample Variance |
0.04871795 |
Kurtosis |
17.2853186 |
Skewness |
4.29214522 |
Range |
1 |
Minimum |
0 |
Maximum |
1 |
Sum |
2 |
Count |
40 |
|
-3.974E-66 |
Treatment Spin Time Descriptive Stats |
|
|
Mean |
1.96 |
Standard Error |
0.51130264 |
Median |
0 |
Mode |
0 |
Standard Deviation |
3.2337618 |
Sample Variance |
10.4572154 |
Kurtosis |
1.21779651 |
Skewness |
1.56974047 |
Range |
10 |
Minimum |
0 |
Maximum |
10 |
Sum |
78.4 |
Count |
40 |
|
-3.974E-66 |
Looking at the above histograms, it can be assumed that the data is not normal. The data range is small, so this is sometimes the case. The data is either shifted or there is no curve to the data at all.
F-Test Two-Sample for Variances |
Control & Treatment Cell Counts |
|
Variable 1 |
Variable 2 |
Mean |
33208.3333 |
28583.3333 |
Variance |
369233929 |
979587644 |
Observations |
36 |
30 |
df |
35 |
29 |
F |
0.37692792 |
|
P(F<=f) one-tail |
0.00320686 |
|
F Critical one-tail |
0.55793785 |
|
F-Test Two-Sample for Variances |
Control & Treatment Direction Changes |
|
Variable 1 |
Variable 2 |
Mean |
4.325 |
4.125 |
Variance |
4.53269231 |
7.24038462 |
Observations |
40 |
40 |
df |
39 |
39 |
F |
0.62602922 |
|
P(F<=f) one-tail |
0.07400631 |
|
F Critical one-tail |
0.58669434 |
|
F-Test Two-Sample for Variances |
Control & Treatment Spin Time |
|
Variable 1 |
Variable 2 |
Mean |
1.96 |
0.05 |
Variance |
10.4572154 |
0.04871795 |
Observations |
40 |
40 |
df |
39 |
39 |
F |
214.648105 |
|
P(F<=f) one-tail |
9.9818E-36 |
|
F Critical one-tail |
1.70446507 |
|
For the F-Test of the Control & Treatment cell counts, F (0.37692792) is not greater than F Critical one- tail (0.55793785). This means that for the cell count, we accept the null hypothesis that there is not significant difference between the two data samples.
For the F-Test of the Control & Treatment direction changes, F (0.62602922) is larger than F Critical one-tail (0.58669434). Since this is true, we reject the null hypothesis. This means that there is a significant difference in spin time between the control and treatment Tetrahymena.
For the F-Test of the Control & Treatment spin time, F (214.648105) is much greater than F Critical one-tail (1.70446507). Since this is true, we reject the null hypothesis. This means that there is a significant difference in spin time between the control and treatment Tetrahymena.
t-Test: Two-Sample Assuming Unequal Variances |
Control & Treatment Cell Count |
|
Variable 1 |
Variable 2 |
Mean |
28583.3333 |
33208.3333 |
Variance |
979587644 |
369233929 |
Observations |
30 |
36 |
Hypothesized Mean Difference |
0 |
|
df |
46 |
|
t Stat |
-0.7060499 |
|
P(T<=t) one-tail |
0.24185892 |
|
t Critical one-tail |
1.67866041 |
|
P(T<=t) two-tail |
0.48371784 |
|
t Critical two-tail |
2.0128956 |
|
t-Test: Two-Sample Assuming Unequal Variances |
Control & Treatment Direction Changes |
|
Variable 1 |
Variable 2 |
Mean |
4.325 |
4.125 |
Variance |
4.53269231 |
7.24038462 |
Observations |
40 |
40 |
Hypothesized Mean Difference |
0 |
|
df |
74 |
|
t Stat |
0.36865065 |
|
P(T<=t) one-tail |
0.35672037 |
|
t Critical one-tail |
1.66570689 |
|
P(T<=t) two-tail |
0.71344074 |
|
t Critical two-tail |
1.9925435 |
|
t-Test: Two-Sample Assuming Unequal Variances |
Control & Treatment Spin Times |
|
Variable 1 |
Variable 2 |
Mean |
0.05 |
1.96 |
Variance |
0.04871795 |
10.4572154 |
Observations |
40 |
40 |
Hypothesized Mean Difference |
0 |
|
df |
39 |
|
t Stat |
-3.7268854 |
|
P(T<=t) one-tail |
0.00030689 |
|
t Critical one-tail |
1.68487512 |
|
P(T<=t) two-tail |
0.00061377 |
|
t Critical two-tail |
2.02269092 |
|
For the T-test of the Control and Treatment cell count, t Stat (-0.7060499) is less than the t Critical two-tail (2.0128956). Due to this difference, we reject the null hypothesis. This means that there is a significant difference between the two data sets. But the p value is .48 so this is not convincing enough to reject the null hypothesis.
For the T-test of the Control and Treatment Direction changes, t Stat (0.36865065) is less than the t Critical two-tail (1.9925435). This means that we reject the null hypothesis as there is evidence of significant difference between the two data sets. But the p value is .71 so this is not convincing enough to reject the null hypothesis.
For the T-test of the Control and Treatment Spin times, t Stat (-3.7268854) is very small compared to the t Critical two-tail (2.02269092). This means we reject the null hypothesis. The p value is 0.0006. This means we have convincing enough data to reject the null hypothesis.
Future Goal:
Since it was my first time using these specific analysis tools, it took a while for me to understand how they work and to actually apply the significance to the data that was collected. So a goal that I have for myself is to be able to run these tests for different sets of data with a higher understanding of how they work and with more speed.