Let's suppose that the normal distribution cannot be demonstrated, particularly with these small samples. When that is true, we are forced to carry out a non-parametric test such as the Wilcoxon Test. Here we must test whether the two samples are the same; the hypothesis becomes: "the samples are the same" (soaking has no effect).

Wilcoxon Test

Rank all the bean weights on a scale from 1 (the lightest) to 20 (the heaviest) and note whether each is wet or dry:
RankWeightX if dry     RankWeightX if dry
1  11  
2  12  
3  13  
4  14  
5  15  
6  16  
7  17  
8  18  
9  19  
10  20  

Add up all the ranks for dry seeds:

Rank Sum for Dry Seeds __________

Add up all the ranks for wet seeds:

Rank Sum for Wet Seeds __________

If there were no difference between the sub-samples, these rank sums would be identical! Since they are not identical, are they different enough to reject the hypothesis of equality?

The Wilcoxon or W-statistic is the lesser of the two rank sums.

This statistic is compared with a table value for 0.05 error found below. Use the two sample sizes (n1 and n2) for rows and columns to locate the table value.

What is the table value? ___________

Critical values of W

n22345678910111213141516171819202122232425
n1------------------1111111
2----333444455666677788899
3--6678891010111112131314151516171718191920
4---11121314151617181920212224252627282930313233
5----192021232426272830313334353738404143444547
6-----2829313335373840424446474951535557586062
7------39414345474952545658616365676972747678
8-------515456596264676972757780838588909396
9--------666972757881848790939699102105108111114
10---------828689929699103106110113117120123127130134
11----------100104108112116120123127131135139143147151155
12-----------120125129133138142146150155159163168172176
13------------142147152156161166171175180185189194199
14-------------166171176182187192197202207212218223
15--------------192197203208214220225231236242248
16---------------219225231237243249255261267273
17----------------24925526226827428128729430
18-----------------280287294301307314321328
19------------------313320328335342350357
20-------------------348356364371379387
21--------------------385393401410418
22---------------------424432441450
23----------------------465474483
24-----------------------507517
25------------------------552

Decision Rule:
We reject the hypothesis of equality if the W-statistic is equal to or less than the table value.

Decision:
Based on the Wilcoxon Rank-Sum test, the hypothesis:

"Soaking causes beans to expand" is:   rejected     not rejected

Our hypothesis used the term "expand" and our prediction used the term "larger." In our experiment we tested the weight of the soaked beans.

What weight adjective would describe the soaked beans? __________________________

Do I Always Need a Statistical Test?

Observation: Our soaked beans sure do seem larger than the dry beans, but how can we measure the volume of an oddly shaped living-bean?

Question: Does soaking beans cause them to expand?

Hypothesis: Soaking does not cause beans to expand. [note alternate!]

Prediction: If soaking does not cause beans to expand, beans which are soaked will not be significantly larger than dry beans.

Experiment: Measure the volume of bean seeds by displacement of water in a graduated cylinder. Calculate the volume per bean by dividing the total volume of beans added by the number of beans added.

Soaked BeansDry Beans
Final Liquid LevelmLmL
Starting Level14 mL14 mL
Total Volume of Beans AddedmLmL
Number of Beans Addedbeansbeans
Volume per BeanmL/beanmL/bean

 

The group of dry beans receiving no treatment is the ________________ group.

The group of soaked beans is called the ________________ group.

Analysis:
Examining the volume per bean, there is a striking difference.

Can we perform a T-test or a Wilcoxon test on these data?     Yes       No

If No, why not?__________________________________________________

If we wanted to redo our volume measurements, how could we do them so that we could use a statistical test for our analysis?

______________________________________________________________

We will not make any further measurements, but perhaps we may satisfy our need for significance by recalling that scientists find 5% error acceptable.

Calculate the ratio of the volume per soaked bean to the volume per dry bean._________

The soaked beans occupy ________% of the volume of the dry beans.

Is there at least a 5% difference between the beans?     Yes       No

Decision:
Based on a displacement test, the hypothesis:

"Soaking does not cause beans to expand" is:   rejected     not rejected

Why did we choose to rewrite our hypothesis this time to its alternate "no effect" form?

___________________________________________________________

___________________________________________________________

By having our hypotheses rejected, are we poor scientists?   Yes     No

Why did we not have the option to "prove" any of our hypotheses?________________________

What If A Project Has More Than One Outcome?

Sometimes a project has more than one possible outcome. Plant breeding and genetic transformation studies have this feature. We don't have time for either of these, so we'll resort to coin tosses and dice throws to simulate meaningful physiology studies for now. We will use a Chi-squared test to test models that might have more than one outcome.

I'm sure you have seen a coin toss at the beginning of a sporting event. A coin flips over and over in the air and lands on one side. One side is called "heads" (because that side has a face on it) and the other side is called "tails" (because that side is opposite the face). You might guess that it is equally likely that the coin lands "heads" up as it is to land "heads" down. In other words, if you tossed the coin 20 times you would expect "heads" to come up about 10 of those times. That should be true of a "balanced" coin. Of course it might be possible to make an unbalanced coin that would allow some cheating. An unbalanced coin would come up "heads" more than 10 out of 20 tosses (or perhaps less than 10/20 tosses).

You are presented with two coins. The idea is to test whether they are balanced.

Question:____________________________________________________________________

Hypothesis:__________________________________________________________________

Prediction:___________________________________________________________________

Experiment:
Put the outcomes of 20 tosses for each coin in the separate charts below. Mark H for "heads" and T for "tails".
Single coinHybrid coin
          
          
          
          
Analysis: Perform X2 tests in the tables:
Single CoinHybrid Coin
HeadsTailsHeadsTails
Observed    
Expected    
Observed-Expected    
(Observed-Expected)2    
(Observed-Expected)2
Expected        
    
X2 Stat = Sum (Observed-Expected)2
Expected        
  
Degrees of Freedom = n-1  
X2 Table Value  
Decision  

Decision Rule: If X2 Stat is greater than or equal to X2 Table Value, then reject model.

Conclusion:___________________________________________________________

What If I Have More Than Two Outcomes?

You are presented with two dice. Try to decide whether each one is "fair" or "loaded."

Question:____________________________________________________________

Hypothesis:__________________________________________________________

Prediction:___________________________________________________________

Experiment: Put the outcomes of 24 tosses for each die. Mark outcomes as 1, 2, 3, 4, 5, or 6.
Black die with White spotsBlack die with Silver spots
            
            
            
            

  Analysis: Perform X2 tests in the tables below:
Black die with White spots:123456
Observed      
Expected      
Observed-Expected      
(Observed-Expected)2      
(Observed-Expected)2
Expected        
      
X2 Stat = Sum (Observed-Expected)2
Expected        
 
Degrees of Freedom = n-1 
X2 Table Value 
Decision 

Black die with Silver spots:123456
Observed      
Expected      
Observed-Expected      
(Observed-Expected)2      
(Observed-Expected)2
Expected        
      
X2 Stat = Sum (Observed-Expected)2
Expected        
 
Degrees of Freedom = n-1 
X2 Table Value 
Decision 

 

  Conclusion:__________________________________________________________

There is more to statistics in plant physiology than just comparison of means (t-test and Wilcoxon) and discrete ratios of outcomes (Chi-squared), but this is a start. Later in the semester we will do a regression test...that is an important one for dose-dependent responses (for example).

Critical Values of the Chi-Square Distribution
Degrees of
Freedom
= 0.05 = 0.001
13.84110.827
25.99113.815
37.81516.266
49.48818.466
511.07020.515
612.59222.457
714.06724.321
815.50726.124
916.91927.877
1018.30729.588

When is a Result Meaningful?

Statistically significant simply means p < . The probability is less than a pre-selected critical value. Please note that this does NOT mean the result is important or interesting (scientifically significant).

Are There Different Levels of Significance?

A test with p=0.001 is not more significant than a test with p=0.02 when your critical value is = 0.05; both are statistically significant. Many plant physiology articles will show numbers in tables with superscript symbols as found in the key below:
symbolpmeaning
ns>0.05not significant
*<0.05significant
**<0.01very significant
***<0.001extremely significant
Such interpretations are deemed erroneous by statisticians. Here is why:
Type I error: rejecting a true null hypothesis (convicting the innocent)
Type II error: not rejecting a false null hypothesis (failing to convict the guilty)
If you set (reasonable doubt) to a very low value, your test makes very few Type I errors but makes many Type II errors. If you set to a very high value, your test makes many Type I errors but very few Type II errors. Obviously some compromise value is needed; convention selects = 0.05. Conclusion: it is OK to show p values to your audience, but don't use the interpretations under the word meaning in the table above.

There are situations when you want to minimize one of the two types of errors and thus would choose a critical value other than 0.05...

You are screening possible new pesticides to control a fungal pathogen of corn plants. The screening tests are rather inexpensive and fast, so you don't care how many type I errors you make (pesticide is ineffective but you will keep testing it). What you really want to avoid is a type II error (pesticide is effective but you stop testing it). You want to use a critical value of 0.2 or 0.1. (In final testing of pesticide for "incurable" disease, you would need to go the other direction!!)

What do I hand in?

You should prepare an abstract of your activities in this first real exercise. Use the guidelines supplied in the Laboratory Introduction handout (or those on the WWW) to prepare your abstract. This worksheet (completed) will serve as your "amplification." Both are due one-week from the completion of the data collection (likely one-week from today).


Go back to the Course Schedule.