Spiders or Butterflies ? Despite Student Preference , Gender-Biased Lesson Models Do Not Impact Interest , Attitude , and Learning in Biology

Spiders or Butterflies? Despite Student Preference, Gender-Biased Lesson Models Do Not Impact Interest, Attitude, and Learning in Biology Amy N. Buxton Department of Biology, BYU Master of Science Educational research often emphasizes the prevalent gender gap between males and females in science, technology, engineering, and mathematics (STEM) fields. While many studies have found a gender bias when it comes to specific areas of science, little has been done to analyze the effects of how we teach within each of these subjects. In our study, we took a new angle on gender research by specifically considering whether there is a gender gap in how the models (the specific lesson examples/content used to teach a broader biology topic) used to teach biology affect student interest, attitude, and learning. We first created and distributed a survey to kindergarten through sixth grade students to see whether a gender bias concerning lesson models exists, when that gap is most prevalent, and which models exhibit the bias. Based on the findings of that survey, we then created four sets of parallel lesson plans teaching broad topics using juxtaposing lesson models, one of male interest and one of female interest. We designed instruments to measure whether lesson model or presenter gender impacted student interest, attitude, and learning. Our findings show that students do indeed indicate a preference to learn using certain lesson models, but that the lesson model and presenter gender do not impact student interest, attitude, or learning during an active learning biology presentation.


LIST OF TABLES
Trends in International Mathematics and Science Study (TIMSS) (Provasnik, 2012).
This pattern does not hold true in all countries. Researchers found that fourth through eighth grade female students in Turkey consistently had higher science success than male students, the difference becoming statistically significant as grade level increased (Bursal, 2013).
Another study suggested that females in Turkey had more positive attitudes towards science than males (Mıhladız, Duran, & Dogan, 2011). In fact, international data from the OECD Programme for International Student Assessment (PISA, 2009) indicate that across all OECD countries, there is no measurable gender gap in science. However, in America, the gender gap is present and is the largest of all countries tested. This suggests that there are no innate differences in ability that would affect females' ability to excel in STEM. Spelke (2005) further confirms that men and women share the cognitive capacities that allow for math and science reasoning. Researchers suggest that among the most prominent factors promoting the apparent gap are the academic environment and perceived gender stereotypes (Hall & Sandler, 1982;Robelen, 2012). Because of the alarming trends indicated by these data, our study further considered the factors leading to the gender bias and what can be done to minimize it.

GENDER BIAS AND AGE
While much of the research on gender stereotypes and gaps within education focuses on high school or college level education, stereotypes begin at a young age (Cvencek, Meltzoff, & Greenward, 2011;del Rio & Strasser, 2013;Farenga & Joyce, 1999). By 36 months of age most children start to understand that some activities or belongings are more frequently associated with men or with women (Weinraub et al., 1984). Halim and Ruble (2010) explain that a child's belief concerning how each gender should act peaks around five years of age, and becomes more flexible after that time (p. 502). Research also examines specifically how gender bias applies to education. For example, del Rio and Strasser (2013) found that children form stereotypes concerning academic achievement as early as age five. In their research, children expressed the belief that a girl would find math harder, do worse in it, and like it less, compared to language.
Furthermore, Cvencek et al. (2011) suggest that, even by second grade, before students show a gender difference in mathematics performance, they have demonstrated the stereotype that "math is for boys." In addition, Farenga and Joyce (1999) found that both genders of fourth through sixth grade students thought science was more appropriate for males. Their results imply that student belief of whether science is an appropriate field begins before the age of nine (Farenga & Joyce, 1999).
Based on the findings of these studies, we believe it is important to consider gender stereotypes in biology interests starting as young as kindergarten age. While one quantitative study demonstrated no significant gender gap in science interests in first through third graders, they found the gender gap to increase 20 times by the tenth to twelfth grade level (Baram-Tsabari & Yarden, 2011). Insofar as we can find, it is unclear when exactly the gender gap in biology begins. As such, our research examined when this bias is evident during elementary school.

GENDER BIAS AND PRESENTER GENDER
Our research further tested the importance of male versus female presenters teaching biology in order to consider whether utilizing presenters of a particular gender could be a way of fighting the implicit stereotyping referenced by Nosek et al. (2009). Moè and Pazzaglia (2006) studied male and female performance on the Mental Rotation Test (MRT) when participants were told that their gender was better or worse at the test. When participants were told their gender was better, their performance would improve; when told it was worse, their score decreased. Neuburger, Jansen, Heil, and Quaiser-Pohl (2012) did a similar study with fourth graders taking mental-rotation tests. Although boys did not actually improve when told that boys were better, when both genders were told that girls were better or that there was no gender difference, girls improved and boys did worse. If males and females often perform better when they believe their gender has an aptitude for the task at hand, it seems likely that the gender of the presenter could affect their performance. For example, if female students see a woman leading their biology presentation, they may be likely to identify women as having an aptitude for biology, and thus perform better themselves. The same pattern could hold true for male students with a male presenter. Lockwood (2006), found that same-gender examples in a woman's field of study affect her self-perceptions more than opposite-gender examples, and that woman seem to be particularly benefited by female role models. Such findings suggest that presenter gender is an important variable to consider in our study. We therefore brought both male and female presenters into the elementary school classroom to test whether presenter gender impacts gender bias.

GENDER BIAS AND DEMOGRAPHIC FACTORS
In overviewing the literature, Witt (1997) suggests that the family is the setting that most influences gender role development, as parents pass on their gender beliefs. In their metaanalysis, Lytton and Romney (1991) found a significant effect in North American studies of parents encouraging sex-typed activities. But how do demographic factors apply to biology education? One study, focusing on Greek secondary school students, found that gender did not appear to influence students' views of biology overall, but that more educated parents correlated with higher intrinsic motivation for biology students (Mavrikaki, Koumparou, Kyriakoudi, Papacharalampous, & Trimandili, 2012). Specifically, children of illiterate parents tended to have low intrinsic motivation in biology, while children of parents with a university education tended to have higher intrinsic motivation. Interestingly, the study also found that parent occupation did not influence a student's views of biology (Mavrikaki et al., 2012). Another study in Turkey found that student science attitudes did not correlate directly with the school's socioeconomic status (Mıhladız et al., 2011). On the other hand, research suggests that household size and parent income and education do indeed contribute to the learning environment of a child (Klebanov, 1994). Indeed, Davis-Kean (2005), in trying to discover how parent education and other factors might indirectly impact their children's academic achievement, found that parental education impacts the way they choose to structure their home and their parent-child interactions that encourage high academic performance. Although the way in which this pattern is carried out differs between various racial groups, parents' academic expectations and parenting behaviors provide an indirect link between their level of education and their children's success in academics (Davis-Kean, 2005). Various studies have considered such demographic factors in depth, and, likely, more such research is justified. While these demographic factors are not the focus of our research, we acknowledge that they appear to play a major role in academia and may very well differentially impact male and female students. As such, we included mother's education, father's education, income level, and number of children in the family as covariates in our analyses.

GENDER BIAS AND THE MODELS USED TO TEACH BIOLOGY
Many studies have found a gender bias when it comes to specific areas of science, e.g., chemistry, physics, biology, or engineering (Baram-Tsabari & Yarden, 2008; Barmby & Defty, 2006;Farenga & Joyce, 1999;Jones, Howe, & Rua, 2000;Stark & Gray, 1999). However, little has been done to analyze the effects of how we teach within each of these subjects. Each subject area (e.g., physics and biology) is taught to both males and females, and it is therefore appropriate to further consider how we can teach these subjects so that both genders have the greatest opportunity possible of understanding and appreciating them. In our study, we focused not on which subject or topic each gender prefers, but rather on the models used to teach each topic, specifically within the subject area of biology. We define "model" as the specific lesson examples/content used to teach a broader biology topic, such as male peacocks showing off their feathers as a model to teach the topic of evolution.
Since we consider biology to be an important subject, we want to help students of both genders excel in biology. By choosing appropriate models for each topic within biology, teachers may have a positive impact on male and female student interest, attitude, and learning, so that neither gender is neglected. While a problem with overusing male-based models in education has been observed (Riddell, 1989), our research represents the first systematic study of male and female preference in models for biology education. Our study adds to gender-bias research by considering whether certain models within various subtopics of biology are preferable to one gender over the other.
It is important for both male and female students to have good experiences in all subject areas so that they can make informed decisions about their interests and what they would prefer to continue studying. One study in Thailand found that students' choice of occupation depends in part on the value they give to a subject (Koul, Lerdpornkulrat, & Chantara, 2011). We believe that choosing appropriate models for both genders will help students assign more value to biology, thus heightening their career aspirations within the subject. In addition, Nosek et al. (2009) suggest that attempts to boost women's participation and achievement in mathematics and science need to overcome the implicit stereotypes in people's minds. Our research will help determine whether picking appropriate models is an important step toward alleviating that stereotype, as it seems quite possible that we are currently driving students away from certain subjects and thereby supporting gender stereotypes by the way we teach. Specifically, we may be unconsciously driving females from the sciences by the choices we make in the models we choose to present. One study found that women in a classroom containing objects stereotypical of computer science (such as a Star Trek poster, video game boxes, and comics) were less interested in the subject than males, while women in a nonstereotypical classroom showed a similar level of interest to males (Cheryan, Plaut, Davies, & Steele, 2009). If something as simple as stereotypical classroom objects can perpetuate gender stereotypes, it seems likely that the models we use may also influence student interest.
In summary, we took a new angle on gender research by specifically considering whether there is a gender gap in how the models used to teach biology affect student interest, attitude, and learning. We are not so much concerned whether one gender prefers to study a particular topic, such as evolution or ecology, more than the other, as both are significant realms within biology.
Instead, we examined whether the specific models used to teach a topic could introduce gender bias.
In order to explore the effects of lesson models on learning within each gender, we first developed an instrument to measure if, when, and what lesson models exhibit gender bias in biology. We next used presentations to examine more explicitly how both gender-biased models and presenter gender affect male and female interest, attitude, and learning within biology. For example, would a female be more interested and consequently gain more knowledge if we looked at diversity using butterflies rather than spiders? Could we perhaps influence a male's perception of form and function if we considered sharks over dolphins? We hypothesized that in many cases the models used would determine the interest of both males and females in the subject matter. We predicted that when the gender-preferred model was used, students of that gender would tend to be more interested and therefore would learn more from the presentation and have more biology-related career aspirations. Our findings showed that students indeed indicated a preference to use gender-matched lesson models in learning about biology. However, in contrast to our prediction, lesson model and presenter gender were not found to impact student interest, attitude, or learning. These findings have several important implications for educators and future research.

METHODS, PART I
PURPOSE Part I of our study had three main purposes: first, to determine whether elementary school students demonstrate a gender bias in their interest in biology teaching models; second, to determine at which grade level this gender bias is most prevalent; and third, to determine which biology teaching models demonstrate this gender bias.

SURVEY INSTRUMENT
To accomplish these purposes, we created a 26-question survey that we distributed to elementary school students, kindergarten through sixth grade. The first two questions asked for the students' gender and grade level. The next 24 questions focused on lesson models that could be used to teach certain topics in biology. In forming the survey, we first selected eight broad topics covered in most beginning biology courses (science as a process, evolution, animal behavior, ecology, relationship of structure to function, diversity, cells and genetics, and science and society). Next, we created three sets of two pairs of juxtaposed lesson models for each topic: one model that is traditionally of stereotypical male interest, and one model that is traditionally of stereotypical female interest. For example, for one of the three sets of lesson model pairs concerning diversity, we asked students whether they would rather learn about the difference between bugs using spiders (male) or butterflies (female), thus testing whether the model (spider versus butterfly) made a difference. Other examples include teaching evolution using animal skulls (male) versus flowers (female), teaching the relationship of structure to function using sharks (male) versus dolphins (female), or teaching ecology using lice (male) versus birds (female). We created a printed color hard copy of the survey in which we depicted the two images (e.g. a shark and a dolphin) side by side. We attempted to find photographs for the model pairs that were similar enough in style and visual appeal, so as to not create an extra bias.
See Appendix A for the completed survey.
We developed simplified questions to ask students which lesson model they would prefer studying to learn about a certain topic. For example, the question associated with the spiders and butterflies was, "If we are going to learn about differences between bugs, which would you rather use: spiders or butterflies?" The question therefore effectively asked, "If we are going to learn about a certain topic (diversity), which of the following lesson models (spiders or butterflies) would you rather use?" We also prepared a PowerPoint presentation mirroring the printed survey: each slide portrayed a different pair of lesson models as pictured in the survey. As long as the classroom situation allowed, in addition to handing out the hard copy surveys to each student, we also displayed larger pictures on the PowerPoint slide. Students were instructed to answer the question as the researcher read it, rather than moving ahead. They were asked to circle their preference on their copy of the survey after hearing each question.

SURVEY ADMINISTRATION
Having obtained IRB approval and distributed and collected signed parent permission forms, we administered the survey to kindergarten through sixth grade students at Hobble Creek Elementary School and Mapleton Elementary School in the Nebo School District in Utah. We surveyed students in each grade as follows: 33 kindergarten students (due to a misunderstanding and earlier summer release date, participation was low), 58 first grade students, 58 second grade students, 99 third grade students, 101 fourth grade students, 82 fifth grade students, and 88 sixth grade students (a total of 271 girls and 248 boys). The number of participants in each grade was determined by the willingness and availability of teachers to participate, as well as student parents' decisions to sign and return the parent permission form.

DATA ANALYSIS
To analyze our data, we first assigned "0" to female models and "1" to male models and calculated an average response for each student. We then averaged the frequencies across males and across females in all grade levels. We found that data violated the assumption of normality (Shapiro-Wilk, p<.001), so we ran a non-parametric independent samples test. To determine at which grade level gender bias was most prevalent, we looked at the average male and female responses and ran further non-parametric independent samples tests for each grade level.
Finally, to determine which models showed a significant gender bias, we ran a logistic regression with the independent variables being gender and grade, and the dependent variable being whether or not students chose the male model. Statistical significance was set at .05. To select which model pairs showed the greatest gender bias, we considered the classification tables for each question. We found the five lesson model pairs that showed the greatest impact due to gender by taking the overall benefit (i.e., the model with both gender and grade minus the null model) minus the impact due to grade (i.e., the model with grade minus the null model). To narrow it down to the four pairs needed for the second part of our study, we excluded one of those five pairs because the impact of gender alone without taking grade into account (i.e., the model with gender minus the null model) was much lower than that for each of the other pairs. In this way we narrowed our list of 24 statistically significant model pairs down to the 4 pairs showing the greatest gender bias.

IMPLICATIONS OF PART I
The data revealed three important patterns, which will be further considered in the Results and Discussion sections, but should be mentioned here in order to more fully explain the second part of our methodology. First, the data revealed that there is indeed a gender bias, justifying our continuation of the research. Second, by showing when gender stereotypes for biology models are most prevalent, our data helped us determine that third grade would be the most appropriate focus for Part II of our research. Finally, the data revealed which lesson model pairs present the most prevalent gender bias. Based on our findings, we determined that the most appropriate lesson model pairs (and topics) to pursue in Part II of our study were: ladybugs versus termites (science as a process), flamingos versus eagles (relationship of structure to function), butterflies versus spiders (diversity), and flowers versus animal skulls (evolution).

PURPOSE
The purpose of Part II of our research was to see whether male and female student interest (as indicated in our survey's results), translates into interest, attitude, and learning in the classroom. In addition, Part II takes another variable into account, considering whether the gender of the presenter affects male and female student interest, attitude, and learning in the classroom.

PRESENTATIONS
To accomplish these purposes, we created lesson plans for the eight lesson models indicated in the survey results from Part I to show the most bias. Each lesson pair was designed to teach the same biological topic using two opposing lesson models. For example, our first lesson was composed in order to teach "science as a process" (lesson topic). We made two parallel lesson plans, one using ladybugs as a model to teach science as a process, and the other using termites to teach science as a process. We thus used lesson models that showed opposing gender interest to teach the same topic.
In order to prepare the lesson plans, we wrote out learning objectives and outlined the entire lesson. We applied active learning principles to the lesson, focusing on student participation. We also focused on the learning cycle, allowing the students to "explore" the content before we "explained" the material (Bybee, 1993). In other words, the lesson plans were designed so that students would engage in the "exploration" stage before the "term application" stage of the learning cycle (Lawson, 2002). We attempted to make the lessons as parallel as possible, changing only the model itself as we could manage. In this way, we hoped to isolate the lesson model as the variable of interest.
The other variable we introduced was presenter gender. Ultimately, our female presenter would teach science as a process using ladybugs in one classroom and termites in another classroom. Our male presenter would do the same, teaching science as a process using ladybugs in one classroom and termites in another classroom. Table 1 shows the presentation schedule.
The presenters met each week before the presentations to go over the lesson plans and to practice delivering the material. They each had a hard copy of the lesson plan they could review or reference as needed. We presented to third grade classes at the same two elementary schools where we had done the initial survey. The surveys took place in the spring of 2014, while the presentations took place in the fall of that same year, meaning that the students who initially participated in Part I had graduated to a new grade level by the time we conducted Part II. We presented first to Mapleton Elementary School and then to Hobble Creek Elementary School. This two-part setup allowed us to use the first set of presentations as a way to test our lesson plans and analyze our instruments as we gathered preliminary data. We were prepared to make changes between the first and second sets of presentations so that our lesson plans and instruments would be ageappropriate and accurate.

FIRST ROUND OF PRESENTATIONS
During the first set of presentations (Mapleton Elementary School), we were only able to present to two groups of third grade students (three classes that the teachers requested be split into two groups), rather than the four groups we had initially desired. However, we felt that this setup was still appropriate as a "trial run" for gathering preliminary data and determining any changes necessary. As such, during the first round of presentations, we combined classes to teach only two classes a week. A female presenter would teach using a certain lesson model in one class, while the male presenter would teach using the same lesson model in the other class.
During our first week, for example, the female presenter taught science as a process using ladybugs as the model in classroom one, while the male presenter taught science as a process using ladybugs as the model in classroom two. We found that incorporating one of each set of lesson plans allowed us to see how the students reacted to that type of lesson and to make changes as necessary in parallel lesson plans. After teaching science as a process with ladybugs, for example, we realized a few changes we should make in that lesson plan, making parallel changes in the lesson plan designed to teach science as a process with termites.

INSTRUMENTS
In order to measure student interest, attitude, and learning, we used three instruments.
First, we measured whether/how lesson models and presenter gender affected male and female interest during a presentation. In order to do this, we obtained wide-angle video cameras and tripods, and recorded the students during each presentation. Three researchers (the author and two undergraduate researchers) acted as observers, watching each video recording and filling out an observation sheet for each presentation. The observers were asked to record how many students of each gender made comments (or raised their hands to make comments, even if they were not called upon) or asked questions (or raised their hands to ask questions, even if they were not called upon). They also noted how many students of each gender were disengaged (stopping the recording every five minutes to count), and made additional note of any student body language of interest. We planned to use student participation, engagement, and body language as a way of implying student interest.
Next, to measure attitude, we included two questions measuring if and how student attitude toward biology and science changed during the course of the first lesson. The first of these questions (hereafter referred to as our "talent" question) asked, "Do you think you are good at biology?" asking students to rate themselves either "a. very," "b. sort of," "c. I don't know," "d. not really," or "e. not at all". The second attitude question (hereafter referred to as our "career" question) asked, "When you grow up, would you like to have a job doing science?" This time students answered either "a. yes," "b. probably," "c. maybe," "d. probably not," or "e. no." Finally, to measure learning, we developed pre-and post-presentation questionnaires (hereafter referred to as pre-and post-questionnaires) for each presentation's content. Students would take the pre-questionnaire within a week before we arrived, and would then take the identical post-questionnaire after we left. The questionnaires were the same for the lesson models used to teach the same topic (i.e., students attending the presentations using ladybugs or termites would take the same questionnaires, as the questionnaires were designed to measure learning about the broad topic-science as a process-not the specific lesson models). In this way we could ultimately measure whether the lesson model itself made a difference in how much male and female students learned about the broader lesson topic.
Each questionnaire consisted of three to five content-based multiple-choice questions designed by the researchers to measure student learning based on our specified learning outcomes. We analyzed each student's change in score for each question (post-questionnaire minus pre-questionnaire).

CHANGES AFTER THE FIRST ROUND
We analyzed the data from the first round of presentations to see what changes we needed to make to improve the second round. First, we made several changes in the lesson plans, attempting to make them even more concrete and more closely parallel between the two presenters and the two juxtaposing models in order to isolate the proper variables. For example, we made sure the parallel lesson plans (for the same topic using different models) had the same presenter questions, thereby attempting to provide students the same opportunities for participation. These questions were clearly underlined in the revised lesson plans, more directly indicating to the presenters when they were supposed to ask for student involvement. In this way, we improved the lesson plans so that the parallel lessons would be as similar as possible, changing only the lesson model or the presenter gender as desired. We also changed our evolution lesson plans, the initial versions of which seemed too difficult for a third grade audience. While we used the same topic and models in the second round of presentations, we simplified the material to make it more suited for the age group. See Appendix B for the finalized lesson plans.
Based on the first round of data, we also changed our questionnaire. When we analyzed this data, of the 16 content-based multiple-choice questions, 9 showed improvement, 6 showed digression, and 1 showed no change. Among those that showed improvement, all changed by less than 20%. Based on these preliminary results, we determined that these multiple-choice questions were too difficult to discriminate among third graders. We therefore decided to rewrite the questionnaires for the second round of presentations. For the first part of the new questionnaires, we kept one content-based multiple-choice question. Specifically, we used the questions that showed the most improvement (between 10-20%) for questionnaires 1, 3, and 4.
Each of these questionnaires had a question that clearly showed the most student improvement, which was therefore selected for continued use in the second round. Questionnaire 2, however, had one question that showed an approximately 8% increase in score (the highest for that quiz), but on closer inspection did not appropriately relate to the lesson material. Therefore, we used another question that initially showed no improvement or digression, but rewrote it in an attempt to model it more specifically after what we taught during the presentations. We added a second question to each questionnaire asking the students why they thought their answer to the multiplechoice question was correct, giving them more opportunity to share their thought process.
Finally, we added an entirely new part to the questionnaires in which we asked students to draw and label a picture showing a certain phenomenon. We based this part of our questionnaire on an instrument used by Jensen (2014). We hoped that by allowing students to draw and label their response, students would be able to more fully express their understanding. We did a pilot test of our new questionnaire format by asking a random sample of second, third, and fourth graders who would not be involved in the presentations to fill out the first questionnaire. Their answers indicated that the instrument was at an appropriate level for third graders and that students still had "room for improvement" and could potentially improve their responses after a presentation.
See Appendix E for the finalized questionnaires.
Finally, the first set of presentations revealed the need for several changes in relation to the observations. It was very difficult to tell on the video recordings how many male and female students were in the room and how many of each gender participated. Because of this, we bought two much taller, sturdier tripods and hired two undergraduate students to act as our "cameramen." The cameraman responsibilities included setting up and taking down the cameras for each presentation, making sure that all students were visible in the recording, and filling out a "cameraman information form" (noting how many male and female students were in the room and making a diagram of where students of each gender were seated; see Appendix C for the cameraman information form). Finally, we further trained the observers to ensure that all were recording data following the same protocol (see Appendix D for the observation sheet). These changes significantly increased our confidence in the accuracy of our observational data from the second round of presentations.

SECOND ROUND OF PRESENTATIONS
Our second round of presentations was presented to four third-grade classes at Hobble Creek Elementary School. In this round, we were able to follow our initial planned schedule more exactly (see Table 1). Presenters again met together before the presentations to go over the lesson plans and practice. The cameramen accompanied them for each presentation. The observers received access to the video recordings and the cameraman information forms.
Teachers administered the pre-questionnaires before we arrived and the post-questionnaires right after the presentation. Overall, the teachers seemed to do a good job of distributing and collecting the questionnaires. However, we did have two issues in one of the classrooms. After receiving all the pre-and post-questionnaires, we realized that one teacher had turned in only four completed copies of the pre-questionnaire for topic 3 from her class, and had completely neglected to turn in the post-questionnaires for topic 4. We asked her class to redo the topic 4 post-questionnaire (which they completed and turned in), but there was no way to make up the missing pre-questionnaires, and we were only able to use the data we had received.

Interest
Two undergraduate researchers were trained to watch the presentation recordings and fill out the observation sheets. Each of them and the author independently completed the observations. Each observation sheet included space to record information about the class and presentation, the total number of female and male questions, the number of female and male student comments after each presenter prompt, the total number of female and male student comments, the number of females and males disengaged (recorded every 5 min and totaled after 25 min), and optionally any notes on female and male body language. By asking the observers to list each presenter prompt followed by the number of student comments, we were able to adjust our data as needed if the presenters did not ask the same number of questions (e.g., if a presenter accidentally asked an additional question). In this way we ensured that the presenter prompts were acceptably similar to those outlined in the lesson plans. Looking specifically at the total number of male and female student comments in the second round of presentations, we determined that any observation showing a difference of 25% or greater between that of the author and an undergraduate researcher would need to be reevaluated. Undergraduate researchers were retrained by the author and asked to re-grade these evaluations.
We used the observation sheets to measure student interest. Out of the information collected by the observers, we decided that the total number of female/male student comments was the most accurate and objective measure of student interest. We first ran an inter-rater reliability using a correlation. Pearson correlation between the author and first undergraduate researcher was .988 (p<.001) for female comments and .974 (p<.001) for male students. The Pearson correlation between the author and second undergraduate researcher was .981 (p<.001) for female comments and .939 (p<.001) for male comments. The Pearson correlation between the two undergraduate researchers was .973 (p<.001) for female comments and .933 (p<.001) for male comments. Because of the high inter-rater reliability score, we averaged the total number of male/female comments recorded by each observer for each presentation and used these averages in our analysis. Next, because the classes did not all have an equal number of male and female students, we calculated the average comments per male student and comments per female student for each presentation by dividing the total number of male comments by the number of male students in the class, and the total number of female comments by the number of female students in the class. We used these measures in our analysis in order to account for unequal numbers of male and female students in the classrooms. We finally ran a series of analyses of variance (ANOVAs) looking first at female comments, then male comments, then the proportion of female/male comments in relation to model gender and presenter gender.

Attitude
The two attitude questions ("talent" and "career" questions) were scored from 1-5, where 5 corresponded to choice "a" ("very" or "yes"), 4 corresponded to choice "b", etc., ending with 1 corresponding to choice "e" ("not at all" or "no"). We calculated talent change and career change for each student by taking their post-questionnaire score for the question and subtracting their pre-questionnaire score for the question. Any student who did not have both a pre-and postquestionnaire score for the question were not included in the analysis. We then ran an analysis of covariance (ANCOVA) for both talent change and career change. We had previously collected certain demographic information for each student to use as covariates to confirm whether our results were indeed based on gender. We decided to use mother's education, father's education, income level, and number of children in the family as covariates in this analysis, based on our literature search and on ease of using the information.

Learning
Referring back to our learning outcomes and considering several student samples, we created a specific rubric for each of the four questionnaires. Each included the correct answer for question #1 (multiple-choice), minimum requirements for question #2 (short answer), and a detailed rubric for question #3 (drawing/labeling). The questionnaires were graded out of five or six points total. If a student neglected to answer question #1 or question #2, they were given 0 points for the question. However, if they did not answer question #3 (i.e., did not draw anything on the quiz or explain their reasoning), we counted that as missing data, as the student did not complete the questionnaire. Those students were not given a total score. However, any drawing, however small or unrelated it seemed, was counted as an attempt at answering question #3 and was therefore graded according to the rubric. See Appendix F for the grading rubrics.
We trained four undergraduate researchers in how to correctly grade the questionnaires.
Two undergraduate researchers and the author were assigned to each of the four questionnaires, and were trained on using the appropriate rubrics. As part of the training, all graders graded a subset of questionnaires together to ensure that everyone understood the rubric and how to apply it to the questionnaire answers. Each researcher then assigned grades to the remaining questionnaires on his/her own. Finally, we met back as a group of three and discussed mismatched scores until we came to full agreement.
We considered both total questionnaire score (based on the multiple-choice question, free response question, and drawing) and multiple-choice score (based solely on the one multiplechoice question) when analyzing the data. We ran a paired samples t-test to determine whether there was overall significant improvement in the total score or multiple-choice score. We furthermore ran an ANCOVA on the total score (post-pre) for each questionnaire, looking at either student gender by model gender or student gender by presenter gender. We again used mother's education, father's education, income level, and number of children in the family as covariates.
Finally, we looked at the multiple-choice scores. We first made a "selection variable" for each questionnaire, in which we added the pre-and the post-multiple-choice scores for each student. Every student could therefore receive a 0, 1, or 2 depending on how he or she answered the multiple-choice question for the pre-and post-questionnaire. We decided that for our analysis, students who received a 2 for the selection variable for a given questionnaire would not be included, as they got the question correct both before and after the presentation, suggesting they may have known the answer before. However, any student with a 0 or a 1 was included.
From there, we could look at how many students got the question correct on the postquestionnaire, suggesting that they learned the material from the presentation. For each analysis, we isolated male or female students as the variable of interest. We collected the descriptive statistics and ran a logistic regression for each of the four questionnaires.

PART I
To determine if there were differences in the average frequency of choosing male models between male and female students, we ran a Mann-Whitney U test. The median frequency was statistically significantly different between males and females, U=55,515. 5, p<.001. In other words, on all the questions combined, males select the male model more often than females select the male model (mean for females=36.2% and mean for males=65.6%).
We also ran a Mann-Whitney U test to determine if there were differences in the average frequency of choosing male models between males and females within each grade. In every grade there was a significant difference (see Table 2 and Figure 1) indicating that a gender bias is present as early as kindergarten and persists through sixth grade. The largest gender gap is in third grade, which we therefore selected for Part II of our study. To ascertain the effects of student gender on the likelihood that students selected the male model, we performed a logistic regression for each question. The logistic regression model was statistically significant for every question, as outlined in Table 3.  The four model pairs that showed the greatest gender bias, and which we chose for Part II of our study, were questions 3, 12, 15, and 16. These pairs were ladybugs versus termites (science as a process), flowers versus animal skulls (evolution), flamingos versus eagles (relationship of structure to function), and butterflies versus spiders (diversity), respectively. The classification values for the various statistical models for the four questions leading to this decision are shown in Table 4, while the percent of male and female students selecting the male model for each of the four questions is depicted in Figure 2.  Table 5 and Table 6.

Attitude
Our attitudinal data was taken from two questions ("talent" and "career") on the first questionnaire. We found that student attitude improved overall between the pre-and postquestionnaires, but that model and presenter gender did not differentially impact change in attitude for either of the questions. To test whether student attitude increased overall, we ran a paired t-test on our talent and career questions, both of which were significant in a 2-tailed test (talent p=.001; career p=.032, see Table 7).

Learning
To measure learning, we considered both the total scores (including all three questions) and the multiple-choice scores alone from the questionnaires. As with our attitudinal data, we ran a paired samples t-test to see whether there was an overall improvement in total score or in multiple-choice score alone for each questionnaire. We found that there was no significant improvement in the total score for any of the four questionnaires (see Table 8). However, there was overall improvement in the multiple-choice score for three of the four questionnaires (again, see Table 8). We ran 2 x 2 ANCOVAs on student total score for each of the four questionnaires using student gender and model gender as factors or student gender and presenter gender as factors.
The results showed no overall pattern. For questionnaire 1, neither student gender by model gender nor student gender by presenter gender showed significance, although we saw a significant covariate of mother's education level (p=.019). For questionnaire 2, the model gender was suggestive (p=.056) for the student gender by model gender ANCOVA, suggesting that the male model may have been best for both male and female students. The student gender by presenter gender ANCOVA showed no significance for any of the variables. For questionnaire 3, the covariate of income level was significant for the student gender by model gender ANCOVA (p=.020) and for the student gender by presenter gender ANCOVA (p=.008). Finally, for questionnaire 4, the only variable of significance was the number of children in the family in the ANCOVA for both student gender by model gender (p=.014) and student gender by presenter gender (p=.016). While certain questionnaires demonstrate significance in one covariate or another, or are suggestive in one variable or another, the patterns are not consistent and do not seem to point to an overall trend.
We analyzed the multiple-choice scores only using logistic regression since the score was measured dichotomously (correct or incorrect). The descriptive statistic frequencies are recorded in . We also noted how many students showed improvement from the multiple-choice score only (see Figure   3).

SURVEY
Our survey findings suggest that a gender bias in lesson model preference can be seen as early as kindergarten and persists through at least sixth grade. These results are in line with other studies that suggest that gender stereotypes in academics begin at a young age (Cvencek, Meltzoff, & Greenward, 2011;del Rio & Strasser, 2013;Farenga & Joyce, 1999), but counter at least one other study that found no statistical significance in gender bias in science interests in first to third graders (Baram-Tsabari & Yarden, 2011). Our findings suggest that teachers as early as kindergarten should be aware and sensitive to the gender biases that may exist within their classrooms.

INTEREST
Although male and female students select different teaching models that they would prefer to use (as shown in Part I of our study), the model used and the presenter gender do not appear to impact the engagement of male or female students. This finding is in contrast to what we anticipated, as we thought that student interest would increase if we selected appropriate models. Our initial hypothesis was in line with a study showing that female students were less interested in computer science than male students when the classroom was filled with stereotypical objects (Cheryan, Plaut, Davies, & Steele, 2009). Our results, however, indicate that the model and presenter gender do not influence student interest, and thus suggest the need for an alternative hypothesis. We propose that perhaps teaching methods play a larger role than model or presenter gender in third grade student interest as measured through their participation.
Because each of our lessons was designed to be inquiry-based and active learning, students were necessarily involved in the learning process. While male and female students may indicate a different preference on which models they prefer, perhaps an inquiry-based, active learning lesson engages students of both genders, regardless of the model used or the presenter's gender.
This could be a key area of future study; engaging student interest for both genders is particularly important considering that elementary school students' biology interest has declined over the course of a generation, from 1980 to 2011 (Randler, Osti, & Hummel, 2012).
The one significant finding with interest was that the ratio of female to male comments per student increases when a female model is used. In other words, when we taught using a female model, female hand raising would increase when compared with male hand raising. This could have implications for teachers struggling with a classroom dominated by male comments.
In order to "even out" student participation, those teachers might consider implementing more female models into their curriculum.

ATTITUDE
While student attitude (based on questions asking students about their perceived talent in biology or career aspirations in science) improved from pre-to post-presentation, there is no evidence that change in attitude is impacted by the model or presenter gender. Based in part on a study finding that students' occupational choice is partially dependent on the value they give a subject (Koul, Lerdpornkulrat, & Chantara, 2011), we thought that choosing appropriate models for each gender could help the students find more value in biology, and thus increase their career aspirations. We likewise thought that lesson model and presenter gender could be key to overcoming the implicit stereotypes underlined by Nosek et al. (2009) It is likewise important to note that improvement was not particularly great, and was in fact unseen on the multiple-choice question for one questionnaire and on the total score for all four questionnaires. This brings to mind at least two possible explanations. First, we may have tested the students using a question format with which they were unaccustomed. In support of this possibility, our questionnaire 3 multiple-choice results showed the most improvement out of all questionnaire multiple-choice questions. This particular question is the only one that is at the "recall" level of Bloom's taxonomy (Bloom, 1984). Because third graders show more improvement on this question, it seems reasonable to consider whether this is the question format to which they are most accustomed. In asking for question samples from the teachers before writing the questionnaires, we noticed a trend toward all recall-level questions. It would be interesting to further investigate whether third grade students are ever given science questions utilizing upper levels of Bloom's taxonomy.
Second, our findings suggest that we may be trying to test student learning on a level they have not yet reached. According to Piaget (1985), there are certain developmental stages that precede learning. Third graders, who are often eight to nine years old, could perhaps be preoperational (attributed to ages 2-7), but most would likely be in Piaget's concrete operations stage (ages 8-11). Likely, very few of them would be formal operational (ages 12+). Piaget would suggest that without yet being formal operational, children would struggle with theoretical concepts. Thus, many of the reasoning patterns and the conceptual understanding we were trying to teach and assess may have been beyond their developmental level. Interestingly, our third topic (diversity) dealt with the concrete concept of conservation and was assessed in a more concrete fashion. As noted above, students showed the most improvement on the multiplechoice question for this topic, lending further support to this Piagetian hypothesis. In the future, it would be interesting to see if we could better detect any gender bias in learning at a higher grade level where students are at a higher developmental level. It seems likely, however, that the patterns in gender bias and learning that we saw were an accurate depiction of reality, as they parallel the patterns we saw in interest and attitude.

TEACHER GENDER
We were surprised to find that presenter gender had no significant impact on the number or proportion of student comments, improvement in student attitude, or improvement in student questionnaire scores. Our findings countered what we had expected based on Mental Rotation Test (MRT) studies finding that participant score depends on what they were told about their or the other gender's aptitude for the task (Moè & Pazzaglia, 2006;Neuburger, Jansen, Heil, & Quaiser-Pohl, 2012). Based on these studies, we had predicted that if males and females often perform better when they believe their gender is good at a particular task, then presenter gender might impact student performance as the students would see someone of their own gender with an aptitude for biology. In addition, research suggesting that same-gender role models benefit female college students seemed to support our prediction (Lockwood, 2006). Despite these findings and our initial hypothesis, this is not what we found. Instead, our research shows that student interest, attitude, and learning do not appear to be impacted by presenter gender. This could have implications for schools concerned with an unequal number of male and female teachers. If teacher gender, like presenter gender, is found to have little to no impact on student participation, this may help to mitigate this concern. In fact, one empirical study conducted at an English primary school already suggests such a trend: Carrington, Tymms, & Merrell (2008) found that matching teacher and student gender had no impact for male or female student achievement or attitude toward school. Together with these findings, our study suggests that having equally represented male and female presenters or teachers in the elementary school does not inherently improve student interest, attitude, or learning.

FUTURE RESEARCH
We hope that this project has not only revealed useful information about elementary school biology education, but that it will become a launching point for future research. We envision that such research will include our alternative hypothesis (the importance of teaching  Barmby & Defty, 2006;Farenga & Joyce, 1999;Jones, Howe, & Rua, 2000). For example, one study allowing fourth through sixth grade students to select courses for themselves and for students of the other gender suggested that both genders consider physical science and technology courses more suited for males and life sciences more suited for females (Farenga & Joyce, 1999). These studies suggest that future research could successfully expand on ours to other age levels and STEM fields. We hope that our research in gender bias and lesson models will be replicated across multiple disciplines and age groups to discover the implications within those groups.

Ladybug Instructor Guide (Science as a Process for Third Grade Students)
Learning Outcomes: • Students make appropriate observations.
• Students make and test hypotheses.
• Students control variables (e.g. one item on each side, same size, etc.) • Students note their results and draw appropriate conclusions.

Supplies:
• • Before the presentation, make a ladybug habitat by placing some leaves, grass, and paper towels (damp with normal water and with sugar water) in a clear container. Add about 50 ladybugs. Cover with layers of saran wrap with small holes for ventilation. Secure saran wrap with rubber bands.

Time:
• 40 minutes Introduction: (12 minutes) • Tell the students that you brought some ladybugs to show them. You are going to walk around the classroom showing them the ladybugs. Ask them to pay attention to what they notice-to what the ladybugs look like and what they are doing. • Walk around the class with your ladybug habitat so that all of the students get to look at them.
• "What did you notice about how the ladybugs looked and acted?" Write their comments on the board. • Explain that what they noticed are their observations (write this word above their observations). o Note: Hopefully someone will mention the ladybugs eating-if needed, add it as your own observation. • "One of our observations was that the ladybugs are eating. What do you think ladybugs like to eat?" Write their ideas on the board. • Next, explain that each of these ideas is a hypothesis (write this word above their hypotheses). A hypothesis is what we think might answer a question about our observations. So, we observed that ladybugs are eating. We asked what ladybugs like to eat. We made hypotheses-ideas that we thought might answer our question about what ladybugs like to eat. • Once we have a hypothesis we can design an experiment to test it. Today we are going to do an experiment to test our ideas about what ladybugs like to eat.

Instructions: (23 minutes)
• Explain to students that they should work in groups of three to design an experiment to test what ladybugs like to eat (you or the teacher will assign the groups). List the food choices. Each group will get a choice chamber. They will put one possible food choice in one side, and one other possible food choice in the other side. • "Why is it important to put exactly one food choice in each side?" After accepting a couple of student answers, make sure it is clear that if you have extra food in one side it might be hard to tell what ladybugs really like. It might be hard to know if ladybugs are on one side because they like that food or just because there is more food. • Allow students to come to the front to choose two food choices per group. When they are ready, add three ladybugs to each choice chamber and remind students to watch what the ladybugs do. After helping all the groups get started, continue walking around and asking what they are discovering. After one test, students are welcome to switch out one food item for another if they have time. • After about 16 minutes, help the students gently put away the ladybugs, and ask them to return their choice chambers and to dispose of their other supplies.

Discussion: (5 minutes)
• "After doing your experiment, what do you think ladybugs like to eat?" Write student ideas on the board. • "Do we know for sure what ladybugs like to eat?" After accepting a couple of student answers, make sure it is clear that we don't know for sure what ladybugs like to eat. We could do more tests to feel more sure, but even then we couldn't be positive.

Termite Instructor Guide (Science as a Process for Third Grade Students)
Learning Outcomes: • Students make appropriate observations.
• Students make and test hypotheses.
• Students control variables (e.g. one item on each side, same size, etc.) • Students note their results and draw appropriate conclusions.

Supplies:
• For each group of three students (and for the instructor) you need: o A vial containing about 10 termites o A petri dish o A piece of white paper o A black Bic pen o A small paintbrush • Various colors and types of pens, pencils, markers, crayons • Transparencies, different colors of paper Time: • 40 minutes Introduction: (12 minutes) • Tell the students you brought some termites to show them. Explain that you are drawing a circle with a black Bic pen and adding the 10 termites from your vial. You are going to walk around the classroom showing them the termites. Ask them to pay attention to what they notice about what the termites are doing. • Walk around the class with your circle paper and termites so that all of the students get to look at them. • "What did you notice about how the termites acted?" Write their comments on the board.
• Explain that what they noticed are their observations (write this word above their observations). • "We observed the termites walking around the circle. Why do you think they are doing this?" Write their ideas on the board. • Next, explain that each of these ideas is a hypothesis (write this word above their hypotheses). A hypothesis is what we think might answer a question about our observations. So, we observed that the termites were following the circle. We asked why they are doing this. We made hypotheses-ideas that we thought might answer our question about why the termites are following the circle. • Once we have a hypothesis we can design an experiment to test it. Today we are going to do an experiment to test our ideas about why the termites are following the circle.

Instructions: (23 minutes)
• Explain to students that they should work in groups of three to design an experiment to test why the termites follow the circle (you or the teacher will assign the groups). List the available supplies. Each group will get a vial of 10 termites, a white piece of paper, a black Bic pen, a paintbrush (to help gently move the termites if needed), and a petri dish (to put the termites in when they are done). They will change one thing (such as pen color, line shape, etc.) at a time. • "Why is it important to change only one thing at a time?" After accepting a couple of student answers, make sure it is clear that if you change more than one thing at a time (such as changing the pen color and the line shape in one test) it is harder to see why the termites acted a certain way (such as knowing if they acted that way because of the pen color or the line shape). • Allow students to set up their experiments (with all the supplies except the termites).
When they are ready, give them a vial of termites and encourage them to gently dump them onto their paper. After helping all the groups get started, continue walking around and asking what they are discovering. After one test, students are welcome to do another test if they have time. • After about 16 minutes, ask students to gently place the termites in the petri dish and to return all of their supplies to the front of the classroom.

Discussion: (5 minutes)
• "After doing your experiment, why do you think the termites followed the line?" Write student ideas on the board. • "Do we know for sure why termites followed the line?" After accepting a couple of student answers, make sure it is clear that we don't know for sure why the termites followed the line. We could do more tests to feel more sure, but even then we couldn't be positive.

Flamingo Instructor Guide (Relationship of Structure and Function for Third Grade Students)
Learning Outcomes: • Students will be able to develop a plausible hypothesis for a given organismal feature, thereby relating structure to function. • Students will be able to interpret data to draw an appropriate conclusion from a structure/function experiment.

Supplies:
• Marbles (a bag of 20 for each group) • A Tupperware container for each group • Plastic spoons (1/2 of the groups need one) • Plastic knives (1/2 of the groups need one) • Half of a large plastic cup taped to a long pipe cleaner (1/2 of the groups need one) • Half of a large plastic cup taped to a short pipe cleaner (1/2 of the groups need one) • A flamingo PowerPoint slide with 3 flamingo pictures Time: • 40 minutes

Instructions:
• Show the students the flamingo pictures and ask, "What do you notice about how flamingos look?" Write their observations on the board. If they do not note the scooped beaks and long necks, add them to the list as your own observations. • Tell the students we are going to make hypotheses about a couple of these observations; specifically, we want to hypothesize why flamingos have scooped beaks and why they have long necks. o "Why do you think flamingos have scooped beaks?" Write their ideas on the board. Someone should mention something along the lines of scooping up foodif they don't, then add it to the list as your own idea. o "Why do you think flamingos have a long neck?" Write their ideas on the board.
Someone should mention something along the lines of finding food upside down-if they don't, then add it to the list as your own idea. • Explain that we are going to do experiments using models to represent the beaks and necks of flamingos. These experiments will test two of our hypotheses.
Experiment #1: Flamingos have scooped beaks to help them scoop up food.
• "First we are going to do an experiment to test our hypothesis that flamingos have scooped beaks to help them scoop up food." • For the first experiment, assign students to groups of 3 (or ask their teacher to assign them). Groups on one half of the room will represent flamingos with scooped beaks and will be given a rounded plastic spoon as a model for a scooped flamingo beak. Groups on the other half of the room will represent another kind of bird without a scooped beak and will be given a long straight plastic knife as a model for another bird beak. • Both groups will be given a Tupperware container and 20 glass marbles to put inside (representing food in the water). The students can only use their tools to scoop marbles from the container and place them on their desks. They will have approximately 30 seconds split between the 3 students in the group (yell "switch" for them to switch to the next student after about 10 seconds) to scoop up as many marbles as possible. • When the time is up, ask students to count how many marbles their group successfully scooped up and placed on their desk. Write "flamingos" and "other birds" on the board and list each group's number under the appropriate heading. • "What do our results show?" Ultimately students should understand that their results support their hypothesis and it seems that flamingos have scooped beaks to help them scoop up their food.
Experiment #2: Flamingos have long necks to help them find food upside down.
• "Now we will do an experiment to test our hypothesis that flamingos have long necks to help them find food upside down." • Students will continue working in groups of three. Each group will continue using the Tupperware container and marbles, representing food in the water. • The other half of the room will represent the flamingos this time (so hopefully each side of the room can "win" once). Each of those groups will get a half cup (that represents a flamingo beak) attached to a long pipe cleaner that represents a flamingo neck. The other groups will get a half cup (that represents a flamingo beak) attached to a short pipe cleaner representing the short neck of another bird. • Each group will use their contraption to scoop out marbles onto the table. Students must hold the bottom end of their pipe cleaner to the side of the Tupperware container. Students in the group should take turns scooping (yell "switch" occasionally again). • When the time is up, ask students to count how many marbles their group successfully scooped up and placed on their desk. Write the new numbers under "flamingos" and "other birds" on the board in the appropriate spot. • "What do our results show?" Ultimately students should understand that their results support their hypothesis and it seems that flamingos have long beaks to help them find food upside down.
• Recap that today you made observations about how flamingos look. You then made hypotheses to explain why they looked this way. You tested these hypotheses and saw that your results supported your hypotheses. You discovered that flamingos have a specific shape (such as scooped beaks and long necks) to help them do certain things (such as scoop food and find food upside down).

Eagle Instructor Guide (Relationship of Structure and Function for Third Grade Students)
Learning Outcomes: • Students will be able to develop a plausible hypothesis for a given organismal feature, thereby relating structure to function. • Students will be able to interpret data to draw an appropriate conclusion from a structure/function experiment.

Supplies:
• Salad tongs (each group gets either full or half tongs) • Cotton balls (a bag of 20 for each group) • A wooden block wrapped in wrapping paper and tied with ribbon for each group • A pack of hooks (1/2 of the groups need one) • A pack of nails (1/2 of the groups need one) • An eagle PowerPoint slide with 3 eagle pictures Time: • 40 minutes Instructions: • Show the students the eagle pictures and ask, "What do you notice about how eagles look?" Write their observations on the board. If they do not note the talons and hooked beak, add them to the list as your own observations. • Tell the students we are going to make hypotheses about a couple of these observations; specifically, we want to hypothesize why eagles have talons and why they have hooked beaks. o "Why do you think eagles have talons?" Write their ideas on the board. Someone should mention something along the lines of catching prey-if they don't, then add it to the list as your own idea. o "Why do you think eagles have hooked beaks?" Write their ideas on the board.
Someone should mention something along the lines of tearing food-if they don't, then add it to the list as your own idea. • Explain that we are going to do experiments using models to represent the talons and the beaks of eagles. These experiments will test two of our hypotheses.
Experiment #1: Eagles have talons to help them catch prey.
• "First we are going to do an experiment to test our hypothesis that eagles have talons to help them catch prey." • For the first experiment, assign students to groups of 3 (or ask their teacher to assign them). Groups on one half of the room will represent eagles with talons and will be given full salad tongs as a model for eagle talons. Groups on the other half of the room will represent another kind of bird without talons and will be given half salad tongs as a model for other bird feet. • Both groups will be given a bag of 20 cotton balls, which they must spread out on the ground (representing fish in the water). The students can only use their salad tongs (representing bird feet) to pick up the cotton balls one at a time and place them on their desks. They will have approximately 30 seconds split between the 3 students in the group (yell "switch" for them to switch to the next student after about 10 seconds) to pick up as many cotton balls as possible. • When the time is up, ask students to count how many cotton balls their group successfully picked up and placed on their desk. Write "eagles" and "other birds" on the board and list each group's number under the appropriate heading. • "What do our results show?" Ultimately students should understand that their results support their hypothesis and it seems that eagles likely have talons to help them catch fish more easily.
Experiment #2: Eagles have hooked beaks to help them tear food.
• "Now we will do an experiment to test our hypothesis that eagles have hooked beaks to help them tear their food." • Students will continue working in groups of three. Each group will be given a wooden block that has been wrapped and tied. Explain that it represents the food the eagle is trying to eat. • The other half of the room will represent the eagles this time (so hopefully each side of the room can "win" once). Each of those groups will get a hook that represents an eagle beak. The other groups will get a long nail representing the straight beak of another bird. • Each group will use their tool to open the present. The eagle group should use the hooked side to open the gift, as that is the side that best represents an eagle beak. Students can only use their hands to hold the gift, not to unwrap it. Students in each group should take turns unwrapping (yell "switch" occasionally again). When a group is done unwrapping the wooden block completely, they should hold it in the air. Time the groups and shout out the times as they finish. • After all groups have finished, ask "What do our results show?" Ultimately students should understand that their results support their hypothesis and it seems that eagles likely have hooked beaks to help them their food.
• Recap that today you made observations about how eagles look. You then made hypotheses to explain why they looked this way. You tested these hypotheses and saw that your results supported your hypotheses. You discovered that eagles have a specific shape (such as talons and curved beaks) to help them do certain things (such as catch prey and tear food).

Butterfly Instructor Guide (Diversity for Third Grade Students)
Learning Outcomes: • Students analyze the importance of biodiversity.
• Students evaluate how ecosystem services impact humankind.
• Students explore conservation opportunities.

Supplies:
• Butterflies o 1-3 live painted lady butterflies o 2 other pinned butterflies (luna moth and Lycaena butterflies) Time: • 40 minutes Instructions: • Bring 3 types of butterflies into the classroom (a live painted lady, a pinned luna moth, and a set of three pinned Lycaena butterflies). Give students time to look at all the butterflies and ask them to make observations of the butterflies. (Student should line up in three lines, observe one butterfly, then move to the back of the next line, until they have seen all three butterflies.) (10 minutes) • Have students sit back down and ask them to share their observations o "What did you notice about the painted lady?" o "What did you notice about the luna moth?" o "What did you notice about the Lycaena butterflies?" o List their observations on the board under appropriate headings for each type of butterfly. (5 minutes) • Next, go through the PowerPoint, asking students, "How many kinds of butterflies do you think there are?" (about 20,000), and, "Where do you think butterflies are found in the world?" (on every continent except Antarctica). • Next, ask students to show with their hands how big they think the largest butterfly is (Queen Alexandra's birdwing, about 11 inch wingspan), and how small they think the smallest butterfly is (Western pygmy blue, about ½ inch wingspan). (4 minutes) • Continuing through the PowerPoint, show the other "neat butterfly" examples using the video clips and pictures. As you show each picture and video clip, ask students to share their observations, "What do you notice about this butterfly?" (but don't take the time to list these on the board). Be familiar with a little background information about each butterfly in order to answer questions and supplement discussion. (8 minutes) • After the PowerPoint, ask "Why is it important to have so many kinds of butterflies?" List ideas on board (they are pretty, pollination, potential medicines, etc.) (5 minutes) • Introduce biodiversity and talk about its importance, including ecosystem services and conservation. (8 minutes) o Define biodiversity as the variety of life-all the different types of butterflies are an example of biodiversity. Write this term on the board above the butterfly observations. It is important for us to preserve biodiversity-of butterflies and other types of life. o One reason we want to preserve biodiversity is because of ecosystem services (write this term as the heading for their ideas about the importance of butterfly diversity). Ecosystem services are the ways that nature helps humans. Ask students, "What are some 'ecosystem services' (ways nature helps humans)?" (Providing food, fresh water, medicines, cleaning water, pollination, providing recreation, etc.) Add their ideas to the list on the board. o Tell students that humans sometimes destroy biodiversity. "How do humans sometimes destroy biodiversity?" (Deforestation, pollution, etc.) Tell them that this can affect ecosystem services as they may not be as available to us and we may need to do them ourselves. o Finally, ask, "How can we protect biodiversity?" Write these ideas on the board (Recycle, reuse things, walk, bike, or use public transport, support conservation efforts, don't waste water, etc.-if riding a bike isn't mentioned, write it as your own idea.). All of these are examples of conservation, of ways humans help conserve nature (write this term on the board above the student suggestions).

Spider Instructor Guide (Diversity for Third Grade Students)
Learning Outcomes: • Students analyze the importance of biodiversity.
• Students evaluate how ecosystem services impact humankind.
• Students explore conservation opportunities.

Supplies:
• Spiders o Tarantulas o Orb weaver o Wolf spider(s) Time: • 40 minutes Instructions: • Bring 3 types of live spiders into the classroom (orb weaver, wolf spider, and tarantula). Give students time to look at all the spiders and ask them to make observations of the spiders. (Student should line up in three lines, observe one spider, then move to the back of the next line, until they have seen all three spiders.) (10 minutes) • Have students sit back down and ask them to share their observations o "What did you notice about the orb weaver?" o "What did you notice about the wolf spider?" o "What did you notice about the tarantula?" o List their observations on the board under appropriate headings for each type of spider. (5 minutes) • Next, go through the PowerPoint, asking students, "How many kinds of spiders do you think there are?" (over 40,000) and, "Where do you think spiders are found in the world?" (on every continent except Antarctica). • Next, ask students to show with their hands how big they think the largest spider is (giant huntsman, about 12 inch leg span), and how small they think the smallest spider is (Patu digua, body the size of a pinhead). (4 minutes) • Continuing through the PowerPoint, show the other "neat spider" examples using the video clips and pictures. As you show each picture and video clip, ask students to share their observations, "What do you notice about this spider?" (but don't take the time to list these on the board). Be familiar with a little background information about each spider in order to answer questions and supplement discussion. (8 minutes) • After the PowerPoint, ask "Why is it important to have so many kinds of spiders?" List ideas on board (they are neat, they eat other insects, potential medicines, etc.) (5 minutes) • Introduce biodiversity and talk about its importance, including ecosystem services and conservation. (8 minutes) o Define biodiversity as the variety of life-all the different types of spiders are an example of biodiversity. Write this term on the board above the spider observations. It is important for us to preserve biodiversity-of spiders and other types of life. o One reason we want to preserve biodiversity is because of ecosystem services (write this term as the heading for their ideas about the importance of spider diversity). Ecosystem services are the ways that nature helps humans. Ask students, "What are some 'ecosystem services' (ways nature helps humans)?" (Providing food, fresh water, medicines, cleaning water, pollination, providing recreation, etc.) Add their ideas to the list on the board. o Tell students that humans sometimes destroy biodiversity. "How do humans sometimes destroy biodiversity?" (Deforestation, pollution, etc.) Tell them that this can affect ecosystem services as they may not be as available to us and we may need to do them ourselves. o Finally, ask, "How can we protect biodiversity?" Write these ideas on the board (Recycle, reuse things, walk, bike, or use public transport, support conservation efforts, don't waste water, etc.-if riding a bike isn't mentioned, write it as your own idea.). All of these are examples of conservation, of ways humans help conserve nature (write this term on the board above the student suggestions).

Flower Instructor Guide (Evolution for Third Grade Students)
Learning Outcomes: • Students understand how evolution through natural selection leads to preferred characteristics.

Supplies:
• A red and a yellow rose • 4 other types of flowers for student analysis (one red alstroemeria, one pink zygocactus, one white carnation, and one orange lily) Time: • 40 minutes Instructions: • Tell the students that bees often like yellow or blue flowers (they can't see red well), while birds often like red and orange flowers (they see red well). Show the class a yellow rose and a red rose. Ask, "Which flower do you think will be pollinated by bees the most often? Why?" (The yellow rose) Tell students that pollination is what lets plants make seeds. • Ask, "If we plant a garden and only let bees inside to pollinate (no birds), which flower do you think will make more seeds?" (The yellow rose) • "When those seeds grow into plants, will their flowers look more like this (the red rose) or this (the yellow rose)?" (The yellow rose) This is because children look a lot like their parents. However, they don't look exactly like their parents, so some of the new yellow roses might be darker or lighter yellow. • So, imagine we start with some red roses and some yellow roses, but we only let bees in to pollinate them. The bees will usually pollinate the yellow roses, so the yellow roses will make the most seeds. The seeds will drop and new roses will grow. Most of the new roses will be yellow. Now we have lots of yellow roses, and very few red roses. This is called evolution. • Set the alstroemeria, zygocactus, carnation, and lily up front. Ask the class to form one long line and to file past all three flowers, carefully observing each one. After all the students have seen them ask, "What did you observe about this flower?" for each flower, and write their observations on the board under the appropriate heading. • Next, ask them to use their observations and what they have learned to answer the following questions. Following each question, you will circulate the room, showing the students the three flowers again. They should discuss their answer with their neighbor first, and then you will ask them to share their ideas with the class. • "Which flower would survive the best if they all grew in an area where birds were the only pollinators? Why?" (The bright red alstroemeria or orange lily) • "Which flower would survive the best if they all grew in an area where cows liked to graze on plants? Why?" (The zygocactus) • "Which flower would survive the best if they all grew in an area where butterflies were the only pollinators? Why?" (Hint: butterflies like flowers that smell good.) (The carnation) • Ask the class, "Why do some flowers survive better than others in difference circumstances?" Make sure they understand that some flowers fit better into their environment, and therefore survive better. If they survive better, they get to make more seeds during their lives, and there are more plants that look like them when the seeds grow.
Animal Skull Instructor Guide

(Evolution for Third Grade Students)
Learning Outcomes: • Students understand how evolution through natural selection leads to preferred characteristics.

Supplies:
• An herbivore skull (antelope) and a carnivore skull (sea otter or fox) • 4 other types of animal skulls for student analysis (one bobcat, one coyote, one rabbit, and one chimpanzee) Time: • 40 minutes Instructions: • Tell the students that herbivores often have flat teeth for chewing, while carnivores often have sharp for tearing. Show the class an herbivore skull and a carnivore skull. Ask, "Which animal do you think will eat grass the most often? Why?" (The herbivore) Tell students that eating lets animals stay alive and have babies. • Ask, "If we capture both types of animals, and keep them in pastures with only grass to eat (no animals to eat), which animal do you think will have more babies?" (The herbivore) • "When those babies grow up, will they look more like this (the carnivore) or this (the herbivore)?" (The herbivore) This is because children look a lot like their parents. However, they don't look exactly like their parents, so some of the new animals might have slightly more or less flat teeth. • So, imagine we start with some carnivores and some herbivores, but we only let them eat grass. The herbivores will be better at eating the grass, so the herbivores will have more babies. The babies will grow up. Most of the new babies will be herbivores. Now we have lots of herbivores, and very few carnivores. This is called evolution. • Set the bobcat skull, the coyote skull, the rabbit skull, and the chimpanzee skull up front.
Ask the class to form one long line and to file past all three skulls, carefully observing each one. After all the students have seen them ask, "What did you observe about this skull?" for each skull, and write their observations on the board under the appropriate heading. • Next, ask them to use their observations and what they have learned to answer the following questions. Following each question, you will circulate the room, showing the students the three skulls again. They should discuss their answer with their neighbor first, and then you will ask them to share their ideas with the class. • "Which animal would survive the best if they all lived in an area where it had to hunt in order to eat? Why?" (The bobcat, the coyote, or maybe the chimpanzee) • "Which animal would survive the best if they all lived in an area where they had to hide from predators? Why?" (The rabbit; eyes on the side) • "Which animal would survive the best if they all lived in an area where they depended upon their sense of smell to get food? Why?" (The coyote or rabbit) • Ask the class, "Why do some animals survive better than others in difference circumstances?" Make sure they understand that some animals fit better into their environment, and therefore survive better. If they survive better, they get to have more babies during their lives, and there are more animals that look like them when the babies grow up.

Record the number of female/male students in the class (camera man).
Number of Female Students in the Class: Number of Male Students in the Class:

Record student participation through questions and comments.
• For questions, record the number of male and female students who asked questions, INCLUDING those who raised their hands when prompted to ask questions. (Note: if it unclear whether students raised their hands to ask a question or to make a comment because the prompt was ambiguous and they weren't called on, assume that they were planning to make a comment.) One student may be counted more than once. • For comments, record the presenter question that led to the student comments (if applicable). Beneath that, record the number of students who shared a comment, INCLUDING students who raised their hands to share a comment. One student may be counted more than once. After you have watched the entire presentation, record the total number of female/male student comments on the short lines provided.

Questions:
Total Female Student Questions: _____ Total Male Student Questions: _____ 4. Note how many students of each gender seem to be disengaged from the presentation and how you know.
• In order to do this, pause the video every five minutes (e.g. 5 minutes into the presentation, then 10 minutes into the presentation, etc.) and count the number of disengaged females and the number of disengaged males at that time. • Make note of how you can tell they are disengaged.
• After you have watched the entire presentation, record the total number of disengaged female/male students on the short lines provided (note: one student may count more than once in this number). Learning Objectives 1. Students will be able to develop a plausible hypothesis for a given organismal feature, thereby relating structure to function. 2. Students will be able to interpret data to draw an appropriate conclusion from a structure/function experiment.