The Default Project:

You don’t have to do this, but if you do something else you need my approval. I’ll approve anything within reason. Whatever you do, your topic may not be the same as your 123A project.

To do the default project:
1) Choose some biological, medical, or environmental issue that is related to a known gene or group of genes. For example, nitrogen fixation is related to nifH/K/D/N. COVID-19 vaccines are related to viral spike proteins.
2) Collect at least 25 sequences of the gene, from various closely related species. Call this your “basic data set”.
3) Collect at least 25 sequences of the gene, from a clade that is related to the basic set. Call this your “related data set”.
4) Apply 123A techniques to your data to get a sense of the diversity of your chosen gene(s). Build a multiple-sequence alignment and a phylogenetic tree.
5) Use 80% of the sequences in your basic data set to build a Hidden Markow Model. Use 80% of the sequences in your related data set to build a HMM.
6) For each of the 20% of sequences that weren’t used to train an HMM, compute the log-Viterbi score on each HMM.

The Written Project Report: Due 11:55 AM on April 18 by file upload to Canvas. Word only, no PDF.

BIOL 123B: At least 15 pages, double-spaced, not including source code or references.

The report must be structured as described below. This structure is required by almost all peer-reviewed (i.e. legitimate) scientific journals. There are 5 sections. Everything you say must go in the proper section and nowhere else.

• Section 1: Background
o The problem/issue you’re investigating
o Why it’s important
o Your plan
 What data
 What analysis
 Expected results
 Hypothesis (optional)
• Section 2: Methods
o Data
 Where it came from
 How you got it
 (But don’t put the data itself in this section)
o Procedure
 What analysis technique(s) you used
 What analysis software you used
• Public web site?
• Publicly available downloaded software?
• 123B HMM?
• Your own code?
 If there’s software, describe the code
• Language
• Number of lines
• What it does
• Don’t include any source code. Don’t include screenshots unless your code draws graphics.
• Section 3: Results
o Raw results, no interpretation
o Tables are good
o No figures that are screenshots of text. That’s annoying and hard to read. Text results should be text, either in the body of your doc or in a table.
• Section 4: Discussion
o Broad summary: I did this to this data, I got these results, which let me to these broad conclusions
o Interpretation of the data: what conclusions do you want your readers to have?
o How you could improve on the methods if you got a do-over
o If you had the time, how would you do more research into this issue
• Section 5: References
o If you use a Word plug-in like Zotero, this section is built for you automatically. If you write the References section yourself, follow these rules.
o When you cite a book, article, or web site in sections 1-4, do it like this
 This habitat seems to generally lack UCYN-A1 and has environmental conditions that clearly differ from the tropical/subtropical oligotrophic open ocean during most times of the year (Chavez et al. , 2002).
o Or like this
 This habitat seems to generally lack UCYN-A1 and has environmental conditions that clearly differ from the tropical/subtropical oligotrophic open ocean during most times of the year [1].
o Then in your references section:
 Chavez FP, Pennington JT, Castro CG, Ryan JP, Michisaki RM, Schlining B et al. (2002). Biological and chemical consequences of the 1997-98 El Nino in central California waters. Progr Oceanogr 54: 205–232.
• References must appear in alphabetical order by first author. If there are 4 or more authors, just name the first 3, and then write “et al.” in italics, with a period after al. It’s Latin for “and others”.
o Or like this:
 (1) Chavez FP, Pennington JT, Castro CG, Ryan JP, Michisaki RM, Schlining B et al. (2002). Biological and chemical consequences of the 1997-98 El Nino in central California waters. Progr Oceanogr 54: 205–232.
• Citations in the text should be in increasing numerical order.
o No plagiarism (see below)
o No cite-cites (see below)


The Oral Presentation:
Presentations will happen on 4/19 through 5/5. Each will be 10 minutes long. The presentation schedule will be randomly generated. Your presentation deck may be PowerPoint (without animations) or PDF; it is due along with the written report (11:55 AM on 4/18). During your presentation, your deck will be screenshared from your instructor’s computer. Say “next slide please” when you need to.

If your project includes code, your presentation should describe the code but should not show the code.

Screenshots of text are not allowed.

Plagiarism is defined here: http://www.sjsu.edu/cs100w/policies/plagiarism.html. All students at SJSU are expected to understand these rules. If you rewrite a substantial amount someone else’s work, replacing words with synonyms, that’s plagiarism. Example:

Original work:
“But soft, what light through yonder window breaks?
It is the east, and Juliet is the sun.
Arise, fair sun, and kill the envious moon” -Shakespeare, Romeo and Juliet

Definitely plagiarism:
“Ssssh! What’s that light coming through the window over there?
That’s where the sun comes up, and my new girlfriend is the sun.
Get up, beautiful sun, and murder the jealous lunar ball.”

If you copy someone else’s words, even if you cite correctly, that’s still plagiarism. The general rule is that everything has to be in your own words.

The consequence of plagiarism is an F in this course and a report to the university. The university may impose their own consequences, including expulsion.

A “cite-cite” is when you cite a source to support a claim, but your source doesn’t support the claim either, your source just cites someone else who (hopefully) supports the claim. The source that you cite must provide the evidence directly: it must be a primary source.

Example: In a 2018 article I wrote the following: “In 2002, Hebert proposed cytochrome c oxidase I (COI) as a standard for molecular barcoding of animals4.” My reference #4 was the article where Paul Hebert made that proposal. If you want to say that Hebert made that proposal in 2002, cite his article, not mine. This is because if someone wants to find the original work, they shouldn’t have to find my article, figure out where I make my claim about Hebert, look up my reference #4, and so on.

“Cite, don’t cite-cite.” – Jon Zehr, to me
“Cite, don’t cite-cite, and definitely never cite-cite-cite.” – Me, to you


Grading
Your project is worth 25% of your grade. You will be evaluated on the following 250-point scale:

Category (points) Low Medium High
Grammar/spelling. (20) Significant or serious grammar/spelling errors. Limited minor grammar/spelling errors. No grammar/spelling errors.
Organization (40) Writing is haphazard
and disjointed with
weak organization. Organization
is for the most part clear
and coherent. Organization is consistently clear and coherent.
Scientific accuracy (40) Much of the
information presented
is inaccurate. Most information
presented is accurate. All information
presented is accurate,
demonstrating a good
understanding of the
subject.
Use of Bioinformatics approaches (60) Inappropriate
bioinformatics
approaches were used
or bioinformatics
approaches were used
incorrectly. For the most part,
appropriate
bioinformatics
approaches were used
and they were used
correctly. Appropriate
bioinformatic
approaches were used
and they were used
correctly in all cases.
Interpretation of data (50) Student did not
interpret data correctly
or students showed a
lack of understanding
of the correct
interpretation. Student interpreted
data correctly for the
most part, or
interpreted the
majority of the data
correctly, and
displayed an
understanding of the
correct interpretation of the data. The student
consistently
interpreted the data
correctly and
displayed a deep
understanding of the
correct interpretation
of the data.
Oral presentation (40) Presentation was unclear, did not flow well, and did not observe time limitations. Presentation was unclear or did not flow well, or did not observe time limitations. Presentation was clear, flowed well, and observed time limitations.

Essay Mill

Share
Published by
Essay Mill

Recent Posts

Childbirth

For this short paper activity, you will learn about the three delays model, which explains…

1 month ago

Literature

 This is a short essay that compares a common theme or motif in two works…

1 month ago

Hospital Adult Medical Surgical Collaboration Area

Topic : Hospital adult medical surgical collaboration area a. Current Menu Analysis (5 points/5%) Analyze…

1 month ago

Predictive and Qualitative Analysis Report

As a sales manager, you will use statistical methods to support actionable business decisions for Pastas R Us,…

1 month ago

Business Intelligence

Read the business intelligence articles: Getting to Know the World of Business Intelligence Business intelligence…

1 month ago

Alcohol Abuse

The behaviors of a population can put it at risk for specific health conditions. Studies…

1 month ago