HSC Biology Syllabus Notes

Module 5 / Inquiry Question 5

Overview of Week 5 Inquiry Question

Learning Objective #1 – Investigate the use of data analysis from a large-scale collaborative project to identify trends, patterns and relationships, for example:

The use of population genetics data in conservation management
Population genetics studies used to determine the inheritance of a disease or disorder
Population genetics relating to human evolution

Learning Objective #2 – Investigate the use of technologies to determine inheritance patterns in a population using DNA sequencing and profiling

NEW HSC Biology Syllabus Video – Inheritance Patterns in a Population

Week 5 Homework Questions

Week 5 Curveball Questions

Week 5 Extension Questions

Solution to Week 5 Questions

Overview of Week 5 Inquiry Question

There are many factors that contributes to the frequency of alleles in a population.

First, recall that there are many modes of inheritance through which an individual could inherit an allele depending on the nature of the allele and/or the chromosomes in which the alleles are located (specifically important for sex-linked inheritance).

However, the population size will also have an effect on the frequency of alleles or genotype observed in the population. Recall that Mendel’s second generation offspring results from his parental pure-breeding lines had a dominant to recessive ratio of 3:1. So, say you have 1000 unicorns for their hair colour. Suppose there are two hair colours being either rainbow or white. If rainbow is dominant, you would expect approximately 750 of the 1000 unicorns (75%) to have the rainbow dominant hair colour and approximately 250 of the 1000 unicorns (25%) to have white hair colour. However, would you observation change if you only had a smaller sample size of 10 rather than 1000? Yes, it will mostly will.

This is perhaps that you may see that all 10 of the unicorn have rainbow hair colour or maybe all 10 of the unicorns have white hair colour. This would seem to disobey Mendelian inheritance. However, this is may not be the case because the sample size of 10 is too small and so the results would be unreliable due to low repeatability or small number of observations.

Apart from sample size in observation experiment that may affect the interpretation of the frequency of alleles appearing in a species population, other factors such as mutation (e.g. by the environment) and migration patterns of the population would also affect frequency of alleles observed in a population. Suppose if you are unfamiliar with the migration patterns in a population, your observations of a chosen location to examine the frequency of a species may yield inaccurate results. That is, results that deviate from reality.

So, here comes the inquiry question, can population inheritance be predicted with any accuracy?

Learning Objective #1 - Investigate the use of data analysis from a large-scale collaborative project to identify trends, patterns and relationships, for example:

- The use of population genetics data in conservation management
- Population genetics studies used to determine the inheritance of a disease or disorder
- Population genetics relating to human evolution

Population genetics data in conservation management

Purpose of conservation studies and management

The primary objective of conservation genetics is to maintain and ensure that the species of conservation concern is able to adapt to changing selective pressures in the environment over time.

Many studies have shown the association between a species’s total population having a high genetic similarity being proportional to the species’s probability of extinction. This is where population genetics plays a role in conservation management or studies.

If you recall from last week’s notes, population genetics involves the study of the frequency of alleles for one or more genes in a population due to changes in ambient environment conditions or species migration over time.

Here, environment condition can refer to both changes in environment selection pressure or the physical environment landscape.
You will see in Module 6 that changes in physical environment landscape could lead to an increase or decrease gene flow which has an effect on the population’s allele frequencies.

Conservation management case study: Albany cycad plants

The following case study is sourced from the research titled, “Population genetics and conservation of critically small cycad populations: a case study of the Albany Cycad, Encephalartos latifrons”, performed by a team of scientists including Jessica M.Da.Silva, John S.Donaldson, Gail Reeves and Terry A.Hedderson.

The Albany cycad plants were chosen as in this study because they exist in very low numbers due to their habitat being disturbed, loss as well as by being eaten. This puts them in risk of extinction.

Out of the 304 species of cycad plants, more than 28 of them have less than 250 individual (actual) plants in their natural habitats!

Purpose of conservation management case study: This study case on cycad plants involves the determination of their genetic structure and diversity which help decide on its degree of susceptibility to extinction.

Like all conservation management studies, a recommendation is proposed. Therefore, this study also helps propose viable reproductive methods that could be used to increase cycad’s population size whilst preserving its genetic identity.

In the study of the population genetics in the Albany Cycad, a total of 86 plants were used to collect to analyse for each plant’s genetic marker, which are known DNA sequences at a particular chromosome’s locus.

The 86 Albany cycad plants were studied in South Africa at all of its known (five) natural geographical habitats within a 560 kilometre square area. Albany cycad plants from the Kirstenbosch Botanical Garden in Cape Town were also studied.

A total of 49 wild Albany cycad plants and 37 from Kirestenbosch Botanical Garden was used in this population genetics (total equal to 86 plants). The wild plants are known as the in-situ subpopulation and the cycad plants from the Botanical garden is known as the ex-situ population.

Cycad plants were analysed for their genetic variation using the Amplified fragment length polymorphism (AFLP)* characterisation technique which is able to identify genetic markers.

*Note that polymorphisms in AFLP involves more than one nucleotide being altered. This is therefore a technique used to study more complex polymorphism rather than more simpler versions such as SNP which we have looked at in last week’s notes.

The AFLP technique is a type of DNA fingerprinting (or DNA profiling technique). It is useful because it can help identify the similarities or differences between the genetic marker on each plant. Thus, AFLP allows the identification of any polymorphisms in cycad plants’ genome.

First, one leaf was collected from each cycad plant. These cycad plants were submerged inside silica gel to be dried and causing DNA to be bind to the surface of the silica (a compound).

The DNA was then subsequently extracted from each of the leaf using a CTAB with 2% PVP.

CTAB = Cetyltrimethylammonium Bromide
PVP = polyvinylpyrrolidone

After this, the same restriction enzyme were used to cut out the section of DNA (genetic marker we are examining) for all DNA gathered from all plants in each of two subpopulation groups.

Note that the corresponding fragments for different plants are different in size due to differences in DNA sequence of one cycad plants to another. Each plant’s DNA fragments are then amplified or copied through a process known as polymerase chain reaction (PCR).

After PCR, a gel electrophoresis machine is used where DNA (in fragment form) of each wild plants and ex-situ cycad plants are placed different wells in the electrophoresis machine. The plate compartment of the machine is then filled with agarose gel.

As an electrical current runs through the machine and through the agarose gel, the negatively charged DNA fragments (due to phosphate groups in DNA) moves through the agarose gel towards the positively charged side of the machine.

Note that the phosphate groups in DNA are negatively charged in a neutral substance such as agarose gel. This allows the DNA fragments to move in the gel medium.

Some fragments of DNA between the two subpopulations will travel at a different pace where longer DNA fragments travel slower due to their higher molecular mass. At the end, the gel will slow down and sort these different fragment according to their size (molecular weight). The spaces between these fragments created what is known as DNA bands which can be viewed under UV light.

This whole process is called DNA profiling or DNA fingerprinting.

The results of this case study on Albany cycad plants showed that there is little genetic variation between the subpopulations (in situ and ex situ) of the cycad plants. This is because the DNA bands of the plants of in the situ population are very similar to the DNA bands of the plants in the ex situ population.

Note that the variations in DNA bands are then tested in greater detail in the actual study. Statistical mean & variance analysis, statistical significance of results were examined. However, this is outside the scope of the course.

While there is little genetic variation between the subpopulations, there is large genetic variation in cycad plant within the same subpopulation, i.e. amongst plants within in situ population and amongst plants within ex situ subpopulation.

The fact that most of the genetic variation is derived from within subpopulations is why the Fst value is low. More specifically, The Fst value was 0.026 between the in-situ and ex-situ population.

Note that: Fst value of 0 – 0.05 means little to no genetic variation between subpopulations.
Fst value of 0.05 – 0.15 means means that there is moderate genetic variation between subpopulations.
Fst value of 0.15 – 0.25 means that there is high genetic variation between the subpopulations.
Lastly, Fst value that’s greater than 0.25 suggests there’s very high genetic variation between subpopulations.

The low Fst value suggests that there is little genetic variation amongst the in-situ and ex-situ Albany cycad plant subpopulations. This therefore suggest that there has been a high gene flow in the past where the two subpopulations were once recently a single population. This is because there is no significant genetic variation between subpopulations when F-statistics is used to test for statistical significance.

AMOVA, short for analysis of molecular variance, is a technique used in the study to examine a population’s genetic variation between subpopulation by examining the details of the identified AFLP genetic markers.

The use of AMOVA allowed researchers to explore the distribution of total genetic variation amongst or within sub-populations.

F-statistics is a type of data analytics that is used to test for the significance of the hypothesis based on the hypothesis and results obtained from AMOVA.

To test for the significance of the F-statistics data, researchers studied the 3024 possible ways of arranging AFLP haplotypes (groups of genetic markers) from the 417 AFLP genetic markers within subpopulations for 86 cycad plants that was studied.

The results showed that there was high repeatability in these 417 AFLP genetic markers between the two subpopulations – specifically the in situ subpopulation had 84% genetic similarity to the total population and the ex situ subpopulation showed 81% genetic similarity to the total population. This suggests a high genetic similarity between the in-situ (wild plants) and ex-situ subpopulations.
This is true as AMOVA revealed that 95.1% of the Albany Cycad population’s genetic variation was from within subpopulations and there is only 5.9% of genetic variation difference between the wild plant (in-situ) and ex situ subpopulations.

Extra stuff from case study:

Something that is worth noting is that within the study for allele frequency, the Bayesian clustering approach (statistical analysis) was used in the study to confirm the observed allele frequency within plants in each subpopulation. It involves further dividing each subpopulation (i.e. in situ or ex situ cycad plants) into their own unique genotypic subpopulation.

The results of the Bayesian clustering approaching in this experiment was that there is little to no differentiation in the genotype subpopulations for both in-situ and ex-situ subpopulations.

Recommended conservation efforts

The results have established that the AFLP genetic markers successfully inform the researchers about the genetic variation in the Albany cycad plant population, specifically both among and within their subpopulations.

As indicated by the low Fst value between subpopulations being 0.026, there is large genetic similarity between the cycad plants in Kirstenbosch Botantical Garden and those at the wild (in-situ). This indicates that cycad plants at the Botanical Garden would be suffice and effective in ensuring genetic variation of the cycad plants even if there is an event that leads to all wild cycad plants being wiped out.

Having an ex situ collection of organisms that have high genetic similarity to the in situ organisms (in this case wild cycad plants) is an important safety measure from a conservation management point of view to ensure that the complete or majority of the genetic variation of a species is conserved.

Another subject matter in conservation management is population size. It is an objective in population genetics for the purpose of conservation to increase species with small population sizes to reduce the threat when harmful environmental selective pressures for a species introduced into the species’s habitat.

One method to increase population size is to spread and germinate seeds obtained from plants in the botanical garden. Another method is artificially pollinating wild plants with ex situ plants. As mentioned previously, the fact that Fst value is low, it suggests that the two subpopulations (ex-situ and wild plants) were part of a single population quite recently. This implies that there is NO mating restrictions that would lower the fitness of the offsprings that are produced when species of the two subpopulations are crossed. This means that artificially pollinating wild plants with ex situ plants is feasible!

From the study, in terms of genetic variation, there are four genotype that are not divided evenly between the two subpopulations. One way to ensure that all genotypes are being included in the new cycad offsprings in the future conservation breeding programmes is to use DNA sequencing.

DNA sequencing should be performed on the new seedlings of the future Albany cycad generation to ensure that all the selected seedlings as a whole would inherit every genotype of the original cycad population.

Limitation of AFLP technique used in conservation studies

AFLP technique cannot be used to determine between homozygous and heterozygous organisms because it uses dominant markers which determines examines more than one locus (and thus more than one gene) at the same time.

Therefore, to address this limitation, there are a couple of possible ways to address this limitation in order determine the allele inheritance patterns of offsprings in new generation. One of these ways is to using pedigree studies to study inheritance patterns which we have talked about in previous week’s notes.

Currently, there is research being performed in attempt to obtain genotype for every gene used in dominant markers.

Population genetic studies used to determine the inheritance of a disease or disorder

A possible definition of disease is a health condition that arises as a response to internal or external variables.

A disease can lead to disorder which can be defined as a condition that hinders the the normal functioning of an individual, either partially (e.g. an arm) or entirely.

Recently, the large scale collaborative project called the International Genomics of Alzheimer’s Project (IGAP) led to a breakthrough in discovering the more than 20 new genes as possible causes of the late development of Alzheimer’s disease (AD), i.e. people after the age of 65.

Each of these possible causes are related to a gene located at different locus on a chromosome. A total of more than 50,000 patients that have perceived risk of developing OR have developed AD is now part of the IGAP’s database.

The identification of more than 20 new genes as possible causes of late development of AD was performed using genome-wide association study (GWAS). GWAS that employs DNA sequencing to determine exact DNA sequence which we have discussed. This is an observational DNA sequencing study used to identify variants in DNA sequence or, more specifically, SNPs between people who have AD and do not have AD.

As we have learnt in the previous week’s notes, affected individuals would have SNPs that occur more frequently amongst them than compared to individuals that are not affected. A typical GWAS explores thousands of SNPs.

After identifying various SNPs that may be suspected to be related to late-stage AD, these varied DNA sequences are then tested for statistical significance using meta-analysis.

Upon analysis, these genes that researchers suspect to be related to late development of AD can be classified into different groups. These genes associated with the development of late-stage AD were grouped based on being responsible for the immune response, lipid decomposition and synthesis as well as recycling vesicles.

Until researchers can further narrow down the 20 genes that identified to be associated with the late development of AD, it is difficult to perform epidemiological studies to accurately assess how reducing exposure to certain factors (e.g. toxic chemicals) may reduce the chances in developing AD after the age of 65.

We will explore epidemiological studies in Module 8.

Currently, it is known that mutations to three proteins would lead to the autosomal dominant expression of Alzheimer’s disease in the individual. These proteins are Presenilin 1, Presenilin 2 (PSEN1 and PSEN2) as well as the amyloid precursor protein (APP).

By encouraging public to take predictive screening (used to detect any genetic mutation that would led to development of disease later in life), more data can be gathered to help understand the correlations of certain genes associated to the late development of AD.

That being said, the remaining uncertainty the exact genes that is responsible for late development AD renders predictive screening to have lower than desired effects to assist patients in their diagnosis as well as in the prevention of late-onset AD.

It is also currently believed that there is no one single gene is completely responsible for the late-onset AD, as suggested by the recent discovery of the 20 locus containing genes that may be associated with late development of AD.

The REVEAL III study performed in 2007 – 2009 uses the screening results of over 290 patients to classify the risk of developing AD based on different factors such as age, gender and family genetics.

Advancements in new generation sequencing (NGS) can allow more accurate and faster identification of abnormalities in individual’s whole genome with less DNA material required than GWAS for instance. In the future, perhaps new technologies may be introduced that is capable of performing more detailed algorithms to reveal more subtle patterns in humans’ genome.

Population genetics relating to human evolution

Anthropological genetics is the study of human population genetics.

There are multiple objectives of this type of study. One of these objectives that is of our interest in HSC Biology is to determine the origin of modern human civilisation and the evolutionary pathway leading to modern day human civilisation.

Population genetics attempts to resolve this question by examining the genetic variation and similarity in modern day humans.

There are currently two model or theories in regards to the origin and evolutionary pathway leading to modern day human civilisation.

The first model is called the replacement hypothesis (also known as the ‘out of Africa’ model).

This model proposes that the Homo Erectus (type of hominid before evolving into homo sapiens) originated and migrated out of subSaharan African into different parts of the Old World about 2 million years ago. These Homo Erectus species evolved then independently into their own groups at different locations, such as the Neanderthals in Europe and western Asia.

However, the modern day humans (Homo Sapiens) is derived from the group of Homo Erectus that STAYED and evolved independently in subSaharan Africa which was LATER dispersed across the world about 100,000 years ago. These Homo Erectus that later dispersed across the world 100,000 years outcompeted other Homo Erectus groups, such as those in the Middle East and Neanderthals, and thus occupied most of the World and evolved into modern day humans.

This means, ancient human populations from the Middle East (Western Asia) and Neanderthals of Europe were NOT direct ancestors for modern human civilisation.

The second model is called the multi-regional hypothesis.

This model also proposes that the Homo Erectus (type of hominid before evolving into homo sapiens) originated and migrated out of subSaharan African into different parts of the Old World about 2 million years ago. During then, these Homo Erectus from different parts of the Old World evolved independently. However, there is also high gene flow between the Homo Erectus groups residing in different parts of the Old World. This means that the Homo Erectus species from different parts of the world are able to interact each other and their offspring would have a mix of their genes.

These offsprings with mixed genes were able to evolve independently into modern day humans (homo sapiens) in different parts of the world. Due to high gene flow, modern humans look the very similar despite living in different parts of the world.

The argument against the multi regional model is that research currently suggest only about 10,000 ancestors are responsible for modern day human civilisation.

This amount may not sufficient to ensure a high enough gene flow amongst the Homo Erectus that are situated in different parts of the Old World.

This would mean that convergence of the genetically different Homo Erectus ancestor in different parts of the Old World should have taken longer to occur due to small ancestor population size that is limiting gene flow. The counter-argument against that (thus supporting multi-regional theory) states that the estimated amount of ancestors for modern human civilisation does not equal to the total population size of all Homo Erectus. The total population size including those that are NOT ancestors for modern humans may have been much higher! The 10,000 are ONLY ancestors of modern day human civilisation and likely NOT the total amount of Homo Erectus living 2 million years ago which presumably should have been higher.

[Insert Data Table]

Fossil Evidence

The oldest fossil evidence for the hominid ancestors exhibiting characteristics of modern human civilisation was found in Africa in 130,000 years ago. This fossil therefore supports the Out of Africa theory as we said that our direct ancestor left Africa 100,000 years ago and we found a fossil with similar characteristics to modern humans in Africa 130,000 years ago, i.e. 30,000 years before our direct ancestors left Africa. It does not support the multi-regional theory.

Lucy is the named given to the hominid, believed to be an ancient ancestor of human civilisation, that existed more than 3 million years ago.

Origin of modern human civilisation using mtDNA

We mentioned in previous weeks that DNA can exist in mitochondria or in the form of mitochondrial DNA in eukaryotes.

Every human’s mitochrondrial DNA (mtDNA) is ONLY inherited from their mother which does NOT recombine with mtDNA DNA from the father. This is because the father’s mtDNA is broken down during the recombination process as part of the fertilisation process. So, the offspring will not have the father’s mtDNA.

The mtDNA that is inherited through a female parent makes organisms in the population more susceptible to genetic drift (divergence) in the event such as a mutation. We will explore genetic drift in Module 6.

That being said, researchers are able to determine the predictable mutations by examining individual’s DNA sequence. Of two organisms, the fewer the predictable mutations and, thus, the greater similarity between the two individuals’ mtDNA, the more closely they are related.

Studies have shown that the most recent common female ancestor of all modern humans lived about 200,000 years ago in Africa by comparing mtDNA from people around the world by a scientist named Rebecca Cann. This effectively supports the replacement hypothesis or ‘Out of Africa’ theory which proposes that the ancestor for modern human civilisation have a more recent African origin than the multi-regional hypothesis.

That being said, this finding by Rebecca and her team does not mean that all modern humans only originated from one female ancestor. It simply shows one set of mtDNA gene that was passed down from one female ancestor. There may have been many other females that passed down their set of mtDNA but was lost.

As we have looked already, the fossil evidence for the origin of modern human civilisation shows evidence of much older specimens. For example, Lucy is a female hominoid that lived 3 million years ago.

Some reasons towards why the sets of mtDNA was lost and not passed down towards modern human civilisation may have been the reason of global events such as Ice Age which lead to the decline in human population to about 15,000 people about 70,000 years ago.

Other events included possibilities of ancient plagues which wiped out a significant portion of population so that only the females that survived were able to pass on their unique set of mtDNA genes to the new generations. This would mean only the females that survived were able to pass on their set of mtDNA to their offspring, also known as the bottleneck effect.

This had the consequence of the current modern human civilisation having only one common female ancestor rather than multiple.

The mtDNA from ancient Neanderthal specimen that was located in Germany that lived about 35,000 years ago showed that it differed to modern human civilisation in 27 base pairs which was larger than typical mtDNA differences (of 8 base pairs) found between modern day humans.

The mtDNA from another Neanderthal from 29,000 years ago was also examined which contained 23 different base pairs when compared modern day humans.

These evidence of mtDNA therefore does not support the multi-regional hypothesis which proposes that the Neanderthal of Europe could evolved to become the ancestor of modern human civilisation and not just solely Homo Erectus from Africa as per ‘Out of Africa’ hypothesis.

That being said, the difference in base pairs between the Neanderthal individual that lived about 30,000 years ago with modern humans is approximately the same as the difference between chimpanzee and modern humans.

Since chimpanzees and humans share 96% identical DNA, Neanderthal individual may have been a race of Homo Sapiens.
Perhaps, one day we discover that a modern human carries the mtDNA of a Neanderthal ancestor that we have so far have not discovered!

Conclusion of anthropological genetics:

Studying population genetics in anthropological genetics, geneticists found that the origins of modern human civilisation is mostly likely derived from ancestors that evolved out Africa as proposed by the replacement hypothesis.

That being said, we cannot confirm or reject the multi-regional hypothesis as some studies have revealed some high frequency of mtDNA of some modern humans is originated from ancestors from South Asia rather than Africa.

Inquiry Question #2 - Investigate the use of technologies to determine inheritance patterns in a population using DNA sequencing and profiling

Up until this point, we have looked at DNA fingerprinting that was used in the case study about the conservation of Albany cycad plants.

DNA fingerprinting is in fact DNA profiling as we have mentioned during our Albany cycad plants case study.

In this learning objective, we will explore what the DNA sequencing technique involves and why it is more useful than DNA fingerprinting for certain* genotyping studies.

* Keep in mind that the definition of ‘usefulness’ will vary and depend on your objective!

What is DNA sequencing?

Sanger sequencing is the basic methodology under which modern computerised DNA sequencing techniques are performed.

Sanger sequencing uses the natural process of DNA replication to identify the precise order of nucleotides for the sequence of a DNA segment.

The target DNA segment is first amplified or copied using polymerase chain reaction (PCR) using primers and DNA nucleotides (dNTP).

These DNA segments are then heated to denature the double helix, effectively unwinding the DNA. In some cases, sodium hydroxide or an acid can be used to unwind the DNA.

DNA polymerase reads from the 3’ to the 5’ end and add complementary nucleotides from the 5’ end to the 3’ end.

A DNA primer molecule (comprised of three complementary nucleotides to the template DNA strand) is used to help DNA polymerase catalyse and initiate the DNA complementary base pairing (or replication of the single DNA strand) process. The primer essentially allows a (phosphodiester) bond to be formed between itself and the complementary nucleotide to the DNA template strand.

There are modified DNA nucleotides called ddNTP or dideoxynucleotides. The normal DNA nucleotides we explored in DNA replication are called deoxyribonucleotide triphosphate (dNTP).

Both the ddNTP and dNTP can have one of the four nitrogenous base. So, similar to how normal DNA nucleotides (dNTP) can be complementary base paired with the DNA template strand, the ddNTP can also be paired with the DNA template strand as well. The one thing that is different is that when ddNTP is attached to the DNA strand, it marks the termination of the DNA replication/synthesis or complementary base pairing process.

This is because ddNTP does not have a 3’ hydroxyl group (missing an oxygen) to bond with other nucleotides to continue the DNA replication or synthesis process. Therefore, the DNA replication process can be terminated early if a ddNTP is bonded to the template strand.

In the Sanger method, there are four reaction vessels used. Within each one, there are many identical copies of the DNA template strand containing the sequence of DNA that we want to precisely identify.

In each vessel, there are also many primer molecules added where each primer is radioactively marked. The radioactive of the primers will have their significance later in the process.
DNA polymerase molecules are added to each reaction vessel.
Many of each of the four unmodified (normal) DNA nucleotides – (dATP, dCTP, dGTP, dTTP) are added to each of the reaction vessel.
In each reaction vessel, there is small amounts of different ddNTP molecule that is added. This means that in each reaction vessel, there is a different type of ddNTP molecule.

Since small amounts of ddNTP is added to each reaction vessel, the DNA synthesis of the complementary base paired DNA strand will occasionally be terminated earlier when the DNA polymerase pairs the ddNTP with the original DNA template strand rather than a normal dNTP.

This means there are different lengths of DNA fragments in each of the reaction vessel as the pairing of the ddNTP and dNTP with the DNA strand is random.

Sometimes the ddNTP is paired as earlier as possible (close to the 5’). Sometimes, the ddNTP is paired with the DNA strand as late as possible (close to the 3’ prime). Because of there are large amount of DNA template strands in each of the four reaction vessel, this allows large amounts of partially and fully replicated DNA strands for EVERY single nucleotide in the DNA template strand.

At the end of the DNA synthesis reaction, all of the DNA fragments in each reaction vessel is added into a separate well in the gel electrophoresis machine where polyacrylamide gel is used to fill up the plate compartment of the machine.

When an electric current is supplied, the DNA fragments will move from the negative to positive side of the plate where the gel will slow down and separate the DNA fragments according to their lengths or molecular weight. The longer the DNA fragment, the larger the molecular weight and will travel a smaller total distance towards the positive charged side (bottom) of the plate.

The separation of these DNA fragments creates what we call DNA bands that are scattered across the plate.

Due to the radioactively labelled primers present in each of the DNA fragments submerged in the gel, through autoradiography, an image of the radioactive DNA bands can be seen.

In the first lane, suppose that ddATP (modified dNTP with adenine base) was added. In this well, it will consists of all the DNA template strands that are partially replicated but terminated when the template strand had a thymine base which the ddATP can bond to. Since ddATP is a modified nucleotide, if it bonds with the template strand, it will terminate the DNA replication process by preventing other normal dNTP (DNA nucleotides) from continuing the complementary base pairing of DNA replication.

The complementary DNA nucleotide sequence to the original DNA template strand can be identified by reading from bottom of the plate to top of the plate. That is, from small DNA fragment to largest fragment (going from 5’ to 3’) which is the direction which complementary base pairs to the original DNA strand is added.

Using the complementary DNA nucleotide sequence, the original DNA sequence of the sample DNA can be determined.

We will be upload a youtube video to illustrate this process very soon.

DNA Profiling (DNA Fingerprinting)

We have already touched on DNA Fingerprinting technique in the cycad plant case study.

Both DNA sequencing and DNA profiling use gel electrophoresis to observe DNA bands.

The difference between DNA sequencing and DNA fingerprinting (profiling) is that DNA sequencing is more specific in the sense that the precise order of nucleotides in a DNA segment can be determined precisely.

In DNA sequencing, the DNA bands that are formed can be used to determine the nucleotide sequence of the complementary DNA strand and thus can be used to identify the original DNA sequence.

DNA sequencing can used when researchers wish to compare the evolutionary relatedness of two organisms (typically two different types of species). This technique was used to conclude that crocodiles are more closely related to birds than lizards.

In DNA fingerprinting or profiling, the DNA bands that are formed CANNOT be used to determine the precise nucleotide that makes up the DNA sequence of the complementary DNA strand (and thus original DNA sequence).

DNA profiling uses DNA that are in the non-coding region of the DNA (i.e. does not code or specific for a protein). Within these regions, there are repetitive sequences that are called short tandem repeat (STR). These STR or repetitive sequence vary across individuals and thus will generate unique DNA fragments when they are cut using the same restriction enzyme. These restriction enzymes cut DNA at specific sequences of DNA as we have talked about in the cycad plant conservation case study earlier.

This means that unlike DNA sequencing, DNA profiling can be used to compare the differences and similarities between individuals based on variations in their short tandem repeats (STRs) in their non coding region of DNA.

This therefore allows the detection of polymorphisms at particular segment of DNA at chromosome locus (genetic marker) between species in the population. We seen the use of DNA profiling to detect polymorphisms in the Albany cycad plant study.

Another classic example of the use of DNA profiling is in forensic investigation where the DNA sample from the crime scene is compared against the suspect’s DNA. Depending on how closely the DNA bands are, the more likely to the sample is derived from the suspect.

Another example of the use of DNA profiling is in paternity testing. This may be used when there is a lost child and DNA profiling can be used to help confirm the identities of the lost individual that had been spotted after missing in the family for many years. This is when the offspring’s DNA is compared against potential fathers’ DNA. The more closely of the two individual’s DNA band matches with each other, the more closely related they are.

Determining Inheritance patterns using DNA profiling

When gel electrophoresis is performed on DNA in one locus on individual’s chromosome, it is possible to determine whether or not it is homologous or heterozygous for that particular gene!

If the organism is heterozygous, there will be two DNA bands shown.
If the organism is homozygous, there will only be one DNA band shown.
Have a look at the diagram below.