Any experts on evolution here?

Human Genome Bears a Virus Related to HIV-1

“Because of these viral gene insertion events, genetic material from inactive viruses accounts for roughly 3 percent of the human genome. Cullen says that 30-50 copies of HERV-K exist in the human genome, and that some of the copies appear to be active at a low level in normal testicular and placental tissue. The HERV-K genes show even more activity in certain cancers, especially those involving the testes, "but there doesn't seem to be a harmful effect from the activity of these genes," Cullen said.”
Re: reply to Eflex

Originally posted by paulsamuel

Given your responses to examples of purported vestigial anatomical structures, I must ask if it's your contention that all DNA, all morphology is functional?

I know it sounds a little hokey.....

But I am a firm believer in the notion, "As above, so below"

If we study a large system like the universe, we can learn a great deal about smaller systems.

If we study small systems like our mitochondria and ATP production, we can learn a great deal about larger systems.

My point is this:
The universe has this unknown substance named "Dark Matter" that accounts for the lions share of mass in our universe.

Astronomers dont fully understand Dark Matter but its importance is undeniable.
Some believe that Dark Matter is the stuff that holds the stars in place via gravity and magnetism....

The Human Genome has 30,000 - 40,000 known genes, the rest is categorized as "junk DNA"

My supposition is that ALL DNA is functional in some way.
The function may be more subtle that simply coding for protein or RNA, but the function IS there.
Last edited:
reply to scilosopher

It appears that promoter regions appear to be too similar phylogenetically in different species to be derived from junk DNA (unless it occured in a common ancester during the origin of eukaryotes). Promoter regions contain highly conserved regions among all eukaryotes, including TATA box and the PRIBNOW box.

When you comment on the divergence of these non-functional regions and the difficulty in phylogenetic comparisons due to this divergence, this is my point exactly. Non-functional areas will mutate freely with no selective constraints, therefore divergence (in a DNA sequence) will occur more rapidly. This is evidence that these regions really are non-functional.

Thanks for the bacteria ref. (although these regulatory regions are not promoters, as promoters are strictly an eukaryotic characteristic). Although I have only read the abstract, it appears that, even in bacteria, these regulatory regions have conservation that indicates common ancestry. This is more evidence that these non-coding regulatory regions are not derived from junk DNA.

Thanks also for the C.elegans ref. These results agree with my own research in intron variation in mammals. Most of the sequence conservation between species is in the 5' and 3' ends of the intron, while the middle part of the introns seem to be free to vary. A quote from the abstract represents my view exactly concerning the identification of regulatory (but non-coding) DNA regions, i.e., "The alignment confirms that patterns of conservation can be useful in identifying regulatory regions and rarely expressed coding regions."

I think it an intriguing hypothesis that non-coding, and apparently non--functional, DNA can have a functional role as spacers for the attachment of scaffolding proteins such as histones. Are there refs. confirming the requirement of such spacers for the protein attachment?

reply to hamster

You said.
"This hamster guesses it would be easier to show genes crossing species by direct comparison of genomes than by observing the mechanism of transfer."

This is exactly right, and we do it all the time right now when we do phylogenetic comparisons. We would expect, if there were a lot of interspecies genetic transfer, that phylogenies would not be concordant at those genes that were transferred, relative to those genes that are common by descent. In fact, we do see this dis-concordance when examining inter-species hybridization. We have identified viral insertions in genomes using this method. Thus far, we have not seen interspecies genetic transfer via some vector, like a virus, but we have seen insertions of disease organisms' DNA into the host genome.

It is not really surprising to me that fucntional constraints would keep animal genomes similar. It is what one would expect, but in the non-functional areas of the genome, there's lots of difference. I've done some research on intron variation in mammals and the variation in there is quite high, also at degenerate third position bases in codons (both introns and these third positions have a similar rate of substitution).

The mechanisms of bacterial interspecific genetic transfer are unlikely to be applicable in eukaryotes, and there is no evidence, thusfar, that it occurs. There is really no reason to invoke this type of genetic transfer, as all evidence points to commonality by descent, which is the most parsimonious explanation.

One must also remember, in sexually reproducing species, gene transfers have to occur in the gametes for them to be of evolutionarily significance, i.e., heritable (unless one wants to invoke Lamarkian evolution).

Co-evolution is a commonly accepted hypothesis, but difficult to prove, but at least co-speciation has been shown to occur.

I am still not convinced that junk DNA has any potential to become anything more than that.
Last edited:
Reply to hamster

Thanks for the refs.

The bacteria example is well described and is an excellent example of interspecific gene transfer especially in conferring adaptations to different species. However, these types of transfers can't occur in eukaryotes without some type of vector, like a bactreial or viral disease organism.

The plant ref. is a mechanism for biotechnologists to transfer genes between plant species (agricultural gene therapy). The only natural gene transfer occurs between the disease organism (bacteria) and the host.
reply to Eflex

Thanks for your reply. You've made your position quite clear (excellent). I personally think that you're opinion is untenable. It's difficult to show that every base pair of DNA is functional, and there's no evidence to support it. But it's quite easy, and has been done, to show that stretches of DNA sequence are not functional and not necessary. But, good luck.
I'm glad you liked the refs. I have to really disagree about the similarity of promoters dispelling the evolutionary model under discussion. Anything that becomes functional would typically have many features maintained, but it doesn't explain its origin.

In terms of junk DNA and regulatory regions, bacterial regulation is much more compact and the characterization of regulation in bacteria generally fit with the mechanism discussed, but they have only one cell type and much different evolutionary constraints. I was just giving a paper I knew on synteny mapping, though in this case I can see how it might have been misleading. In context the whole discussion came up through the increase in noncoding DNA in eukaryotes and its possible role. I agree that introns aren't a great example either, but the delineation of regions containing functional elements is one of the reasons it is an early example of more large scale sequence comparisons.

In any organism for a transcription factor to start regulating new genes requires generation of new binding sites. These are the facts:

The binding sites of many transcription factors are quite fuzzy (ie degenerate sequence binding specificity and there is at least one example of a factor that binds two distinct motifs).

Multiple clustered sites is a common feature of many regulatory regions allowing both cooperative binding which allows switch like behavior and in other cases many poor sites to compensate for fewer more exact sites.

These regulatory regions grequently occur in a range of 10kb around the gene and even have intervening genes allowing development of useful regulatory modules over a large range of sequence. There is even a cell death regulation gene called reaper in Drosophila that is 65 aa and has a 100kb regulatory region This requires a certain amount of scaffold and available sequence just to develop.

I never said or thought they wouldn't be conserved, I even directed you to papers that found sequence comparison. But new binding sites have to come from somewhere and the mechanistic constraints of the system are such that it makes a lot of sense. If you also note the increasing amounts of regulatory DNA in higher eukaryotes it would follow that we may have evolved flexible constraints just to allow such increased regulation.

The role of many DNA binding proteins is still misunderstood so there aren't a lot of great refs. Histones are known to be involved in regulating transcription using DNA accessibility which operates through regulated post translational modifications such as acetylation/deacetylaion by histone acetylases (HATs) and deacetylases (HDACs). Methylation and phosphorylation also occur. These effect the superstructure of histone interactions changing the local chromatin structure. Chromosomes are also somehow associated with the nuclear perifery, but the proteins which mediate this are unknown. More may be known about centromere proteins that bind the chromosomes. I know INCENP, aurora B, and survivin are all closely associated with the chromosome in anchoring microtubule assembly during mitosis.
You may be right about similarity of promoters not proving evolution by descent, but the alternative is even more unlikely, although possible. Imagine a random stretch of DNA sequence eventually becoming a regulatory region by random substitutions. I can't believe it myself and believe it to be highly unlikely.

I don't know a lot about the evolution of transcription binding sites, but, to me, it's unlikely that they just arise from junk DNA. I suspect that transcription binding sites coevolve with transcription factors. This is a an easily testable hypothesis. Anyone ever done it?

I fail to see how the facts about transcription factors and their binding sites are evidence that they arose from junk DNA.
Paul, thanks for the info. This hamster was aware of only a few specie genomes being decoded. Hadn’t realized there had been many phylogenetic comparisons.

One difficulty in evaluating data is the data may be fit to the answers the experimenter expects to see. Thus the experimenter may declare that the data shows “A” occurred rather than “B”, not realizing that the data really represented the unknown case “C”. In this case the possibilities were “divergence from common ancestral gene”, “divergence from hybrid gene”, “viral code”, and “unknown”. Would the dis-concordance from the “unknown” case be recognized and reported?

Actually, this hamster doesn’t think the idea of genes crossing species is so wild that some experimenters wouldn’t have considered it. Likely they do look for it and the evidence just isn’t there.

Paul posted: “I am still not convinced that junk DNA has any potential to become anything more than that.”

This hamster agrees that a long segment of random base pairs is extremely unlikely to accidentally code for a useful protein. If junk DNA does somehow contribute to new coding DNA, a mechanism more likely to produce useful proteins would be needed.

Deriving a new gene from an existing redundant gene by a series of simple mutations seems similar to a “greedy” optimization algorithm. These algorithms require that the new solution be better than the old at each step. They tend to find local minimums rather than global best solutions.

This hamster wonders if nature might use several optimization schemes in parallel. Bacteria, plants, and animals all seem to use different mechanisms and face different selection pressures favoring those different mechanisms.

Another question is how rapidly non-functional DNA mutates. The link to viral DNA existing in the human genome indicated that some of the viral code is still expressed in certain tissues. Would seem to indicate the viral DNA isn’t changing very rapidly. Presumably the normal DNA repair mechanisms are repairing that viral DNA.
Existing sites most likely do coevolve with the TF. But that doesn't explain where new sites come from. The situation is much different from whether a codin gene could arise by accident where there are much longer stretches of dependencies.

Worst case scenario - an 8 base binding site with equal base probabilities would occur every 1.5E-5 by chance. In a region of 10kb that means the chance of one site occuring is 0.15. That's for an exact sequence for one transcription factor.

Since multiple transcription factors are typically expressed in an overlapping fashion you also get multiple tries.

Figuring for the fact that binding sites are fuzzy ie don't need to be an exact sequence, but just something close the number of possible sequences shoots way up.

Then there is clustering of many weak sites. This has an effect even if they are spread over 150 bp or more.

I just don't see why this is unbelievable. How do you think additional regulatory regions evolve?
Sorry, didn't mean to confuse. You're right, only a few species entire genomes have been sequenced, but one doesn't need the entire genome to do phylogenetic analyses. Literally, tens of thousands of phylogenetic analyses have been published. In fact there are a couple of journals devoted entirely to phylogenetic analyses (Systematic Biology, Molecular Phylogenetics and Evolution)

You can be pretty confident that an experimenter would not be allowed to evaluate the published data in any biased manner. All acceptable biology journals are peer reviewed with a scientific board of editors and outside referees. There are certain assumptions associated with these analyses, though. Dis-cordance would be discovered and reported, and, in fact it has.

Well, genes crossing species is not so wild an idea that it is dismissed outright, and, in fact, as you have pointed out, we know it happens in bacterial species. The problem with eukaryotes, there's no mechanism to test. If we get a proposed mechanism for this type of transfer, we can test it.

Redundant genes are probably the source for new genes. In the complex of genes for histocompatibility, this is accepted.

I don't know anything about "optimization algorithms," but your statement, "tend to find local minimums rather than global best solutions," is what has been theorized by Wright's shifting balance theory. In this theory, one can envision an evolutionary landscape with adaptive peaks. A population reaching an adaptive peak (through evolutionary selective processes) would find it difficult to reach another adaptive peak in the landscape, even though it was higher (more adaptive) because selection would preclude crossing a "valley" to reach the new peak. If you came up with this idea independently, I am extremely impressed with your intellect and intuitiveness. Can't say I would have.

I don't know the different optimization schemes of which you speak, but the evolutionary mechanism of natural selection appears to be at work in bacteria, plants and animals. An alternative mechanism has been proposed in the neutral (or nearly neutral) theory of evolution. It appears that both processes are at work in both eukaryotes and prokaryotes.

We have a good idea about the rate of random mutation and it appears to be universal. The maintainence of the viral gene expression in humans could have a number of explanations which Cullne et al. do not go into in their paper, but there has been much theoretical work on the maintainence of genetic variation. It's important to note however, that, by being incorporated into the human genome, this gene is now on a different evolutionary trajectory defined by human selective constraints.
Last edited:
Ok, well maybe the creation of new regualtory binding sites for TF's is less improbable than I originally thought, and perhaps it occurs, but other regulatory regions are more complex and less likely to arise randomly from junk DNA. Can you propose a way in which to test the theory that junk DNA is the source of these binding sites? Has there been any published work on the origin of these sites? We already have a mechanism for the generation of new genes, duplication and selection. I think that this is the most likely mechanism for the generation for new binding sites, new regulatory regions (promoters and enhancers) and coding genes.

Ultimately, I think it unlikely that junk DNA is an adaptive characteristic as a source for new functions (we're back to the original discussion) because evolution doesn't have foresight.
Paul, this hamster can’t make claims to rediscovering Wright’s “shifting balance theory”. The hamster has been focusing on “optimal” proteins rather than optimal adaptation to an environment. (Though as you note the same algorithmic principle applies to total species adaptation.)

This hamster follows aging research. A hot topic is substances to protect against and repair damage in various tissues. Some critters such as bats seem to have far more effective mechanisms than do mice. A natural question is determining what genes confer that advantage and whether transferring those genes to a mouse would increase its lifespan.

Did bats just get lucky? Did bats experience more selective pressure to “design” a better antioxidant? Is it likely that a human biochemist unconstrained to slight modifications of existing genes could produce a much better antioxidant? If one species got very lucky and “discovered” a very effective gene, would it eventually displace other species? Is there any evidence that this has happened? Is such a possibility remote due to the three billions years of cellular evolution that occurred before the first multi-cellular life appeared? (All the good genes are already taken. Sigh.) These questions seem to relate to the biological mechanisms for generating new protein “solutions”.

There are many ways that “errors” introduce genome changes. (You guys know this stuff far better than the hamster.) Calling those mechanisms “optimizing algorithms” rather than “errors” emphasizes their power to accelerate adaptation. Natural selection weeds good changes from bad.

Even though Scilosopher is focusing on gene regulation and this hamster has been focusing on generation of new structural genes the same optimization issues arise. What biological mechanisms accelerate specie adaptation? (Unlike Scilosopher, hamster speculation isn’t hampered by knowing much biology. Hehe.)
reply to Hamster

The assumption regarding aging is that longer life is adaptive. This is not a generally accepted hypothesis, human desirability notwithstanding. May I suggest, re: aging, evolution and selective adaptiveness, these books by Sir Peter Medawar, The life science : current ideas of biology, and Induction and intuition in scientific thought.
Paul, this hamster doesn’t assume long life is adaptive. This hamster understands that natural selection is all about maximizing viable offspring. This can lead to longer or shorter lifespan depending on factors such as expected lifespan in the wild. (E.g. possums living on isolated island vs. on a mainland with predators.) Little reason to invest energy in maintaining an animal than has little chance of rearing more than one litter. Also understand redundancy theory as a reason why animals live past child rearing ages.

(Checked out Medawar’s theory on aging. Already familiar with it.)
Researcher traces gene development in 'last common link'

Traces an invertebrate gene duplicating and taking on new function in vertebrates.

“A field of research has arisen to address what kinds of genetic change over time have occurred in different species to account for so many physical differences despite such genetic similarity. It is called evolutionary development.

"Evo-devo," as Gibson-Brown affectionately refers to this budding discipline, combines the principles of traditional evolutionary and developmental biology in examining the change in gene sequence and regulation that over time lead to the development of new species and eventually new body plans.“
thanks hamster

Thanks for that link. It was extremely pertinent to our discussion. This ref. should have tons of refs. to other examples of the origins of new functions and morphology arising from genes of differing functions.
Bugs enjoy hamster sex

“Waters's results suggest that bacteria try their luck with mammalian cells all the time. But there is little evidence that such matings are fruitful - reports of bacterial genes transferred directly into the human genome are disputed.”

“It now seems that bacteria can mate with any organism with a cell membrane, says Stanley Maloy, a biologist at the University of Illinois, Urbana-Champaign. He says this idea has very profound implications for the debate over the origins of bacterial genes that are present in the human genome but absent in our closest relatives (Science, 8 June, p. 1903): The amount of conjugation Waters detected is high enough to readily explain the possible infiltration of bacterial genes into our DNA, meaning that conjugation could have happened quickly enough to add genes only to humans, in the years since they split from the common ancestor they shared with chimpanzees.”

On the other hand…

So as a strong supporter of interspecies gene transfer one must wonder who and what your parents were ... the evidence is mounting - just how can a hamster be so smart?

I had heard about bacterial genes being found in the human genome which ended up being contamination by the bacteria they were cloned into. This is always a concern working from unconfirmed sequence. One might also be well advised to PCR genes from human only cells before believing the sequence.

It's cool that someone has shown that mammalian/bacterial cells can "do it", although I would be curious to know what exactly is in the cell culture media ... one can get cells to do a lot of things in culture they won't in vivo.

I just noticed I dropped the ball when you asked about a way to test junk DNA for binding site generation. I would take a Drosophila mutant for a gene that reduces fertility and rescue with a working copy that has a purported junk region in front of it (one would also have to make sure that there were no enhancers in the introns).

From there I would do some combination of the following: 1) single fly PCR every n generations on the upstream region and sequence it, 2) wait until the fertility rate has come back up significantly and then test regions upstream of the construct for rescue driving the non-mutant gene in the mutant (or just see if it can drive expression of GFP in the right cell types), 3) use biochemical techniques (methylation interference/gel shift/ DNAse footprinting) to characterize possible binders from those known to be expressed in the right fashion (if they exist - this would also be aided by #2).

A fertility gene should have strong selective pressure on it without much effort. My guess is you would first get less specific expression that rescues the phenotype, but is not necessarily very specific. From there I don't know if it would be selected in a fashion that would permit tracking the "evolution" to specific expression.

While I agree that evolution does not design anything (or have foresight) the possibility that more complex organisms have been selected for molecular processes that encourage the generation of varied morphologies and new function is not farfetched.

Look at somatic hyper mutation and B-cell maturation. While I'm no immunologist and have never gotten the whole thing straight, taking a semi-specific B-cell antibody and evolving it to recognize an antigen more specifically within the life-span of an organism is pretty remarkable. If biology can take advantage of such processes on one scale it may be able to do so on another. Evolution pressure is scale invariant - it doesn't care how things work, just that they do.

You might be interested in reading Mark Ptashne's "Genes and Signals" or "A Genetic Switch". Part of my thinking on this subject matter came from him and he's much more credible and knowledgeable on the subject. Gene and signals especially gets into general issues of protein interactions in evolution and puts forward a model of regulated recruitment. It's pretty interesting stuff.