Genetic libraries are collections of genes present in some recombinant DNA form so they can be propagated. When people refer to “screening a library” they usually have some phenotype that they are able to select or screen for and evaluate a large number of library clones to look for a gene that alters the phenotype. People interested in eukaryotic biology usually make cDNA libraries that are derived from pools of mRNA isolated from an organism of interest. This allows them to isolate DNA fragments that encode proteins or RNA that are produced form the spliced form of the RNAs found in the cells. It has been a long time since I have worked with cDNA libraries, so I won’t go into that here (perhaps someone in another group can add a section?).
For example, suppose we have a strain of bacteria that can’t grow on lactose (like Salmonella) and we are interested in finding genes that are needed for lactose metabolism. First, we prepare a plasmid that has been digested with two different restriction enzymes so that is can accept similarly-digested DNA fragments. Second, we digest the genomic DNA of an organism that can metabolize lactose (like E. coli) and ligate the fragments into the plasmid. Third, we transform the Salmonella with the recombinant plasmids we have made and look for Salmonella that can grow on lactose as a carbon source. The plasmid that contains the genes responsible for lactose metabolism can then be isolated and sequenced to identify, hopefully, the lac operon of E. coli.
In the above example, a selection was used because only the cells with the ability to use lactose for food could grow. We could have also screened for the ability to cleave lactose with β-galactosidase by putting X-Gal in the plates and looking at thousands of white colonies for a blue colony
There are many variations on library creation. An investigator may choose to randomize a small segment of a cloned gene and screen the variants for a mutant with a new phenotype. A whole gene or plasmid can be mutated can be transformed for screening. A screen can be set up for “multi-copy suppressors” that rely on having an excess of a gene to obtain a phenotype. It’s all up you and your smart noodle to figure out the best way.
I am presenting this strategy to make a library of genomic E. coli fragments in plasmids that replicate in E. coli. In searching for genes or gene clusters, it’s important to keep in mind that any given pair of restriction enzymes will only cover a fraction of the DNA present in the chromosome. Because the E. coli genome is sequenced (at least several K strains), you can make an estimate of the number of times your genomic DNA prep will be cut by a particular endonuclease. When you are cutting with two enzymes to make library fragments, the number of clonable fragments from a complete digest will, at most, be twice the number of times the “least cutting” endonuclease cuts. Because of this, it is wise to set up several libraries with different endonucleases. Not only will this allow better coverage, but it will also allow the identification of a gene that may have one of the sites within it (that would never be isolated because it would always be cut in the library).
It is a good idea to use endonucleases that leave 4 bp overhangs so that the ligation efficiency is high. There are four enzymes that leave the same CTAG overhang (Avr II, Nhe I, Spe I, and Xba I). This is quite useful. You can prepare a single vector preparation cut with, say, EcoR I and Avr II, and ligate four different genomic digests into it (EcoR I and each of the four enzymes that generate complementary overhangs).
When preparing the vector for your library, you want to minimize the background of transformants lacking an insert. A trace amount of singly-cut vector can produce a lot of transformants after ligation. I generally employ one of two strategies to get around this.
- The easiest is to add a third restriction enzyme to your digest that cuts between the two sites of interest. In doing so, vectors that were cut by only one of your library enzymes get secondarily cut to prevent self-ligation. In general cloning of fragments, I find this to be far more effective than using a phosphatase (which reduces overall ligation/transformation efficiency).
- If your vector doesn’t have convenient sites for making the library, or if your vector is a low-copy vector, you can use PCR to amplify the replication origin and drug-resistance gene while appending convenient restriction sites on the ends (see 'Round-the-horn site-directed mutagenesis). If you follow this route, keep in mind that most of your ligated plasmids will have large segments that are not host modified. Therefore, transform the library into cells that have no restriction system. This method greatly reduces “vector-only” transformants.