Isolation and sequencing of active origins of DNA replication by nascent strand capture and release (NSCR)

II. Supplementary methods A. Purification of genomic DNA from cultured cells by SDS Lysis Buffer and Protease K B. Separation of DNA from RNA by using TRIzol Reagent C. Preparation of sucrose gradient manually D. Size fractionation of nascent DNA + DNA fragments by 16ml 5%-30% sucrose gradient E. 5’-labeling nascent strand DNA + DNA fragments with biotin for one reaction F. Detection of the cleavage ability of RNase I using oligonucleotides with incorporated rNTP


INTRODUCTION
DNA replication is an event of one time duplication of DNA during S-phase of the cell cycle [1].It is multistep process that involves a large number of factors which are highly coordinated with the other cellular processes during cell cycle, differentiation and development [2][3][4][5].The first step in DNA replication is the denaturation of DNA to form a "bubble" or "replication eye" (Fig. S1).Primase initiates synthesis of a 5'-RNA which serves as the substrate for extension by DNA polymerase.Although the size of the RNA primers may differ between organisms, the mammalian primase usually extends approximately ~11 nt (for mouse 9-11 and for human 11 -14 nt) [6].This process results in 5'-RNA-DNA-3' chimeric molecules which are the actual nascent strands [7,8].Replication forks initiated from a given origin extend bi-directionally until merging with forks from adjacent origins.In the leading strand, nascent DNA extends continuously 5' to 3' in the direction of fork movement whereas in the lagging strand 5' to 3' synthesis of DNA occurs opposite to the movement of the replication fork.In the latter case, synthesis proceeds by repeated RNA primer synthesis by primase and short extensions of DNA, to form Okazaki fragments.Internal RNA primers are removed and Okazaki fragments are ligated together to extend the length of the lagging nascent DNA strand.Ligation of Okazaki fragments is processive such that each nascent strand becomes progressively longer while remaining centered at the origin of replication (Fig. S1).Nonetheless, they contain 5' RNA primers received from the ligation of the most recent Okazaki fragment [7,8].
Structure-based isolation exploits the "replication bubble" present at the initiation site.Recently, a modification of the method has been applied to isolation and sequencing origins from mammalian cells where origins are first enriched based on association with nuclear matrix followed by purification of replication "bubbles" by trapping in an agarose matrix and retarding their migration during gel electrophoresis [21][22][23].Limitations to this method include low resolution and sequence biases resulting from a requirement for restriction endonuclease digestion [19].Additionally, enrichment of putative origin sequences based on attachment to the nuclear matrix may result in a distorted representation of origins across the genome.Furthermore, sequences based on structural properties other than those associated with a replication "bubble", e.g., R-loops or repair intermediates, may also occur.
Sequence read depth approaches rely on the ability to discriminate regions of the genome that have undergone replication from those that have not by the relative copy number of different sequences represented in a sequencing library and the fact that sequences near to origins will be duplicated earlier than sequence distal to origins.In an exponentially POL Scientific growing population of cells, these differences are small but recent studies suggest that bioinformatics approaches may be sufficient to identify origins based on this property [11,12].
Strand switching of Okazaki fragments results from bidirectional and semi-discontinuous replication [13][14][15][16].To take advantage of this property in next generation sequencing based approaches requires that knowledge of the 5'-3' orientation of isolated Okazaki fragments is retained.Such an approach has recently been applied to identification of replication origins in vaccinia virus [24].
Figure 1.Schematic view of nascent strand capture and release (NSCR) method.First, nascent strands (green lines with red) are released from total DNA (black lines) by heat denaturation and size selected using sucrose gradient fractionation.Gradient fractions containing fragments between 400 and 2000 nt are pooled.Fragments larger than 400 nt are large enough to avoid the Okazaki fragments and an upper size of 2000 nt insures that the sequences will lie within close proximity to their origins.The pooled fractions will contain both nascent strands and broken DNA (black lines), both of which are 5' biotinylated (indicated with B).Biotinylation is performed by first introducing a 5' thiol using T4 polynucleotide kinase (T4-PNK) with ATPγS as the substrate.T4-PNK transfers the thiol-group from the γ position of ATPγS to the 5´-hydroxyl terminus of single-stranded DNA and RNA.Second, the thiol is then reacted with biotin-maleimide to introduce a 5' biotin.All 5'biotin labeled fragments are loaded on streptavidin magnetic beads taking advantage of the highly specific and very strong non-covalent interaction between streptavidin and biotin.The method continues with extensive washing steps to eliminate non-specific interactions.The final step is release of the SNS by RNase I digestion of the 5' RNA primer (indicated with red color), providing a positive selection for nascent strand DNAs.
Short nascent strand (SNS) based methods for mapping replication origins are the most widely used analyses [17].These methods depend on physical isolation and characterization of newly synthesized SNSs that reflect the beginning of replication [17,25].Generally, DNA is isolated, denatured by heating to release nascent strands, and size fractionated to recover DNAs larger than Okazaki fragments but small enough to lie near origins of replication.The approach continues with purification of SNSs from contaminating broken DNA and characterization of the recovered sequences.The main experimental obstacle is the separation of SNSs from broken DNA of similar size, which could be a significant part of the fractionated material [17][18][19].Primarily, two approaches have been used to enrich for SNSs.In one method, replicating DNA is labeled with BrdU and selected using anti-BrdU antibodies.An advantage to this approach is a positive selection for de novo replicated DNA.A second approach for SNS enrichment is the use of lambda-exonuclease (lexo) to degrade and select against broken DNA contamination.Lexo is a 5' to 3' exonuclease that exhibits a strong preference for double stranded DNA (dsDNA) over RNA substrates.This property allows enrichment for chimeric 5'-RNA-DNA-3' fragments through preferential degradation of contaminating single strand DNAs (ssDNA) [26,27].The activity of lexo on ssDNA is not as high as on dsDNA and therefore requires a very high concentration of the enzyme to provide significant enrichments POL Scientific [28].Further, lexo stalls at GC-rich sequences, can be blocked at sites of ribonucleotide incorporation, will not digest circular DNA, and can degrade RNA and hence nascent strands under some conditions, all of which could result in a sequence bias [19,26,29].Head to head comparison of origins identified from HeLa cells by BrdU selection and lexo shows that over the same 30 Mb region BrdU selection identifies 815 origins and lexo identifies 320 origins where 49% of the origins identified by lexo are also found by BrdU selection [30].

Development of NSCR and applications of the method and possible modifications
The nascent strand capture and release (NSCR) protocol was developed as an alternative method to enrich SNSs from size-fractionated single strand genomic DNA through a sequence-neutral, positive selection for chimeric 5'-RNA-DNA-3' molecules.Similar to the lexo method, genomic DNA is isolated, heat denatured and size fractionated to obtain the appropriately sized fragments.The material is then 5'-biotinylated using malelimide chemistry, bound to streptavidin beads, and extensively washed under conditions that minimize DNA-DNA interactions.
The method uses a final step of RNase I treatment to release the RNA containing fragments only (Fig. 1).
The developed protocol has been applied to compare SNSs from wild type and minichromosome maintenance protein 2 (MCM2) deficient mouse embryonic fibroblasts (MEFs) [31].With minor modification NSCR could be used to examine Okazaki fragments by selecting smaller size DNA from sucrose gradients.One type of information could be achieved from Okazaki fragments is that it will mark the regions with active replication per se.Another potential application is the identification of sites of ribonucleotide incorporation into DNA.
Ribonucleotide incorporation occurs at a frequency of approximately 1/2,000-1/10,000 nt in normal cells [32,33] and even a single ribonucleotide incorporated in to a deoyribonucleotide oligomer is cleaved by RNase I (Fig. S3).Examination of sequences recovered from non-dividing cells by NSCR could allow the locations of these events to be determined.In this case the sucrose gradient should be omitted or much larger size fractions used.One caveat to this application is that ribonucleotides are replaced with deoxyribonucleotides in non-dividing cells.Experimental design (Fig. 1) NSCR begins with purification of genomic DNA.DNA is then purified from the bulk of RNA by using TRIzol reagent because, at a later stage, RNA will compete with nascent strands during 5'-biotin labeling.
It is important to avoid using RNase to remove contaminating RNA since the 5'RNA leader in the nascent strand DNA must remain intact.DNA is re-suspended in TE buffer and heat denatured at 95-100 º C for 10 min.At this point, accurate estimates of DNA concentrations can be POL Scientific obtained; a total of approximately 1 mg of RNA-free DNA is recommended prior to proceeding to fractionation on sucrose gradients.For nascent strand isolation, a 5% to 30% sucrose gradient was used from which fractions containing ~400-~2000 nt long DNAs are recovered and pooled.DNA of this size is sufficiently large to avoid Okazaki fragments, but small enough to lie near to origins of replication.The appropriate fractions are pooled, ethanol precipitated and washed with 70% ethanol to concentrate and desalt the DNA.After dissolving in an appropriate volume, nucleic acids are 5'-biotinylated and bound to streptavidin coated magnetic beads.Next critical step is the wash of the beads from unspecific to the biotin-streptavidin interactions.We have modified a protocol from David Wilson (published online, http:// molbio.mgh.harvard.edu/szostakweb/protocols/biotin-avidin/index.html) for this purpose.
The final step is release of SNSs with RNase I (Fig. 1).RNase I is used since it exhibits little to no sequence specificity [34,35].The RNase I released fraction constitutes the SNS product and can be assessed by PCR or amplified using whole genome amplification (WGA) in preparation for next generation sequencing.For WGA, we have used the SIGMA-ALDRICH SEQXE, SeqPlex DNA Amplification Kit following the instructions provided with primer removal.Amplified material is sheared based on the randomized 5'-primers from the kit resulting in DNA fragments approximately 200 bp in length (Fig. 2).

Limitations and considerations
One limitation of the NSCR method is the efficiency.It is estimated that ~10-20 ng of SNS between 400-2000 nt is present in 600 µg of genomic DNA [8].In comparison, NSCR routinely yields approximately 1-2 ng from 1mg total DNA; a yield of less than 10% of the theoretical calculation.Loss of SNSs may occur at several steps in the procedure including size fractionation, biotinylation, binding of biotinylated DNA to streptavidin beads, wash steps, and RNase I digestion.We observed that a significant proportion of the DNA failed to bind to the streptavidin beads suggesting that improvements in either the biotinylation or binding steps could significantly improve the yields.Due to the low yields, whole genome amplification (WGA) is necessary to achieve sufficient material for next generation sequencing (NGS).The use of WGA imposes several additional concerns.WGA results in truncation and loss of 5' to 3' orientation relative to the initial NSCR isolated SNSs.Additionally, differential amplification of different regions of the genome based on GC-content or secondary structure could alter the representation of sequences during amplification.Further, substantial bias can occur with some kits for library preparation.We obtained satisfactory results with IntegenX PrepX DNA ChIP Library Prep kit.To control for these potential sources of bias, WGA and library preparation on genomic DNA that does not contain short nascent strands can be performed.DNA from non-dividing cells or DNA from which short nascent strands have been removed (e.g., the fraction containing the largest DNA during size fraction on sucrose gradients) can be used for this purpose.
A second concern that is equally applicable to the lexo and NSCR methods is that ribonucleotides are incorporated into DNA by mechanisms independent of the activity of primase.DNA-dependent polymerases occasionally fail to distinguish between rNTPs and dNTPs (e.g.DNA polymerase delta incorporates one rNTP per every 2000 dNTPs) [32].Inefficient removal of RNA primers from Okazaki fragments can also result in the presence of ribonucleotides at significant distances from the elongating replication fork; although RNase H2 class ribonucleases remove many of ribonucleotides from DNA in vivo [32,33].
Treatments or mutations that result in an increased frequency of rNTP incorporation may result in a higher rate of inclusion of sequences that are not derived from SNSs in either lexo or NSCR isolated DNA.We tested the activity of RNase I over a single ribonucleotide incorporated into 63nt of ssDNA oligomer in same conditions as we perform NSCR.Indeed the RNase I enzymatic activity was observed on a single ribonucleotide flanked by oligo-deoxyribonucleotides (Fig. S3).Importantly, an incorporated ribonucleotide will be cleaved by RNase I and removed from the streptavidin beads, thereby contaminating the SNS preparation.If the sites of ribonucleotide incorporation are random in the genome, then this contamination will not be a major problem.
NSCR shows good reproducibility between experimental repeats (~90% of the largest 20% of peaks were common to two independent NSCR analyses of MEFs [31]).It may be possible to validate peaks identified by NSCR as origins by determining if peak width correlates with the size of DNAs recovered from sucrose gradients.Additionally, concordance with orthogonal methods (many of which are summarized above) would support the validity of origins identified by NSCR.Although a genome-wide head to head comparison has not been performed, a prior estimate of the number of nascent strand peaks from MEFs within a 60.4 Mbp region of mouse Chr11 has been made using lexo [28] and identified 2231 putative origins.In comparison NSCR (using a 95% cutoff value) identifies 3922 peaks in this region where 79% of the peaks identified by lexo overlap peaks identified by NSCR.However, this analysis is complicated by the differing peak widths between the two studies [31].

POL Scientific
Capture nascent DNA + DNA fragments on streptavidin magnetic beads (Timing ~4 h) 1. Pipet the calculated volume of streptavidin coated magnetic beads into a 1.5 ml tube using a cut tip.Wash the beads by adding 1 ml Buffer A, rinse the tip in the buffer to collect any adhering beads, apply the magnet for 3-5 min until the supernatant is clear and the beads are all collected into a spot adjacent to the magnet.Remove the supernatant taking care not to disturb the beads.

NOTE:
The beads concentration is 4 µg/µl and the binding capacity is 1 pmol of nucleic acid 5'ends/1 µg of beads or 4 pmols/µl or 4 nmols/ml.Assuming that 5'ends are 100% labeled, 100 ng of 500 nt ssDNA (or RNA) has 0.6 pmol of 5'ends (see Supplemental Information section E).Therefore theoretically 1 µl of beads is enough to bind ~600 ng of ssDNA of ~500 nt in length.Starting with ~1-2 mg of DNA (RNA free) for sucrose gradient centrifugation, the typical yield we observe after biotinylation of 400-2000 base fractions is around 100-200 µg of biotinylated DNA, equivalent to 0.6-1.2nmols of 5' ends.(The yield will depend on a number of factors including the proportion of dividing cells in the starting sample and the level of DNA breakage during isolation.)~150 µl of beads is required per 100 µg 5' biotinylated DNA.Using these parameters, the bead volume required to provide an excess of binding capacity should be calculated.
If more than one sample is being prepared, use an equal volume of beads for all reactions ensuring that the amount is sufficient for the sample with the greatest amount of DNA.It is important not to overload the bead capacity.
2. Add 300 µl Buffer A to the beads and mix well.
3. Add 300 µl Buffer A to the biotinylated fraction (it is step 19 from Supplementary Information protocol) and mix well.
5. Incubate for 60 min at room temperature with shaking or rocking.
6. Apply the magnet as described above.
8. Wash twice with 1 ml Buffer A at room temperature.9. Wash twice with 1 ml Buffer B at room temperature.
10. Wash three times with 1 ml Buffer C in a heating block at +70ºC for 5 min.
11. Wash three times with 1 ml 20× SSC in a heating block at +70ºC for 10 min.
12. Switch off the heating block and perform the rest of the wash steps as the temperature declines.
14. Add 1 ml RNB re-suspend well and leave it on the heating block for 30 min.Apply the magnet and remove the buffer.

NOTE:
Washing the beads to remove non-specifically bound DNA is the most important step of NSCR.The intensive treatment with high salt, low salt, heating at +70ºC in 4 M Urea and 20× SSC reduces the background due to non-specific DNA-DNA and DNA-bead interactions.
CAUTION: All wash steps need to be performed with the appropriate buffer and at the described temperature using 1 ml of buffer.Finger-tap the tubes to be sure that the beads are well dispersed then incubate at the temperatures described.Apply the magnet for sufficient time to clear the supernatant of beads prior to removing it.The volume of the beads shrinks a bit during the washes.If un-biotinylated DNA is used, the beads will be lost much more rapidly, possibly completely, by the end of the wash cycles.
15. Add 400 µl RNB Buffer to the beads, re-suspend well and incubate for 15 min at room temperature, apply the magnet and collect the supernatant as the mock RNase I wash control (Fraction W).The wash control serves to estimate the non-specific background in the absence of RNase I release of SNS.
Release the Nascent DNA with RNase I (Timing 1 h) 16.Make a mixture (or master mix) of 1:500 (= 0.02 U/µl) dilution of RNase I into RNB buffer.

PAUSE POINT:
The fractions can be stored at -20°C.

NOTE:
In most cases beads will not be visible, but may nonetheless be present.Store all fractions and the beads at -20ºC.
In addition if you need to assess the DNA left remaining bound to the beads: Treat the beads with 0.4 M NaOH (100 µl) for 5 min at room temperature, neutralize with 100 µl of 0.4 M HCl, and spin for 1 min at maximum speed.Collect the supernatant as the bound or column fraction (Fraction C).PAUSE POINT: DNA can be stored for long periods of time in isopropanol at -20ºC.

Concentrating nascent strands and preparing for whole genome amplification
23. Spin at 14,000 rpm (maximum speed in a microfuge) for at least 45 min at room temperature, look for the pellet, and then discard supernatant.Wash with 70% EtOH 3 times.Perform the wash steps with caution taking care not to lose the pellet.
PAUSE POINT: DNA can be stored for long periods of time in ethanol at -20ºC.

NOTE:
Fractions W and R are ready for PCR or for whole genome amplification (WGA).The amount of DNA in these fractions is in the range of picograms per microliter and is difficult to measure directly by absorbance or pico-green.Pico-green interacts with dsDNA but does not detect single stranded nascent DNAs efficiently except for the regions with secondary structure.For higher precision of pico-green measurement ssDNA can be converted to dsDNA using the Klenow fragment of DNA polymerase I. Other methods for measuring the amount of DNA were not tested.
25. Perform a quality control with 50 µl PCR reaction(s) at 60°C annealing temperature with 27-29 cycles for a known strong peak using only 2 µl from fraction W (Wash control) and from R (RNase I release).It is important to observe a significant enrichment of the band signal for fraction R over W which will be an indication of RNase I dependent release into fraction R. For the mouse genome we used primers: forward F-Chr11_108873.3CCTG-CCCTTCACAAAGAAAA and reverse R-Chr11_108873.3GTGGAGAAGGGTTCCATGTG. High quality NSCR preparations should demonstrate at least a 3-10 times brighter band for the R fraction versus the W fraction.It is typical however for the W fraction to give some background amplification.Different sites may demonstrate different levels of enrichment.

Whole genome amplification (WGA)
26.We have used the WGA kit SeqPlex Enhanced DNA Amplification Kit Sigma "SEQXE-10RXN" plus primer removal step.Other methods or kits for whole genome amplification were not tested.The protocol provided with the kit was followed without deviation except that 29 cycles were used for the amplification (Fig 2).Step

Problem Causes Suggestions
Suppl II D.
Step 5 • Fractionation gel shows second bands with lower molecular weight.• On the gel the high molecular weight DNA is present in the lower fractions contaminating my fractions of choice with longer DNA.
• The second bands are RNA, which was not removed during the TRIzol step.• Most likely this results from poor re-suspension of DNA after heating and before loading on the sucrose gradient (e.g.clumps of DNA were present).• Alternatively, the sucrose gradient may have been overloaded.
• Performing a TRIzol step to split DNA from RNA was not successful.Collect only the interphase white pellet without aqueous phase.If you have a very well formed tight white pellet, most likely you will obtain a good separation of DNA and RNA.• The DNA need to be well dissolved after heating.
• Do not load more than 350-400 mg DNA per 18 ml gradient.
• The biotinylation level is too low.
• Make sure that you perform the 5'biotinylation correctly with fresh (proven) reagents.
• Your input DNA is not as much as you expected.
• Use more input DNA.

25
• PCR was performed on fractions W and R for a proven origin in order to demonstrate origin enrichment.However, no significant enrichment in the R fraction over the W fraction was seen.
• Wash step was not sufficient.
• The representation of the proven origin as part of SNS after NSCR is so low that is difficult to detect using a reasonable (up to 32) number of PCR cycles.
• Be sure that you perform the washing conditions as it is described.Make fresh washing buffers especially the urea containing buffer (buffer C). • Perform PCR on another site which usually demonstrates stronger signal.

ANTICIPATED RESULTS
The result described in Figure 2 demonstrates a sufficiently well represented distribution of amplified SNS on a microgel bioanalyzer.Figure 2 shows that the overall the size of the summit in the gel is ~300 bp before primer removal step and ~200 bp after primer removal.If no significant material is detected after the primer removal step origin sequences will not be well represented in NGS data from the sample.

Figure 2 .
Figure 2. Bioanalyzer electrophoresis after whole genome amplification WGA.The RNase I release fraction is used as the template for WGA, which we have performed using the SEQXE SeqPlex DNA amplification kit from Sigma-Aldrich.The workflow of the kit contains three major steps: pre-amplification in which specific primers are added to 5' and 3' ends, amplification and primer removal.A. Shows a bioanalyzer microgel of duplicate experiments before and after the primer removal step.B. Densitometry of the gels displayed in (A).As shown, the initial fragment size of 400-2000 nt is now reducedby random priming in the pre-amplification step from 150-1000 bp with a summit at ~300 bp (lines 1 and 2 before primer removal).After primer removal the size is slightly smaller 100-700 bp with a summit at ~150-200 bp (lines 3 and 4 after primer removal).This final size is suitable for Illumina sequencing.

22 .
Perform a phenol extraction of fractions W and R using 100 µl phenol.Re-extract the phenol with an additional 200 µl TE and combine the aqueous phases.Precipitate by addition of 5 µl glycogen, 1/10 volume of 3 M NaAc (mix well) and one volume of isopropanol.

Figure 3
shows wiggle track files aligned to the mm9 genome and plotted with the UCSC web browser for the same region and at the same resolution.Approximately 20 kb and 200 kb regions are shown and include scale, NSCR tracks with insufficient or sufficient reads, peak track and gene track as indicated in the figure.
17. Add 200 µl of the RNase I mixture to the beads.Mix well, by finger tapping or pipetting up and down several times.Incubate 15 min at room temperature, not longer.18.Apply the magnet and collect the supernatant.19.Wash the beads by adding 200 µl of RNB to the beads, mix well and apply the magnet again to collect the supernatant.Combine first 200 µl with this second 200 µl which together is the RNase I release fraction, Fraction R. Fractions F, W and R may contain residual beads, which could increase the background.In order to minimize any contamination from left over beads, proceed to next step.21.Spin down the fractions F, W and R in a microfuge at maximum rpm for 5 min.Collect the entire volume into a new tube except the last few microliter which may contain the residual beads if any. CAUTION:
27.Following the primer removal step in 26, free primers are removed from the sample using a PCR clean-up kit (e.g., GenElute or PureLink PCR purification kit cat No K3100-01) with elution in 50 µl.~1-2µl of the eluate is run on a bioanalyser (High sensitivity DNA assay) to estimate the size and the amount of amplified fragments (Fig2).A preparation is considered good if you observe a peak summit at ~200 bp and a total amount of at least 50 ng.POL Scientific