SAGE: Serial Analysis of Gene Expression?

xanthines · Apr 10, 2007

I'm having a little trouble wrapping my mind around what it is this process is all about. Does anyone have an explanation on what this is used for, as opposed to microarrays? They both seem to involve identifying mRNAs, but why would you use one over the other? I went to sagenet.com and some site at King's College (in London) but still feel a little uneasy about this process.

I'm supposed to come up with a list of genes for SAGE, but I'd like to know more about it before I look like an idiot in front of my PI!

Thanks!

-X

solitude · Apr 10, 2007

Coincidentally, this was essentially a verbatim question on my genomics exam a few weeks ago. The prompt was "you wish to identify genes responsbile for [a given process]. How would you go about doing so?" Here is what I wrote for the procedure (it's not clear to me whether you understand the procedure, so I included it just in case, although it's somewhat detailed):

1) Isolate total RNA from tissue type in experimental and control stages by lysing the cells and purifying total RNA.
2) Isolate mRNA using oligodT column that binds polyA tail of all mRNA's but allows elution of tRNA, rRNA, and impurities.
3) Perform RTPCR using reverse transcriptase enzyme, oligodT primer attachced to a bead, and other necessary reagents (dNTPs, thermal cycler, etc.). This will produce an mRNA/cDNA hybrid attached to oligodT+bead.
4) Add RNAse H to remove RNA, then reverse transcriptase or DNApol will fill in 2nd cDNA strand using first cDNA strand as template. Now I have dscDNA+bead.
5) Add NlaIII, a restriction enzyme that cleaves more than 14bp away from polyA/T tail of cDNA. Now I have Nla - (N)x - bead
6) Ligate a Bsm sequencing using DNA ligase to form: Bsm - Nla - (N)x - bead
7) Add BsmFI, a restriction enzyme which cleaves 14bp away from its recognition sequence towards the polyA/T end. Now I have: Bsm - Nla - (N)10
8) Ligate cDNA's together into di-tags to create: Bsm - Nla - (N)10 - (N)10 - Nla - Bsm
9) Add NlaIII, which cleaves at Nla recognition site to yield: Nla(ss) - (N)10 - (N)10 - Nla(ss).
10) Ligate these di-tags missing Bsm together via DNA ligase to create concatemers. A concatamer is a 500-1000bp of the Nla(ss) - (N)10 - (N)10 - Nla(ss) sequences ligated together.
11) Now sequence each concatemer via Sanger method.
12) Computational analysis of the sequence data; search genomic sequence for the sequences obtained by sequencing the concatamer. Normally 14bp would be insufficient to uniquely identify a gene, but the program is assisted by the constraint that it only looks at Nla - (N)10 near the polyA tail.
13) Compare quantities of each cDNA tag. The relative number of each cDNA tag reflects the relative abundance of the corresponding mRNA, and hence gene expression.
14) Statistical analysis (t-test, ANOVA) to determine significance of differential expression levels (NORMALIZE DATA).

Thus, this SAGE analysis has revealed a number of genes that are differentially regulated in experimental vs. control tissue.

To confirm these results, individually test interesting candidate genes. The method of choice for verifying differential RNA expression is real-time PCR (i.e. quantitative reverse transcriptase PCR) via Taqman method or SyberGreen method.

Overall: much cheaper than MPSS, but more expensive than a microarray. Sequences need not be known prior to analysis, so great for organisms where whole-genome arrays don't exist, or new ESTs are expected. Requires a lot of sequencing, though.

I hope this helps. I aced that question on the exam!

xanthines said:
I'm having a little trouble wrapping my mind around what it is this process is all about. Does anyone have an explanation on what this is used for, as opposed to microarrays? They both seem to involve identifying mRNAs, but why would you use one over the other? I went to sagenet.com and some site at King's College (in London) but still feel a little uneasy about this process.

I'm supposed to come up with a list of genes for SAGE, but I'd like to know more about it before I look like an idiot in front of my PI!

Thanks!

-X

xanthines · Apr 11, 2007

Nice! I'd give you an A+!

I think was missing out on steps 12 and 13 for some reason... Thanks for clearing that up.

-X

solitude said:
Coincidentally, this was essentially a verbatim question on my genomics exam a few weeks ago. The prompt was "you wish to identify genes responsbile for [a given process]. How would you go about doing so?" Here is what I wrote for the procedure (it's not clear to me whether you understand the procedure, so I included it just in case, although it's somewhat detailed):

1) Isolate total RNA from tissue type in experimental and control stages by lysing the cells and purifying total RNA.
2) Isolate mRNA using oligodT column that binds polyA tail of all mRNA's but allows elution of tRNA, rRNA, and impurities.
3) Perform RTPCR using reverse transcriptase enzyme, oligodT primer attachced to a bead, and other necessary reagents (dNTPs, thermal cycler, etc.). This will produce an mRNA/cDNA hybrid attached to oligodT+bead.
4) Add RNAse H to remove RNA, then reverse transcriptase or DNApol will fill in 2nd cDNA strand using first cDNA strand as template. Now I have dscDNA+bead.
5) Add NlaIII, a restriction enzyme that cleaves more than 14bp away from polyA/T tail of cDNA. Now I have Nla - (N)x - bead
6) Ligate a Bsm sequencing using DNA ligase to form: Bsm - Nla - (N)x - bead
7) Add BsmFI, a restriction enzyme which cleaves 14bp away from its recognition sequence towards the polyA/T end. Now I have: Bsm - Nla - (N)10
8) Ligate cDNA's together into di-tags to create: Bsm - Nla - (N)10 - (N)10 - Nla - Bsm
9) Add NlaIII, which cleaves at Nla recognition site to yield: Nla(ss) - (N)10 - (N)10 - Nla(ss).
10) Ligate these di-tags missing Bsm together via DNA ligase to create concatemers. A concatamer is a 500-1000bp of the Nla(ss) - (N)10 - (N)10 - Nla(ss) sequences ligated together.
11) Now sequence each concatemer via Sanger method.
12) Computational analysis of the sequence data; search genomic sequence for the sequences obtained by sequencing the concatamer. Normally 14bp would be insufficient to uniquely identify a gene, but the program is assisted by the constraint that it only looks at Nla - (N)10 near the polyA tail.
13) Compare quantities of each cDNA tag. The relative number of each cDNA tag reflects the relative abundance of the corresponding mRNA, and hence gene expression.
14) Statistical analysis (t-test, ANOVA) to determine significance of differential expression levels (NORMALIZE DATA).

Thus, this SAGE analysis has revealed a number of genes that are differentially regulated in experimental vs. control tissue.

To confirm these results, individually test interesting candidate genes. The method of choice for verifying differential RNA expression is real-time PCR (i.e. quantitative reverse transcriptase PCR) via Taqman method or SyberGreen method.

Overall: much cheaper than MPSS, but more expensive than a microarray. Sequences need not be known prior to analysis, so great for organisms where whole-genome arrays don't exist, or new ESTs are expected. Requires a lot of sequencing, though.

I hope this helps. I aced that question on the exam!

SAGE: Serial Analysis of Gene Expression?

xanthines

decaying organic matter

solitude

Senior Member

xanthines

decaying organic matter

Similar threads