Coding Potential Calculator
HOME RUN CPC DOCUMENTS CONTACT
This guide will give you a quick look at how to using CPC on line web tools. If you want to run CPC on your computer, please refer to our Installation Guide.

TERM Fasta
EXPLAIN "A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. (ncbi)" For more information about fasta format, refer to wikipeida fasta format description or ncbi fasta format description.
TERM Email
EXPLAIN Because of the complexity of coding potential calculating, it might takes several minutes or more to finish the calculating. By providing email address here, users can recieve their results by email for those long-time runing calculating. The email address provided by users is only used for sending calculating results, and we will not keep the email address or leak it to others.
TERM Task ID
EXPLAIN The online coding potential calculator will automatically assign an unique ID for each calculating task. Users can use the Task ID to retrieve their calculating results. For more details, please refer to our Quick Guide .
NOTES: Task ID may be expired after 7 days when it is created. The results of an expired task will removed from the CPC server and can no longer be retrieved.
TERM Hit Num
EXPLAIN How many blast hits are found for the input sequence.
TERM Hit Score
EXPLAIN For a true protein-coding transcript the hits are also likely to have higher quality; i.e., the HSPs (High-scoring Segment Pairs) overall tend to have lower E-value. Thus we define feature HIT SCORE as follows::



where Eij is the E-value of the jth HSP in frame i, Si measures the average quality of the HSPs in frame i and HIT SCORE is the average of Si across three frames. The higher the HIT SCORE, the better the overall quality of the hits and the more likely the transcript is protein-coding.
TERM Frame Score
EXPLAIN For a true protein-coding transcript most of the hits are likely to reside within one frame, whereas for a true noncoding transcript, even if it matches certain known protein sequence segments by chance, these chance hits are likely to scatter in any of the three frames. Thus we define feature FRAME SCORE to measure the distribution of the HSPs among three reading frames:

The higher the FRAME SCORE, the more concentrated the hits are and the more likely the transcript is protein-coding.
TERM FrameFinder ORF Coverage
EXPLAIN FrameFinder's orf coverage, A large COVERAGE OF THE PREDICTED ORF is an indicator of good ORF quality (Slater, G.S.C. (2000) Algorithms for the Analysis of Expressed Sequence Tags, University of Cambridge, Cambridge.) . For more information, refer to Pasteur FrameFinder Man Page .
TERM FrameFinder LOG-ODDS SCORE
EXPLAIN As suggested by the FrameFinder's author, the LOG-ODDS SCORE is an indicator of the quality of a predicted ORF and the higher score, the higher the quality. For more information, refer to Pasteur FrameFinder Man Page .
TERM FrameFinder ORF Type
EXPLAIN The ORF Type is the INTEGRITY OF THE PREDICTED ORF, that indicates whether an ORF begins with a start codon and ends with an in-frame stop codon.
TERM Html View
EXPLAIN The html view of coding potential calculator online results.
TERM UTRdb
EXPLAIN UTResource-DB (UTRdb) is a specialized database of 5' and 3' UnTRanslated sequences of eukaryotic mRNAs cleaned and annotated based on RefSeq. For more detials, refer to UTResource-DB Web Site.
TERM RNAdb
EXPLAIN A comprehensive mammalian noncoding RNA database, includes >800 unique experimentally studied ncRNAs, >1100 putative antisense ncRNA and almost 20000 putative ncRNAs identified in high-quality murine and human cDNA libraries. For more detials, refer to RNAdb Web Site.