We're here to help.
Check out the topics on this page to get started with analyzing TP53 mutations.

Getting started

Batch analysis

  1. Download the sample files (MAF, VCF and CSV) on your computer.
  2. Go to the batch analysis page.
  3. Upload one of the sample files using the "Select a file" button.
  4. Fill in your email address in the form.
  5. Click the "Start the analysis" button.
  6. You will receive an email with the results of the analysis within a short time.

Single analysis

  1. Go to the single analysis page.
  2. Fill the "Start position" field with "7577538".
  3. Fill the "Mutant allele" field with "A".
  4. Click the "Start the analysis" button.
  5. A display of the results will be shown on the screen.
  6. Click the "Export tables" button if you want a full description of the mutation.

Sample files

Download sample batch analysis files

Download description of variables in batch analysis files


Download Seshat documentation

TP53 gene at a glance

Seshat: Quick start


What is the goal of Seshat?

Seshat performs the following tasks:
  • Quality checks mutation nomenclature.
  • Generates a full description of each variant formatted according to HGVS.
  • Generates publication-ready tables.
  • Assesses the pathogenicity of each variant according to either general prediction algorithms (Provean, Sift, Polyphen2, FATHMM, MutationAssessor and 7 other algorithms) or algorithms developed exclusively for TP53.
  • Displays functional and structural data for each TP53 variant.

What is the extent of Seshat analysis? Is it restricted to the 11 "classical" exons?

Seshat can handle all mutations localized in the TP53 gene, including introns, exons and alternative exons using HG18, HG19 and HG38 nomenclature.
The genomic nomenclature and boundary of TP53 can be found at the LRG website.
The reference sequence is located at the NCBI website and includes 32,772 nucleotides.
Start End
HG18 7536593
HG19 7595868 7563097
HG38 7692550 7659779

What information is included in the Seshat TP53 database?

The Seshat TP53 database is different from the TP53 mutation database available at the TP53 website. We have added a lot of additional information to improve the analysis of TP53 mutations. Among other things, the Seshat database includes:
  • Natural SNP data extracted from various databases, such as dbSNP, ExAc or NHLBI (more than 1,000 variants).
  • Prediction data for all TP53 single nucleotide substitutions.
  • Functional data, such as apoptosis, growth arrest or DNA binding (among others) for many TP53 variants.

What type of data is necessary to use Seshat?

Seshat works with both txt files and VCF or MAF files generated by NGS. Input data using protein, cDNA or genomic data can be used. Read the documentation for more information.

I have a few synonymous variants in the patients that I have analyzed

This is an important issue. Several database have removed all synonymous variants (sSNV) from their data. This is a serious error, as it is now well known that many sSNV are pathogenic with possible defects in RNA splicing, RNA stability or protein folding. See the following publication for more info:
  • Synonymous Somatic Variants in Human Cancer Are Not Infamous: A Plea for Full Disclosure in Databases and Publications
  • Soussi T., Taschner P.E., & Samuels Y.
  • Human Mutation (in press)
  • 2016
Seshat handles and analyzes both sSNV and nSNV and provides a full report of their potential pathogenicity including functional data for many well-known TP53 sSNV.

I have identified a novel TP53 variant not included in the database

This is an important issue, particularly if this variant is a germline variant.
  • Frameshift variant (germline or somatic)
    • If the mutational event has been verified*, finding a novel deletion or insertion is not infrequent and can be considered to be a pathogenic mutation.
  • Missense variant (germline or somatic)
    • Analysis of the database shows that the discovery of novel missense variants has decreased considerably over recent years, as most deleterious TP53 variants have already been identified. The discovery of a novel missense variant can therefore raise a number of questions, particularly in the case of a germline variant.
    • Germline variant: Peripheral blood lymphocytes
      • Sequencing larger numbers of individuals will lead to the identification of excessively rare, novel non-pathogenic SNP. Until the pathogenicity of the variant has been clearly demonstrated (segregation with the disease or experimental analysis), these SNP should be considered to be Variants of Unknown Significance.
      • Various tools can be used to predict the pathogenicity of the mutation, but they have poor sensitivity and specificity for TP53 and must be used with caution. The effect on RNA translation or splicing is not included in predictive software.
      • You can contact us for further discussion on this variant.
    • Somatic variants**: Frozen tissues
      • In the absence of any functional information, there is no way to define whether this variant is a driver or passenger mutation.
    • Somatic variants**: Paraffin-embedded tissues
      • DNA sequencing from paraffin-embedded tissues is known to be associated with a high number of artifactual single-nucleotide changes (C:G>T:A). Careful control is necessary to validate this variant.
      • In the absence of any functional information, there is no way to define whether this variant is a driver or passenger mutation.
* Assuming that sequencing of the genetic material has been carefully performed and controlled.
** Assuming that the somatic origin of this variant has been validated by sequencing normal DNA from the same individual.

Prediction of TP53 mutation pathogenicity: how accurate is this prediction and can it be used in clinical practice?

The prediction of pathogenicity of TP53 variants provided by Seshat is one of the most accurate predictive tools currently available for TP53 variants. In contrast with all other TP53 mutation databases, including the Cosmic or IARC databases, the Seshat TP53 mutation database has been highly curated to remove all artifactual data linked to sequencing errors. See Edlund et al. paper for more information:
  • Data-driven unbiased curation of the TP53 tumor suppressor gene mutation database and validation by ultradeep sequencing of human tumors
  • Edlund K., Larsson O., Ameur A., Bunikis I., Gyllensten U., Leroy B., Sundström M., Micke P., Botling J., & Soussi T.
  • Proceedings of the National Academy of Sciences, 109(24), 9551-9556
  • 2012
Pathogenicity has been predicted using specific algorithms developed exclusively for TP53 variants taking three types of non-redundant parameters into account:
  • Database parameters such as frequency in the database, association with cell lines or in germline among others. These parameters are very accurate and irrefutable, as they are based on more that 70,000 observations.
  • Exploratory parameters based on multiple experimental data on TP53 mutant loss of function based on the analysis of more than 500 publications.
  • Predictive parameters based on the use of multiple predictive algorithms.
All popular prediction software use similar algorithms for each protein and do not take into account the specificity of the protein and/or the disease. This leads to loss of specificity, which is unacceptable for clinical applications. TP53 is a multifunctional protein that plays an important role in cancer protection, but which has different properties some of which are not linked to tumor suppression. Therefore, some evolutionarily-conserved residues important for TP53 can be altered without giving rise to a TP53 driver mutation, even when the mutation is predicted to be strongly pathogenic by most predictive software. For example, TP53 codon 23 is highly conserved in all vertebrates and is important for TP53 regulation. Mutations at codon 23 impair TP53 binding to mdm2 that can lead to defects in cellular growth and these mutations will therefore be counter-selected. A close examination of TP53 data shows that several highly conserved residues (including codon 23) are totally absent from the database. Unfortunately, such cold spot mutations are always predicted to be pathogenic by predictive software. The use of three types of non-redundant parameters avoids all of these problems.

As a result of these stringent criteria, we are fairly confident that TP53 variants classified as pathogenic correspond to true driver mutations.

I have unpublished mutations that could be useful for the database

Send us a full description of the mutations using the official nomenclature and we will include your data.

I have used your database for my work. How should I cite Seshat in our publication?

Please use the reference to our last publication and include the url of the website:
  • Recommendations for Analyzing and Reporting TP53 Gene Variants in the High‐Throughput Sequencing Era
  • Soussi, T., Leroy, B., & Taschner, P. E.
  • Human mutation, 35(6), 766-778
  • 2014
Thank you!

I work in an academic lab and would like to use your database in our analytical pipeline for the analysis of TP53 mutations. Can I obtain Seshat data?

Unfortunately, Seshat data are not publicly available.

What is the future of Seshat?

Seshat data will be regularly updated. Keep an eye on the history file.

Why is this website called Seshat?

Seshat was the Ancient Egyptian goddess of wisdom, knowledge, and writing. Her name means she who scrivens (i.e. she who is the scribe) is in perfect adequation with the goal of this site.

On which database is Seshat based on?

Seshat is based on 2018 release of UMD TP53 database.