EMAIL:  PASSWORD:      

Gene-set Activity Datasets

The benchmark gene-set activity datasets are available to download here for who interesting in using these benchmark datasets to evaluate their novel algorithms for analyzing these gene-set activities.

The gene-set activity datasets available for downloading are listed below:

DatasetAccession no.publicationsamplesoptions
Breast Cancer GSE5764 Turashvili et al. 2007 [1] normal: 20 samples
tumor: 10 samples
GSA method:
Gene-set collection:
Breast Cancer GSE7904 Richardson et al. 2006 [2] normal: 19 samples
tumor: 43 samples
GSA method:
Gene-set collection:
Colorectal Cancer GSE4107 Hong et al.2007 [3] normal: 10 samples
tumor: 12 samples
GSA method:
Gene-set collection:
Colorectal Cancer GSE8671 Sabates-B, et al. 2007 [4] normal: 32 samples
tumor: 32 samples
GSA method:
Gene-set collection:
Lung Cancer GSE4115 Spira et al. 2007 [5] normal: 90 samples
tumor: 97 samples
GSA method:
Gene-set collection:
Lung Cancer GSE10072 Landi et al. 2008 [6] normal: 49 samples
tumor: 58 samples
GSA method:
Gene-set collection:
MCLung Cancer GSE43580 Tarca et al. 2013 [7] AC stage1: 41 samples
AC stage2: 36 samples
SCC stage1: 34 samples
SCC stage2: 39 samples
GSA method:
Gene-set collection:


References
1. Turashvili G, Bouchal J, Baumforth K, Wei W, Dziechciarkova M, Ehrmann J, Klein J, Fridman E, Skarda J, Srovnal J, Hajduch M, Murray P, Kolar Z, Novel markers for differentiation of lobular and ductal invasive breast carcinomas by laser microdissection and microarray analysis, BMC Cancer, 7:55, 2007.
2. Richardson AL, Wang ZC, De Nicolo A, Lu X, Brown M, Miron A, Liao X, Iglehart JD, Livingston DM, and Ganesan S, X chromosomal abnormalities in basal-like human breast cancer, Cancer Cell, 9:121-132, 2006.
3. Hong Y, Ho KS, Eu KW, Cheah PY, A susceptibility gene set for early onset colorectal cancer that integrates diverse signaling pathways: implication for tumorigenesis, Clinical Cancer Research, 13: 1107- 1114, 2007.
4. Sabates-Bellver J, Van der Flier LG, de Palo M, Cattaneo E, Maake C, Rehrauer H, Laczko E, Kurowski MA, Bujnicki JM, Menigatti M, Luz J, Ranalli TV, Gomes V, Pastorelli A, Faggiani R, Anti M, Jiricny J, Clevers H, and Marra G, Transcriptome profile of human colorectal adenomas, Molecular Cancer Research, 5:1263-1275, 2007.
5. Spira A, Beane JE, Shah V, Steiling K, Liu G, Schembri F, Gliman S, Dumas YM, Calner P, Sebastiani P, Sridhar S, Beamis J, Lamb C, Anderson T, Gerry N, Keane J, Lenburg ME, Brody JS, Airway epithelial gene expression in the diagnostic evaluation of smokers with suspect lung cancer, Nature Medicine, 13:361-366, 2007.
6. Landi MT, Dracheva T, Rotunno M, Figueroa JD, Liu H, Dasgupta A, Mann RE, Fukuoka J, Hames M, Bergen AW, Murphy SE, Yang P, Pesatori AC, Consonni D, Bertazzi PA, Wacholder S, Shih JH, Caporaso NE, Jen J, Gene expression signature of cigarette smoking and its role in lung adenocarcinoma development and survival, PLoS One, 3:e1651, 2008.
7. Tarca AL, Lauria M, Unger M, Bilal E, Boue S, Kumar Dey K, Koeppl H, Martin F, Meyer P, Nandy P, Norel R, Peitsch M, Rice JJ, Romero R, Stolovitzky G, Talikka M, Xiang Y, Zechner C, Improver DSC collarborators, Strengths and limitations of microarray-based phenotype prediction: lessons learned from the IMPROVER Diagnostic Signature Challenge, Bioinformatics, 29:2892-9, 2013.

Simple Gene Expression Analysis

GAT provide the simple tools to visualize the gene expression data, identify differential expression genes for the preliminary study step.
***Not available yet!

Gene-set Activity Transformation

Instead of analyzing gene expression levels, many work have been done in converting them to another form namely Pathway activity or Gene-set activity by integrating with pathway or gene-set data. The pathway/gene-set activity have been successfull applied for disease classification. Here, we provide several gene-set activity transformation methods including CORR-based, NCFS-i, AFS, etc.

Machine Learning

This tool is integrated with WEKA (the data mining tool) to maximize the learning effectiveness from the data. This includes the feature selection, classification, clustering and model validation with various number of methods.

Intepretation of Result

After the machine learning processes is done, the results from those step would be more valuable, if it can be annotated its relationship to the phenotype outcome. In this version, we only allow user to link and intepret their result using KEGG pathway database via KEGG mapper.

***Currently undergoing construction.

Data Repository

The benchmark gene-set activity datasets are available to download for who interesting in using these benchmark datasets to evaluate their novel algorithms for analyzing these gene-set activities. The gene-set activity datasets available >> here <<.

Home   |   Data Repository   |   Java Library   |   Contact Us and Developers   |  

© 2015 Gene-set Activity Toolbox, School of Information Technology, King Mongkut's University of Technology Thonburi Contact person: Assoc.Prof.Dr. Jonathan H. Chan (jonathan (at) sit.kmutt.ac.th)