Recently I posted a blog on using miRNA profiling as biomarkers for cancer. Protein profiling is another potential tool for hunting biomarkers. Traditional microarray data are based on the assumption that each individual protein contributes independently to clinical outcomes. As the data are inconsistent across different data sets, we start to think that not all proteins should be weighted equally. The first step is to identify the significant protein sets or pathways involved in disease, that is, the so called protein biomarkers.
Several different algorithms have been generated to approach this problem. Some algorithms use interacting structures, such as protein-protein interactions (PPIs), protein-DNA interactions, or regulatory pathways as the input information for the simulation . The limitation in this kind of simulation is that it does not consider the protein network structure within the cell.
The next improvement is to introduce network-constrained regularization procedures (which can weight the importance of a particular protein) for linear regression analysis . The diagnostic result, however, should not have a linear association with relevant proteins. Rather, the associations should be binary (alive/dead, metastasis/non-metastasis).
A new simulation algorithm was proposed recently to achieve this assignment . The idea was to combine the advantages of previous algorithms. This algorithm uses gene expression data and protein-protein interaction information as the input, builds a network-constrained support vector machine (which can return binary classifiers), and then predicts the outcome of new samples (see flowchart). From this algorithm, significant protein or subnetworks can be detected through a significance test. A simulation was generated on breast cancer and the results showed that the contribution of hub proteins is significantly enhanced, even when the expression levels were not very different between cancerous and non-cancerous cells.
Overall, the results suggest that this new algorithm can identify important proteins that may serve as biomarkers for human cancer.
Chuang, H., Lee, E., Liu, Y., Lee, D., & Ideker, T. (2007). Network-based classification of breast cancer metastasis Molecular Systems Biology, 3 DOI: 10.1038/msb4100180
Li, C., & Li, H. (2008). Network-constrained regularization and variable selection for analysis of genomic data Bioinformatics, 24 (9), 1175-1182 DOI: 10.1093/bioinformatics/btn081
Chen, L., Xuan, J., Riggins, R., Clarke, R., & Wang, Y. (2011). Identifying cancer biomarkers by network-constrained support vector machines BMC Systems Biology, 5 (1) DOI: 10.1186/1752-0509-5-161