Unsupervised Feature Extraction Applied to Bioinformatics

Unsupervised Feature Extraction Applied to Bioinformatics
Author: Y-h. Taguchi
Publisher: Springer Nature
Total Pages: 321
Release: 2019-08-23
Genre: Technology & Engineering
ISBN: 3030224562

Download Unsupervised Feature Extraction Applied to Bioinformatics Book in PDF, Epub and Kindle

This book proposes applications of tensor decomposition to unsupervised feature extraction and feature selection. The author posits that although supervised methods including deep learning have become popular, unsupervised methods have their own advantages. He argues that this is the case because unsupervised methods are easy to learn since tensor decomposition is a conventional linear methodology. This book starts from very basic linear algebra and reaches the cutting edge methodologies applied to difficult situations when there are many features (variables) while only small number of samples are available. The author includes advanced descriptions about tensor decomposition including Tucker decomposition using high order singular value decomposition as well as higher order orthogonal iteration, and train tenor decomposition. The author concludes by showing unsupervised methods and their application to a wide range of topics. Allows readers to analyze data sets with small samples and many features; Provides a fast algorithm, based upon linear algebra, to analyze big data; Includes several applications to multi-view data analyses, with a focus on bioinformatics.

Feature Extraction

Feature Extraction
Author: Isabelle Guyon
Publisher: Springer Science & Business Media
Total Pages: 765
Release: 2006-07-20
Genre: Computers
ISBN: 3540354875

Download Feature Extraction Book in PDF, Epub and Kindle

This book is both a reference for engineers and scientists and a teaching resource, featuring tutorial chapters and research papers on feature extraction. Until now there has been insufficient consideration of feature selection algorithms, no unified presentation of leading methods, and no systematic comparisons.

Applying Machine Learning for Automated Classification of Biomedical Data in Subject-Independent Settings

Applying Machine Learning for Automated Classification of Biomedical Data in Subject-Independent Settings
Author: Thuy T. Pham
Publisher: Springer
Total Pages: 114
Release: 2018-08-23
Genre: Technology & Engineering
ISBN: 3319986759

Download Applying Machine Learning for Automated Classification of Biomedical Data in Subject-Independent Settings Book in PDF, Epub and Kindle

This book describes efforts to improve subject-independent automated classification techniques using a better feature extraction method and a more efficient model of classification. It evaluates three popular saliency criteria for feature selection, showing that they share common limitations, including time-consuming and subjective manual de-facto standard practice, and that existing automated efforts have been predominantly used for subject dependent setting. It then proposes a novel approach for anomaly detection, demonstrating its effectiveness and accuracy for automated classification of biomedical data, and arguing its applicability to a wider range of unsupervised machine learning applications in subject-independent settings.

Data Analytics in Bioinformatics

Data Analytics in Bioinformatics
Author: Rabinarayan Satpathy
Publisher: John Wiley & Sons
Total Pages: 544
Release: 2021-01-20
Genre: Computers
ISBN: 1119785618

Download Data Analytics in Bioinformatics Book in PDF, Epub and Kindle

Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel machine learning computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization. Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics approximating classification and prediction of disease, feature selection, dimensionality reduction, gene selection and classification of microarray data and many more.

Unsupervised Discovery of Biological Features from Gene Expression Data

Unsupervised Discovery of Biological Features from Gene Expression Data
Author: Jie Tan
Publisher:
Total Pages: 352
Release: 2017
Genre:
ISBN:

Download Unsupervised Discovery of Biological Features from Gene Expression Data Book in PDF, Epub and Kindle

Rapid advances in high-throughput measurement technologies have led to a data rich era. Such big data provide opportunities to discover more biological phenomena but also challenge our ability to efficiently and effectively analyze them. While supervised machine learning algorithms are powerful in studying a known phenomenon, they cannot make novel discoveries. Therefore, unsupervised approaches that can directly extract knowledge from big data without requiring metadata annotation are needed. We adapted a neural network-based approach, denoising autoencoders (DAs), to fulfill this need. We first applied DAs to breast cancer gene expression data. DAs built features descriptive of important molecular characteristics of breast cancer, and the features can be robustly generalized to an independent dataset. We next developed ADAGE, A̲nalysis using D̲enoising A̲utoencoders for G̲ene E̲xpression, a DAs-based unsupervised feature extraction approach that integrates a complete gene expression compendium of an organism. We demonstrated ADAGE in the bacterial pathogen Pseudomonas aeruginosa. ADAGE constructed features that reflect biological states, such as strain variations and oxygen abundance in the environment, and achieved better resolution than the traditional feature extraction approaches PCA and ICA. Aiming for a more robust model, we next developed eADAGE, an ensemble of ADAGE models. eADAGE consolidates 100 individual ADAGE models into one model with improved robustness, less noise, and better concordance with known biology. We found that features in eADAGE models can recapitulate expert-curated biological pathways, and they obtained higher pathway coverage and better pathway resolution than ADAGE, PCA, and ICA. Using eADAGE, we compared 78 different medium types used in the P.a. compendium and identified media-specific patterns. Finally, to put eADAGE features into researchers’ data analysis workflow, we developed the ADAGE signature analysis pipeline. This pipeline aims to identify perturbed biological processes in an input dataset. Unlike traditional pathway analysis that relies on expert-curated processes, we leverage features extracted from big data that mimic biological processes. This pipeline is especially useful for less wellannotated organisms with few pathway resources available. As data continue to grow, we expect that unsupervised knowledge extraction methods like ADAGE will become the mainstream in big data analysis.

UnFEAR

UnFEAR
Author:
Publisher:
Total Pages:
Release:
Genre:
ISBN:

Download UnFEAR Book in PDF, Epub and Kindle

Handbook of Machine Learning Applications for Genomics

Handbook of Machine Learning Applications for Genomics
Author: Sanjiban Sekhar Roy
Publisher: Springer Nature
Total Pages: 222
Release: 2022-06-23
Genre: Technology & Engineering
ISBN: 9811691584

Download Handbook of Machine Learning Applications for Genomics Book in PDF, Epub and Kindle

Currently, machine learning is playing a pivotal role in the progress of genomics. The applications of machine learning are helping all to understand the emerging trends and the future scope of genomics. This book provides comprehensive coverage of machine learning applications such as DNN, CNN, and RNN, for predicting the sequence of DNA and RNA binding proteins, expression of the gene, and splicing control. In addition, the book addresses the effect of multiomics data analysis of cancers using tensor decomposition, machine learning techniques for protein engineering, CNN applications on genomics, challenges of long noncoding RNAs in human disease diagnosis, and how machine learning can be used as a tool to shape the future of medicine. More importantly, it gives a comparative analysis and validates the outcomes of machine learning methods on genomic data to the functional laboratory tests or by formal clinical assessment. The topics of this book will cater interest to academicians, practitioners working in the field of functional genomics, and machine learning. Also, this book shall guide comprehensively the graduate, postgraduates, and Ph.D. scholars working in these fields.