A Dimension Reduction Technique to Preserve Nearest Neighbors on High Dimensional Data

A Dimension Reduction Technique to Preserve Nearest Neighbors on High Dimensional Data
Author: Christos Nestor Chachamis
Publisher:
Total Pages: 74
Release: 2020
Genre:
ISBN:

Download A Dimension Reduction Technique to Preserve Nearest Neighbors on High Dimensional Data Book in PDF, Epub and Kindle

Dimension reduction techniques are widely used for various tasks, including visualizations and data pre-processing. In this project, we develop a new dimension-reduction method that helps with the problem of Approximate Nearest Neighbor Search on high dimensional data. It uses a deep neural network to reduce the data to a lower dimension, while also preserving nearest neighbors and local structure. We evaluate the performance of this network on several datasets, including synthetic and real ones, and, finally, we compare our method against other dimension reduction techniques, like tSNE. Our experiment results show that this method can sufficiently preserve the local structure, in both the training and test data. In particular, we observe that most of the distances of the predicted nearest neighbors in the test data are within 10% of the distances of the actual nearest neighbors. Another advantage of our method is that it can easily work on new and unseen data, without having to fit the model from scratch.

Geometric Structure of High-Dimensional Data and Dimensionality Reduction

Geometric Structure of High-Dimensional Data and Dimensionality Reduction
Author: Jianzhong Wang
Publisher: Springer Science & Business Media
Total Pages: 363
Release: 2012-04-28
Genre: Computers
ISBN: 3642274978

Download Geometric Structure of High-Dimensional Data and Dimensionality Reduction Book in PDF, Epub and Kindle

"Geometric Structure of High-Dimensional Data and Dimensionality Reduction" adopts data geometry as a framework to address various methods of dimensionality reduction. In addition to the introduction to well-known linear methods, the book moreover stresses the recently developed nonlinear methods and introduces the applications of dimensionality reduction in many areas, such as face recognition, image segmentation, data classification, data visualization, and hyperspectral imagery data analysis. Numerous tables and graphs are included to illustrate the ideas, effects, and shortcomings of the methods. MATLAB code of all dimensionality reduction algorithms is provided to aid the readers with the implementations on computers. The book will be useful for mathematicians, statisticians, computer scientists, and data analysts. It is also a valuable handbook for other practitioners who have a basic background in mathematics, statistics and/or computer algorithms, like internet search engine designers, physicists, geologists, electronic engineers, and economists. Jianzhong Wang is a Professor of Mathematics at Sam Houston State University, U.S.A.

Proceedings of the Fifth International Conference on Fuzzy and Neuro Computing (FANCCO - 2015)

Proceedings of the Fifth International Conference on Fuzzy and Neuro Computing (FANCCO - 2015)
Author: V. Ravi
Publisher: Springer
Total Pages: 365
Release: 2015-11-24
Genre: Computers
ISBN: 3319272128

Download Proceedings of the Fifth International Conference on Fuzzy and Neuro Computing (FANCCO - 2015) Book in PDF, Epub and Kindle

This proceedings bring together contributions from researchers from academia and industry to report the latest cutting edge research made in the areas of Fuzzy Computing, Neuro Computing and hybrid Neuro-Fuzzy Computing in the paradigm of Soft Computing. The FANCCO 2015 conference explored new application areas, design novel hybrid algorithms for solving different real world application problems. After a rigorous review of the 68 submissions from all over the world, the referees panel selected 27 papers to be presented at the Conference. The accepted papers have a good, balanced mix of theory and applications. The techniques ranged from fuzzy neural networks, decision trees, spiking neural networks, self organizing feature map, support vector regression, adaptive neuro fuzzy inference system, extreme learning machine, fuzzy multi criteria decision making, machine learning, web usage mining, Takagi-Sugeno Inference system, extended Kalman filter, Goedel type logic, fuzzy formal concept analysis, biclustering etc. The applications ranged from social network analysis, twitter sentiment analysis, cross domain sentiment analysis, information security, education sector, e-learning, information management, climate studies, rainfall prediction, brain studies, bioinformatics, structural engineering, sewage water quality, movement of aerial vehicles, etc.

A Non-linear Dimensionality Reduction Method for Improving Nearest Neighbour Classification [microform]

A Non-linear Dimensionality Reduction Method for Improving Nearest Neighbour Classification [microform]
Author: Renqiang Min
Publisher: Library and Archives Canada = Bibliothèque et Archives Canada
Total Pages: 164
Release: 2005
Genre:
ISBN: 9780494021743

Download A Non-linear Dimensionality Reduction Method for Improving Nearest Neighbour Classification [microform] Book in PDF, Epub and Kindle

Learning in high dimensional spaces is computationally expensive because of the curse of dimensionality. Consequently, there is a critical need for methods that can produce good low dimensional representations of the raw data that preserve the significant structure in the data and suppress noise. This can be achieved by an autoencoder network consisting of a recognition network that converts high-dimensional data into low-dimensional codes and a generative network that reconstructs the high dimensional data from its low dimensional codes. Experiments with images of digits and images of faces show that the performance of an autoencoder network can sometimes be improved by using a non-parametric dimensionality reduction method, Stochastic Neighbour Embedding, to regularize the low-dimensional codes in a way that discourages very similar data vectors from having very different codes.

Nonlinear Dimensionality Reduction

Nonlinear Dimensionality Reduction
Author: John A. Lee
Publisher: Springer Science & Business Media
Total Pages: 316
Release: 2007-10-31
Genre: Mathematics
ISBN: 038739351X

Download Nonlinear Dimensionality Reduction Book in PDF, Epub and Kindle

This book describes established and advanced methods for reducing the dimensionality of numerical databases. Each description starts from intuitive ideas, develops the necessary mathematical details, and ends by outlining the algorithmic implementation. The text provides a lucid summary of facts and concepts relating to well-known methods as well as recent developments in nonlinear dimensionality reduction. Methods are all described from a unifying point of view, which helps to highlight their respective strengths and shortcomings. The presentation will appeal to statisticians, computer scientists and data analysts, and other practitioners having a basic background in statistics or computational learning.

Computational Genomics with R

Computational Genomics with R
Author: Altuna Akalin
Publisher: CRC Press
Total Pages: 462
Release: 2020-12-16
Genre: Mathematics
ISBN: 1498781861

Download Computational Genomics with R Book in PDF, Epub and Kindle

Computational Genomics with R provides a starting point for beginners in genomic data analysis and also guides more advanced practitioners to sophisticated data analysis techniques in genomics. The book covers topics from R programming, to machine learning and statistics, to the latest genomic data analysis techniques. The text provides accessible information and explanations, always with the genomics context in the background. This also contains practical and well-documented examples in R so readers can analyze their data by simply reusing the code presented. As the field of computational genomics is interdisciplinary, it requires different starting points for people with different backgrounds. For example, a biologist might skip sections on basic genome biology and start with R programming, whereas a computer scientist might want to start with genome biology. After reading: You will have the basics of R and be able to dive right into specialized uses of R for computational genomics such as using Bioconductor packages. You will be familiar with statistics, supervised and unsupervised learning techniques that are important in data modeling, and exploratory analysis of high-dimensional data. You will understand genomic intervals and operations on them that are used for tasks such as aligned read counting and genomic feature annotation. You will know the basics of processing and quality checking high-throughput sequencing data. You will be able to do sequence analysis, such as calculating GC content for parts of a genome or finding transcription factor binding sites. You will know about visualization techniques used in genomics, such as heatmaps, meta-gene plots, and genomic track visualization. You will be familiar with analysis of different high-throughput sequencing data sets, such as RNA-seq, ChIP-seq, and BS-seq. You will know basic techniques for integrating and interpreting multi-omics datasets. Altuna Akalin is a group leader and head of the Bioinformatics and Omics Data Science Platform at the Berlin Institute of Medical Systems Biology, Max Delbrück Center, Berlin. He has been developing computational methods for analyzing and integrating large-scale genomics data sets since 2002. He has published an extensive body of work in this area. The framework for this book grew out of the yearly computational genomics courses he has been organizing and teaching since 2015.

Database and Expert Systems Applications

Database and Expert Systems Applications
Author: Stephane Bressan
Publisher: Springer Science & Business Media
Total Pages: 977
Release: 2006-08-29
Genre: Computers
ISBN: 3540378715

Download Database and Expert Systems Applications Book in PDF, Epub and Kindle

This book constitutes the refereed proceedings of the 17th International Conference on Database and Expert Systems Applications, DEXA 2006. The book presents 90 revised full papers together with 1 invited paper. The papers are organized in topical sections on XML, data and information, data mining and data warehouses, database applications, WWW, bioinformatics, process automation and workflow, knowledge management and expert systems, database theory, query processing, and privacy and security.

Nonlinear Dimensionality Reduction Techniques

Nonlinear Dimensionality Reduction Techniques
Author: Sylvain Lespinats
Publisher: Springer Nature
Total Pages: 279
Release: 2021-12-02
Genre: Computers
ISBN: 3030810267

Download Nonlinear Dimensionality Reduction Techniques Book in PDF, Epub and Kindle

This book proposes tools for analysis of multidimensional and metric data, by establishing a state-of-the-art of the existing solutions and developing new ones. It mainly focuses on visual exploration of these data by a human analyst, relying on a 2D or 3D scatter plot display obtained through Dimensionality Reduction. Performing diagnosis of an energy system requires identifying relations between observed monitoring variables and the associated internal state of the system. Dimensionality reduction, which allows to represent visually a multidimensional dataset, constitutes a promising tool to help domain experts to analyse these relations. This book reviews existing techniques for visual data exploration and dimensionality reduction such as tSNE and Isomap, and proposes new solutions to challenges in that field. In particular, it presents the new unsupervised technique ASKI and the supervised methods ClassNeRV and ClassJSE. Moreover, MING, a new approach for local map quality evaluation is also introduced. These methods are then applied to the representation of expert-designed fault indicators for smart-buildings, I-V curves for photovoltaic systems and acoustic signals for Li-ion batteries.

The Random Projection Method

The Random Projection Method
Author: Santosh S. Vempala
Publisher: American Mathematical Soc.
Total Pages: 120
Release: 2005-02-24
Genre: Mathematics
ISBN: 0821837931

Download The Random Projection Method Book in PDF, Epub and Kindle

Random projection is a simple geometric technique for reducing the dimensionality of a set of points in Euclidean space while preserving pairwise distances approximately. The technique plays a key role in several breakthrough developments in the field of algorithms. In other cases, it provides elegant alternative proofs. The book begins with an elementary description of the technique and its basic properties. Then it develops the method in the context of applications, which are divided into three groups. The first group consists of combinatorial optimization problems such as maxcut, graph coloring, minimum multicut, graph bandwidth and VLSI layout. Presented in this context is the theory of Euclidean embeddings of graphs. The next group is machine learning problems, specifically, learning intersections of halfspaces and learning large margin hypotheses. The projection method is further refined for the latter application. The last set consists of problems inspired by information retrieval, namely, nearest neighbor search, geometric clustering and efficient low-rank approximation. Motivated by the first two applications, an extension of random projection to the hypercube is developed here. Throughout the book, random projection is used as a way to understand, simplify and connect progress on these important and seemingly unrelated problems. The book is suitable for graduate students and research mathematicians interested in computational geometry.