Boosting for Generic 2D/3D Object Recognition

Boosting for Generic 2D/3D Object Recognition
Author: Doaa Abd al-Kareem Mohammed Hegazy
Publisher:
Total Pages: 0
Release: 2009
Genre:
ISBN:

Download Boosting for Generic 2D/3D Object Recognition Book in PDF, Epub and Kindle

Generic object recognition is an important function of the human visual system. For an artificial vision system to be able to emulate the human perception abilities, it should also be able to perform generic object recognition. In this thesis, we address the generic object recognition problem and present different approaches and models which tackle different aspects of this difficult problem. First, we present a model for generic 2D object recognition from complex 2D images. The model exploits only appearance-based information, in the form of a combination of texture and color cues, for binary classification of 2D object classes. Learning is accomplished in a weakly supervised manner using Boosting. However, we live in a 3D world and the ability to recognize 3D objects is very important for any vision system. Therefore, we present a model for generic recognition of 3D objects from range images. Our model makes use of a combination of simple local shape descriptors extracted from range images for recognizing 3D object categories, as shape is an important information provided by range images. Moreover, we present a novel dataset for generic object recognition that provides 2D and range images about different object classes using a Time-of-Flight (ToF) camera.

Pattern Recognition and Computer Vision

Pattern Recognition and Computer Vision
Author: Qingshan Liu
Publisher: Springer Nature
Total Pages: 532
Release: 2023-12-23
Genre: Computers
ISBN: 9819984351

Download Pattern Recognition and Computer Vision Book in PDF, Epub and Kindle

The 13-volume set LNCS 14425-14437 constitutes the refereed proceedings of the 6th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2023, held in Xiamen, China, during October 13–15, 2023. The 532 full papers presented in these volumes were selected from 1420 submissions. The papers have been organized in the following topical sections: Action Recognition, Multi-Modal Information Processing, 3D Vision and Reconstruction, Character Recognition, Fundamental Theory of Computer Vision, Machine Learning, Vision Problems in Robotics, Autonomous Driving, Pattern Classification and Cluster Analysis, Performance Evaluation and Benchmarks, Remote Sensing Image Interpretation, Biometric Recognition, Face Recognition and Pose Recognition, Structural Pattern Recognition, Computational Photography, Sensing and Display Technology, Video Analysis and Understanding, Vision Applications and Systems, Document Analysis and Recognition, Feature Extraction and Feature Selection, Multimedia Analysis and Reasoning, Optimization and Learning methods, Neural Network and Deep Learning, Low-Level Vision and Image Processing, Object Detection, Tracking and Identification, Medical Image Processing and Analysis.

Person Re-Identification

Person Re-Identification
Author: Shaogang Gong
Publisher: Springer Science & Business Media
Total Pages: 446
Release: 2014-01-03
Genre: Computers
ISBN: 144716296X

Download Person Re-Identification Book in PDF, Epub and Kindle

The first book of its kind dedicated to the challenge of person re-identification, this text provides an in-depth, multidisciplinary discussion of recent developments and state-of-the-art methods. Features: introduces examples of robust feature representations, reviews salient feature weighting and selection mechanisms and examines the benefits of semantic attributes; describes how to segregate meaningful body parts from background clutter; examines the use of 3D depth images and contextual constraints derived from the visual appearance of a group; reviews approaches to feature transfer function and distance metric learning and discusses potential solutions to issues of data scalability and identity inference; investigates the limitations of existing benchmark datasets, presents strategies for camera topology inference and describes techniques for improving post-rank search efficiency; explores the design rationale and implementation considerations of building a practical re-identification system.

Reconstruction and Analysis of 3D Scenes

Reconstruction and Analysis of 3D Scenes
Author: Martin Weinmann
Publisher: Springer
Total Pages: 250
Release: 2016-03-17
Genre: Computers
ISBN: 3319292463

Download Reconstruction and Analysis of 3D Scenes Book in PDF, Epub and Kindle

This unique work presents a detailed review of the processing and analysis of 3D point clouds. A fully automated framework is introduced, incorporating each aspect of a typical end-to-end processing workflow, from raw 3D point cloud data to semantic objects in the scene. For each of these components, the book describes the theoretical background, and compares the performance of the proposed approaches to that of current state-of-the-art techniques. Topics and features: reviews techniques for the acquisition of 3D point cloud data and for point quality assessment; explains the fundamental concepts for extracting features from 2D imagery and 3D point cloud data; proposes an original approach to keypoint-based point cloud registration; discusses the enrichment of 3D point clouds by additional information acquired with a thermal camera, and describes a new method for thermal 3D mapping; presents a novel framework for 3D scene analysis.

Enhancing Point Cloud Generation From Various Information Sources by Applying Geometry-aware Folding Operation

Enhancing Point Cloud Generation From Various Information Sources by Applying Geometry-aware Folding Operation
Author: Yu Lin
Publisher:
Total Pages: 0
Release: 2022
Genre: Machine learning
ISBN:

Download Enhancing Point Cloud Generation From Various Information Sources by Applying Geometry-aware Folding Operation Book in PDF, Epub and Kindle

A plethora of cutting-edge computer vision and graphic applications, such as Augmented Reality (AR), Virtual Reality (VR), automatic vehicles, and robotics, require rapid creation and access to abundant 3D data. Among various 3D data representations, e.g., RGB images, depth images, or voxel grids, point cloud attracts considerable attention from the research community because it offers additional geometric, shape, and scale information in comparison with 2D images and demands less computational resource to process in contrast to other 3D representations, e.g., voxel grids, octree, or triangle meshes. Unfortunately, even with the increasing availability of 3D sensors, the size and variety of 3D point clouds datasets pale when compared to the vast size datasets of other representations. Therefore, it will benefit many applications if we can generate point clouds from other information sources. Point cloud generation is a sub-field of 3D reconstruction, which aims to generate a complete 3D object from other information sources. Conventional methods generally focus on 2D images and heavily rely on the knowledge of multi-view geometry, while multiple 2D views of a target 3D object usually are inaccessible in many real-world scenarios. On the contrary, recent deep learning approaches either dedicate to 3D representations with regular structures, such as voxel grids and octrees, and thus suffer from resolution and scalability issues, or unconsciously ignore the crucial 3D prior knowledge and lead to sub-optimal solutions. To address the aforementioned drawbacks, we explore the possibilities to improve the point cloud generation by developing advanced folding operations and geometry-aware (3D-prioraware) reconstruction networks in this dissertation. Specifically, we start with a novel point cloud generation framework TDPNet that reconstructs complete point clouds by employing a hierarchical manifold decoder and a collection of latent 3D prototypes. Later, we find that applying vanilla folding operation is insufficient for a realistic reconstruction, and using KMeans centroids as the prototype features is unstable and lacks interpretability. Inspired by these observations, we further introduce a novel framework equipped with a collection of Learnable Shape Primitives (L-SHAP), which encode the crucial 3D prior knowledge from training data through an additional folding operation. On the other hand, it’s beneficial to many applications if point clouds can be generated in a few-shot scenario. We tackle this problem by a novel few-shot generation framework FSPG, which simultaneously considers class-agnostic and class-specific 3D priors during the generation process. Finally, we observe that conventional folding operations are implemented by a simple shared-MLP, which increases training difficulty and limits the network’s modeling capability. In order to solve this problem, we incorporate the popular Transformer architecture into a novel attentional folding decoder AttnFold and introduce a Local Semantic Consistency (LSC) regularizer to further boost the model’s capability. Based on our research, we demonstrate that learning flexible data-driven 3D priors and adopting advanced folding operations are effective for point cloud generation under different problem settings.

Domain Adaptation in Computer Vision with Deep Learning

Domain Adaptation in Computer Vision with Deep Learning
Author: Hemanth Venkateswara
Publisher: Springer Nature
Total Pages: 256
Release: 2020-08-18
Genre: Computers
ISBN: 3030455297

Download Domain Adaptation in Computer Vision with Deep Learning Book in PDF, Epub and Kindle

This book provides a survey of deep learning approaches to domain adaptation in computer vision. It gives the reader an overview of the state-of-the-art research in deep learning based domain adaptation. This book also discusses the various approaches to deep learning based domain adaptation in recent years. It outlines the importance of domain adaptation for the advancement of computer vision, consolidates the research in the area and provides the reader with promising directions for future research in domain adaptation. Divided into four parts, the first part of this book begins with an introduction to domain adaptation, which outlines the problem statement, the role of domain adaptation and the motivation for research in this area. It includes a chapter outlining pre-deep learning era domain adaptation techniques. The second part of this book highlights feature alignment based approaches to domain adaptation. The third part of this book outlines image alignment procedures for domain adaptation. The final section of this book presents novel directions for research in domain adaptation. This book targets researchers working in artificial intelligence, machine learning, deep learning and computer vision. Industry professionals and entrepreneurs seeking to adopt deep learning into their applications will also be interested in this book.

2D Object Detection and Recognition

2D Object Detection and Recognition
Author: Yali Amit
Publisher: MIT Press
Total Pages: 334
Release: 2002
Genre: Computers
ISBN: 9780262011945

Download 2D Object Detection and Recognition Book in PDF, Epub and Kindle

A guide to the computer detection and recognition of 2D objects in gray-level images.

Distributionally Robust Unsupervised Domain Adaptation and Its Applications in 2D and 3D Image Analysis

Distributionally Robust Unsupervised Domain Adaptation and Its Applications in 2D and 3D Image Analysis
Author: Yibin Wang
Publisher:
Total Pages: 0
Release: 2023
Genre:
ISBN:

Download Distributionally Robust Unsupervised Domain Adaptation and Its Applications in 2D and 3D Image Analysis Book in PDF, Epub and Kindle

Obtaining ground-truth label information from real-world data along with uncertainty quantification can be challenging or even infeasible. In the absence of labeled data for a certain task, unsupervised domain adaptation (UDA) techniques have shown great accomplishment by learning transferable knowledge from labeled source domain data and adapting it to unlabeled target domain data, yet uncertainties are still a big concern under domain shifts. Distributionally robust learning (DRL) is emerging as a high-potential technique for building reliable learning systems that are robust to distribution shifts. In this research, a distributionally robust unsupervised domain adaptation (DRUDA) method is proposed to enhance the machine learning model generalization ability under input space perturbations. The DRL-based UDA learning scheme is formulated as a min-max optimization problem by optimizing worst-case perturbations of the training source data. Our Wasserstein distributionally robust framework can reduce the shifts in the joint distributions across domains. The proposed DRUDA method has been tested on various benchmark datasets. In addition, a gradient mapping-guided explainable network (GMGENet) is proposed to analyze 3D medical images for extracapsular extension (ECE) identification. DRUDA-enhanced GMGENet is evaluated, and experimental results demonstrate that the proposed DRUDA improves transfer performance on target domains for the 3D image analysis task successfully. This research enhances the understanding of distributionally robust optimization in domain adaptation and is expected to advance the current unsupervised machine learning techniques.

Visual Domain Adaptation in the Deep Learning Era

Visual Domain Adaptation in the Deep Learning Era
Author: Gabriela Csurka
Publisher: Morgan & Claypool Publishers
Total Pages: 190
Release: 2022-04-05
Genre: Computers
ISBN: 163639342X

Download Visual Domain Adaptation in the Deep Learning Era Book in PDF, Epub and Kindle

Solving problems with deep neural networks typically relies on massive amounts of labeled training data to achieve high performance/b>. While in many situations huge volumes of unlabeled data can be and often are generated and available, the cost of acquiring data labels remains high. Transfer learning (TL), and in particular domain adaptation (DA), has emerged as an effective solution to overcome the burden of annotation, exploiting the unlabeled data available from the target domain together with labeled data or pre-trained models from similar, yet different source domains. The aim of this book is to provide an overview of such DA/TL methods applied to computer vision, a field whose popularity has increased significantly in the last few years. We set the stage by revisiting the theoretical background and some of the historical shallow methods before discussing and comparing different domain adaptation strategies that exploit deep architectures for visual recognition. We introduce the space of self-training-based methods that draw inspiration from the related fields of deep semi-supervised and self-supervised learning in solving the deep domain adaptation. Going beyond the classic domain adaptation problem, we then explore the rich space of problem settings that arise when applying domain adaptation in practice such as partial or open-set DA, where source and target data categories do not fully overlap, continuous DA where the target data comes as a stream, and so on. We next consider the least restrictive setting of domain generalization (DG), as an extreme case where neither labeled nor unlabeled target data are available during training. Finally, we close by considering the emerging area of learning-to-learn and how it can be applied to further improve existing approaches to cross domain learning problems such as DA and DG.