Linguistically Motivated Statistical Machine Translation

Linguistically Motivated Statistical Machine Translation
Author: Deyi Xiong
Publisher: Springer
Total Pages: 159
Release: 2015-02-11
Genre: Language Arts & Disciplines
ISBN: 9812873562

Download Linguistically Motivated Statistical Machine Translation Book in PDF, Epub and Kindle

This book provides a wide variety of algorithms and models to integrate linguistic knowledge into Statistical Machine Translation (SMT). It helps advance conventional SMT to linguistically motivated SMT by enhancing the following three essential components: translation, reordering and bracketing models. It also serves the purpose of promoting the in-depth study of the impacts of linguistic knowledge on machine translation. Finally it provides a systematic introduction of Bracketing Transduction Grammar (BTG) based SMT, one of the state-of-the-art SMT formalisms, as well as a case study of linguistically motivated SMT on a BTG-based platform.

Using Linguistic Knowledge in Statistical Machine Translation

Using Linguistic Knowledge in Statistical Machine Translation
Author: Rabih Mohamed Zbib
Publisher:
Total Pages: 162
Release: 2010
Genre:
ISBN:

Download Using Linguistic Knowledge in Statistical Machine Translation Book in PDF, Epub and Kindle

In this thesis, we present methods for using linguistically motivated information to enhance the performance of statistical machine translation (SMT). One of the advantages of the statistical approach to machine translation is that it is largely language-agnostic. Machine learning models are used to automatically learn translation patterns from data. SMT can, however, be improved by using linguistic knowledge to address specific areas of the translation process, where translations would be hard to learn fully automatically. We present methods that use linguistic knowledge at various levels to improve statistical machine translation, focusing on Arabic-English translation as a case study. In the first part, morphological information is used to preprocess the Arabic text for Arabic-to-English and English-to-Arabic translation, which reduces the gap in the complexity of the morphology between Arabic and English. The second method addresses the issue of long-distance reordering in translation to account for the difference in the syntax of the two languages. In the third part, we show how additional local context information on the source side is incorporated, which helps reduce lexical ambiguity. Two methods are proposed for using binary decision trees to control the amount of context information introduced. These methods are successfully applied to the use of diacritized Arabic source in Arabic-to-English translation. The final method combines the outputs of an SMT system and a Rule-based MT (RBMT) system, taking advantage of the flexibility of the statistical approach and the rich linguistic knowledge embedded in the rule-based MT system.

Syntax-based Statistical Machine Translation

Syntax-based Statistical Machine Translation
Author: Philip Williams
Publisher: Springer Nature
Total Pages: 190
Release: 2022-05-31
Genre: Computers
ISBN: 3031021649

Download Syntax-based Statistical Machine Translation Book in PDF, Epub and Kindle

This unique book provides a comprehensive introduction to the most popular syntax-based statistical machine translation models, filling a gap in the current literature for researchers and developers in human language technologies. While phrase-based models have previously dominated the field, syntax-based approaches have proved a popular alternative, as they elegantly solve many of the shortcomings of phrase-based models. The heart of this book is a detailed introduction to decoding for syntax-based models. The book begins with an overview of synchronous-context free grammar (SCFG) and synchronous tree-substitution grammar (STSG) along with their associated statistical models. It also describes how three popular instantiations (Hiero, SAMT, and GHKM) are learned from parallel corpora. It introduces and details hypergraphs and associated general algorithms, as well as algorithms for decoding with both tree and string input. Special attention is given to efficiency, including search approximations such as beam search and cube pruning, data structures, and parsing algorithms. The book consistently highlights the strengths (and limitations) of syntax-based approaches, including their ability to generalize phrase-based translation units, their modeling of specific linguistic phenomena, and their function of structuring the search space.

Statistical Machine Translation

Statistical Machine Translation
Author: Philipp Koehn
Publisher:
Total Pages: 433
Release: 2010
Genre: Machine translating
ISBN: 9780511690587

Download Statistical Machine Translation Book in PDF, Epub and Kindle

Hybrid Approaches to Machine Translation

Hybrid Approaches to Machine Translation
Author: Marta R. Costa-jussà
Publisher: Springer
Total Pages: 208
Release: 2016-07-12
Genre: Computers
ISBN: 3319213113

Download Hybrid Approaches to Machine Translation Book in PDF, Epub and Kindle

This volume provides an overview of the field of Hybrid Machine Translation (MT) and presents some of the latest research conducted by linguists and practitioners from different multidisciplinary areas. Nowadays, most important developments in MT are achieved by combining data-driven and rule-based techniques. These combinations typically involve hybridization of different traditional paradigms, such as the introduction of linguistic knowledge into statistical approaches to MT, the incorporation of data-driven components into rule-based approaches, or statistical and rule-based pre- and post-processing for both types of MT architectures. The book is of interest primarily to MT specialists, but also – in the wider fields of Computational Linguistics, Machine Learning and Data Mining – to translators and managers of translation companies and departments who are interested in recent developments concerning automated translation tools.

Machine Translation with Minimal Reliance on Parallel Resources

Machine Translation with Minimal Reliance on Parallel Resources
Author: George Tambouratzis
Publisher: Springer
Total Pages: 92
Release: 2017-08-09
Genre: Computers
ISBN: 3319631071

Download Machine Translation with Minimal Reliance on Parallel Resources Book in PDF, Epub and Kindle

This book provides a unified view on a new methodology for Machine Translation (MT). This methodology extracts information from widely available resources (extensive monolingual corpora) while only assuming the existence of a very limited parallel corpus, thus having a unique starting point to Statistical Machine Translation (SMT). In this book, a detailed presentation of the methodology principles and system architecture is followed by a series of experiments, where the proposed system is compared to other MT systems using a set of established metrics including BLEU, NIST, Meteor and TER. Additionally, a free-to-use code is available, that allows the creation of new MT systems. The volume is addressed to both language professionals and researchers. Prerequisites for the readers are very limited and include a basic understanding of the machine translation as well as of the basic tools of natural language processing.​

Neural Machine Translation

Neural Machine Translation
Author: Philipp Koehn
Publisher: Cambridge University Press
Total Pages: 409
Release: 2020-06-18
Genre: Computers
ISBN: 1108497322

Download Neural Machine Translation Book in PDF, Epub and Kindle

Learn how to build machine translation systems with deep learning from the ground up, from basic concepts to cutting-edge research.

Quality Estimation for Machine Translation

Quality Estimation for Machine Translation
Author: Lucia Specia
Publisher: Springer Nature
Total Pages: 148
Release: 2022-05-31
Genre: Computers
ISBN: 3031021681

Download Quality Estimation for Machine Translation Book in PDF, Epub and Kindle

Many applications within natural language processing involve performing text-to-text transformations, i.e., given a text in natural language as input, systems are required to produce a version of this text (e.g., a translation), also in natural language, as output. Automatically evaluating the output of such systems is an important component in developing text-to-text applications. Two approaches have been proposed for this problem: (i) to compare the system outputs against one or more reference outputs using string matching-based evaluation metrics and (ii) to build models based on human feedback to predict the quality of system outputs without reference texts. Despite their popularity, reference-based evaluation metrics are faced with the challenge that multiple good (and bad) quality outputs can be produced by text-to-text approaches for the same input. This variation is very hard to capture, even with multiple reference texts. In addition, reference-based metrics cannot be used in production (e.g., online machine translation systems), when systems are expected to produce outputs for any unseen input. In this book, we focus on the second set of metrics, so-called Quality Estimation (QE) metrics, where the goal is to provide an estimate on how good or reliable the texts produced by an application are without access to gold-standard outputs. QE enables different types of evaluation that can target different types of users and applications. Machine learning techniques are used to build QE models with various types of quality labels and explicit features or learnt representations, which can then predict the quality of unseen system outputs. This book describes the topic of QE for text-to-text applications, covering quality labels, features, algorithms, evaluation, uses, and state-of-the-art approaches. It focuses on machine translation as application, since this represents most of the QE work done to date. It also briefly describes QE for several other applications, including text simplification, text summarization, grammatical error correction, and natural language generation.

A Machine Translation Approach to Cross Language Text Retrieval

A Machine Translation Approach to Cross Language Text Retrieval
Author: María Gabriela Fernandez-Diaz
Publisher: Universal-Publishers
Total Pages: 137
Release: 2005-03
Genre: Language Arts & Disciplines
ISBN: 1581122675

Download A Machine Translation Approach to Cross Language Text Retrieval Book in PDF, Epub and Kindle

Cross Language Text Retrieval (CLTR) has been defined as the retrieval of documents in a language different from that of the original query. To make this possible some kind of mechanism has to be applied in order to translate the information contained in the source sentence. Many different approaches have been carried out with the purpose of transferring the information from the source language query to the target language one. Though all these methods deal with a way of translating as much information as possible from the source query, little research has been conducted in relation to the field of Machine Translation (MT). The purpose of this research work is to determine the feasibility of using MT techniques for CLTR. Specifically, I will describe how a MT system has been adapted without much effort to translate Spanish queries of a specific domain, i.e. Finance and Economics, into English in order to retrieve documents related to that field. The results of this process will then be compared with the results obtained from the retrieval of the original English queries. Thus, I will discuss the advantages and disadvantages of using MT for CLTR.