Characterizing the Redundancy of Universal Source Coding for Finite-length Sequences

Characterizing the Redundancy of Universal Source Coding for Finite-length Sequences
Author: Ahmad Beirami
Publisher:
Total Pages:
Release: 2011
Genre: Communication
ISBN:

Download Characterizing the Redundancy of Universal Source Coding for Finite-length Sequences Book in PDF, Epub and Kindle

In this thesis, we first study what is the average redundancy resulting from the universal compression of a single finite-length sequence from an unknown source. In the universal compression of a source with d unknown parameters, Rissanen demonstrated that the expected redundancy for regular codes is asymptotically d/2 log n + o(log n) for almost all sources, where n is the sequence length. Clarke and Barron also derived the asymptotic average minimax redundancy for memoryless sources. The average minimax redundancy is concerned with the redundancy of the worst parameter vector for the best code. Thus, it does not provide much information about the effect of the different source parameter values. Our treatment in this thesis is probabilistic. In particular, we derive a lower bound on the probability measure of the event that a sequence of length n from an FSMX source chosen using Jeffreys' prior is compressed with a redundancy larger than a certain fraction of d/2 log n. Further, our results show that the average minimax redundancy provides good estimate for the average redundancy of most sources for large enough n and d. On the other hand, when the source parameter d is small the average minimax redundancy overestimates the average redundancy for small to moderate length sequences. Additionally, we precisely characterize the average minimax redundancy of universal coding when the coding scheme is restricted to be from the family of two--stage codes, where we show that the two--stage assumption incurs a negligible redundancy for small and moderate length n unless the number of source parameters is small.\r : %We show that redundancy is significant in the compression of small sequences. Our results, collectively, help to characterize the non-negligible redundancy resulting from the compression of small and moderate length sequences. Next, we apply these results to the compression of a small to moderate length sequence provided that the context present in a sequence of length M from the same source is memorized. We quantify the achievable performance improvement in the universal compression of the small to moderate length sequence using context memorization.

Analytic Information Theory

Analytic Information Theory
Author: Michael Drmota
Publisher: Cambridge University Press
Total Pages: 381
Release: 2023-08-31
Genre: Computers
ISBN: 1108474446

Download Analytic Information Theory Book in PDF, Epub and Kindle

Explores problems of information and learning theory, using tools from analytic combinatorics to analyze precise behavior of source codes.

Information and Life

Information and Life
Author: Gérard Battail
Publisher: Springer Science & Business Media
Total Pages: 264
Release: 2013-07-30
Genre: Science
ISBN: 9400770405

Download Information and Life Book in PDF, Epub and Kindle

Communication, one of the most important functions of life, occurs at any spatial scale from the molecular one up to that of populations and ecosystems, and any time scale from that of fast chemical reactions up to that of geological ages. Information theory, a mathematical science of communication initiated by Shannon in 1948, has been very successful in engineering, but biologists ignore it. This book aims at bridging this gap. It proposes an abstract definition of information based on the engineers' experience which makes it usable in life sciences. It expounds information theory and error-correcting codes, its by-products, as simply as possible. Then, the fundamental biological problem of heredity is examined. It is shown that biology does not adequately account for the conservation of genomes during geological ages, which can be understood only if it is assumed that genomes are made resilient to casual errors by proper coding. Moreover, the good conservation of very old parts of genomes, like the HOX genes, implies that the assumed genomic codes have a nested structure which makes an information the more resilient to errors, the older it is. The consequences that information theory draws from these hypotheses meet very basic but yet unexplained biological facts, e.g., the existence of successive generations, that of discrete species and the trend of evolution towards complexity. Being necessarily inscribed on physical media, information appears as a bridge between the abstract and the concrete. Recording, communicating and using information exclusively occur in the living world. Information is thus coextensive with life and delineates the border between the living and the inanimate.

Redundancy of Lossless Data Compression for Known Sources by Analytic Methods

Redundancy of Lossless Data Compression for Known Sources by Analytic Methods
Author: Michael Drmota
Publisher:
Total Pages: 140
Release: 2017
Genre: Coding theory
ISBN: 9781680832853

Download Redundancy of Lossless Data Compression for Known Sources by Analytic Methods Book in PDF, Epub and Kindle

Lossless data compression is a facet of source coding and a well studied problem of information theory. Its goal is to find a shortest possible code that can be unambiguously recovered. Here, we focus on rigorous analysis of code redundancy for known sources. The redundancy rate problem determines by how much the actual code length exceeds the optimal code length. We present precise analyses of three types of lossless data compression schemes, namely fixed-to-variable (FV) length codes, variable-to-fixed (VF) length codes, and variable to- variable (VV) length codes. In particular, we investigate the average redundancy of Shannon, Huffman, Tunstall, Khodak and Boncelet codes. These codes have succinct representations as trees, either as coding or parsing trees, and we analyze here some of their parameters (e.g., the average path from the root to a leaf). Such trees are precisely analyzed by analytic methods, known also as analytic combinatorics, in which complex analysis plays decisive role. These tools include generating functions, Mellin transform, Fourier series, saddle point method, analytic poissonization and depoissonization, Tauberian theorems, and singularity analysis. The term analytic information theory has been coined to describe problems of information theory studied by analytic tools. This approach lies on the crossroad of information theory, analysis of algorithms, and combinatorics.

DCC '95, Data Compression Conference

DCC '95, Data Compression Conference
Author: James Andrew Storer
Publisher:
Total Pages: 528
Release: 1995
Genre: Data compression (Computer science)
ISBN:

Download DCC '95, Data Compression Conference Book in PDF, Epub and Kindle

Contains the presentations from the March 1995 conference which was sponsored by the IEEE Computer Society Technical Committee on Computer Communications. Among the topics are hierarchical vector quantization of perceptually weighted block transforms; unbounded length contexts for PPM; quadtree based JBIG compression; parallel algorithms for the static dictionary compression; and CREW--compression with reversible embedded wavelets. Includes a poster session and abstracts from industry and NASA workshops. No subject index. Annotation copyright by Book News, Inc., Portland, OR.

Proceedings

Proceedings
Author:
Publisher:
Total Pages: 540
Release: 1994
Genre: Information theory
ISBN:

Download Proceedings Book in PDF, Epub and Kindle