arm |
|
|
bitpack.c |
We're 'MSb' endian; if we write a word but read individual bits,
then we'll read the MSb first. |
3127 |
bitpack.h |
Custom bitpacker implementations. |
2652 |
config.h |
config.h. Generated from config.h.in by configure. |
3053 |
dct.h |
Definitions shared by the forward and inverse DCT transforms. |
1303 |
decinfo.c |
Only used for fuzzing. |
9381 |
decint.h |
Decoder-specific accelerated functions. |
6873 |
decode.c |
No post-processing. |
106993 |
dequant.c |
Note: The caller is responsible for cleaning up any partially
constructed qinfo. |
5290 |
dequant.h |
|
1097 |
fragment.c |
Copies the fragments specified by the lists of fragment indices from one
frame to another.
_dst_frame: The reference frame to copy to.
_src_frame: The reference frame to copy from.
_ystride: The row stride of the reference frames.
_fragis: A pointer to a list of fragment indices.
_nfragis: The number of fragment indices to copy.
_frag_buf_offs: The offsets of fragments in the reference frames. |
2832 |
huffdec.c |
Instead of storing every branching in the tree, subtrees can be collapsed
into one node, with a table of size 1<<nbits pointing directly to its
descedents nbits levels down.
This allows more than one bit to be read at a time, and avoids following all
the intermediate branches with next to no increased code complexity once
the collapsed tree has been built.
We do _not_ require that a subtree be complete to be collapsed, but instead
store duplicate pointers in the table, and record the actual depth of the
node below its parent.
This tells us the number of bits to advance the stream after reaching it.
This turns out to be equivalent to the method described in \cite{Hash95},
without the requirement that codewords be sorted by length.
If the codewords were sorted by length (so-called ``canonical-codes''), they
could be decoded much faster via either Lindell and Moffat's approach or
Hashemian's Condensed Huffman Code approach, the latter of which has an
extremely small memory footprint.
We can't use Choueka et al.'s finite state machine approach, which is
extremely fast, because we can't allow multiple symbols to be output at a
time; the codebook can and does change between symbols.
It also has very large memory requirements, which impairs cache coherency.
We store the tree packed in an array of 16-bit integers (words).
Each node consists of a single word, followed consecutively by two or more
indices of its children.
Let n be the value of this first word.
This is the number of bits that need to be read to traverse the node, and
must be positive.
1<<n entries follow in the array, each an index to a child node.
If the child is positive, then it is the index of another internal node in
the table.
If the child is negative or zero, then it is a leaf node.
These are stored directly in the child pointer to save space, since they only
require a single word.
If a leaf node would have been encountered before reading n bits, then it is
duplicated the necessary number of times in this table.
Leaf nodes pack both a token value and their actual depth in the tree.
The token in the leaf node is (-leaf&255).
The number of bits that need to be consumed to reach the leaf, starting from
the current node, is (-leaf>>8).
@ARTICLE{Hash95,
author="Reza Hashemian",
title="Memory Efficient and High-Speed Search {Huffman} Coding",
journal="{IEEE} Transactions on Communications",
volume=43,
number=10,
pages="2576--2581",
month=Oct,
year=1995
} |
17786 |
huffdec.h |
|
1320 |
huffman.h |
The range of valid quantized DCT coefficient values.
VP3 used 511 in the encoder, but the bitstream is capable of 580. |
2536 |
idct.c |
Performs an inverse 8 point Type-II DCT transform.
The output is scaled by a factor of 2 relative to the orthonormal version of
the transform.
_y: The buffer to store the result in.
Data will be placed in every 8th entry (e.g., in a column of an 8x8
block).
_x: The input coefficients.
The first 8 entries are used (e.g., from a row of an 8x8 block). |
11371 |
info.c |
This is more or less the same as strncasecmp, but that doesn't exist
everywhere, and this is a fairly trivial function, so we include it.
Note: We take advantage of the fact that we know _n is less than or equal to
the length of at least one of the strings. |
3954 |
internal.c |
A map from the index in the zig zag scan to the coefficient number in a
block.
All zig zag indices beyond 63 are sent to coefficient 64, so that zero runs
past the end of a block in bogus streams get mapped to a known location. |
5973 |
internal.h |
Disable missing EMMS warnings. |
4093 |
mathops.h |
Note the casts to (int) below: this prevents OC_CLZ{32|64}_OFFS from
"upgrading" the type of an entire expression to an (unsigned) size_t. |
5580 |
ocintrin.h |
Some common macros for potential platform-specific optimization. |
6246 |
quant.c |
The maximum output of the DCT with +/- 255 inputs is +/- 8157.
These minimum quantizers ensure the result after quantization (and after
prediction for DC) will be no more than +/- 510.
The tokenization system can handle values up to +/- 580, so there is no need
to do any coefficient clamping.
I would rather have allowed smaller quantizers and had to clamp, but these
minimums were required when constructing the original VP3 matrices and have
been formalized in the spec. |
5635 |
quant.h |
Maximum scaled quantizer value. |
1216 |
state.c |
The function used to fill in the chroma plane motion vectors for a macro
block when 4 different motion vectors are specified in the luma plane.
This version is for use with chroma decimated in the X and Y directions
(4:2:0).
_cbmvs: The chroma block-level motion vectors to fill in.
_lbmvs: The luma block-level motion vectors. |
47423 |
state.h |
A single quadrant of the map from a super block to fragment numbers. |
22114 |
x86 |
|
|
x86_vc |
|
|