Andre Noll [Thu, 26 Aug 2021 19:39:31 +0000 (21:39 +0200)]
mp4: Make most loop variables unsigned.
If the loop variable iterates from zero to some number stored in a
variable of unsigned type, the loop variable should be of the same
unsigned type. This was not always the case, and if it was, the loop
variable was sometimes called i, which is confusing because i usually
indicates a signed quantity.
Quoting Andrew Morton:
Doing "unsigned i;" is an act of insane vandalism, punishable by
spending five additional years coding in fortran.
Andre Noll [Thu, 26 Aug 2021 17:57:45 +0000 (19:57 +0200)]
mp4: Replace the five tag value functions by a single one.
It's easier to let the caller pass the tag item string than to
have one caller for each of the five tags of interest. This commit
renames meta_find_by_name() to mp4_get_tag_value(), makes it public
and removes its five callers from mp4.c.
The only user is _aac_afh_get_taginfo() of aac_afh.c, which needs to
be adjusted accordingly. Kill the pointless underscore while at it.
Andre Noll [Thu, 26 Aug 2021 17:20:49 +0000 (19:20 +0200)]
mp4: Provide proper error codes for all errors.
This changes the few remaining places where we return -1 to indicate
failure by proper error codes which can be turned into a meaningful
error message.
Andre Noll [Thu, 26 Aug 2021 16:36:00 +0000 (18:36 +0200)]
mp4: Remove tracks array.
The mp4 structure currently contains an array of 1024 track pointers
which are initialized to point to track structures allocated as we
encounter tracks. This is kind of wasteful given that we will only
care about audio tracks, and only ever consider the first one.
This patch replaces the pointer array by a single track structure
embedded within struct mp4. Besides the above mentioned memory savings,
this approach allows us to remove a bunch of identical sanity checks
in the atom parsers.
The old code maintained the ->audio_track pointer of struct mp4 to
tell whether we already saw an mp4a atom and thus already allocated
a structure for the corresponding track. We now use a state based
approach with three states instead. The state value determines whether
we have to parse the atom. The first state transition takes place when
the mp4a atom is encountered while the second transition occurs at the
subsequent trak atom, if any. If an atom parser is called while the
state machine is in an unexpected state, we return success rather than
an error code to ignore the atom without failing the whole operation.
Andre Noll [Tue, 24 Aug 2021 19:37:06 +0000 (21:37 +0200)]
mp4: Introduce skip_bytes().
We often call one of the read_intX() helpers with a NULL result pointer
just to move the file position forward. Calling ->seek() with whence
set to SEEK_CUR is simpler and has the advantage that this operation
cannot fail. If we happen to seek beyond EOF, the next read will
return EOF and we'll abort then.
This patch provides the skip_bytes() helper and replaces all
read_intX(f, NULL) calls by calls to skip_bytes() and removes the
error checking.
Due to this cleanup read_int8() and read_int24() and read_u24_be()
(the latter being an inline function defined in portable_io.h) have
become unused, so remove these as well.
Andre Noll [Tue, 24 Aug 2021 18:47:03 +0000 (20:47 +0200)]
mp4: Provide whence parameter for the seek callback.
This adds a parameter to make ->seek() work like the lseek(2) system
call. This is easy to implement in both the memory-mapped callback
case used to retrieve the file information and the metadata update
case where ->seek() is a trivial wrapper for lseek(2).
With the additional functionality in place we don't need to track
the file size and the current file offset any more in mp4.c as these
values can now be obtained by calling ->seek() with a zero offset and
whence set to SEEK_END and SEEK_CUR, respectively. This also makes
the code more robust against corrupt mp4 files because we no longer
rely on the values from the atom headers to compute the file size.
The way mp4.c calls ->seek() should never cause the underlying lseek(2)
system call to fail. Therefore it suffices to check the return value
only in the callback wrapper and abort on failure.
Andre Noll [Tue, 24 Aug 2021 12:49:16 +0000 (14:49 +0200)]
mp4: Implement error checking for the write path.
Although the ->write callback has a return value, it is of unsigned
type and is never checked. Fix this by changing the prototype to
match that of the write(2) system call, check the return value of the
callback in the write_data() wrapper of mp4.c and propagate paraslash
error codes back to aac_afh.c via the public mp4_meta_update().
While at it, handle short writes and EINTR properly, and fix the
indentation of the callback structure in mp4.h.
Andre Noll [Mon, 23 Aug 2021 19:30:14 +0000 (21:30 +0200)]
mp4: Merge write_int32() into mp4_meta_update().
It has only this single caller, and it's short. Use uint8_t instead
of int8_t for the buffer as we do elsewhere and rename the buffer
variable while at it.
Andre Noll [Mon, 23 Aug 2021 18:18:54 +0000 (20:18 +0200)]
mp4: Avoid camel case for members of struct mp4_track.
Only three members of struct mp4 are in camel case while all others
follow the underscore convention, which is the standard coding style
of the paraslash code base. Let's be consistent here.
Add comments which indicate the origin of the values stored while
at it.
Andre Noll [Mon, 23 Aug 2021 18:12:30 +0000 (20:12 +0200)]
mp4: Kill fix_byte_order_32().
All quantities stored in mp4 files are in big endian format, There's
no reason to "fix" anything, just write out the 32 bit numbers using
write_u32_be().
Andre Noll [Mon, 23 Aug 2021 14:37:33 +0000 (16:37 +0200)]
mp4: Avoid duplicating the list of atoms.
A little cpp magic can do wonders in this regard. The new
atom_name_to_type() should also be more efficient because we replaced
four 8-bit comparisons by one 32-bit comparison.
Andre Noll [Mon, 23 Aug 2021 14:24:38 +0000 (16:24 +0200)]
mp4: Merge parse_leaf_atom() into parse_sub_atoms().
This gets rid of the distinction between atoms with and without
subatoms, which was confusing because some atoms "without" subatoms
in fact do contain subatoms, we just did not want to parse them
recursively in parse_sub_atoms().
With this weirdness gone, we may move on to simplify the atoms enum and
atom_name_to_type() further, but this is left to a subsequent patch.
Andre Noll [Mon, 23 Aug 2021 14:18:17 +0000 (16:18 +0200)]
mp4: Simplify parse_sub_atoms().
This converts the while loop into a for loop and replaces the
counted_size variable by "dest" to clarify the loop structure. We
also move the two 8-bit variables into the loop as they are only used
there and skip their pointless initializations.
Andre Noll [Sun, 22 Aug 2021 21:08:36 +0000 (23:08 +0200)]
mp4: Use automatic numbering for atom enum.
The exact numbers numbers of the ATOM enum are irrelevant. The only
thing which matters is the distinction between atoms we are only
interested in because they contain subatoms we care about and atoms
for which there is a corresponding read_xxx() parser.
Andre Noll [Sun, 22 Aug 2021 20:49:13 +0000 (22:49 +0200)]
mp4: Remove unused atoms.
The enum and atom_name_to_type() still knows about a lot of atoms we
don't care about. These only clutter up the code and slow things down,
so drop them.
Andre Noll [Sun, 22 Aug 2021 19:16:48 +0000 (21:16 +0200)]
mp4: Kill membuffer API.
Thanks to the previous cleanups, create_ilst() is the last remaining
membuffer user. Since the size of the ilst atom can be computed as the
sum of the tag lengths plus a constant times the number of tag items,
we can allocate a suitably sized buffer up-front instead of relying
on the membuffer framework to allocate and resize buffers as needed.
Andre Noll [Sun, 22 Aug 2021 18:01:01 +0000 (20:01 +0200)]
mp4: Assume udta, meta and ilst are always present.
Under normal circumstances these atoms exist or can at least be
created by other means (e.g., by running mp4tags -a foo bar.m4a).
This patch makes mp4_open_meta() fail early if at least one of the
three atoms is missing. This allows to remove the (never tested hence
probably buggy) code which creates these atoms.
Andre Noll [Sat, 21 Aug 2021 15:56:17 +0000 (17:56 +0200)]
mp4: Eliminate duplication between the two open functions.
The only difference between mp4_open_read() and mp4_open_meta()
is that they pass different values for the meta_only flag to
parse_root_atoms(). We can avoid some duplication by moving the
common code to parse_root_atoms(). Rename that function to open_file()
because it now does more than just parsing atoms.
The patch also changes the prototype of both public open functions
to return an integer error code in addition to the pointer to an mp4
structure. This allows us to gradually improve the error diagnostics.
Andre Noll [Sat, 21 Aug 2021 14:23:05 +0000 (16:23 +0200)]
mp4: Remove find_atom() and find_atom_v2().
During mp4_open_meta() we encounter the ILST, META and UDTA atoms
but don't record the size and the location of these atoms. Doing
so allows us to use this information later in mp4_meta_update()
instead of calling find_atom() or find_atom_v2() to search the file
again. This removes some ugly code and speeds up the operation.
Andre Noll [Sat, 21 Aug 2021 11:30:02 +0000 (13:30 +0200)]
mp4: Get rid of find_standard_meta().
We don't need a dedicated function and data structure for that. Just
open-code the logic in create_ilst() and clean up this function a bit
while at it. Specifically:
* Call the loop variable "n" rather than "metaptr" since it is not
a pointer but an unsigned integer.
* Abort if we encounter a tag item name which is not one of the five
standard names. This can never occur because the origin of these
strings is the code in aac_afh.c which only passes standard names.
* Drop the integer return value, since the function can never
fail. Make it return the buffer pointer instead and get rid of the
corresponding parameter.
Andre Noll [Fri, 20 Aug 2021 11:38:55 +0000 (13:38 +0200)]
mp4: Simplify read_mp4a().
The single caller resets the file offset after the call, so we may
stop reading the atom after we've parsed the last field of interest,
which happens to be the sample rate.
Andre Noll [Thu, 19 Aug 2021 18:07:13 +0000 (20:07 +0200)]
mp4: Reduce atom parsing to the bare minimum.
This replaces need_parse_when_meta_only() by need_atom() which is
called from parse_sub_atoms() for both regular opens and meta-only
opens to decide if the detected atom needs to be parsed.
After this patch we skip more atoms than we used to do, speeding up
the operation for both kinds of opens.
Andre Noll [Thu, 19 Aug 2021 17:13:31 +0000 (19:13 +0200)]
mp4: Convert "meta_only" to a boolean.
Several functions receive the "meta_only" parameter to distinguish
between regular and metadata-only opens. The parameter can only be
zero or one, so use a boolean because true/false is more descriptive
than 1/0.
Andre Noll [Thu, 19 Aug 2021 17:06:10 +0000 (19:06 +0200)]
mp4: Simplify parse_atoms().
We are only interested in subatoms of the moov atom, so skip everything
else. Rename the function to parse_root_atoms() and remove the comment
which does not convey any information anymore.
Andre Noll [Thu, 19 Aug 2021 13:52:13 +0000 (15:52 +0200)]
mp4: Hide tracks array.
All functions of mp4.c operate on the first audio track. This
patch makes this fact implicit which allows us to remove the public
mp4_get_total_tracks() and mp4_is_audio_track(). Moreover, the track
parameter can be removed from all public functions.
If no audio track was found in the mp4 file, we now return an error
from two public open functions of mp4.c. Otherwise, we maintain a
pointer to the first audio track within the mp4 structure and use
that to identify the track rather than letting the API users pass
the track number.
Andre Noll [Wed, 18 Aug 2021 21:15:07 +0000 (23:15 +0200)]
mp4: Simplify chunk_of_sample().
This function was unnecessarily complex. The equivalent replacement
code is much shorter and easier to read. Besides reducing the number
of local variables, we drop the chunk_sample parameter and return
this number via the return value of the function.
Andre Noll [Wed, 18 Aug 2021 18:59:25 +0000 (20:59 +0200)]
mp4: Provide return value for mp4_set_sample_position().
This function fails if the given parameters are invalid. Detect this
and return EINVAL in this case. Add corresponding error checking to
the aac audio format handler.
Andre Noll [Wed, 18 Aug 2021 16:08:38 +0000 (18:08 +0200)]
mp4: Remove ->error of struct mp4.
It's easier to have track_add(), the only function which sets ->error,
return an integer error code instead. Since track_add() is simple
and is only called by parse_sub_atoms(), open-code the logic there.
Also, don't reset ->total_tracks on errors because this leads to a
memory leak, don't increase the track counter on errors and remove
the comment which only states what is obvious.
Andre Noll [Wed, 18 Aug 2021 15:08:08 +0000 (17:08 +0200)]
mp4: Add error checking to parse_atoms() and friends.
After this patch read errors are propagated all the way down from the
read_data() primitive to the public entry functions mp4_open_read()
and mp4_open_meta().
Andre Noll [Wed, 18 Aug 2021 14:59:35 +0000 (16:59 +0200)]
mp4: Add error checking for atom_read().
While the individual atom parsers all perform error checking and
return an error code, their caller, atom_read(), ignores errors.
Address this shortcoming, simplify the function by using a switch
instead of an if-else chain and move the descriptions of the atoms
to the enum where they belong.
Andre Noll [Wed, 18 Aug 2021 13:25:17 +0000 (15:25 +0200)]
mp4: Improve handling of read errors.
Currently read_data() of mp4.c is an atrocious mess. The ->read()
callback is defined to return uint32_t, but the return value is
stored in a signed 32 bit integer. Moreover, read_data() contains a
dead store, it handles neither short nor interrupted reads correctly,
and it moves the file position backwards on errors.
While this is easy to fix, a more intricate problem is that most
callers of read_data(), including all read_intX() helpers, ignore the
return value of read_data() and return uninitialized stack contents in
the error case. This is kind of dealt with by the ->read_error member
of struct mp4, but this not more than a kludge, which, according to
the comments, was applied after several CVEs had been filed against
the library.
Let's DTRT here, even though it adds a fair amount of new code:
Check the return value of each read operation and fail early on errors.
We have to distinguish three cases: error, EOF, and success, encoded
as return values -1, 0 and 1, respectively. This commit converts most
functions which read from an mp4 file to this convention. More work
is required as return values are not checked everywhere yet. This was
left for subsequent commits to keep the already large patch within
reasonable size.
Since we don't rely on ->read_error of struct mp4 any more, it can
be removed.
Andre Noll [Sat, 14 Aug 2021 21:14:38 +0000 (23:14 +0200)]
mp4: Clean up membuffer_transfer_from_file().
The buffer pointer can never be NULL, so drop this check. Next, instead
of defining a void * pointer and cast it to char *, use char * directly.
Finally, the cast to unsigned has no effect, so drop it.
Andre Noll [Sat, 14 Aug 2021 20:40:00 +0000 (22:40 +0200)]
mp4: Simplify membuffer_create().
Since para_malloc() never returns NULL, the error state can only be
zero. Use para_calloc(), skip the zero initializations and kill a
pointless local variable.
Andre Noll [Sat, 14 Aug 2021 20:27:12 +0000 (22:27 +0200)]
mp4: Check return value of membuffer_transfer_from_file().
This function calls the ->read() method of the callback, which may
fail. Currently all three callers ignore the return value and rely
on the fact that the membuffer is set to error state, which will be
detected later.
It's easier and clearer to check for errors in the callers and fail
early on read errors. Since the membuffer is useless in the error
case, free it right away in membuffer_transfer_from_file(). Change
the function to return bool instead of unsigned while at it and remove
a pointless cast in one of its callers.
Andre Noll [Sat, 14 Aug 2021 18:53:15 +0000 (20:53 +0200)]
mp4: Drop integer return type from modify_moov().
This function returns either zero or one to indicate success. On
success, a pointer to a buffer and the buffer size are returned. It
is simpler and less redundant to indicate failure by returning a NULL
buffer pointer. Rather than using a void ** argument for the buffer,
let the function return void *.
Andre Noll [Sat, 14 Aug 2021 18:21:51 +0000 (20:21 +0200)]
mp4: Merge mp4_close() and tag_delete().
The latter is only called by the former, and both are short enough.
Don't bother to zero out meta->tags and meta->count because we free
the containing mp4 structure as well.
Andre Noll [Sat, 14 Aug 2021 17:38:21 +0000 (19:38 +0200)]
mp4: Clean up membuffer_write_std_tag().
Remove the check for the compilation flag since we never pass "cpil"
to this function. Remove the flags variable whose value is now always
one. Introduce a variable for the string length instead of calling
strlen() three times, and unify the way comments are formatted.
Andre Noll [Sat, 14 Aug 2021 17:22:17 +0000 (19:22 +0200)]
mp4: Clean up find_standard_meta().
Use ARRAY_SIZE() instead of open-coding it, move the stdmetas array
into the function since it is only used there, and make it const.
Also replace 0 by NULL, since the function returns a pointer, and
remove the pointless comment.
Finally, move the function and the declaration of the stdmeta_entry
structure closer to its single user.
Andre Noll [Sat, 14 Aug 2021 17:15:29 +0000 (19:15 +0200)]
mp4: Simplify create_ilst().
This function contains a lot of overhead which is just dead code
for paraslash since we only care about five standard tags, In
particular, we never write custom tags. Removing the single caller
of membuffer_write_custom_tag() left a whole bunch of other functions
and data structures unused, so these can be removed as well.
Andre Noll [Sat, 14 Aug 2021 17:00:06 +0000 (19:00 +0200)]
mp4: Call metadata structures "meta".
Currently they are called "tags" or "data", both of which are confusing
because struct mp4_metadata has a member called "tags", and "data"
is also used for generic buffers in the various I/O helpers.
Andre Noll [Sat, 14 Aug 2021 15:53:59 +0000 (17:53 +0200)]
mp4: Simplify and speed up metadata editing.
Currently the aac audio format handler first calls mp4_open_meta()
to get the metadata tags, then alters the in-memory structure of the
tags according to the command line options and passes this modified
structure to mp4_meta_update() to rewrite the tags. This latter call
parses the tags again, which is unnecessary overhead.
This patch changes the signature of mp4_meta_update() to accept an
mp4 structure instead of a callback structure and uses that instead
of re-opening the file.
Andre Noll [Fri, 13 Aug 2021 18:51:38 +0000 (20:51 +0200)]
mp4: Simplify and doxify meta tag accessors.
The integer return value is redundant, so get rid of the value
parameter and simplify meta_find_by_name() accordingly. Document that
tag values are allocated on the heap and should be freed by the caller.
Andre Noll [Fri, 13 Aug 2021 18:19:43 +0000 (20:19 +0200)]
mp4: Simplify parse_tag().
We don't care about arbitrarily named tags, and those tags we're
interested in are generally present in form of the standard tags
(ATOM_TITLE, ATOM_ARTIST etc.).
Since we now always call get_metadata_name() to get the string
representation of the tag, we don't need to make a copy any more,
just pass the const pointer directly to tag_add_field().
With this change in place it is obvious that we never pass a NULL or
empty tag name to tag_add_field(), and we don't pass a NULL pointer
for the value argument either, so remove the safety check.
Andre Noll [Wed, 11 Aug 2021 20:52:11 +0000 (22:52 +0200)]
mp4: Rename and simplify set_metadata_name()
This function is an atrocious mess. For one, the naming is confusing
because the function does not set the atom name, it *returns* it.
More importantly, the function defines a static array for no good
reason and then hard-codes the array indices in a large switch
statement.
It's much easier to simply return a pointer to a string literal and
perform the strdup operation in the single caller.
Andre Noll [Wed, 11 Aug 2021 19:45:12 +0000 (21:45 +0200)]
mp4: Use uniform names for callback and mp4 structures.
Currently it's a confusing mess, with callbacks called f, ff, or
stream, where the former two are also used for pointers to struct
mp4. Let's call a spade a spade and use cb everywhere for the callbacks
while f is reserved to denote an mp4 pointer.
Andre Noll [Wed, 11 Aug 2021 19:37:16 +0000 (21:37 +0200)]
mp4: Hide ->read_error.
This does not belong into the callback structure whose fields are
supposed to get initialized by the audio format handler. Move it to
the internal struct mp4 instead, next to the existing error counter.
Andre Noll [Wed, 11 Aug 2021 17:26:00 +0000 (19:26 +0200)]
mp4: Don't parse the esds atom any more.
With the decoder specific config no longer in use, we can simplify
mp4.c further by getting rid of some cryptic and underdocumented code
which no longer does anything useful for us.
Andre Noll [Wed, 11 Aug 2021 17:12:07 +0000 (19:12 +0200)]
mp4: Introduce mp4_is_audio_track().
Currently the aac audio format handler iterates over the tracks
in an mp4 file. For each track it tries to get the audio-specific
configuration by calling mp4_get_decoder_config() and calls into faad
to check whether it is a valid configuration for the aac decoder.
We can simplify all this because the mp4 code already knows the type
of each track, albeit it does not expose this information yet. So
provide the new mp4_is_audio_track() helper and let the aac audio
format handler pick the first track for which this helper returns true.
As an additional benefit, we can remove the now unused
mp4_get_decoder_config().
The old name is misleading because the returned file handle is by no
means opened read-only. In fact we call mp4ff_meta_update() on it,
which alters the file to store the modified metadata.
Andre Noll [Wed, 11 Aug 2021 13:41:37 +0000 (15:41 +0200)]
mp4: Introduce mp4ff_get_duration().
This allows us to get rid of an ugly hack in aac_afh.c where we peeked
at the audio-specific config structure to get the scaling factor which
was needed to compute the duration.
Andre Noll [Tue, 10 Aug 2021 16:57:42 +0000 (18:57 +0200)]
mp4: Remove const qualifier from non-pointer function arguments.
In contrast to the pointer case, it's generally not very interesting
to know whether a function will modify the automatic variable which
corresponds to a non-pointer argument.
Andre Noll [Mon, 9 Aug 2021 22:14:53 +0000 (00:14 +0200)]
mp4: Reduce indentation of create_ilst().
This function contained an indented block for no good reason. No real
change except that a few variables are exposed to the code below the
former block, but this is not a problem.
Andre Noll [Mon, 9 Aug 2021 21:13:35 +0000 (23:13 +0200)]
mp4: Use read/write functions from portable_io.h.
This removes quite a bit of ugly code. In particular, atom_get_size()
is completely unnecessary and can be removed. Since there is no
function in portable_io.h to read a 24 bit integer, we have to add
read_u24_be().