The mp4 structure currently contains an array of 1024 track pointers
which are initialized to point to track structures allocated as we
encounter tracks. This is kind of wasteful given that we will only
care about audio tracks, and only ever consider the first one.
This patch replaces the pointer array by a single track structure
embedded within struct mp4. Besides the above mentioned memory savings,
this approach allows us to remove a bunch of identical sanity checks
in the atom parsers.
The old code maintained the ->audio_track pointer of struct mp4 to
tell whether we already saw an mp4a atom and thus already allocated
a structure for the corresponding track. We now use a state based
approach with three states instead. The state value determines whether
we have to parse the atom. The first state transition takes place when
the mp4a atom is encountered while the second transition occurs at the
subsequent trak atom, if any. If an atom parser is called while the
state machine is in an unexpected state, we return success rather than
an error code to ignore the atom without failing the whole operation.