Topics: Question
Dec 5, 2014 at 6:28 PM
will you have depacketization for AAC in your library at some point ?
Dec 5, 2014 at 6:42 PM
Edited Dec 5, 2014 at 6:46 PM
https://tools.ietf.org/html/rfc6416 is already supported.

For audio there is no depacketization, the packets should start with the Sync Word 0xff and be ready for a decoder as far as I know...

Furthermore I am not even really sure video requires that extra data be placed in the file as the stream should be aligned however I have included some code to add the start codes which will probably end up being removed because it doesn't really do any good unless some objects were stripped from the stream for some reason and in such a case the SDP would contain a config which should also be read and would possibly restore the missing header info.

Hopefully that made sense, let me know if you need anything else.
Marked as answer by juliusfriedman on 12/5/2014 at 10:42 AM
Dec 5, 2014 at 7:01 PM
so can i just use Assemble as i would with PCM and send the stream to the container and let the player worry about the decoding ?
Dec 5, 2014 at 9:59 PM
Edited Dec 5, 2014 at 10:34 PM
That would be a correct assertion based on the understanding I have.

RFC6416 is for 'MP4V-ES' and is focused around the Video and everything else is just a Elementary Stream as outlined in mpeg4-generic e.g. RFC3640 and actually have a header which is specified in the SDP on the 'fmtp' line.

See RFC3640Frame.cs for the notes.

If you find you need anything or that there is a bug let me know!
Marked as answer by juliusfriedman on 12/5/2014 at 1:59 PM
Dec 5, 2014 at 10:56 PM
Do you know if Windows media player plays an aac file directly or does it need a decoder plug-in?
Dec 5, 2014 at 11:06 PM
Edited Dec 5, 2014 at 11:08 PM
I know that Windows.Media.SoundPlayer uses Windows Media Player.

Please reference http://support.microsoft.com/kb/316992 for codec support in Windows.

In short I think you can also get the standard System.Media.SoundPlayer to work IF your device has support for the codec in question and you can properly setup the WaveFormatEx header to indicate that codec / sample rate.

If you look at the source code for Media.SoundPlayer you will see how it creates the WaveFormatEx header and then just puts the samples after it.

E.g. if you know your card supports DTS / AAC etc you should be able to indicate that format and put the data right afterwards and it should be decoded by the hardware and played out; otherwise you will need to use the Windows.Media.SoundPlayer and give it the depacketized data which it will decode and then forward to the regular Media.SoundPlayer for playing automatically.

If you need anything else let me know!
Dec 5, 2014 at 11:13 PM
thanks for the info.

i tried this :
   var context = ((Media.Rtp.RtpClient)sender).GetContextByPayloadType(frame.PayloadTypeByte);

        if (context.MediaDescription.MediaType == Media.Sdp.MediaType.audio)
            byte[] audioBytes = frame.Assemble().ToArray();
            using (var fs = new FileStream(string.Format("C:\\temp\\test.aac"), FileMode.Append))
                fs.Write(audioBytes, 0, audioBytes.Length);

but the file is not playable. do you have any ideas what i'm missing ?.
Dec 5, 2014 at 11:32 PM
Depending on what handles playing of that file it could be several things.

Did you check the SDP to make sure its type is 'mpeg4-generic' or 'MP4A-LATM'?

You may have to use the RFC3640 class (for mpeg4-generic).

How are you trying to play the file?

Take a look at a simple example of how another library is sending the data:


My RFC3640Frame class should handle this and it may or may not need the information from the SDP.

Can you send me a Wireshark Capture of the AAC Stream?

I will try to make sure the logic is correct based on the data which comes in the packet vs what the RFC states should be there, that will indicate if Depacketization is correct and if the info is needed from the SDP or it can be read from the packet itself.

Marked as answer by juliusfriedman on 12/5/2014 at 3:32 PM
Dec 5, 2014 at 11:44 PM
Edited Dec 5, 2014 at 11:49 PM
here's the wireshark capture


and the SDP

{v=0o=- 1417817535595464 1417817535595464 IN IP4 Presentatione=NONEb=AS:50064a=control:*a=range:npt=0.000000-t=0m=video 0 RTP/AVP 96c=IN IP4,0,0;0,1,0;0,0,1a=control:trackID=1a=rtpmap:96 H264/90000a=fmtp:96 packetization-mode=1; profile-level-id=4D4029; sprop-parameter-sets=Z01AKZpmAoAy2AtQEBAQXpw=,aO48gA==m=audio 0 RTP/AVP 97c=IN IP4 mpeg4-generic/16000/1a=fmtp:97 streamtype=5; profile-level-id=15; mode=AAC-hbr; config=1408; sizeLength=13; indexLength=3; indexDeltaLength=3; profile=1; bitrate=64000;}

it's of type mpeg4-generic

I'm just playing the file using windows media player. I also tried nero aac decoder/encoder to decoder to a wav file but it failed (moov box not found)
Dec 6, 2014 at 12:25 AM
Edited Dec 6, 2014 at 12:51 AM
It seems you do need the RFC3640 class then, Give that class a try.

If it has any bugs let me know and I will fix them, one thing I am curious about is that it seems I am stripping the AU Header which I am not sure is correct, you would think the Decoder would need the AU Header because RTP surely doesn't so if neither the decoder or RTP need it why is there in the first place?

Since that makes no sense I imagine that maybe I have the process wrong and only the first 2 bytes or maybe the first 4 bytes get removed for depacketization or possibly none and the data gets handed off directly to the decoder

Play around with the RFC3640Frame class and try to see if works when you skip / keep certain bytes.

I will also take some time tonight to work on this if possible if not in the next releases after I finish with the RtpDump changes and before I finish up the other contains implementations, thanks for making me aware of this and let me know what you find and I will let you know what I find.


Seems to indicate you need the sdp info but looking at the packets it seems that you may also be able to extract it.

The end problem is that when there are interleaved frames they need the audio header removed, the audio header is used to signal that multiple samples appear in a single access unit.

Each fragmented Au also has a header indicating its size, if the size is fixed then this information must come from the sdp.

Then the output stream will be written for each sample in the access unit.

Currently the stream includes each sub header but if you give the information from the sdp this should be right for the first packet, the change needs to be when depacketization occurs to iterate the data reading each header while there is data in the payload then writing only the samples.

I will work on this shortly.
Marked as answer by juliusfriedman on 12/5/2014 at 4:25 PM
Dec 6, 2014 at 1:15 AM
Also looking here

It seems that bytes are needed in the output stream.

A adts header with the profile and other info (6 byes or more when Crc is present)

The sdp signals the profile id and is why its needed.

Those 6 bytes should then allow it to be played.

I also think that the interleaving specifies how many times you have to add that header e.g. one time for each complete Au in the frame.

I should have all the info here now. I will play with this and let you know what I find.
Marked as answer by juliusfriedman on 12/5/2014 at 5:16 PM
Dec 6, 2014 at 9:46 PM
Okay, the latest changes should work for your stream since it looks like it does not interleave the access units.

Check out the class RFC3640Frame.cs

Just curious if you have a way to change the setting on the camera to allow for interleaving so I can test that out too, the main difference is that when interleaved that there is sometimes more then 1 access unit in each packet and when that occurs you must consume all the bytes in the current packet, subtract from aacSize and then read the remainder from the next packet. I am curious if the header then repeats itself or if there is another indication in the packet that the data belongs to a previous AU.

If not let me know how this works out for you and I will adjust for interleaving when I can.

I am not sure why some people said you need the SDP info, I think what they mean is that if the SDP has a 'config=' entry it must be given to the decoder before decoding a frame along with the audio object indication and profile.

So in short you may still need to add some bytes similar to how the SPS and PPS were added for H264 but only if there is a config= present in the sdp fmtp line it seems.

Give it a try and let me know how it works for you and if you need anything else!
Marked as answer by juliusfriedman on 12/6/2014 at 1:46 PM
Dec 7, 2014 at 8:29 AM
I just updated the source again, it has more support for interleaving and probably works better than before.

You should probably now be able to get what you need by saving the packet to a file and then playing it (optionally you might also need to add the ADTS header or some configuration info from the sdp) I am still trying to confirm things on that side as well as to ensure that the overlap in standards is correct while allowing for alternate variations of the standard for whatever reason. (e.g. flux mux or another draf) without having to derive the class.

I really think that this could have been done a lot easier and without the SDP information being required but since it is the way it is I need to just try and cover what needs to be done and then worry about procuring improvement if viable.

After all since the end result is a Decoder I think that it should have been able to accept raw depacketization e.g. the result of Assemble() and the Decoders should have had to worry about de-interleaving and Au Headers but since it's done like this it has to be tied to the SDP.

One seemingly good option would be to make the frame class take the MediaDescription which would then call another constructor with the explicit parameters e.g. sizeLength (which would kept as readonly properties), that way both Packetization and Depacketization would have all the info needed when required.

Anywayz, enough banter from me.

Let me know if you find anything else out and I will keep you updated with anything I find out / change as well!
Marked as answer by juliusfriedman on 12/7/2014 at 12:30 AM
Dec 8, 2014 at 10:28 AM
I have fixed few bugs and now I will resume working on the AAC stuff, I would definitely grab it especially if your using TCP.

Have you gotten anywhere with AAC Playback?

It seems that to get it to play back you may need the following data

00 00 01 (AudioObjectStart) (ProfileId) 00 00 01 (DATA)

Where DATA should be the assembled frame.

I think that DATA may have to be preceded with the data from the config= (1408) and maybe a sync word FFFF

I will play around with it some more and let you know what I find out!

Please also keep me updated.

Dec 8, 2014 at 6:51 PM
I don't see any options for the audio settings to allow interleaving, i do see the mode full duplex/half/simplex, sample rate (8,16) and bit rate.
I'm going to try your class now, i will keep you posted.

Thank you.
Dec 8, 2014 at 7:12 PM
I think its more of an option for LATM which is a Multiplexing Protocol for Audio.

In DVB There is one Audio Stream which can then have multiple Sub Programs / Layers which all make up the Audio Stream.

In your case there is no LATM Multiplexing around the packets so that is what I meant by interleaved.

Also it seems that the configuration info can be included in band, this is probably the same for video also.

Also check out what VLC Looks for when decoding AAC, it seems that it handles LATM and AAC in the same context although it doesn't seem to re-order them (it probably leaves that for the decoder)


No problem and thank you!
Dec 9, 2014 at 5:30 PM
I am not able to play back the audio, maybe i am missing the header ?

here's the code i used:
          Media.Rtsp.Server.MediaTypes.RFC3640Media.RFC3640Frame hAudio = new Media.Rtsp.Server.MediaTypes.RFC3640Media.RFC3640Frame(frame);


public void saveaudio_stream(Stream fin)
        using (var fs = new FileStream(string.Format("C:\\temp\\test.aac"), FileMode.Append))
Dec 10, 2014 at 6:25 AM
I thought you were able to hear something in the other thread? Was that for PCMU?

I will take another look at this then, sorry for the confusion!

It probably does need some kind of header, what exactly I am not sure but I will find out!

I will give it a try using your dumps and see if I can extrapolate a .aac file, I will work on that tomorrow.

Thanks for keeping me updated!
Dec 10, 2014 at 5:17 PM
yes it was for pcmu (g711), that works fine but in the other thread i mentioned about the beginning being cut off and i will post the code there.
thanks !
Dec 10, 2014 at 6:47 PM
Cool, I will investigate as indicated and get back to you asap. There is probably just a bug in my code.

Thanks again for the help!
Marked as answer by juliusfriedman on 12/10/2014 at 10:47 AM
Dec 10, 2014 at 10:31 PM
Sounds good, looking forward to seeing what you found.
Dec 12, 2014 at 9:24 PM
I know you must have been very busy but i was wondering if you found anything with AAC depacketizing ?
thanks !
Dec 12, 2014 at 9:33 PM
Well I only really just confirmed that audacity uses ff mpeg.

Im looking into why anything needs to be removed at all from the rtp payload also.

I think that in the end it comes down to needing an additional header to be able to detect its aac.

This header identifies the profile and any configuration required also.


So technically you should be able to take the result of assemble add an additional header and it should be a valid aac stream.
Marked as answer by juliusfriedman on 12/12/2014 at 1:33 PM
Dec 12, 2014 at 10:02 PM
ok let me give it another shot. I'll keep you posted !
Dec 12, 2014 at 11:45 PM
Cool, here is a link describing the format of what apparently are aac headers from various profiles.


Definitely keep me updated and thanks!
Dec 19, 2014 at 11:26 PM
Hey, I didn't forget about you!

I have just been busy!

I wanted to let you know it will probably be after Christmas before I can get to this, if you find anything definitely keep me updated!
Jan 5, 2015 at 11:42 PM
Hi, I just got back from the holiday break and ready to tackle this again. I understand you've been quite busy ! I can see many releases since i last checked :)
Let me know if you had time to look at aac but i will also keep you posted.
Thanks !
Marked as answer by juliusfriedman on 1/6/2015 at 9:00 PM
Jan 7, 2015 at 5:00 AM
Awesome, good luck!

I haven't had a lot of time to work on AAC specifically but the overall functionality has improved a bit and is going to continue to improve. I plan on de-coupling Rtp from the Rtsp client so other types of middle layer / lower layer transports can be used such as MPEG Transport Streams.

I really do want to get a good example for AAC (all of the profiles) and LATM up I just didn't have the time I needed with the holidays and some other things going on, definitely keep me updated and if I can be of any service please just let me know!
Marked as answer by juliusfriedman on 1/6/2015 at 9:00 PM
Jan 7, 2015 at 5:53 PM
sounds good - did you start on mp4 containers btw ?
Jan 7, 2015 at 6:33 PM
Edited Jan 7, 2015 at 11:18 PM
Reading the MP4 container is already supported but might need a little work in some cases.


I have to do some work to enhance the GetTracks methods to be faster and handle fragmented media.

I will also be revising the readers to Parse data they need as they are enumerated to decrease the amount of time taken to call GetTracks from the middle of a stream.

Writing to a container won't be that hard, one should just have to AddTrack from an existing IMediaContainer which should then be able to create all of the required data for the samples of that given 'Track' in it's own format.
Marked as answer by juliusfriedman on 1/7/2015 at 10:33 AM
Jan 7, 2015 at 11:28 PM
I just released 110754 which should be better overall.

I won't really have any time to dedicate to AAC but I was looking at how FFMPEG's decoder is decoding it to make sure that all the required data is available and it seems that it should be.

See https://www.ffmpeg.org/doxygen/trunk/libavcodec_2aacdec_8c_source.html @ static int decode_audio_specific_config [Line 941]

You will see that it decodes the AuHeaders so it may be best to just leave the data as is and not call Depacketize from the RFC3640Frame class because that will strip away the data that is needed by the decoder.

The question is should any data be added, and it seems that the answer is no.

The context looks for the m4ac->object_type to be recognized, this data comes from the SDP.

So it seems that Depacketize should ADD data, something like how I indicated above:

"00 00 01 (AudioObjectStart) (ProfileId) 00 00 01 (DATA)"

It would be similar to the header used for video @ RFC6416Frame Depacketize only with your ObjectId and profileId from the SDP.

After you add that header and put the data in the buffer FFMPEG / Audacity should be able to play it.

Let me know if that is not the case!

If you do get it working please be sure to update this thread so I can update the class if needed.

Let me know if you come up with anything!
Marked as answer by juliusfriedman on 1/7/2015 at 3:28 PM
Jan 7, 2015 at 11:36 PM
ok let me try it - btw, how long do you think before you get to writing to the mp4 container ?
Jan 7, 2015 at 11:54 PM
Edited Jan 7, 2015 at 11:57 PM
I could start today if I needed to, I would just have to whip up the class.

In the end its not really that much work if you have completed reading support and have GetSamples working then really that's all there is to it.

What is your expectation on the use cases?

E.g. the most basic API I can imagine would have a AddSample method which took a Track and the data for the sample and its duration. (offset and count also).

The writer would create and entry in the appropriate sample table for the track which contained the sample size and duration.

It would write the sample data at the offset indicated by theentry in the sample offset table.

The Track would be obtained with a CreateTrack call happens first and specified the type of track and its attributes.

Fragments would be supported by having alternative methods e.g. CreateFragmentedTrack.

The data would only be written to disk when closing or flushing the writer and should allow concurrent reading to the already written data.

I was going to put something together for writing a motion jpeg mp4 but I just didn't get the time yet, I imagine that if I did that it would be easier for others to conceive how writing other types of media would be achieved.
Marked as answer by juliusfriedman on 1/7/2015 at 3:54 PM
Jan 13, 2015 at 11:40 PM
On this:

"00 00 01 (AudioObjectStart) (ProfileId) 00 00 01 (DATA)"

where would i find the audioobjectstart ?

here's the fmtpline

{a=fmtp:97 streamtype=5; profile-level-id=15; mode=AAC-hbr; config=1408; sizeLength=13; indexLength=3; indexDeltaLength=3; profile=1; bitrate=64000;

Seems like the profileid you referred to is the profile-level-id ?

Let me know.
Jan 13, 2015 at 11:51 PM
Edited Jan 13, 2015 at 11:52 PM
Excellent question, sorry I should have made that apparent.

I might have it defined in the mpeg 4 codec solution or in another place.


I will see about determining this from the sdp (as well as the other properties) in the next release I do, I had this information somewhere but I can't seem to find where I put it.

Thanks for bringing this up and sorry for the inconvenience!
Marked as answer by juliusfriedman on 1/13/2015 at 3:51 PM
Jan 14, 2015 at 12:04 AM
Edited Jan 14, 2015 at 12:12 AM
Thanks for the help and the quick reply.

So my header now looks like this

fs.Write(new byte[] { 0x00, 0x00, 0x01,0xA5, 0x0F, 0x00, 0x00, 0x01 }, 0, 8);
where A5 is for AC3
and 0x0F is the profile id

fs.Write(audioBytes, 0, audioBytes.Length);
where audioBytes is a byte array containing the frame.Assemble().ToArray()

The decoder still thinks that it is an invalid format, any ideas ?

PS I do see it defined in your solution now that you mentioned in, it is under Media.Codecs.Video.Mpeg4/ObjectTypeIndication.cs
Marked as answer by juliusfriedman on 1/14/2015 at 2:41 PM
Jan 14, 2015 at 10:41 PM
Interesting, does it give any output as far as errors saying why it thinks it's invalid?

I am really sorry I don't have more time to put into this right now but I am genuinely interested and I am sure I will have to deal with this at one point or another.

It may worth making a thread on




To ask if anyone else has tackled this before.

If you can find another solution which works (regardless of the language it uses I will be more than happy to take a look).

Please definitely keep me updated if you find anything, I suspect that I won't have much time to invest into this again until the end of the month at least while I am revising some of the classes in the solution.

Thank you again for bringing this up and please definitely keep me updated!
Marked as answer by juliusfriedman on 1/14/2015 at 2:41 PM
Jan 14, 2015 at 10:51 PM
You may also want to check out


It seems for raw aac the header might just be 0x00ff and then the assembled data.

If that works great!
Jan 15, 2015 at 6:56 PM
My ultimate goal is to actually save the video and aac audio to an mp4 file.
Do you think i still have to pursue this as it is or once i put the raw aac and the video into the container, i should be able to play it back with vlc ?
Jan 15, 2015 at 8:23 PM
Edited Jan 15, 2015 at 8:33 PM
It's hard to say.

When AAC comes in a MP4 File , (Base Media Format) it contains an 'ESDS' atom which describes the configuration required for the decoder to decode each sample.

This 'configuration' data can also come 'in-band' within the stream samples e.g. as a sample it self within the stream, it is basically the information you would have in the SDP, such as 'profile' or 'sizeDelta' (just like H.264 Video)

It shouldn't matter if the AAC is coming from a MP4 or RTP as the data is the same, the difference is that the SDP data would be materialized into a 'esds' atom.

Technically even if the configuration data is missing it should be able to be played using defaults for the profile although it might sound a bit distorted.

Understanding that, vlc SHOULD play the audio data you give it now, the fact that it doesn't probably means that something is wrong with the data such that 'ffprobe' cannot find a compatible demuxer.

Even if you wrote the data into a MP4 container, vlc would demux the MP4 and when it came time to read the 'AAC' data if something was wrong with the header being added to the sample it simply would not play and you would be at the same point you are now.

When I take your data that you gave me from the Wireshark Capture and I put the Payload section of a RtpPacket with the PayloadType 97 into a 'raw.aac' file and play it with VLC I get messages like this

packetizer_mpeg4audio debug: running MPEG4 audio packetizer
packetizer_mpeg4audio debug: no decoder specific info, must be an ADTS or LOAS stream
es debug: did not sync on first block

Try just this:

'fs.Write(new byte[] { 0x00, 0xFF }, 0, 2);'

Followed by the 'audioData'

That is a mpeg audio sync word which should force ffprobe to think there is some kind of MPEG audio there.

Doing this I am able to get audio but I am not sure if it's playing correctly or not as I didn't do the entire stream only a few packets.

Please give it a try and if it plays exactly as expected then that's fine, if it's distorted then we may have to tackle how to insert the configuration into a raw AAC stream which shouldn't be too hard.

If you look here you will see that the next step would be to read the properties after being 'synced'

https://github.com/gpac/gpac/blob/master/modules/aac_in/aac_in.c @ ADTS_SyncFrame ->[Line 183]

It will basically be looking to read the profile id and the the rest of the data which is present already in the depacketized data.

I will revise the RFC3640 class to pretty much not read anything and just use assemble in the future if this is the case.

You can access the messages in VLC by selecting 'Tools -> Messages' which may help you determine if things are moving in the right direction!

Please verify if this works or not for you!

Jan 15, 2015 at 10:11 PM
Edited Jan 15, 2015 at 10:36 PM
Thanks for the suggestion.

I added 'fs.Write(new byte[] { 0x00, 0xFF }, 0, 2);' followed by the audioData and VLC has the following messages:

packetizer_mpeg4audio error: Multiple blocks per frame in ADTS not supported

here's the complete code:

if (context.MediaDescription.MediaType == Media.Sdp.MediaType.audio)
            byte[] audioBytes = frame.Assemble().ToArray();

            using (var fs = new FileStream(string.Format("C:\\temp\\testing.aac"), FileMode.Append))
                if (bFirstAudio)
                  fs.Write(new byte[] { 0x00, 0xFF }, 0, 2);
                  fs.Write(new byte[] { 0x00, 0x00, 0x01,0x20, 0x0F, 0x00, 0x00, 0x01 }, 0, 8);

                bFirstAudio = false;
                fs.Write(audioBytes, 0, audioBytes.Length);
Jan 15, 2015 at 10:35 PM
Wow. This isn't you or me....

Google search that and you'll see why I am writing my own media library.

Also check out
How to play AAC files - AfterDawn: Guides

It lists some other players.

You can probably also make a small aac player on windows using media foundation and adapting the mp3 sample by changing the codec parameters.

I bet the decoder can do it, but there is some logic getting in the way. I will look at the source for ffmpeg and see if there is a work around but I can't promise anything.

Sorry that this is so hard!
Marked as answer by juliusfriedman on 1/15/2015 at 2:35 PM
Jan 15, 2015 at 10:40 PM
i'm wondering though,

if you save the wireshark rtp from the pcap and added the header and were able to play it,
wouldn't assembling the frame and adding the header in the code have the same result ?
Jan 15, 2015 at 10:45 PM
Edited Jan 15, 2015 at 10:52 PM
I was inferred by the absences of the sync errors and I assumed if I appended more data I would get something.

I can try that and see but if your getting the error I believe you I was just trying to see where and how it was failing.

You can patch mpeg4audio.c, there's some code there right after the error but I'm not sure how useful it will be.

Your best bet at this point is to try another decoder though or transcode to mp3 with vlc then playback.

You could also try to forge the header into a single frame and then feed each block manually to the decoder or through the file although I'm not sure it would play smoothly.
Marked as answer by juliusfriedman on 1/15/2015 at 2:45 PM
Jan 16, 2015 at 12:29 AM
i'm just curious, how are you saving the payload with wireshark ? I made another test and saved the payload 97 but i haven't had any messages popping up on VLC at all when i tried to play.

i went to Telephony -> RTP -> Stream Analysis and save payload
Jan 16, 2015 at 1:29 AM
I used Utility.HexStringToBytes and the data from wireshark came by right click and copying the hex stream from the rtp payload.

On the saved payload does a conversation work e.g. by file convert in vlc?

I'll try extracting the same way and see if it reveals anything different but I still have a feeling that vlc isn't going to work because of how the demuxers are written.

If you can force the mpeg video demuxers on your file from the payload it may also work. (Either the one from the library or wireshark)

If you were passing samples directly to a decoder then I think the 0x00ff header would be enough.

Definitely keep me updated!
Marked as answer by juliusfriedman on 1/15/2015 at 5:29 PM
Jan 16, 2015 at 2:55 PM
See also

It seems with mplayer you can specify the demuxer, with vlc you may also be able to.

Since I am fairly sure the header should be the sync word you may also want to try a command line transcode and see if the resulting file is playable.

If I can get more time in between or after some work I am doing I will update you if I can make the data playable and how.
Jan 18, 2015 at 11:11 AM
See also


It looks like you can use that to both display the video and play the sound (if your not on Mono).

I'm also done with my first round of changes, I will see if I cannot make some more progress on this in my downtime.
Jan 19, 2015 at 11:24 PM
Edited Jan 19, 2015 at 11:26 PM
i'm trying to understand this as well


Could the issue also be that it might working during a stream but not when saving to a file and attempting to play back ?
Jan 21, 2015 at 12:25 AM
i've added this for every audio frame as a header, VLC now recognizes it as AAC audio and is able to figure out the duration.
However, there's no audio during playback although the slider moves.

private void addADTStoPacket(byte[] packet, int packetLen)
        int profile = 2;  //AAC LC
      int freqIdx = 8;  // 16 Khz
        int chanCfg = 1;  

        packet[0] = (byte)0xFF;
        packet[1] = (byte)0xF9;;
        packet[2] = (byte)(((profile - 1) << 6) + (freqIdx << 2) + (chanCfg >> 2));
        packet[3] = (byte)(((chanCfg & 0x3) << 6) + (packetLen >> 11));
        packet[4] = (byte)((packetLen & 0x7FF) >> 3);
        packet[5] = (byte)(((packetLen & 7) << 5) + 0x1F);
        packet[6] = (byte)0xFC;
Marked as answer by juliusfriedman on 1/21/2015 at 11:19 PM
Jan 22, 2015 at 7:19 AM
Your the man!

Where did you find the information about the header?

I will add a Depacketize method which takes the info from the media description automatically.

Great job and thanks!
Jan 22, 2015 at 5:24 PM
Sorry to confuse you, VLC recognizes it but i can't hear any audio...

I wonder if VLC does need the AU headers or if needs to be stripped - i think i'm on the right track though and you definitely need the ADTS header.
Marked as answer by juliusfriedman on 1/23/2015 at 6:22 PM
Jan 25, 2015 at 8:12 PM
Edited Jan 25, 2015 at 8:12 PM
Not sure if it needed the ADTS header then, that seems similar to the result I have when using 0x00FF.

Let me know what you find and thanks for the code, I will see about using it somewhere and include your name in the comments :) !
Jan 29, 2015 at 11:41 PM
IT seems that for each RTP that comes in, the AAC raw data might be either a partial or complete frame, so i think we might need the AU header to determine its size before inserting the ADTS header. Is this your understanding as well ?

Also i've been playing with your rfc3640 class and see if stripping the AU header would make a difference but i've run into an issue on Media.Rtp.RtpClient
OnRtpFrameChanged , an entry with the same key already exists.

here's a payload parser for rfc3640, for AAC-hbr


let me know what you think
Jan 30, 2015 at 12:06 AM
Well I do notice that they don't implement 'de-interleaving' which means re-ordering by the AU by index.

Thats from OSCL_EXPORT_REF PayloadParserStatus

You should see right after the first line...

"//@TODO: Implement AU de-interleaving"

Thats why I do the following and then return the result of selecting that sorted list just after.

' //Add the Access Unit to the list and move to the next in the packet payload
                    accessUnits.Add(auIndex, accessUnit);

headersPresent is also set by the constructor of that class but it doesn't show where it would be set (e.g. by what value in the SDP) unless I glanced over it.

Ideally there should be a method to call which can determine if the headers will be present based on the sdp and furthermore it should be handled intelligently that if the headers are expected or not yet the opposite is found, an attempt to recover should be made.

Are you doing High Bit Rate (HBR or LBR) Low Bit Rate?

Also have you seen these threads?

Some indicate that the ADTS must be stripped.




Depending on if it's HBR or LC it may be a different spec also?


But the most interesting of which documents which I think may give you the most benefit and possibly also answer this format question is:

https://tools.ietf.org/html/draft-ietf-avt-rtp-mpeg2aac-02 (Draft)

Which came from a link here which has more useful information.


I don't have time to digest it all yet but I will get around to it, let me know if you find anything else or have any other questions!
Marked as answer by juliusfriedman on 1/29/2015 at 4:06 PM
Jan 30, 2015 at 12:16 AM
I am doing HBR according to the SDP but it's also called AAC-LC.
I looked at most of these threads and I guess i need to digest them as well, let me know if you need more details about the exception i encountered on your RFC 3640 class.
Jan 30, 2015 at 12:21 AM
Yea, Sorry the reason why I didn't say anything about that is because that function can't throw that exception from that class, it probably comes from the RtpClient when a re-transmission occurs.

E.g. in TCP sometimes you get packets twice.

The client should recover from this error easily as the SendReceive function should just go to the top of the loop again.

Is it effecting your debugging? I can add a try catch to prevent it from propagating, I hadn't encase someone was using the occurrences of the event to do something else with e.g. track re-transmissions or check if re-transmitted data was the same length etc.

Definitely let me know of any difficulties you are experiencing.

I am releasing a new version tonight which may resolve some of it and is definitely better with threading.

Jan 30, 2015 at 12:26 AM
Ok got it, you are correct, the client easily recovers from it.

is the composite indicating the size of the AAC packets read from the AU header ?
Jan 30, 2015 at 12:38 AM
Cool, I will see about skipping packets which should have already been received but it occurs because it does allow for that packet to be removed from the frame and then updated with a new packet if for some reason this functionality is required.

Yes and No...

readHeaderLength determines that.

Then each individual AU is read starting at the offset either directly at the start of the data or after the length if the length is even read and composite is defined in that loop.

//If we are reading the Access Unit Header Length
                if (readHeaderLength)
                    //Then read it
                    auHeaderLength = Common.Binary.ReadU16(rtp.Payload, offset, BitConverter.IsLittleEndian);

                    //If the value was positive
                    if (auHeaderLength > 0)
                        //Convert bits to bytes
                        auHeaderLength /= 8;

                        //Move the offset
                        offset += 2;

If that data is not read then composite will be read from the next two bytes after the RtpHeader.

You will see there is a region which indicates No AU Headers Length

You may want to see if you can decode an esId there which means anything useful to your session, this may also reveal that your looking at some form of TransportStream which has be unwrapped to get to the AAC data itself and that's why your having so much trouble.

Hopefully that made sense, definitely keep me updated and let me know if I can help in anyway further!
Marked as answer by juliusfriedman on 1/29/2015 at 4:38 PM
Feb 5, 2015 at 12:42 AM
Edited Feb 5, 2015 at 12:47 AM
i was looking at the frame assemble
byte[] audioBytes = frame.Assemble().ToArray();

and it seems that it always have a bunch of 0 and at the end of the array for each frame,from 10 to 20 bytes depending on the size of the frame...

could there be a bug there?
Feb 5, 2015 at 1:05 AM
Which class? Anything is possible :)

Assemble should just be using the default logic from RtpFrame.

I think there may be a corner case but sometimes I tend to over think and that's why there is a comment there which says

'//Should chyeck PayloadData is > profileHeaderSize ?'

This basically just indicates that PayloadData may have less bytes to enumerate then profileHeaderSize allows and shouldn't be an issue in most normal cases.

I would look into making sure that useExtensions was not causing it but that is normally false anyway. (The bug would be due to an invalid extension or a bug somewhere else but I have a unit test for Rtp.Extension so I don't think that's the case)

Are you using TCP, I have a few fixes I am going to post which should resolve a few issues there, if this is UDP then I am not sure where the extra bytes are coming from.

Can you post an example of what the packet looks like and what it is supposed to look like and I will be better suited to answer where the issue may lie.

It would be good to find out why the extra data is there, that is my primary concern.

Thanks for bringing this up!
Feb 5, 2015 at 1:15 AM
Here's an example here :


I added the ADTS header for each frame so if you look before FF F1, you should see the extra bytes there.

I am using TCP.

I'm not sure if that information is enough or if an associated pcap with the file would help.
let me know.
Feb 5, 2015 at 1:52 AM
Edited Feb 5, 2015 at 1:53 AM
There is a bug in RtpClient.ProcessFrameData in certain cases (with small payloads or when the ssrc is checked).

I have corrected this I believe in the latest code which I haven't released yet.

I am doing some more testing and I hope to release it tonight, lets check again after that release and see if the issue persists.

If so then I will need a capture unfortunately :(

Feb 5, 2015 at 11:59 AM
Really sorry but I didn't get everything that I wanted to done and now I need to take a break and get some sleep.

I should have something tomorrow to release, I am sorry for any inconvenience it may have caused.
Feb 5, 2015 at 5:45 PM
Thanks, please let me know when you're ready to release and I'll test it right away.
Feb 5, 2015 at 7:22 PM
Edited Feb 5, 2015 at 7:24 PM
I have actually stayed up all night running tests and making improvements.

The performance tests are running now an I will do a quick review again and average the results before release.

The new release which is both improved in performance and features will be released as early as possible.

I am thinking if I should waste more time on unit tests or if I should focus on the release and let others try and find the issues and build more test cases depending on what comes up.

It will probably be no later than 8PM EST but probably sooner.

Thank you again for all of your testing and feedback!
Marked as answer by juliusfriedman on 2/5/2015 at 11:23 AM
Feb 5, 2015 at 7:37 PM
Edited Feb 5, 2015 at 7:40 PM
sounds good ! i'm thinking though even if there's a few extra bytes at the end, wouldn't they be treated as silence anyway ?

also i've been wondering if this is getting nowhere, do you know of any means of transcoding PCM to AAC before muxing it with the h264 stream to mp4 ?
Feb 5, 2015 at 8:04 PM
The short answer is that it may but if the data is not indicating the correct amount of bytes it would be up to the decoder to choose what to do.

One easy way to remove them it seems is to check for Padding to be false and remove any 0 bytes which occur before using the assembled data.

The latest version SHOULD fix anything which was effecting this related to bugs so I am very curious to see.

I imagine also the problem is intermittent and only manifests itself in some versions :-(

Anyway not much longer now!
Marked as answer by juliusfriedman on 2/5/2015 at 1:56 PM
Feb 5, 2015 at 11:33 PM
https://net7mma.codeplex.com/SourceControl/latest has been updated with the latest release. (As promised :D)

There are a number of performance and API improvements.

Let me know how you like it!
Feb 5, 2015 at 11:44 PM
thanks ! i think it's missing taggedexception.cs, i just did a svn update
Feb 5, 2015 at 11:51 PM
Sorry, just fixed!
Feb 5, 2015 at 11:59 PM
Has IloggingExtensions been removed also from Media.Common ? RtpClient.cs doesn't find it
Feb 6, 2015 at 12:01 AM
Edited Feb 6, 2015 at 12:01 AM
110906 put it back, I had originally moved it to a different source file but then I put it back because that's how I did it with the other Extension classes which define methods for the interfaces...

Anyway Sorry again!

Now everything is able to be built! (Hopefully) :)
Feb 6, 2015 at 12:10 AM
Everything is building fine now.
however, i still seem to be getting extra bytes at the end of each frame.
Please see


for pcap and audio file
Feb 6, 2015 at 12:25 AM
That's no good.

Can I have a high level of what your doing so I can see if I can replicate it and find out why.

E.g. use these bytes to create a RtpPacket

Then call this method (Assemble) etc.

Then do this, what is expected and what is not.

Then I can also add another unit test.

Ill take a look at the dumps right afterwards as I check into dumps for another user who has been waiting two days.

Keep me updated, if I can narrow it down further or eliminate it I will let you know and probably also check in some kind of update.

Thanks again!
Feb 6, 2015 at 12:34 AM
Edited Feb 6, 2015 at 12:42 AM
sure, this is what i am doing: instantiate a new rtsp client to connect to the Axis camera and on the frame changed, code is as follow

void Client_RtpFrameChanged(object sender, Media.Rtp.RtpFrame frame)
       if (frame.IsComplete)
            var context = ((Media.Rtp.RtpClient)sender).GetContextByPayloadType(frame.PayloadTypeByte);
           if (context.MediaDescription.MediaType == Media.Sdp.MediaType.audio)
                byte[] audioBytes = frame.Assemble().ToArray();
                byte[] adtsHeader = new byte[7];
                using (var fs = new FileStream(string.Format("C:\\temp\\testing.aac"), FileMode.Append))
                  addADTStoPacket(adtsHeader,   audioBytes.Length);
                   using (var bw = new BinaryWriter(fs))

            else if (context.MediaDescription.MediaType == Media.Sdp.MediaType.video)
                   ..... // h264 stuffs

this is addADTStoPacket:

private void addADTStoPacket(byte[] packet, int packetLen)
        int profile = 2 ;  //AAC LC
        int freqIdx = 11;  // 8 Khz
        int chanCfg = 1;  //CPE 

        int nFinalLength = packetLen + 7; 

  // fill in ADTS data
        packet[0] = (byte)0xFF;
        packet[1] = (byte)0xF1;
        packet[2] = (byte)(((profile - 1) << 6) + (freqIdx << 2) + (chanCfg >> 2));
        packet[3] = (byte)(((chanCfg & 0x3) << 6) + (packetLen >> 11)); 
        packet[4] = (byte)((nFinalLength & 0x7FF) >> 3);
        packet[5] = (byte)(((nFinalLength & 7) << 5) + 0x1F);
        packet[6] = (byte)0xFC;

The high level goal is to connect via rtsp to get both audio/video stream to mp4. We got to a point where PCM and H264 is working but i don't think there's a way to mux pcm to mp4 so AAC would be it.
Marked as answer by juliusfriedman on 2/5/2015 at 4:50 PM
Feb 6, 2015 at 12:44 AM
Awesome, would it be difficult to add a few packets in there just copying them from wireshark and then using utility.hextobytes to create so I can actually follow along?

I dont mean to give you more work but I also kinda need to know what is expected vs what is found to properly diagnose , e.g. assemble is 3020 bytes preceded by the data header for a total of 3026 bytes.

Offset 3027 should point to another adts header but it doesn't And what is there is x bytes followed by the next adts header.

This way I can easily step through and diagnose and fix it.

If not I will try to create from the capture and go from there.

Thanks again!
Marked as answer by juliusfriedman on 2/5/2015 at 4:50 PM
Feb 6, 2015 at 12:50 AM
Also do u have a working aac file you can snip me and post from a similar stream?

I have this suspicious feeling that once I can see what works that it will be much easier to also achieve the end goal of playback!

Thanks again!
Marked as answer by juliusfriedman on 2/5/2015 at 4:50 PM
Feb 6, 2015 at 12:53 AM
I'd be more than happy to do that but i am not familiar with using utility.hextobytes, can you point me on how to do that ?
Feb 6, 2015 at 12:57 AM
here's a working aac that i downloaded


if you open one of the file using the bin editor on VS, you can clearly see the sync word for each frame. I'm suspecting that either those extra bytes might cause an issue or the length specified is not the length of the assembled bytes but maybe the length specified by the AU ? just a wild guess there though.
Feb 6, 2015 at 1:18 AM
i think if you right click on wireshark on the payload type 96 and do a follow tcp stream it should filter out everything else and you'll be able to look at only those RTP and get the hex stream. am i missing something ?
Feb 6, 2015 at 1:31 AM
In a capture window select a packet.

In the information window (right under)

Select Real-Time Transport Protocol

Right Click

Select Copy -> Bytes -> Hex Stream

byte[] rawPacket = Utility.HexStringToBytes("8060003f4f7dc8d6e86e90543c81e6067827f57fff8276f77be3b1df826bbf7d1dfabfc125fbe3710a47c466d88c80001f0435e8efd5feaff5477e09fbbdf7fabfd5fe4df89cbe232ff045d7ff827f7bdfbe0a7bd7efefaf6370e35a7f824f7ec4e3a10144e3f278beb5fe2bdf77f04bbdf7f7103029baad3ba93ee2bdf891a14dfeafd5bc7fe71e14d7d6b6fdd1de61c0b6fdff6271d008faf7c12f5d7ef822efef822efef822dfec338ebe3ffdbfefaf619c35748f7ffffc2b9107effff09ed7dfd5fe3eef57fde272aa853353fafff89a515bdfe111fbddcf8fe4c4fc64b67ffaafcf9ec23d575755f848211755aafafc569fd7c477f4f1a2c66fddebdad71cc2179b2bef786702752607314fad7fdb8a04c0bb7befeae0875e5f045bfdf1d7777bbbbff7dfe5d7f25ffbf7f17ef77c6f7fbbf8dc5dfeaff5ef821dfef937fd5be0935d37deeff177fde15c70287ff6ffe2a9f7fe33afad7ddfc9d7e5f7f09ebf4a15d03fa7fd7f08c57dd277af0ce162417ffffec2b8712704ffffc4e4530a632347dffff827d6bb56f823df6c2b8227dfe6fffb74f044847a537f046c463eb55f045a6bef8aaaf6ab0ae3230934fffff37bc4e302e07c477d78dc3b293be0b69fb59ec6e12c8aa5e619c081a53e6fafffdf04b7adff7c9eb12a8286557effe9dbe0b6bf7fbeebfdfaf8aeaaafc4e3a4286711a53fffeb13b0142b937ebfffad619c2a623feffbfd5fe08f2e276ff5cff57f822df7c5aa1e59ffabfd5fef5fd5f14bf5fcbcbd73f8854fcbbfefdfc23ef7efa393f37ac563e70c16410d2efc4d5fe11dddeff5b53025086feabdbb383030d1ddbebeb3b8568713994789dfafe08fafa82ae32383ffa7f0aaafaffff043afd8554e08fe9fa7f1387e2d467caae2550fc16dfbfef84eeebefe08abfbeeffc657d555efef8423a6cc5e9ebf055bbe7bfaf8abdfa58ac3539fc260b35d3d5ebaf38cefefadddfe4b6bc2aa25cf4f6edff9c199ab5e508828ebbe5df2dfce499513821ad6df7d71c4a03abe6f78d26ff27ba3e84bab708413fbdf661270379a63fdef89c27c51e789ad53e9fbebe08455af5fc7537eae95ff13efbfc257fd68ea1708edefdffbbbf12e08fc779b89c26ed485318649ffff8530668bd36fffc567530b63e101d3ffffc775f54bc563c0d50b3b12fffff11edaff2f697c23d7df7c2ae16890fffeb13826f2c7d88c1acbe6adbf89eaeffc4d7f589248080cf86e10917b036c2aa5395fffa7e087adbe0a2fa77dbe0aebdfebec2ce1c5bcebffb74e13c85fffeb4e270c33a8924741e627200f83177afdd7c2b876239fffa2e17d53ffff7f82db7eef7f828bedebbd757f823f77fabfcbdfe6d6be6f7675210f9b5af92ffc27eb7fc9be84b9cc67fe6f54295fdf7ae093bdf13b1f9bd619c3434bffebbf6c56bafcd08edbef15d5b17f04bafabd7c7faeb5fe12f5dfe2baaaff0852fbeff84bdf5fbdd70b6089a45ebffe1570859e71dffebf8cbfafaeb54157045e1f12feffd3857282fa6dffff15df7bf0424dff0877dee2b77e3078cdfef5aa749f1a38776d7517d70b61e895ea9fafff0856b55e6ff84ae2bbaefe32f7def6944704f1db0ff053d5f3c7eadf0515d56b6f823af6f827df5d6be08bddb9a14ee7dbdd67be8f7df7a159998a555cddfeb5f26b93abe274e389bfcbeec5b9fc9be0977f7ecc2b868602fafffe6af415509c615b7ffd7f8bebef1b8f0adb7c256bd7a1c2571ca826c4387a21f85b40eba7f4ffc178befdaf828d277b7b7828043a5b629ca218a5")

RtpPacket managed = new RtpPacket(rawPacket);
Feb 6, 2015 at 1:47 AM
Was axis-capture and testing created from the same data?

Are you talking about @ 0x3b0 ...

I see it in the capture also...

Check out this packet from frame 1324 (seq = 5328)
Feb 6, 2015 at 2:05 AM
Yes the pcap (axis capture) was taken while the application was recording to the testing file so we are looking at the exact same data.
i think you meant @ 0x3f0 ?

it's occuring prior to everything sync word
Feb 6, 2015 at 2:20 AM
And I do think you can put PCM in mp4, you can put anything there, its just a matter who can read it.

I think that it is supported if forced and I know that you can transcode from PCM with ffmpeg or vlc.

I am not sure why it would be a big deal since MP4 does allow you to specify the codec as PCM and include the WaveInfo(Ex) but then again :P
Feb 6, 2015 at 2:32 AM
would that be an issue for streaming if we use pcm ? just wondering.
I am trying to avoid using ffmpeg or vlc since they are just big libraries to carry around with deploying our solution, ffmpeg does actually does rtsp and saves directly to mp4 but there seems to be so much overhead
Feb 6, 2015 at 3:35 AM
Edited Feb 6, 2015 at 3:36 AM
It depends on your use case, to a low bandwidth device it is possible that you will exhaust the resources of either the device providing the stream or the resources of the network.

I would agree as well with the 'overhead' but another option may be to only use it in house for trans-coding to the 'RtspServer', another option is to use Media Foundation for the decoding which may be easier.

Also, the ADTS to frame function, Where did that come from, do you have a standard reference?

The best thing I could find was

http://sourceforge.net/p/jaadec/code/HEAD/tree/jaad/src/net/sourceforge/jaad/aac/ and it seems a bit different, maybe an older or unrelated format (ADIF).

As much as I hate to look, Live 555 does it...

I need to take a break for the night but I don't think there is a bug @ all, that data comes from Wireshark in the packets and that's why its there although I do see a potentially small problem where the offset math is wrong which I will fix...

If you check the capture you sent me @ Seq - 5332 (Frame No 1560) You will see the it there also as well in many others, I think it occurs everywhere a Marker is and all your packets seem to have the Marker bit set.

I made another update to the code just for you!

Give it a shot and let me know if it helps any!

I will give this another run tomorrow or the day after if I can!
Marked as answer by juliusfriedman on 2/5/2015 at 7:35 PM
Feb 6, 2015 at 6:34 AM
Ok, I was just suspecting something only because it seems odd that the 0 were at the end of each frame.

What ADTS to frame function are you referring to ? Adding the header to each frame ?

What did your last update address ? I'm just asking to make sure i'm in sync with your latest changes.
Anyway, I appreciate your help and good work.
Feb 6, 2015 at 1:30 PM
There are also 0 at the end of each frame in the capture.

Yes the add adts to frame functionality.

It ensured that the offset was corresponding to the payload and not header data.

Np @ all and thank you!
Marked as answer by juliusfriedman on 2/6/2015 at 5:30 AM
Feb 6, 2015 at 5:51 PM
this is where i got the info about the ADTS structure


was that the reference you were asking about ?
Feb 6, 2015 at 5:57 PM
Edited Feb 6, 2015 at 5:59 PM
Cool yes thanks.

I will try to make a unit test(s) for aac and ensure packetization is also working.

If not today probably tomorrow.
Feb 6, 2015 at 6:05 PM
should i give rfc3640 another shot as opposed to assemble ?
Feb 6, 2015 at 7:37 PM
Edited Feb 6, 2015 at 7:38 PM
Yes, but you still might need the adts header I I might need to change the way I obtain the data is in bits and might be split between bytes.

Asemble already used the corresponding offset according to the packets PayloadData so the offset change I made in the other class (RFC3640) shouldn't effect it, I was trying to be more efficient there and access the payload directly without generating an enumerator.
Marked as answer by juliusfriedman on 2/6/2015 at 11:37 AM
Feb 6, 2015 at 9:14 PM
I used the RFC3640 class as follow
                Media.Rtsp.Server.MediaTypes.RFC3640Media.RFC3640Frame hframe = new Media.Rtsp.Server.MediaTypes.RFC3640Media.RFC3640Frame(frame);

                using (var stream = new System.IO.MemoryStream())
but the stream length always seem to be 0 ?
Feb 7, 2015 at 3:34 PM
                    //Move the offset past the bytes read
                        offset += auHeaderLength;

                        //These values may be split between bytes, should read them using a bit reader.

                        //The size of the esData is given by removing the bits used for the index
                        auSize = (int)composite >> indexLength;

                        //The index of the access unit is given by removing the bits used for the size
                        auIndex = (int)composite << sizeLength;
This code is probably the culprit.

Depending on what index and sizeLength are there may not be enough bits available.

I have to use a BitReader or a BitArray to fix this :)

I will make another release today probably!
Marked as answer by juliusfriedman on 2/7/2015 at 7:34 AM
Feb 7, 2015 at 8:00 PM
110910 Should be better.

I will be unavailable for the next few days so definitely keep me updated!
Feb 8, 2015 at 7:26 AM
Thanks, i'll give it a try shortly.
Also i experimented with ffmpeg to capture an rtsp stream from the camera to an mp4 file and extracted the aac from the mp4 to take a quick look.
It seems that the number of bytes between each ADTS header is about 160-180 bytes.
The size of frame.Assemble is around 500-600 bytes, which could be the issue. I'll see what the RFC3640 class returns as a length.
i'll keep you updated.
Feb 9, 2015 at 7:55 AM
Edited Feb 9, 2015 at 8:03 AM
I think i fixed all the bugs in RFC3640 but who knows :)

I haven't done much testing but I am confident the latest changes put the code in much better state than it was, but I still have some work to do there.

Anyway, I shouldn't even be on the computer right now so I may not be around until Wednesday, Let me know if it helps!

Also check out

Marked as answer by juliusfriedman on 2/8/2015 at 11:56 PM
Feb 9, 2015 at 6:22 PM
it's strange, with the latest version, i'm not even getting the rtpframechanged event, I tried a previous version and it worked just fine.
I know you're off for the next few days but just thought i'd let you know.
Feb 9, 2015 at 6:38 PM

Post a wireshark capture if you can of the session with the issues if possible and any accompanying logs from the library if possible.

And yes,
I'll probably have time tomorrow or the next day.

There may also be a few changes to the rfc3640 class still :-)

Marked as answer by juliusfriedman on 2/9/2015 at 10:38 AM
Feb 9, 2015 at 8:03 PM
you can find the capture at https://www.dropbox.com/s/7s24pyyaufx0rzu/capture%202-9-15.pcapng?dl=0

Where would the logs file from the library reside ?

I've applied to latest changes of the rfc3640 class to the last working version but it seems that it does not exit the
while (offset < max) loop and return the access units.
Feb 9, 2015 at 9:56 PM
In the console by default, unless you run the rtsp client tests, then there are temporarily in the same place the application executed until the end of the tests.

I'll check it out asap and thanks.
Feb 9, 2015 at 10:45 PM
Edited Feb 10, 2015 at 12:30 AM
i don't see any logs, do they have to be enabled ?

also looking at RFC3640Media.cs, i have a question. Is there a 12 bytes RTP header preceding the Au-header length and if so, should the offset be 12 then ?

this is the reference

I was calling Depacketize without any parameters, that was why i never got the access units.
I passed in the info from the SDP but however, the sdp only provides info about the sizelength, indexlength and indexdeltalength.

What about CTSDeltaLength, DTSDeltaLength and auxDataSizeLength ?

by passing the sizelength, indexlength and indexdeltalength values only, i am getting the buffer back but its length is still about 500 bytes which imo still seems too big.
Feb 10, 2015 at 7:57 AM
No, the unit test makes a 'logWriter' which is a Media.Common.Loggers.FileLogger

("rtspLog" + DateTime.UtcNow.ToFileTimeUtc() + ".log.txt")

Yes there is a RtpHeader and yes from your capture it appears it's 12 bytes.

No, the offset should not be 12, the Payload starts after the Header. See Prepare.

This allows ROHC to be used as well as other types of headers.

The overall data may be 496 bytes but that is because there are what seems to be 6 access units (in the packet I chose to test on)..

Index 96 - 15 Bytes

Index 320 - 110 Bytes

Index 448 - 156 Bytes

Index 576 - 26 Bytes

Index 608 - 141 Bytes

And finally Index 540 - 35 bytes(actually 48) but with padding

See TestRFC3640Frame.

Also I think I have to do some more work in the AuHeaderBits section..

I probably also have to store each au header and then give the header then the unit to the buffer (in that order) (header, au, header au)

I will update the code tomorrow or the day after when I get more time to show you.

Keep me updated!
Marked as answer by juliusfriedman on 2/9/2015 at 11:57 PM
Feb 10, 2015 at 7:45 PM
Edited Feb 10, 2015 at 8:34 PM
cool -

i'm still unclear on what determines the length of the CTS and DTS delta along with the auxDataSize ?

also, if the ADTS header is needed for each AAC frame, would an AAC frame be considered as
RTP header plus AU-header-length plus AU size plus AU-Index plus AAC frame data or just append in front of each AAC frame data ?
Feb 10, 2015 at 8:37 PM
Edited Feb 10, 2015 at 8:37 PM
Np, Working on the updates now.

I will release around 6 or 8 PM EST.

Just to be sure we are talking about the same thing I mean this header.


You should also check your bit shifting as I don't think it's entirely correct for what you specified.

I come up with

FF F1 54 00

Which specifies 32000 hz and 0 channels.

If you want 1 channel

'FF F1 5D'

Which specifies 22050 hz and 1 channels.

You come up with

'FF F1 6C 00'

Which specifies a sample rate of 8000 hz with 0 channels.

I think that is your proboem.

Also, are you essentially saying that one also needs to add the ADTS header for each accessUnit in the Payload?

If that's the case I can just have it provided and written to when writing the accessUnits...

Something like this
                    //Add the Access Unit to the list and move to the next in the packet payload, optionally add the missing bytes as padding of 0
                    var accessUnitHeader = rtp.Payload.Skip(auHeaderOffset).Take(auHeaderLengthBytes);

                    var accessUnitData = rtp.Payload.Array.Skip(offset).Take(auSize);

                    var accessUnitAndHeader = Enumerable.Concat(accessUnitHeader, accessUnitData);

                    var depacketizedAccessUnit = accessUnitAndHeader;

                    //If a frameHeader is required for each accessUnit then prepend it here
                    if (frameHeader != null)
                        depacketizedAccessUnit = Enumerable.Concat(frameHeader, depacketizedAccessUnit);

                     //Add padding if required..
                    ////.Concat(Enumerable.Repeat<byte>(byte.MinValue, max - offset))

                    accessUnits.Add(auIndex, depacketizedAccessUnit); 
Lemme know!
Feb 10, 2015 at 10:01 PM
Edited Feb 10, 2015 at 10:11 PM
yes we are talking about the same frame header.

I changed my camera sample rate to 8000hz actually. The bit rate is 32Kbits/s but i don't think we have to worry about that, do we ? so now if have FF F1 6C 40 which specifies a sampling frequency of 8000 according to the document and a channel configuration of 1

Also i am not sure if it should be F9 for mpeg 4 or F1 for mpeg 2

I am not entirely sure if one needs to add the ADTS header for each access unit, we can certainly try.
Would that make sense if you look at this ?

Could you also elaborate where the length of the CTS, DTS delta and auxDataSize come from ?
Feb 10, 2015 at 11:07 PM
Edited Feb 10, 2015 at 11:22 PM
Bit rate is essentially the amount of bits in each sample period which correspond to the audio data being sampled over the defined sampling rate.



It would depend on the media for mpeg 2 or mpeg 4 indication.

The SDP usually outlines this :

'a=rtpmap:97 mpeg4-generic/32000/1'

RFC3640 is for MPEG4 streams.

The differences are outlined here.


The CTS, DTS and other data come from the SDP or otherwise (whatever describes the media).

I could make a overload of 'Depacketize' which reads the values from the fmtp or rtpmap lines in a given MediaDescription, parse them and pass them for the developer, I just didn't get around to it yet.

a=rtpmap:97 mpeg4-generic/8000/1
a=fmtp:97 streamtype=5; profile-level-id=15; mode=AAC-hbr; config=1588; sizeLength=13; indexLength=3; indexDeltaLength=3; profile=1; bitrate=32000;

Your media description doesn't explicitly specify those parameters which would leave them to the default indicated by the profile I assume.

The blog doesn't really tell me anything about interleaving, which is where my question lies.

E.g. if you have 1 accessUnit then it really doesn't matter.

In your sample (capture) however there seems to be 'interleaved' units, which means more than 1 frame's data is contained in a single Payload.

So, essentially the AddADTSHeaderToPacket function would probably have to be called with the same information as given to Depacketize for everything to work as expected, all that data comes from the SDP (fmtp or rtpmap lines).

In closing,

FF F1 6C 40 is your header, it specifies the correct profile and rate.

The next 2 octets would contain the length of the single frame

The AAC Frame is defined by the [Frame Header, Aux Data] and usually proceeded by ADTS Headers where the profile requires.

In the case of more then 1 access unit I believe that there must be another frame header and (any aux data) followed by the next access unit.

The header then only needs to be generated again if the length of the access unit is different.

So we have in total:

'ADTS Header'
Followed by
AU Header

Hopefully that clears it up!

I will update the code soon!
Feb 10, 2015 at 11:20 PM
yes definitely clears it up, thanks...

I understand the bit rate but what i meant is that it is not required for the adtsheader it seems.
I've actually parsed the fmtp to retrieve the sizeLength, indexLength and indexDeltaLength and pass them to the Depacketizer as follow:

hAudio.Depacketize(true, sizeLength, indexLength, indexDeltaLength, 0, 0, 0, false, false);

everything else stayed 0 since there were no further info from the SDP.
Feb 11, 2015 at 1:00 AM

I updated the code!

Cool, you should post the code so I can include it with

'TestRFC3640Frame' as part of the example.

I will work on the overloads once we are sure everything is working!

Keep me updated!
Marked as answer by juliusfriedman on 2/10/2015 at 5:00 PM
Feb 11, 2015 at 1:12 AM
thanks - this version is however behaving like the last release, it does not always get to RtpFrameChanged so i wasn't able to test properly.
I will apply the changes to the last working version however as i'm anxious to see if we can get everything else working.
Please let me know what you need so you can address the RtpFrameChanged issue.
Feb 11, 2015 at 1:28 AM
here is the latest file that the latest RFC3640 code generated


it seems that the length of each aac frame between the adts header vary from 16 to 20 something bytes which i think is too small
most of the working samples i've seen have frames of length of about 160 bytes
Feb 11, 2015 at 1:30 AM
Do you see any output in the console such as 'Unknown Packet or Incompatible Packet' ?

Your wireshark capture looks good the problem must be with some logic I added recently.

Does UDP work as expected?

110920 adds more logging points.

Can you post the output if any.

I will try to make heads or tails based on that.

I may need to add a few more points to narrow it down.

Feb 11, 2015 at 1:35 AM
Thinking and knowing are two different things.

Look at my results...

Index 96 - 15 Bytes

Index 320 - 110 Bytes

Index 448 - 156 Bytes

Index 576 - 26 Bytes

Index 608 - 141 Bytes

Especially with that small of a sample rate you aren't going to get a lot of data for each sample.... I would say more than 80 bytes for single channel 8000 hz sample is wayyy to much...

I will take a look at the file tomorrow probably as I am about to take a break for a few hours at least.

See if you can track down the issues with the event and I will go from there!

Feb 11, 2015 at 1:36 AM
i do not see any output other than 'The thread 0x3274 has exited with code 259 (0x103).' which probably won't mean much to you.
UDP does not seem to work either.
Feb 11, 2015 at 1:40 AM
Edited Feb 11, 2015 at 1:42 AM
UDP could be nat, if you post a wireshark capture of you trying to use udp I will see if I can tell from that.

The message you cited you are seeing in the output window of the the debugger? Are you using a DebuggingLogger?

You have to attach a Logger, if your using your own code I don't know where the best place to do that is but probably after you connect.

If your using a RtspClient it already a logger property which can be set after creation.

You can then wait for the Play event to set the logger on the RtpClient.

I do have a unit test in the Test solution which does show all of this. 'TestRtspClient' and created the logs for you automatically so you can either dump the console output with 'Test.exe >> myOutput.txt' to a file or you can use a separate file logger instance.

Marked as answer by juliusfriedman on 2/10/2015 at 5:41 PM
Feb 11, 2015 at 2:23 AM
i am seeing it in the output windows of the debugger.

I attached a logger as follow in my own code :

System.IO.FileInfo rtspLog = new System.IO.FileInfo("rtspLog" + DateTime.UtcNow.ToFileTimeUtc() + ".log.txt");
                Media.Common.Loggers.FileLogger logWriter = new Media.Common.Loggers.FileLogger(rtspLog);
                //Attach the logger to the client
                client.Logger = logWriter;
right after client.IsConnected is executed but i don't seem to see any logs at all.
Feb 11, 2015 at 2:44 AM
Edited Feb 11, 2015 at 2:44 AM
The logs would be found in the FileInfo you specified once created (if you can create the file)

I don't where where IsConnected gets executed in your code.

I have an example on how it should be called in my code which is via an event.

This is from Program.cs which is in the Tests solution.

connectHandler = (sender, args) =>

But what you want is client.OnPlay

And then inside that you need to put the following code inside the handler:

client.Client.Logger = logWriter;

To attach it to the RtpClient.


.. I need to add some documentation but part of the problem here is that you don't seem to be understanding the API of the RtspClient.

If you can provide feedback on that it would probably be helpful.

Anyway, once you attached a logger you will only notice a file is created if you try to log something, try a test log.

logWriter.Log("This is my log");

That should be easy enough to verify.

There should be "This is my log" in the contents of the file, maybe I should also make a unit test for that so copy and paste can be as easy as it was intended to be :)

Anyway after all that you would be able to obtain the results of any log call by checking the source of the log, in this case that file.

Hopefully then I can try to diagnose what is happening and why.

It may also make sense to step through the function ProcessFrameData and determine where if it gets to ParseAndCompelte data which would be easy enough to do by just setting a break point there, if it gets hit the problem is in HandleIncomingRtpPacket. If it doesn't the problem is still in ProcessFrameData, put another breakpoint on reading the frameLength with ReadRFC2326FrameHeader and step line by line from there until you can find the issue.

Thank you!
Marked as answer by juliusfriedman on 2/10/2015 at 6:44 PM
Feb 11, 2015 at 4:02 AM
Edited Feb 11, 2015 at 4:16 AM
i had
client.Logger = logWriter as opposed to
client.Client.Logger = logWriter;

here's what the log indicated

Large Packet of 16388 for Channel 0
No Context for Channel 203
No Context for Channel 103
No Context for Channel 175
No Context for Channel 142
No Context for Channel 158
No Context for Channel 69
No Context for Channel 69
No Context for Channel 18
No Context for Channel 188
No Context for Channel 188
No Context for Channel 72
No Context for Channel 72
No Context for Channel 191
No Context for Channel 112
No Context for Channel 112
Feb 11, 2015 at 5:02 AM
Edited Feb 11, 2015 at 5:02 AM
I don't really have a keyboard right now.

It looks like that your using a buffer larger than default.

That or a network error occurred.

If you could capture the same session with wireshark it would be helpful.

It seems as a result of the large packet and buffer the framing from tcp segments which overlap caused a framing error and because of that the stream didn't recover at the correct offset.

I'll fix the math with large frame issue in the next release.
Marked as answer by juliusfriedman on 2/10/2015 at 9:02 PM
Feb 11, 2015 at 5:24 AM
here's the wireshark capture


let me know if you need anything else, thanks.
Feb 11, 2015 at 5:25 AM
Yes, What version works as expected.

Feb 11, 2015 at 5:37 AM
i believe the last version that worked was 110190
Feb 11, 2015 at 6:21 AM
I found an issue on header[3] for the ADTS header , i meant to use nFinalLength instead of packetlen so header[3] should be as follow
header[3] = (byte)(((channelConfiguration & 3) << 6) + (nFinalLength >> 11));

After making that change though, the AAC file that was generated had the first ADTS header as FF F1 6C 40 24 BF FC which indicate an AAC frame length of 293 bytes.
However, looking at the AAC file, between the first and the next ADTS header, there's 287 bytes for the AAC frame so I'm trying to find out why it's off by 6 bytes
Marked as answer by juliusfriedman on 2/11/2015 at 6:09 AM
Feb 11, 2015 at 2:09 PM
I'll check into both of these today.

I think when there are a small amount of bytes missing its due to padding.

I will keep you updated.

Feb 11, 2015 at 7:13 PM
Edited Feb 11, 2015 at 7:16 PM
Also just curious, what is the size of your RtpClient's buffer?

(Obtained by checking m_Buffer.Count)

I noticed a TCP segment which was missing around frame 3520.

Right after rtp.seq = 33140

Are you aware of what sequence number the issue started at?

I will have to probably add some more logging points to get all the information I need.

Was stepping through the code helpful? It should have revealed the issue almost immediately.

I will add some additional logging points and update the code.

I will keep you updated.
Feb 11, 2015 at 10:35 PM
i haven't had the chance to step through the code as i was looking at the rfc3640 on the existing working code, i'll probably do so later today.

also the m_Buffer.Count is 16384
Feb 11, 2015 at 10:41 PM
Try using the default buffer size (e.g. not specifying it) or as the example does specifying 0.

I imagine the issue will go away completely.

If it does then that's great I can probably test on my own with the same size buffer to determine what is happening but I would also like to address it if possible.

Just curious, why do you need such a large buffer?

I am going to make a release, it would probably wait until after I did so to do any further debugging.
Feb 11, 2015 at 11:33 PM
I actually do not need that big of a buffer, I probably specified it when i first started using your library without realizing i didn't need a large buffer.

i used it without specifying the buffer size but i do still get the same issue

Here's what the log indicates:

Large Packet of 8196 for Channel 0
No Context for Channel 72
No Context for Channel 123
No Context for Channel 79
No Context for Channel 254
No Context for Channel 87
No Context for Channel 97
No Context for Channel 16
No Context for Channel 52
No Context for Channel 15
No Context for Channel 168
No Context for Channel 127
No Context for Channel 160
No Context for Channel 72
Large Packet of 8196 for Channel 0
No Context for Channel 217
No Context for Channel 122
No Context for Channel 130
No Context for Channel 63
No Context for Channel 156
No Context for Channel 46
Feb 12, 2015 at 12:44 AM
It looks like the client has somehow became unframed.

I see that its about 2 bytes off if that's the case.

If you can replicate this with Wireshark running and subsequently determine what the 'SequenceNumber' of all GetTransportContexts was I can try to look and see why the logic messed up by feeding the same packets to the client.

I should probably have put a Time in the log but you can also easily do that by deriving the logger.

E.g. TimedLogger could inherit FileLogger for now just to always indicate the time at the beginning of the line.

If you can't follow any of that can you send a diff of the RtpClient class or the file itself from the working version so I can compare it to the latest.


110925 is released, check that out and let me know if anything changes.
Marked as answer by juliusfriedman on 2/11/2015 at 4:54 PM
Feb 12, 2015 at 1:15 AM
110925 is running better, i am getting the RtpFrameChanged and am getting both streams.

However, there are a few HandleIncomingRtpPacket failed reception in the log and i've attached in the zip file along with the pcap. I am also getting a first chance exception (System.ArgumentException) that is probably occurring during the RFC3640 depacketization but i will look at that a bit later to make sure this isn't coming from me first. I'll keep you updated there.

Here's the capture

Feb 12, 2015 at 1:16 AM
Looking at the logs and doing some rough math I made a fix for what I think the issue is @ 110926.

Let me know if I was right!
Marked as answer by juliusfriedman on 2/11/2015 at 5:18 PM
Feb 12, 2015 at 1:31 AM
I do have to take a break for the night but...

The Argument Exception would be interesting to see, get a stack trace if you can.

Looking at the newest capture and logs.. It seems ...

The first RR SR Compound packet was received but you aren't sending anything so you don't need to handle the RR, your looking for the SR. (The library does this anyway for you)

The camera shouldn't really send the RR SR combo, I am gonna have to double check the RFC on that one but either way I don't think it's illegal :) But I will handle it better if that ends up being the problem.

You will also probably want to make sure you using the latest RtspClient class also.

I have tested with a bunch of Axis cameras before but I never really noticed that, I will have to try and replicate it locally if possible.

What model is that?

If nothing else step through the code as I indicated earlier which will reveal exactly why the packet is being deemed incompatible.

Thank you for your assistance in testing this!
Marked as answer by juliusfriedman on 2/11/2015 at 5:32 PM
Feb 12, 2015 at 5:26 PM
you were right ! with 110926 i am no longer getting the 'HandleIncomingRtpPacket failed reception' mesg.
I will look at the exception today and see what's going on.

The camera model is a M1034-W and yes I am using the latest RtspClient class.
I can try to give you access to the camera if you want to try and replicate it as well.

Feb 12, 2015 at 9:12 PM
Edited Feb 12, 2015 at 10:33 PM
Only sometimes ;p

If you don't mind yes, but please email it to me so we know that the IP is not getting out there unless you don't care.

I just wanted to run some quick tests against it.

I also want to provide some information about what I was talking about for you to understand.

IMHO it should have send a Compound Packet like this:


Instead it sent



Even if the SSRC is different in each packet MY interpretation is that the camera SHOULD NOT HAVE SENT TWO DIFFERENT TCP / UDP Segments.

The COMPOUND PACKET would have been






Even if every SSRC in each packet was different then the RULES state that the packets MUST BE 'stacked' as if they were a single packet.


It further states this:
An individual RTP participant SHOULD send only one compound RTCP
packet per report interval in order for the RTCP bandwidth per
participant to be estimated correctly (see Section 6.2), except when
the compound RTCP packet is split for partial encryption as described
in Section 9.1. If there are too many sources to fit all the
necessary RR packets into one compound RTCP packet without exceeding
the maximum transmission unit (MTU) of the network path, then only
the subset that will fit into one MTU SHOULD be included in each
interval. The subsets SHOULD be selected round-robin across multiple
intervals so that all sources are reported.
The difference here is that we are using RTSP to setup the RTP stream and Encapsulate it (tunnel it) through TCP.

In such a scenario each stream will have it's own virtual 'channel' (data [rtp] and control[rtcp])

On such a channel there are no rules about what packets can be present, e.g. Can I send a packet to channel 1 with the identify of channel 2?

Well Yes you can, will it be handled, that is up to how the software your using works. (The Rtsp Implementation)

As far as I know my software is the only implementation which works by looking up the ssrc even in TCP which allows for channels to overlap if desired.

This may be a problem if you (the receiver) have two separate address spaces which you expect data from and any of the addresses in one of the spaces overlap.


You have channel 0 - 1 for Rtp video and Rtcp, channel 2 - 3 for Rtp audio and Rtcp.

Technically you would also be able to have something like 0 - 1 for Video and 0 - 2 for Rtcp if you wanted but not supposedly not without using the new 'rtcp-mux' datum... or something proprietary.. but I digress....Even in version 1.0 its not really clear and it's up to the software and I am trying to support ANY variety so this is why I made this clear.

In such a case if channel 0's ssrc was 1234 and channel 2's ssrc was ALSO 1234 when the lookup occurred you would have an overlap, there are various checks in my software to TRY and prevent this but the FACT is that if it overlooked and the RtpClient is used in such a manner you have to handle this by ensuring that each channel uses a unique ssrc.

Your not doing anything like that (I don't think anyway) and as a result you don't have to worry. I will have to add some more tests to my own software to ensure this is realized and working and only is an issue when multiple parties are being served media under a single connection and only from my RtspServer.

In this case, your use case, RTSP Interleved, I Suppose that since each packet was for a different 'SSRC' that there are also two separate report intervals and hence why it occurs.

Why the packet is a RR SD and not another SR SD is still a bit beyond me, the RR doesn't include the necessary information for NTP time synchronization...

Why it sends the RR at all is... interesting... but not invalid.

My implementation only sends a SR if anything was sent and only sends a RR if anything was received, the SD is sent when ever the bandwidth allows and they are all sent in a single packet.

In short, there is nothing really wrong, it would just be good to make sure that since I have a camera available that I would able to test a few things but I don't NEED to as I can replicate it without a camera using my server anyway and with the cameras I have access to eventually.

It would however help with Depacketization testing :)

Hopefully all of that made sense, Let me know if you have any questions and I will be more than happy to go over anything.

If you can make the camera available let me know for when and how long and if not don't worry!

On a side note, what is your project with the Audio Data.... I had quite a few interesting ideas with audio as well as video which I didn't think have been realized yet and I would be willing to share ideas if your project is in the same genre and you desired so.

Also if you can provide any feedback about the library it would be very helpful, e.g. on it's design, API, ease of use...

It doesn't have to be professionally written or anything but would definitely help with future development of documentation and API choices.

It shows me what people understand about the library and what they don't and how I can improve that.

Thank you again for all of your testing and hard work!
Marked as answer by juliusfriedman on 2/12/2015 at 2:33 PM
Feb 12, 2015 at 11:33 PM
Edited Feb 12, 2015 at 11:52 PM
I just emailed you the ip address of the camera. Please let me know if you have any issues accessing it and I'll try to correct them.

it is also set to AAC currently.

I need to digest all the info you've described above in the meantime :) and yes, i would be more than happy to provide feedback about the library, give me a couple of days :)

one thing i can tell you first though about this library that's awesome is its support ! :)
Marked as answer by juliusfriedman on 2/12/2015 at 4:45 PM
Feb 13, 2015 at 12:44 AM
Well thanks :)

I got your email.

I just made a commit so I am taking a break for the night.

There are others, if you read my article which is linked on the project home page I explain some of them..

Others are




None of which support TCP AFAIK and none of which have anywhere near the functionality of this implementation but they do have some interesting differences... e.g. have more events or less properties etc..

I will probably not get a chance to test with the camera tonight, but definitely tomorrow.

I will keep you updated and please also do the same for me!

Thanks again!
Marked as answer by juliusfriedman on 2/12/2015 at 4:45 PM
Feb 13, 2015 at 1:37 AM

I also just thought that if you really wanted to help test that you could actually host the stream under the rtsp server and then make that link available.

That is something I don't currently have the ability to do (Yet) but I am working on it.

You would also be able to add different layers of authentication without relying on making the camera itself public.

The server would instead be made public and you would also be able to control viewing times etc.

But that's only if you want to.

Anyway thanks again!

Marked as answer by juliusfriedman on 2/12/2015 at 5:39 PM
Feb 13, 2015 at 8:11 PM
A new release is out!

I am going to take a break and then re-focus on testing that camera.

I will let you know if I run into any issues.
Feb 13, 2015 at 8:25 PM
sound great ! anxious to see what you find on the depacketizer, I am taking a look at it today as well.
Good idea on hosting the stream under the rtsp server, I'll have to try it once this is working.
Feb 14, 2015 at 4:57 AM
Didn't get a chance to work on that yet...

110949 is another release..

I may finally have time to look at the depacketization now..

I am gonna take a break and then I will email you if I have issues connecting to the camera!

Feb 14, 2015 at 7:30 AM
Edited Feb 14, 2015 at 7:30 AM
In other news, It seems those values are in Network Byte Order.

I will have to add the overloads to read bits in reverse endian...


I didn't get a chance to tinker yet, but I released 110950.

Hopefully in another little bit I can get some more done!
Marked as answer by juliusfriedman on 2/13/2015 at 11:30 PM
Feb 14, 2015 at 8:05 AM
Okay, I was able to add the Reverse reading methods and that is done.

See 110951.

I think I will also need to add Reverse writing methods because your values are being written in the opposite byte order I think.

I will break for a bit!

Keep me updated!
Feb 15, 2015 at 6:02 AM
See 110952.

I have fixed a few bugs in the Binary classes :)

I have to take a break for the night but tomorrow I should have enough time to look at the camera (if I am semi lucky)

Take care and keep me updated!
Marked as answer by juliusfriedman on 2/14/2015 at 10:02 PM
Feb 16, 2015 at 5:56 PM
let me grab your latest code and i'll give it a run ! thanks !
Feb 17, 2015 at 8:22 AM
https://net7mma.codeplex.com/SourceControl/latest is released.

This thread is getting quite long :)

Any updates on AAC? Was I right about the endian?
Feb 17, 2015 at 5:25 PM
i just tried the latest, there's still issues with playing back AAC, i'm going to look at the generated file now and its header.
Feb 17, 2015 at 5:31 PM
Cool, I'm not doing much besides thinking until tomorrow or the next day.

Please definitely keep ne updated though.

Also I / you probably need a proper WriteBits method especially for creating the header.

I will see about putting something in my next release.

Technically with SetBit you can do it too and that probably should be used until there is better functionality.

Feb 17, 2015 at 5:57 PM
Edited Feb 17, 2015 at 6:00 PM
i think the issue is still there. In one of the samples, the aac header indicates 143 bytes for the frame length but the data length between the first and next header is 138 bytes.

btw were you able to access the camera ?
Feb 17, 2015 at 6:01 PM
Edited Feb 17, 2015 at 6:09 PM
What issue exactly?

You do realize that the data is a bit stream once depackeized?

E.g. any data left over from the previous depackeized result in the Buffer will have bits which are added which may not belong to the aac stream correct?

How are you handling that?

Please note, I probably also need to do the same when writing to the buffer but can only do so for bytes obviously why the rfc indicates that the stream must be aligned.

E.g if you notice I violated that when writing the headers or otherwise please let me know but the function expects the data to conform to the rfc from which it was written and then also the standard for the codec.

I could allow for multiple calls to depackeize to augment the Buffer and keep the bit offset of the last byte but what is the use case?

If the frame is not completely received how does this help?

Possibly your asking that I also return the last bitoffset written when depackeized?
Marked as answer by juliusfriedman on 2/17/2015 at 10:01 AM
Feb 17, 2015 at 6:08 PM
the issue about the frame length that is being written to the AAC header to indicate to the decoder how much data to expect vs the actual data length that does not match.
I hope i'm being clear in trying to describe this.

Right now, I am just appending after the depacketization, would the left over on the next frame still require a separate header for what's left or does it have to be appended to the previous frame ? If that's the case, i'm not handling that as i should then.
Feb 17, 2015 at 6:15 PM
Edited Feb 17, 2015 at 6:15 PM
that's why I asked about the endian.

please clarify the line number, so I can check expected vs actual results.

Please keep in mind im on my mobile (today) and thanks for assistance testing and bringing things up I do appreciate it!
Marked as answer by juliusfriedman on 2/17/2015 at 10:15 AM
Feb 17, 2015 at 8:22 PM
I think the difference in bytes is in line 484 of the RFC3640Media class where you are concatenating the accessUnitHeader with the accessUnitData.

The auSize is passed in the CreateADTSHeader so when the frame length is calculated, it doesn't take into account the extra bytes that are coming from the accessUnitHeader.

Was that what you were asking ?
Feb 17, 2015 at 9:12 PM
Well I just wanted to know what the issue was as I don't yet really use that class and I hoped everyone would benefit from you also explaining why the issue was occurring.

I will get to this eventually, probably when I handle packetization or add the write bits methods.

If you can post up some code which fixes it before then just keep me updated.

And thank you also, please don't forget feedback ;-)
Marked as answer by juliusfriedman on 2/17/2015 at 1:12 PM
Feb 17, 2015 at 9:18 PM
i tried to add the extra bytes to the adts header, but there's still silence when trying to play with VLC.

have you been able to access the camera ?
Feb 17, 2015 at 9:33 PM
I honestly don't think I ever tried.

I got caught up with other things and have as of yet since not had the time to try it out.

I will eventually but just not today until later tonight or tomorrow.

If you needed something sooner axis has a managed decoder wrapper which works with my server and library.
Marked as answer by juliusfriedman on 2/17/2015 at 1:33 PM
Feb 17, 2015 at 9:44 PM
i do see irregular packet and incompatible packet error messages in this latest version.

Media.Rtp.RtpClient-3a3dfb01-3255-4b9b-96f9-b29b73e0865d@ProcessFrameData - Irregular Packet of 50680 for Channel 0 remainingInBuffer=752

3a3dfb01-3255-4b9b-96f9-b29b73e0865dProcessFrameData - Incompatible Packet frameLength=7435 for Channel 2 remainingInBuffer=1188
Feb 17, 2015 at 9:52 PM
Edited Feb 18, 2015 at 12:24 AM
The logs are misleading when you don't understand the traffic on the wire, once a large frame is encountered the data may possibly arrive and as a result it needs to be completed in the interleaved data event If desired.

I will include the endpoint and payload bytes information in the future if that's helpful (In the logs), any feedback you have is helpful as I previously stated.

I would ask for an accompanying filtered wireshark capture to properly explain the traffic.

It looks like a large packet was dropped and the source sent a rtcp fb packet to resend the data.

Either that or maybe you missed a tcp segment?

I dunno without an accompanying wireshark capture and complete log from the library not excerpts.

I dunno how you have everything setup protocol channel and otherwise And without I can't properly advise.

Marked as answer by juliusfriedman on 2/17/2015 at 1:52 PM
Feb 17, 2015 at 10:22 PM
i'll send you the captures + logs shortly. Where can i find the managed decoder wrapper ? The closest i could find is a media viewer and parser, the parser records in .bin and the viewer plays it back. It writes the sample type, flags and start and stop time within their SDK along with the buffer length but that wouldn't work with your library.
Feb 17, 2015 at 11:01 PM
i forgot to tell you about the exception also that we talked about a few days ago (first chance exception (System.ArgumentException).

i set the debugger to throw and stop at the very first chance exception and it is stopping at offset += auSize but the exception is in the RtpFrameChanged (entry with the same key already exists)
Feb 18, 2015 at 1:53 AM

Why wouldn't it work with my library?

What research have you done to support that?

I don't think its wise to mix questions and statements especially when also stated (albeit in a contradictory aspect) that you can't even locate the managed wrapper.

Two, what about a first chance exception? Is it possible the tcp segment was retransmistted? Just because the exception occurred doesn't mean it's not being handled. See Thread.Abort.

I still really don't have time yet but if you could respond I could have a better understanding of what I can do to help you.
Marked as answer by juliusfriedman on 2/17/2015 at 5:53 PM
Feb 18, 2015 at 3:55 AM
I think you misunderstood my reply. Yes i cannot locate the managed decoder wrapper hence my question earlier.

What I did locate from Axis was what's called the Media Parser SDK which i have installed, played with and tested. You have to start the parser which establishes its own RTSP connection . You'll then have to add an event handler callback for both audio and video to receive each parsed frame. The handler writes thedata on the disk with starttime, endtime, sampletype and sampleflags as parameters.

it is defined as follow:

OnVideoSample(int cookieID, int sampleType, int sampleFlags, ulong startTime, ulong stopTime, object SampleArray)


OnAudioSample(int cookieID, int sampleType, int sampleFlags, ulong startTime, ulong stopTime, object SampleArray)

The sampleType, flags, start and stop time followed by the buffer are being written to the file first for the video and the audio frames. The Axis viewer uses that format to play back the frame. It reads the .bin file and parses it to play. The actual parsing is being done behind the scene.

About the exception, you were curious about what was causing it after a few changes within the depacketizer but I am just indicating yes the tcp segment was retransmitted and was a fyi.
Feb 19, 2015 at 2:38 AM
Well how is that RTSP connection not compatible with my library again?

The media format of their SDK is theirs and I am not very familiar with it (besides a few times I played with it) and have intentions of supporting it.

Axis has support, and their cameras come with support so why you would be asking me is another question.

My library fully supports every camera which is complaint to the spec (and even ones which are not complaint in most cases)

I would suspect that now that you have the samples in their depacketized form you can play them by providing them to a decoder from the bin file their SDK created for you. You can probably also use it to come the results of what my class gives you versus what is in the bin file for 'correctness'.

Lastly, I think your issue with the AAC is that in my class I put the size of each au in the ADTS header function you have provided.

Besides the point that the function doesn't write at what could possibly be an unaligned offset in the bit stream it also possibly needs to give the entire data size of the 'frame' which is technically the 'summation' of all contained access units and not each individual one, hence why I originally didn't provide a CreateADTS header function.

But now that we have that out of the way I will go ahead and note that I do believe that you don't need usually need the ADTS header until it's time to go to the decoder, at which point you would insert it based on the entire frame you are giving it and not just a single access unit.

If you could play with that I will go ahead and fix the the class when you have verified it works or when I get the time to do so myself.

On to the first chance exception...

I was curious why you were experiencing it, then you enlightened me that you debug on first chance exceptions and now I understand.

You have not derived the RtpClient into an application client.

Your probably just using the 'RtpClient' verbatim, copy and paste, right from my unit tests.

Which is fine, I intend that some people will utilize it just as they do the TcpClient and UdpClient classes which Microsoft provided and I support that paradigm as well but you must realize that without knowing what you expect or what your issue is I cannot help you.

It might not be clear to some people how the library operates or even how to handle exceptions but I will assist you if I can especially considering you have mostly been succinct.

Please do not however assert compatibility issues which don't exist in an attempt to foster my time and resources or expect me to review your work if you cannot provide the feedback I asked you for originally.

I have various other responsibilities and unfortunately even though I set the bar high by responding in the middle of the night sometimes I cannot be expected to be at the beck and call of everyone for free 24 hours a day 7 days a week.

I also cannot be bothered to be rushed into providing features which others will not help to even determine what is not working for them.

If you want the exception to stop handle it, determine if the packet is contained in the current or last frame. Determine if it's size is greater. Do whatever you want because really as Colin Perkins will write in one of his famouass books about jitter buffers, once the packet has been received ANY other packet with the same sequence number is considered a DUPLICATE, there is a RFC with a special format to handle this:


While it would probably be possible to do I don't see a benefit from doing this, the developers still have access to the packet level events and can monitor them just as my example does and if required in their end system can update frames before decoding as required.

The fact it happens in the class at the same time is coincidence. The problem is either the auIndex is contained twice in the data which is invalid or that it was re-transmited and the segment overlapped the previous.

There can also be encoding 'errors' in the camera, which can cause it to except and recover (albeit) quickly which can cause a re-transmit to occur.

There can be network errors along the route which cause segments to be dropped.

There are many things which can go wrong.

Where is the problem in my library?

Where is the feedback I asked for?

How can I help you?

I will take a look at your camera and I will complete the RFC3640 class but it would be nice since you started this thread and I have been nothing but helpful and supportive for you to take responsibility where it is yours and test what I have give you both information and with respect to debugging, post your results and then we can go from there.

Please don't hesitate to ask questions but please also do not post incorrect information, ask questions first if you need to.

Now I don't have much more time tonight but I will tomorrow or the next day :)

Marked as answer by juliusfriedman on 2/18/2015 at 6:38 PM
Feb 20, 2015 at 8:17 PM
Ok, let me address your comments/suggestions about AAC first since this is what this thread started with.

You've indicated that you don't believe that you usually need the ADTS header until it is time to go to the decoder.

Based on that, I have removed the following line from the RFC3640 class:

depacketizedAccessUnit = Enumerable.Concat(CreateADTSHeader(profileId, frequencyIndex, channelConfiguration, auSize + accessUnitHeader.ToArray().Length), depacketizedAccessUnit);

By doing that, is that correct that it will no longer create a header for each single access unit ?

Depacketize is being called as follow:
   Media.Rtsp.Server.MediaTypes.RFC3640Media.RFC3640Frame hAudio = new Media.Rtsp.Server.MediaTypes.RFC3640Media.RFC3640Frame(frame);
   hAudio.Depacketize(true, 2, 1, frequencyIndex, sizeLength, indexLength, indexDeltaLength);
When the Depacketize function returns , hAudio.Buffer.Length should contain the size of all contained access units.
Is that correct ?

I then inserted the ADTS Header for each of those frame containing all the access units before writing to file.

The result that I got from the changes what that there was silence playing.

Please let me know if these were the test/result that you were asking.
Feb 20, 2015 at 10:47 PM
I hope you also intend on contributing a patch after you have everything working.

With that being said:

The else branch associated to the corresponding sample of code removed should also have been commented out.

The previous parenting if statement (to the else removed) evaluates if a 'frameHeader' is not null and appends in there, naming could probably have been better chosen but's logic IS still completely valid especially if there was allowed to be a previously incomplete access unit in the payload.

'previouslyIncompleteAccessUnitWithAuxHeaderAndData' is a possible candidate, and I will address that if needed later.

I will need to understand what RFC's you are working with and possibly some device configuration settings related to MTU and MSS and HOW or WHY the data has become 'incomplete' in the first place. (With the accompanying information I asked be included above.)

Custom padding should probably be allowed for besides just '0' to be completely correct to what can be required although its use is only applicable in niche cases and is sometimes referred to as 'trailer' by other libraries and I will also address that when required.

You might not want to include the data for incomplete access units, you may want to discard them. I don't know unless I can understand the formerly asked portion of my question and get the information I require and as stated I will also address that if required.

Finally since I guess that optimizations are at a minimal here you would then need something to the equivalent of

'CreateADTSHeader(profileId, frequencyIndex, channelConfiguration, accessUnits.Sum(au=> au.Value.Count()))' since the header generated is now removed.

You would achieve this by using the same values after Depacketize returned and instead of 'accessUnits' you would use 'Buffer.Length' which was populated in Depacketize.

Now, I don't have WriteBits overloads yet but I do have SetBits which is enough (when combined with GetBits) to allow offset writing into the created header.

You will also need to correctly modify the function to achieve that CreateADTSHeader by using header variable combined with SetBits to write whatever 'bits' you want at the appropriate offset.

How this does or does not answer your question only you can say but I can say that you have not provided any test results to me as of the time of your last response to this thread (to which I am replying) herein.

Please let me know how I can be of further assistance and your intention on contributing back to the library either through a patch or otherwise, e.g. a review as indicated previously.

Thank you.
Marked as answer by juliusfriedman on 2/20/2015 at 2:47 PM
Feb 20, 2015 at 11:30 PM
With the help and support that you have provided, that goes without saying that i do intend to contribute back to the library if i do get this to work.
On the side note, where would you like the review for the library to appear at ?

I wanted to also clarify one more thing:
Which test results were you expecting that I did not provide ? Please elaborate and I will send them to you if I missed to do so.

I've indicated after testing various releases of the library that the audio was still not playable and often resulted in "silence" when sent to the media player (VLC). Please let me know what would be more helpful if those weren't the feedbacks you were looking for after each test.

Thank you.
Feb 21, 2015 at 12:07 AM

The review or feedback can appear where-ever, you would like.

You have already communicated with myself via email and to continue that dialog especially in relation to questions I would have about your understanding of the library makes sense to me, however if you would prefer a Discussion or Issue thread that is also appropriate.

I just want to understand how people are using the library and also what could be easier or better suited for general use case.

Anything you find relevant to include is probably helpful, I will help me achieve creating a library which is easier to use and understand.

The test results from an End to End configuration capture.

E.g. A Receivers and Senders Capture if possible (With associated logs on the Sender and Receivers side), See an example I gave someone else here:


Essentially I need to see if possible what was sent and also what was received, if only one or the other is provided then I or anyone for that matter can really only guess whats happening anyway.

If you don't have the means to log what was being sent and what was being received on both ends of the connection easily you can use the RtspServer to host the connection and diagnose further through the RtspServer by logging the connection on the RtspServer itself, what the RtspServer sends via Rtp comes as a result of the underlying hosted connection, which is exactly what is received by the application so long as the data is in sequence as allowed by the Rtp standard. (which should be the same as what was sent by the receiver [in most cases]). You will also then have the capture which shows [In] to the server from the camera and then [Out] to the application from the server in one capture and [In] to the application in another capture.

Combined with the accompanying log output from the RtspServer during that time (if Logging is enabled) I should then have enough data to be able to see what you are doing and what is going wrong.

Another easy way would be to capture what VLC or another player gets from the camera with Wireshark and then use Utility.HexToBytes as I do in the tests to test to manually depacketize the data and then give it to your decoder which may or may not need the ADTS header, VLC doesn't tell me anything because VLC doesn't decode anything, it uses FFMPEG and various other libraries for decoding the data which then finally is another fun trip back to the output / speakers for 'play'.

After various releases, audio was still not playable?

This project doesn't play audio, it only assists with the depacketization of Rtp profile data.

The "silence" you hear could be a result of many things, incorrect depacketization or incorrect decoding or otherwise. I don't know because I don't have your code in front of me.

You wrote the CreateADTSHeader originally, I merely put it somewhere to assist you in debugging your issue and now I guess I have also adopted it into the library somehow.

I also corrected several problems with the function already (that I didn't write) and asked that you correct the rest as required and indicate the results of your tests and the modifications made.

I will complete the changes in the class as required, when required after agreed that a sample can finally be played.

Once working, you can extract the lengths of the access units of a randomly received frame and create some unit tests to verify that the functionality works.

Another good test would be to create silence for a pre-determined duration and include a test that packetized and depacketized the same data.

I will provide those samples to validate where applicable when the time comes as I previously indicated but right now how does this help anyone? what are you going to do with the data anyway once depacketized unless you are writing an AAC Decoder anyway?

I know that VLC plays the hosted stream from Axis Cameras when viewed through the RtspServer so what exactly does this library do that causes VLC not to be able to play audio and how can I replicate that?

Why don't you just use VLC to consume the data from the RtspServer and then you don't even have to worry about depacketization then?

If you want to or need to manually perform these steps then you DO need the classes in this library (I would appreciate having an understanding of why) and I said I would provide support but that I DO NOT USE THE RFC3640Frame class RIGHT NOW!

I will help you get it to work but I need you to put in effort as well.

If anything about that is unclear, please let me know and I will clarify further.
Marked as answer by juliusfriedman on 2/20/2015 at 4:07 PM
Feb 21, 2015 at 5:00 PM
Edited Feb 21, 2015 at 5:00 PM
Another way you can reverse engineer and test this is to send a file over rtp in the same codec.

You can use various other libraries to do this e.g. vlc or rtpsend.

Once you know what that data looks like you should be able to take the same file and be able to create the same packets using the classes I have already in the library.

You don't have to implement packetization but then you would need to start with the same data sent and generate you own rtp header and sequence number so that
you can then verify that receiving data would be equivalent.

You can do that just by using Utility.HexToBytes to create a rtp packet or help with it by using the data from the existing camera capture at any point from after a marker packet arrived until the next (not including the first).

Then you can ensure that the depackeized results are the same by using the same library to play back the frame depackeized results from my library and your done.

If this needs clarification or if your having trouble still let me know.
Feb 23, 2015 at 12:55 PM
Someone else is also using AAC and apparently has some info.

The issue is here

Marked as answer by juliusfriedman on 2/23/2015 at 4:55 AM