g711 audio

Topics: Question
Dec 3, 2014 at 7:26 PM
Hi,

Thanks for your continuous support, I was still trying to figure out how to grab the g711 mulaw audio from the camera using your rtsp client.

I know the g711 mulaw rtp packets do come in properly (wireshark) but do I save them at rtp_framechanged or at Handleincomingrtppackets method ?

Thank you.
Coordinator
Dec 3, 2014 at 9:33 PM
That's up to you!

I would handle this at the frame changed event so you can choose if the packet is complete yet etc.

If you need anything else let me know!
Marked as answer by juliusfriedman on 12/3/2014 at 2:33 PM
Dec 3, 2014 at 10:26 PM
Edited Dec 3, 2014 at 10:30 PM
Thank you for your response.
I tried putting a break point at the frame changed and check whether the media type is of type audio but never got to that condition.

ie if (context.MediaDescription.MediaType == Media.Sdp.MediaType.audio)
        {
             // breakpoint here
        }
should i just check for the payloadtypebyte to be 0 instead (0 for g711 mulaw) ?
Coordinator
Dec 4, 2014 at 10:29 PM
You can try that but if the media description is not audio then there is probably not an audio stream.

Can you post the SDP from the source?

Also ensure you are using the latest code!
Dec 4, 2014 at 10:46 PM
yes i am using yesterday's code.

here's the SDP :

{v=0o=- 1417736527088879 1417736527088879 IN IP4 192.168.109.179s=Media Presentatione=NONEb=AS:50064a=control:*a=range:npt=0.000000-t=0m=video 0 RTP/AVP 96c=IN IP4 0.0.0.0b=AS:50000a=framerate:30.0a=transform:1,0,0;0,1,0;0,0,1a=control:trackID=1a=rtpmap:96 H264/90000a=fmtp:96 packetization-mode=1; profile-level-id=4D4029; sprop-parameter-sets=Z01AKZpmAoAy2AtQEBAQXpw=,aO48gA==m=audio 0 RTP/AVP 0c=IN IP4 0.0.0.0b=AS:64a=control:trackID=2}

this is what i did on rtpframechanged
        if (frame.PayloadTypeByte == 0)
        {
             byte[] audioBytes= frame.Assemble().ToArray();
             using (var fs = new FileStream(string.Format("C:\\temp\\test.pcm"), FileMode.Append))
                 fs.Write(audioBytes, 0, audioBytes.Length);
        }
I was able to play that audio in Audacity but I'm wondering if i'm coding it correctly.

How far along are you with the mp4 container ? Or do you know any c# code that takes the audio and video streams to an mp4 container ? I used the Bento SDK in the past but it'd be nice to keep everything c#.

Thanks.
Coordinator
Dec 4, 2014 at 10:57 PM
Okay cool,
Make sure you give this code a whirl as it has a few more fixes.

And yes that appears correct, again the only thing is that the Payload type shouldn't be checked like that except in certain cases e.g. when it's not dynamic.

Your SDP has a Audio MediaDescription so it should be able to find it, do you have 2 TransportContexts? Does each one have a different SynchronizationSourceIdentifier, DataChannel and ControlChannel? What about MediaDescription, one should be audio and the other video.

Let me know what you find.

I may need to see a Wireshark capture to help out more.
Marked as answer by juliusfriedman on 12/4/2014 at 3:57 PM
Dec 4, 2014 at 11:06 PM
Coordinator
Dec 4, 2014 at 11:13 PM
The wireshark capture looks correct to the SDP.

Are you sure you have two TransportContext's in the RtpClient?

What does the MediaDescription look like for each instance and does each instance have a unique SynchronizationSourceIdentifier, DataChannel and ControlChannel?

Beyond that RtpClient has a bunch of GetContextForPacket do any of them return the TransportContext with the audio media description?

Lastly are you playing back through the RtspServer or directly from the source?
Dec 4, 2014 at 11:32 PM
Yes i do have two TransportContexts.

here's the media description for each TransportContexts:

"m=video 0 RTP/AVP 96\r\nc=IN IP4 0.0.0.0\r\nb=AS:50000\r\na=framerate:30.0\r\na=transform:1,0,0;0,1,0;0,0,1\r\na=control:trackID=1\r\na=rtpmap:96 H264/90000\r\na=fmtp:96 packetization-mode=1; profile-level-id=4D4029; sprop-parameter-sets=Z01AKZpmAoAy2AtQEBAQXpw=,aO48gA=="

"m=audio 0 RTP/AVP 0\r\nc=IN IP4 0.0.0.0\r\nb=AS:64\r\na=control:trackID=2"

The SynchronizationSourceIdentifier for the video is 1088074170, DataChannel is 0, ControlChannel is 1
but for the audio, the SynchronizationSourceIdentifier is -892432421, DataChannel is 2 and ControlChannel is 1

I am playing back directly from the source.
Coordinator
Dec 4, 2014 at 11:43 PM
Are you sure the ControlChannel is 1 for both TransportContext's ?

Each should have it's own unique value.

0-1 Should be for the Video
2-3 Should be for the Audio.

GetContextForPacket from the RtpClient should return the appropriate TransportContext, from which point the .MediaDescription.MediaType should be 'audio'.

Please confirm, if so that is how you can tell audio / video :)

GetContextForPacket(somePacket).MediaDescription.MediaType

So long as GetContextForPacket does not return null of course :)
Marked as answer by juliusfriedman on 12/4/2014 at 4:43 PM
Dec 4, 2014 at 11:54 PM
Edited Dec 5, 2014 at 12:12 AM
yes, I do see ControlChannel is 1 for both TransportContexts.

In 'HandleIncomingRtcpPacket', i added
        if (transportContext.MediaDescription.MediaType == Sdp.MediaType.audio)
        {
            int i = 0; // just to put a breakpoint
        }
right under TransportContext transportContext = GetContextForPacket(packet);
it does not go to that breakpoint.

in HandleIncomingRtpPacket, i did the same thing and it does stop at the breakpoint but the controlchannel value is still 1.

is the SynchronizationSourceIdentifier supposed to be assigned too for audio ? The value seem to indicate that it was not initialized and not assigned ?
Coordinator
Dec 5, 2014 at 12:17 AM
The SynchronizationSourceIdentifier is supposed to be assigned for each TransportContext a unique value.

Can you please make sure you have the latest code and please step through the

internal RtspMessage SendSetup(Uri location, MediaDescription mediaDescription, bool unicast = true) Method.

You should see it go to

RtspMessage.TryParseTransportHeader and into a switch where it should enter the 'interleaved' case.

string[] channels = subParts[1].Split(TimeSplit, StringSplitOptions.RemoveEmptyEntries);

Should be 0,1 then 1,2.

Please check that and let me know.

If that is not the case then I will need to find out why and add code to fix that.

I appreciate your help!
Marked as answer by juliusfriedman on 12/4/2014 at 5:19 PM
Dec 5, 2014 at 12:27 AM
Edited Dec 5, 2014 at 12:31 AM
ok I grabbed the latest code and recompiled.
The first time around, in the interleaved case, channels[0] is 0 and channels[1] is 1
then channels[0] is 2 and channels[1] is 3.

Please let me know if you need more info.
Thanks.

btw, i don't know if this helps but if the audio encoding of the source is AAC, I do get the mediatype as being audio during the rtpframechanged.
Coordinator
Dec 5, 2014 at 3:10 AM
Now that the channels are correct you should also get audio for the other stream types as well.

If you don't I need to look at it again.

In short in all circumstances you should get a context for every frame which is evented.
Marked as answer by juliusfriedman on 12/4/2014 at 8:10 PM
Dec 5, 2014 at 5:08 PM
I got your changes but i am still not getting the audio for other stream types.

the channels values are still 0,1 and 2,3
Coordinator
Dec 5, 2014 at 8:51 PM
What do you mean your not getting audio?

Your capture you sent even when the channels were wrong showed that audio was coming in.

The latest changes should have fixed it so that the context's were associated correctly.

If the channels are now correct what about the SynchronizationSourceIdentifier for each TransportContext, those should both be unique and should match some packet being received.

On every RtpClient instance there is a GetContextForPacket method, if you call that your saying that you get null as result?

If that is the case then something with the SETUP request is going wrong, can I access your camera for testing?

I need to determine if this is an issue how it is occurring as I cannot seem to replicate this at all with my testing streams.
Marked as answer by juliusfriedman on 12/5/2014 at 1:51 PM
Dec 5, 2014 at 9:25 PM
sorry, i think it is working properly now, i must have had some old code mixed with your fixes - I got your latest on a different folder and restarted another test winform from scratch and i do get the mediadescription mediatype as audio on some packets.

thanks for your help !
Dec 5, 2014 at 9:27 PM
sorry, i think it is working properly now, i must have had some old code mixed with your fixes - I got your latest on a different folder and restarted another test winform from scratch and i do get the mediadescription mediatype as audio on some packets.

thanks for your help !
Coordinator
Dec 5, 2014 at 9:34 PM
No problem, what is the issue with just referencing the Dll's and then you can download and rebuild them without worrying about mixing the code.

Anyway glad to hear it if you need anything else let me know!
Marked as answer by juliusfriedman on 12/5/2014 at 2:34 PM
Dec 5, 2014 at 9:48 PM
do you think the latest changes might have affected the video portion ? I can't seem to be able to record the h264 portion.
Coordinator
Dec 5, 2014 at 9:57 PM
Edited Dec 5, 2014 at 10:00 PM
I highly doubt it, the changes in the RtpClient only effect Jitter calculation.

What seems to be the problem?

The only change is the namespace from older versions..

Rtsp.Server.Media is now Rtsp.Server.MediaTypes and will probably either change or be relocated again once the separation of profiles is complete.

Sorry for the confusion!
Marked as answer by juliusfriedman on 12/5/2014 at 2:57 PM
Dec 5, 2014 at 10:18 PM
it seems like i am not getting all the video packets.

i reset both vlc and the camera.

here's the code:
 void Client_RtpFrameChanged(object sender, Media.Rtp.RtpFrame frame)
        {

            var context = ((Media.Rtp.RtpClient)sender).GetContextByPayloadType(frame.PayloadTypeByte);

            if (context.MediaDescription.MediaType == Media.Sdp.MediaType.audio)
            {
                byte[] audioBytes = frame.Assemble().ToArray();
                using (var fs = new FileStream(string.Format("C:\\temp\\test.aac"), FileMode.Append))
                    fs.Write(audioBytes, 0, audioBytes.Length);
            }
            else if (context.MediaDescription.MediaType == Media.Sdp.MediaType.video)
           {
                if (!frame.Complete) return;
        
                Media.Rtsp.Server.MediaTypes.RFC6184Media.RFC6184Frame hframe = new Media.Rtsp.Server.MediaTypes.RFC6184Media.RFC6184Frame(frame);
                if (bFirst == true)
                {
                    Media.Sdp.SessionDescriptionLine fmtp = context.MediaDescription.FmtpLine;

                    byte[] sps = null, pps = null;

                    foreach (string p in fmtp.Parts)
                    {
                        string trim = p.Trim();
                        if (trim.StartsWith("sprop-parameter-sets=", StringComparison.InvariantCultureIgnoreCase))
                        {
                            string[] data = trim.Replace("sprop-parameter-sets=", string.Empty).Split(',');
                            sps = System.Convert.FromBase64String(data[0]);
                            pps = System.Convert.FromBase64String(data[1]);
                            break;
                        }
                    }

                    bool hasSps, hasPps, sei, slice, idr;
                    hframe.Depacketize(out hasSps, out hasPps, out sei, out slice, out idr);
                    byte[] result = hframe.Buffer.ToArray();

                    using (var stream = new System.IO.MemoryStream(result.Length))
                    {
                        if (!hasSps && sps != null)
                        {
                            stream.Write(new byte[] { 0x00, 0x00, 0x00, 0x01 }, 0, 4);
                            stream.Write(sps, 0, sps.Length);
                        }

                        if (!hasPps && pps != null)
                        {
                            stream.Write(new byte[] { 0x00, 0x00, 0x00, 0x01 }, 0, 4);
                            stream.Write(pps, 0, pps.Length);
                        }

                        hframe.Buffer.CopyTo(stream);
                        stream.Position = 0;

                        // Write All Bytes stream
                        decode_stream(stream);
                        bFirst = false;
                    }
                }
                hframe.Depacketize();
                decode_stream(hframe.Buffer);
            }
        }
99% is pretty much similar that what has been discussed in other threads... The previous version seemed to work with video but i'm going to test some more.
Coordinator
Dec 5, 2014 at 10:29 PM
Edited Dec 5, 2014 at 10:34 PM
What makes you think your not getting all of the packets?

Yes please do test and verify because the changes shouldn't have effected anything with this.

If it did I would be interested to see where and how, you can verify you are getting all the packets by testing the stream with RtspInspector which shows the sequence numbers.

Let me know if you see unusual discontinuity, you can also test in the command line with the TestRtspClient function which prints the amount of missing packets / incomplete frames at the end.

You should always have a low number of those.

Let me know if you find anything different at all and thanks!
Marked as answer by juliusfriedman on 12/5/2014 at 3:29 PM
Dec 5, 2014 at 11:01 PM
oh just by playing back the video since i had the camera pointing to a clock :)

it looks intermittent but i'll know more when i test later on. It shouldn't matter that it is trying to write to two files during the rtp frame changed right ?
Coordinator
Dec 5, 2014 at 11:32 PM
Nice.

Okay definitely let me know if you find anything.

An no, it shouldn't unless other threads write to the same file also they might overlap each other if your on a ntfs filesystem.

In short have only one thread responsible for writing and queue the bytes to be written on that thread to prevent issues.
Marked as answer by juliusfriedman on 12/5/2014 at 4:32 PM
Dec 9, 2014 at 8:57 PM
i think both streams work fine at this point. I did want to mention though that the audio seems to be cut off at the beginning, let's say if i'm recording 'testing 1,2,3', i can heard 'ing, 12,3'... Is that perhaps the camera that hasn't started streaming back yet ?

thanks.
Coordinator
Dec 10, 2014 at 3:48 AM
Edited Dec 10, 2014 at 5:26 AM
Can you post your code so I can run some tests on your dumps that you sent me?

After receiving frames I guess you basically just use Assemlbe and just save it to a .aac file and then play it with VLC?

If not it could be a bug in my code, I am still researching to see if it's missing a few things.

Apparently the configuration info can come in band also (which may need to be removed or have data added to it), the only thing I have a problem with is that since all of this data is technically Elementary Stream I am not sure why it needs to have anything 'REMOVED' from the depacketization because nothing would be added.

Hence the decoders should be able to just take the result of Assemble even with the length bits preceding because that is the format of the AudioMuxElement.

The profile in-fact should have either left data exactly as it was or compressed it somehow rather than making changes to something it doesn't even really have a right to.

With that being said I am glad you have something working, I suspect the decoder may be dropping the first frame, why I don't know. It could be a bug in my code or it could just be the way that decoder works, you may want to try another player / decoder.

Also you see 'Audio' as the media type now correct?

Was this just using the Assemble method or did you use the G.711 class? (RFC5215Frame)

Thanks again for keeping me updated!
Marked as answer by juliusfriedman on 12/9/2014 at 8:48 PM
Dec 10, 2014 at 4:36 PM
sorry i might have confused you between pcmu and aac. This issue is for PCMU and yes i am just using assemble and save it to a file.

here's the code
        if (context.MediaDescription.MediaType == Media.Sdp.MediaType.audio)
        {   
            byte[] audioBytes = frame.Assemble().ToArray();
            using (var fs = new FileStream(string.Format("C:\\temp\\testing.raw"), FileMode.Append))
                fs.Write(audioBytes, 0, audioBytes.Length);
        }

        else if (context.MediaDescription.MediaType == Media.Sdp.MediaType.video)
        { // save video frames }
I do see the Audio as media type correctly now as well.
I have not used the G711 class, should i give that a try ? i was actually converting the raw audio with Audacity and playing it back.
Coordinator
Dec 10, 2014 at 5:46 PM
Both classes in this case should do the same thing but yes, you should be using that class overall.

Good about the audio, give the other class a try and let me know how it works out!
Marked as answer by juliusfriedman on 12/10/2014 at 10:47 AM