Sockets are closing unexpectedly

Topics: Question
Apr 13, 2015 at 5:56 PM
Hi there,

I'm currently connecting to a Samsung NVR and have it working using your library (though I had to downgrade it to .net4).

Most of the time this works fine, but I've encountered two issues, where one seams to be the main one:

When I want to watch 4 streams at the same time (or just fire a load at the same time) the sockets bum out quickly, making sure the setups steps are breaking (.StartPlaying()) because the socket is aborted. It always is raised at line 2607 of the RtspClient (111212 release) and I've been struggling with finding the origin or the solution for quite some time.
In Wireshark we see that it is reset by a tcp reset flag(from our local ip to the NVR) but I've been unable to see if this happens before or after the exception is thrown. My knowledge of the stack that low isn't sufficient to say a lot about the order and occurrences of those flags in the packets.
Could you maybe share some insight in this? We've got the NVR set up on a public ip and port so I can give you access or perhaps walk through the issue together to see where in the library it fails. If I know the solution I'm more than willing to spend some time abstracting it and contributing it back.

Another issue we've seen is that if the first unauthorized arrived and for some reason digest authentication fails the NVR issues a new nonce which is not processed by the library. I haven't dived into this yet since the previous issue is occurring way more often in the use case we foresee.

I look forward to your response, if you need anything just let me know, we've got wireshark captures from both issues and am more than willing to spend some time on it with you using Skype or something if you would be up for it.
Coordinator
Apr 13, 2015 at 6:07 PM
Edited Apr 13, 2015 at 6:07 PM
You cannot easily downgrade the framework version because of some inconsistent behavior in the 4.0 framework.

Additionally 4.0 is no longer supported and has not been for some time. Please use an older version.

In short, The resets are probably result of the Binary Reader being closed as 4.0 has no explicit way to leave the underlying stream open.

If you need anything else let me know!
Marked as answer by juliusfriedman on 4/13/2015 at 10:07 AM
Apr 13, 2015 at 7:06 PM
I was already expecting you'd say that ;)
I had wrapped the binary readers in a non closing stream wrapper and the other extension methods I've rebuild using the .net source code. This seemed to work fine.

Obviously there might be differences in underlying stuff that is happening so I ported my code to use a fresh download of your source and .net 4.5.2 (it only needs to be 4.0 when I include it in another product, so for testing purposes it doesn't matter that much).

Having just done that I've run into the same problems, so using .net 4.5.2 and a fresh download of the source (only had to make one change to allow me to send custom headers), the same error happens.

Thanks for this massive massive massive piece of work btw.
Coordinator
Apr 13, 2015 at 8:02 PM
There's no change required to send custom headers.

I do not experience closed sockets under normal operations, you might have a special network setup which requires further configuration.

Thanks for the kind words.

Let me know if you need anything else.
Marked as answer by juliusfriedman on 4/13/2015 at 12:02 PM
Apr 14, 2015 at 9:55 AM
Ah, your comment about the headers send me in the direction of the AdditionalHeaders field I can fill.

Why we experience those sockets being closed is unknown to me as well. I've tested it on 4 different machines, both external and internal networks, different versions of Windows, though the target machine always stays the same. I've fixed it by using two things: A hard try-catch retry mechanism in case the other one fails, and an extension to the auto reconnect parts in the SendRtspMessage method, where I try to reconnect on both a SocketError.ConnectionReset and a SocketError.ConnectionAborted, instead of just the SocketError.ConnectionReset. This seems to solve it.

We've encountered the party using a non-rtsp specced status code as well (limitting the amount of users looking into the past by giving a 560 statuscode). I might solve this by making the tag contain the status code so that I can make an if statement for it that is a little bit more secure than doing it based on text. Would you be interested in this as well for me to push back?

The other issue we talked about in the beginning I've nailed down to the absence of the stale header which is supposed to be there. Also communicated to the the party.

Thanks for your help.
Coordinator
Apr 14, 2015 at 1:53 PM
Edited Apr 14, 2015 at 1:55 PM
Cool.

About the status code, its fairly easy to just look at the integer value and know 500 is an error.

509 is allowed and could be any error the camera wanted to convey. Same with success of 101 or 199.

In general the tag should contain the whole message in the tag in the latter exception so no change is required there either. The initial exception is more for expansion than anything else and only need to be handled once.

If you post a wireshark capture ill take a look and see if I can shed any light.

The other thing you can try is tcp keep alive.

I would suspect though the machine acting up has a weird configuration which is causing it, changes to the code shouldn't be required.

There's an option for automatically reconnecting as well.
Marked as answer by juliusfriedman on 4/14/2015 at 5:53 AM
Apr 14, 2015 at 2:50 PM
You were right about the status code! Didn't realize I could just catch your tagged exception and query it like an integer. Makes at least that part fixed without having to hack.
I've also used the Additional Headers collection now, clearing it before every new command instead of my own overload of SendPlay I've written. Makes it a lot easier to stay in line with your changes.

The auto reconnect was already turned on, the change I made is below (sorry for the weird formatting, my editor is tabbed and codeplex doesn't like that a lot).

This link (Dropbox) is also a wireshark. An interesting one is for example if you use the filter 'tcp.stream eq 8' where you see everything end after the client for unknown reasons terminates the socket. We've fixed this with underlying code but I'm unsure if that is the best way. If it is, it might be a valuable addition to the automatic reconnect, but I assume you threw the SocketError.ConnectionAborted further for a reason. Works so far in our use case though.
if (AutomaticallyReconnect && error == SocketError.ConnectionAborted || error == SocketError.ConnectionReset)
{
                        //Check for the host to have dropped the connection
                        if (error == SocketError.ConnectionReset)
                        {
                            //Check if the client was connected already
                            if (wasConnected && false == IsConnected)
                            {
                                Reconnect(true);

                                goto Receive;
                            }
                        }
                        else if (error == SocketError.ConnectionAborted)
                        {
                            if (wasConnected && false == IsConnected)
                            {
                                Reconnect(true);
                                goto Receive;
                            }
                        }
                    }
Coordinator
Apr 14, 2015 at 2:56 PM
Edited Apr 14, 2015 at 2:57 PM
:-)

I will take a look asap.

Also please understand that in such cases you really need to derive the RtspClient and override or provide your own methods.

At the very least you need to wrap the call and catch and handle the exception and indicated how your using it so I can determine if the api can be improved.

Please do not modify the code.

Would you modify the tcp client or derive from it / encapsulate it?

Please utilize the same logic here.

Also maybe you should just turn off Automatically Reconnect.
Marked as answer by juliusfriedman on 4/14/2015 at 6:56 AM
Apr 14, 2015 at 3:24 PM
Yeah, preferably I'd just override, but since the SendRtspMessage is so massive and the reconnect logic is right in the middle of it, I didn't really find a good way, besides perhaps writing that whole method again. Given the size of 719 lines of code that didn't seem very maintainable to me, I prefer to let source code figure out where our differences are in the future. Besides that I already had to made some differences in the .net 4.0 port.

I wouldn't do anything with the tcp client, the Media framework is the only one who touches it, besides my small changes to the reconnect logic. The code above was from inside the SendRtspMessage method. If I turn the AutomaticReconnect off the error simply persists higher up, the code above does fix this for me. How do you mean it would work with Automatic Reconnect turned off?
Coordinator
Apr 14, 2015 at 3:32 PM
Edited Apr 14, 2015 at 3:40 PM
Uh you clearly don't understand...

Why would you have to override the send method?

Why not just Reconnect?

Not maintain able? Use ffmpeg or something else...

Improvements can be made but what is your specific use case?

Automatically reconnecting is clearly an option, its clearly set and I don't know why.

Turn it off by setting it to false THEN handle if desired.

That's how it should be done.

Besides that the option is for when the client need this for a specific required reason, your not even sure why your getting disconnected yet..

Why cant you override with try { base.Logic (); }catch (Exception) { new logic ... }

And finally what did you add? You duplicated my existing code without understanding why and specifically endured that the condition cannot be handled at a higher level, how is that valuable?
Marked as answer by juliusfriedman on 4/14/2015 at 7:33 AM
Apr 14, 2015 at 4:16 PM
If I let the 'error' bubble up and destroy the already setup RtspClient I lose costly time. Fixing the automatic reconnect works and keeps the use case the same.

The specific use case is basically that users have to click an icon and a random number of streams (1-4) will be set up, streaming from a specific time in the past. Users will click quite often, quite fast, so letting the error bubble up while I can do a reconnect a layer lower is very inefficient.

Your way of doing the base.Logic and then catching the exception, running it again, would technically work perfectly fine, besides the point that the cost of setting everything up again (I'm talking about the StartPlaying method) would be too time intensive. The time that is needed for simple roundtrips towards the RTSP server and getting RTP to stream it efficiently with 4 streams at the same time already costs valuable seconds, any slight thing I can take off of it is more than valuable.

I didn't duplicate code, I just changed the specific reconnect in your SendRtspMessage that was throwing the error to try a reconnect.

I'm sadly not in the luxury of trying, catching and then re-running the logic that has already run since the time to set it back up is just too long. The system is an event based system, where a user will see a list of 100s of events with a camera icon, when clicking that it will pop up a window with an embedded dll that I need to provide (hence the annoying .NET 4.0 requirement), that needs to initialize the imagery as soon as possible. We're already using ffmpeg to decode the H264 (if we get that instead of MJPEG). I hope you understand the use case, and the reason why I can't rely on try, catches. I know I technically can, but I need to stop the error as soon as possible, letting it be thrown and a hard retry will execute at least the options one extra time, and since it doesn't always stop at the same time the retry sometimes takes the describe, the setups and even play's with it.
Coordinator
Apr 14, 2015 at 4:28 PM
Edited Apr 14, 2015 at 4:46 PM
Why do you have to destroy it? You can always find a way to do things your way but just because you do unfortunately doesn't mean you fixed something.

Your entire problem here is that your not properly using sockets.

You need to create those sockets as required and then keep them open with the leave open option. Additionally you need to ensure the Connection:close header is not sent upon teardown.

You will see there is a disconnect sockets method.

You also need to give that existing socket ( which may come from an initial rtsp client instance) to a new rtsp client so it can send and receive messages.

The other way to do this would be using a single client and sharing its socket with other clients which can then do whatever they need to without closing the connection.

This is your design issue and honestly without more information I can't help you further. (Why you have to open multiple connections, or why you can't wait for existing connections to close , etc) even with such information this unfortunately has nothing to do with my library and if it did I assure you I would fix it.

I am already in the process of creating a rtsp session which is separate from a rtsp connection but also realize that I also have other work to do for rtp and rtcp and that there are already issues tracked for most of my ideas.

Most of the value added features in terms of session and connection are simply in fluency of api and allowance of what is already possible in other ways anyway and really only add more overhead and complexity to what I think is designed much more viable in other ways for various reasons.

Again most of the improved functionality I envision is already documented in some aspect.

Please let me know if you would like to contribute there or otherwise please take my advise while I offer it and provide examples of how and why you needed to do what you did so that:

1) It helps you keep with the latest changes easier
2) It allows me to see if the paradigm you require will be allowed for specifically in api updates

Ffmpeg has it's own rtsp client maybe you should consider using that?

With that being said if you need anything else please let me know.
Marked as answer by juliusfriedman on 4/14/2015 at 8:28 AM
Apr 14, 2015 at 4:46 PM
I didn't even know there was the option to create the socket beforehand and then share them.

My current code works as follow (simplified because the specific client's implementation is abstracted away into an object that manages it, for this copy paste I just dump it into a collection):
foreach(Camera camera in Cameras){
    RtspClient.ClientProtocolType clientProtocol = _useUdp ? RtspClient.ClientProtocolType.Udp : RtspClient.ClientProtocolType.Tcp;
    rtspClient = new RtspClient(liveStreamSource, clientProtocol);
    rtspClient.AutomaticallyReconnect = true;
    rtspClient.Client.RtpFrameChanged += ClientOnRtpFrameChanged;
    rtspClient.StartPlaying();
    Clients.Add(rtspClient);
}
As you can see I'm not manually creating my sockets at all, I leave that up to your library, where if I read your last comment correctly there lies my issue. Just found the method ConfigureSocket. Am I correct in assuming it is better to create one main socket to the target machine (that stays open) and sharing this one over all the specific instances? And would just running the following (where socket is a single socket shared over all of those) be enough to setup the shared behavior?
rtspClient.ConfigureSocket(socket);
I realize you're busy, and this is a massively complex system. I have been going at the subject for more or less a month, with the last two weeks more intensively, and though I have a rudimentary understanding it is still very complex. I was thinking I could perhaps find some time after this project is done to setup a couple of documentation pages over here as a way to say thanks, for example a getting started with an RtspClient. Obviously only with my limited understanding of the client, but I've seen a lot of duplicate posts where I got my base logic from, this might resolve a little bit of the pressure towards you as well.

I'd like to thank you as your insight and help has been very valuable so far, my apologies if I sounded ungrateful or finger pointing in any way.
Coordinator
Apr 14, 2015 at 4:52 PM
Edited Apr 14, 2015 at 5:11 PM
There is, this is supported for ANY socket however on udp or tcp sockets will be created by default Unless one is given In the constructor of the Rtsp client and will also be closed if created if the leave open option is not changed manually. (Not usually required when the socket is given unless manually setting leave open to false) You could also assign to the RtspSocket property....

I outlined above how you can dervive the client and or how you can use an existing socket and the leave open option to ensure the connnection is not closed.

You can remove the sockets with disconnect sockets and please remember to either close the socket or take a reference to it before closing it if you still need it.

I only call Configure when creating a socket, Your free to call it however you choose but I do warn you that the api is changing and that Configure is just a quick shim to allow for weird requirements.

I appreciate and do accept your offer of documentation etc and if you need anything do please let me know.

I am more concerned with the comments and xml doc for the functions being correct so please also let me know if you find inconsistent descriptions there also.

Thanks.
Marked as answer by juliusfriedman on 4/14/2015 at 8:52 AM
Apr 16, 2015 at 1:07 PM
Thanks. Have been trying to feed it my own socket (both with leave socket open to false and true) and it still doesn't work like I'd expect. I'll keep digging, while debugging it seems like the socket closes itself automatically and as already apparent from earlier posts my knowledge of the low level of this isn't up to what I'd like it to be.

I've encountered one weird-ish thing that might be valuable for you.

Using RTCP we get the event back, with the following values:

NtpMSW: -250626135
NtpLSW: 1734606888
NtpTimestamp: 7450079859420676009

Using built in calculator the SendersRapport uses (in NetworkTimeProtocol.cs) we get the following date back: 18-7-2064 17:11:00.
Using a different calculation I found on the internet I (correctly) get 16-4-2015 09:46:00 back from the MSW and LSW values.

I'm not sure if the provider of this DVR is not following the specs there and that the calculation that I found is wrong, but the data might be valuable for you. If you'd need more examples for this I'm more than happy to provide.
Coordinator
Apr 16, 2015 at 1:30 PM
Edited Apr 16, 2015 at 1:32 PM
Sorry I have had other things to do as of late.

You shouldn't have to dig, I outlined everything above.

If you can post a wireshark capture which shows the miscalculation I will take a look. may also indicate why the connection gets closed.

I Will check the ntp stuff when I review the capture as well.
Marked as answer by juliusfriedman on 4/16/2015 at 5:32 AM
Apr 16, 2015 at 2:11 PM
It seems like it has something to do with the unsigned vs signed integers. If I feed the integers from Wireshark into it, I get the correct date back, if I feed the integers I get from the report (which are signed) I get a false one back. The normal NptTime always calculates it false though, so feeding it the NtpTimestamp goes wrong.

Here the pcap, in the title the packet number with the first valid rtcp timestamp is available.

Since we already have it fixed using a calculator that uses the signed ints I get back from the report, we're solid on this one ;)
Coordinator
Apr 16, 2015 at 2:22 PM
Without even looking then I can explain that.

The result is signed because unsigned types are not CLSCompliant.

Usually one would use the unsigned cast when working with the value Or call Convert.

I think your reference is to the DateTime NtpTimestamp property?

When its calculated, it uses the unsigned value and not the signed And I explained why above.

Is this not the case?

I will check the capture related to the closing of sockets again and let you know what I find.

Let me know if I didn't address your question completely.
Marked as answer by juliusfriedman on 4/16/2015 at 6:23 AM
Apr 16, 2015 at 2:32 PM
That totally answers my question about the time, thanks ;)
Still sad that a property like NtpTime won't return the right results in all environments, I get the reason, but for a user of a library it is very confusing if the libraries property is a timestamp that is far off into the future or in the past.

Oh well. Maybe setting a xml comment with this to warn for certain differences might be helpful?

Thanks for your insights!
Coordinator
Apr 16, 2015 at 2:37 PM
Edited Apr 16, 2015 at 2:41 PM
Can you show me how your using the property?

E.g. a unit test with a expected value so I can visualize what you mean when I run the test.

Afaik If you take the long property and use that then you must cast it.

If you use the managed DateTime property the value should be calculated in the unsigned space anyways.

Please also realize that daylight savings time and time zone conversion could also be a factor, although this specific instance doesn't sound like a symptom of that.
Marked as answer by juliusfriedman on 4/16/2015 at 6:37 AM
Coordinator
Apr 17, 2015 at 6:23 PM
I could only find a Rtsp Stream on tcp.stream eq 6

e.g.

'73 4.465032000 192.168.0.78 192.168.0.225 RTSP 401 SETUP rtsp://192.168.0.225:558/PlaybackChannel/2/media.smp/session=1555620/track1 RTSP/1.0'

I see that this is not a library issue and please do not modify the code unless necessary.

What you are experiencing there is either a result of a bad connection from the camera or a result of your debugging while capturing.

If you need anything further please also respond to my previous question (above) [Related to Ntp and how your using it] and also indicate what exactly I can additional answer so I can help.

Sincerely,
Julius
Marked as answer by juliusfriedman on 4/17/2015 at 10:23 AM
Apr 18, 2015 at 2:39 PM
Sure, sorry for the late reply!

At the end of the comment the code for the test is displayed. I used the values literally from the RtcpSendersReport (the NtpLSW and NtpMSW and NtpTimestamp). It should be 18-4-2015 12:59:00 but the report says as NtpTime 19-9-2046 16:05:45. I try to use the managed DateTime property, but this already returned the wrong value so the little code that is written has the calculation in it that returns the right value (for my connection, certainly not applicable to all NptTimestamps).

Yeah the socket issue is still bothering me. I tried to keep one generic socket, wrote a log every second about the status and as soon as it hits the library and it tries to send options (from the StartPlaying() method) it bums that socket out (with auto reconnect on and off and leave open on and off). We've encountered this on multiple pc's and multiple networks so I'm unsure, the resets seem to come from the clients side but that's about as far as I got, the most stable thing I've found is not to tamper with your code and just retry it as soon as it fails, most often the second time around (without a custom provided socket), it works.

The wireshark you looked at was/is the second one, with an example of the RTCP report we get back. Another one I posted earlier in this conversation has the Socket problem in it. Tcp.stream eq 6 only returns two packets in that one, but 'tcp.stream eq 8' shows the wrong connection.

Thanks for your time!
// Made a static test and in the Media.UnitTests namespace to be a little bit more in the coding style of net7mma
// One assumption is valid, another one isn't. 
// Values are obtained through debugging the RtcpSenders report and inspecting the values.
namespace Media.UnitTests
{
    /// <summary>
    /// Provides tests which ensure the logic of the NetworkTimeProtocol class is correct
    /// </summary>
    public class NtpTimeTests
    {
        public static void InvalidNtpTimestampCreated()
        {
            DateTime validValue = DateTime.SpecifyKind(new DateTime(635649587400770000), DateTimeKind.Utc);
            //DateTime assumption = CalculateNptDateTime(-656616556, 335007449);
            DateTime assumption = NetworkTimeProtocol.NptTimestampToDateTime(1438846041009738644);

            bool result = DateTime.Equals(assumption, validValue);

            if(!result) throw new Exception("DateTime calculation was not correct.");
        }

        public static DateTime CalculateNptDateTime(int seconds, int fraction)
        {
            DateTime utcEpoch2036 = new DateTime(2036, 2, 7, 6, 28, 16, DateTimeKind.Utc);
            Int32 milliseconds = (Int32)(((Double)fraction / UInt32.MaxValue) * 1000);
            DateTime dateTime = utcEpoch2036.AddSeconds(seconds).AddMilliseconds(milliseconds);
            return dateTime;
        }
        
    }
}