The different sizes is what the SVC would help you achieve but you would also need a decoder which would be able to support it.
You would have to do less work at the Rtp and Rtsp level to achieve that but you would still have to grab both streams to put them into a single SVC stream correctly.
For the SDP, yes you could also have a Time Description which makes one minute streams which repeat every other minute, the syntax for it would be found in
5.10. Repeat Times ("r=")
You must also realize that it would totally also be up to the consumer of the stream to honor the SDP unless you also enforced these same time-descriptions and repeat times at the server side.
For instance if you don't end the stream at the 1 minute mark the client may continue to listen and may or may not check the remaining media descriptions to determine if they are still active.
I know my RtspClient doesn't do do this but I will add some logic for this so thanks for showing me that people intend to use these features.
It can be changed quite easily when receiving the Describe response to check before any Setup requests to ensure that MediaDescription doesn't have a corresponding TimeDescription in the SessionDescription which describes it :).
In short even if you get support in the RtspServer and RtspClient I am not sure that any other client enforces this in real life, the RTSP layer gets the SETUP in most cases even if the Time Description within the SessionDescription doesn't allow it in most
cases I have seen and thus which is why I didn't really bother to check it.
It may be easier if you can control the client to use a Rtcp App message to inform the client to switch or possibly even a PUSHED RTSP Message such as ANNOUNCE or PLAY_NOTIFY, you will have to play around to see what works for your application scenario.
What I can say though is that I can see this working more generally as follows:
When the client connects for the DESCRIBE request create a SDP with multiple Media Descriptions, one from each source you want to encompass in the session.
(You may have to remap the payload types if they overlap because some receivers will only use the payload type in the media description and not the SSRC)
You can then use the PacketBuffer to virtually 'Pause' a certain stream e.g. video1 or video2 to prevent packets from going into the RtpClient of the receiver.
The receiver will get packets from a whatever source for your desired time limit.
Unpause the next source and Pause the current source.
Repeat the process of sending (optionally discarding packets which are older if you wanted)
You would also be able to make a small class with a Collection to help with the switching of packets if you ever wanted more than two sources or to do different types of buffering for each source depending on some other state variable.
This would allow all streams to be playing individually and also remain accessible individually but allow only sending data at certain times on a certain session though a certain source, or in short when you wanted to allow via the server or otherwise.
A client such as VLC or QuickTime would not disconnect (Rtcp would still be sent) and SHOULD be able to display all the data within the same view (even different resolutions) in the case of a client tried to manually switch to a source you would be able to
handle this also by either allowing it to effect the pause cycle or you could choose to ignore it.
Some viewers might choose to display the different feeds in different sections of the view or some may choose to totally show each stream in the entire view, it's up to the viewer to decide. For example VLC allows you to compose a custom viewport quite easily
either with different sources or offsets of another source.
So, using a custom viewport or just including multiple session descriptions should be a much easier way to do what your trying to do no matter what codecs you use and without needing to change anything with the sequence numbers or timestamps, your just simulating
multiple parties input to a client (Audio or Video) additionally it should work much more generally than trying to get the Time Descriptions and Repeat Times to be used if they aren't already.
That should achieve what your trying to achieve without any disconnects and should also work under TCP and UDP accordingly.
This just like issuing multiple SETUP requests for multiple streams and choosing to issue PLAY or PAUSE from the client but instead the server is controlling what is playing and what is paused so there is no request other then the SETUP and the logic how to
resume from the pause state as the client doesn't have to do anything but SETUP all media in the description which is much easier to achieve in most cases.
Let me know if you have any other questions or if you need further assistance!