Fix loosing network packets when LanLink socket is reset.
AbandonedPublic

Authored by albertvaka on Feb 25 2017, 4:53 PM.

Details

Reviewers
giedrius
Group Reviewers
KDE Connect
Summary

There is a problem, that if many network packets are sent in short time and also at the same time LanLink socket happens to be reset (calling LanLink.reset method), some packets may get lost. It seems that when LanLink is being reset and old socket is being closed, some data on the wire can be discarded - the peer then thinks that the packets were sent successfully, but the host happens to think that socket is closed and it is all over. As I am not Java programmer, I didn't find a way to close the socket in a manner that all the data would be fully transmitted. Not sure if it is possible with SslSocket class at all.

On the other hand, it seems that LankLink reset shouldn't be needed at all if we already have a functional socket. As far as I know, socket resets may happen in two cases, when a peer connects directly to a previously announced TCP port or when a host connects to a peer TCP port in response to broadcast UDP datagrams. In the former case, we definitely cant ignore the new socket, because the peer might have a good reason to create a new connection - maybe the socket on its side is broken or smth. However on the later case, if the host already has a working socket, there is no indication that it is broken, hence no need to create new sockets and risk loosing packets. This fix does exactly that - prevents creating new sockets if there is one already.

Test Plan

The scenario may be simulated with such bash script on peer side:

$ for i in $(seq 0 10); do echo $i > "tmp-$i.txt"; kdeconnect-cli --share "./tmp-$i.txt" --device 4b7309ec0a4ee3c1; done
$ kdeconnect-cli --refresh

Diff Detail

Repository
R225 KDE Connect - Android application
Lint
Lint Skipped
Unit
Unit Tests Skipped
giedrius created this revision.Feb 25 2017, 4:53 PM
giedrius edited the summary of this revision. (Show Details)Feb 25 2017, 8:09 PM

Hey Giedrius,

Sorry for the late reply. This was implemented this way on purpose: sometimes, links die (eg: a device disconnects from the WiFi) and there is no way for the other end to know, so we refresh the link every time to make sure we have one that works. Maybe the problem could be fixed in a different way? Like waiting for the packets to be sent before closing the socket, or something like that? I'm not sure about what could be a good solution to the problem :/

Albert

Hmm, I didn't notice such a problem and thought it should be resolved by keep-alives. Doesn't keep-alive detect broken connections?

I was thinking about keeping old connection for some time in case remaining packets would arrive, but this seemed unreliable: what timeout would be appropriate? What if network is very slow? There would probably would still be cases when packets would be lost, unless lingering connection would be kept for quite a long time.

Considering alternatives, what if when broadcast message is received, an empty packet would be sent on current connection in order to test it? If connection is broken, sending should fail and socket should close. In such case a new connection could be reestablished

The problem with keep alive is that it takes quite long to detect a broken connection... it is useful to detect a device is completely gone, but it takes a couple minutes to do so.

The solution that you mention I think is the ideal one: pinging the devices ourselves to know if we can reach them. However this is a bigger task, so we never prioritized it. Also it is a bit tricky, as it needs to be backwards compatible with devices that don't implement this feature.

albertvaka requested changes to this revision.May 31 2017, 1:30 PM
This revision now requires changes to proceed.May 31 2017, 1:30 PM
albertvaka commandeered this revision.Mar 25 2018, 11:50 AM
albertvaka abandoned this revision.
albertvaka edited reviewers, added: giedrius; removed: albertvaka.