Better broadcast/neighbor networking
AbandonedPublic

Authored by daniel.z.tg on Sep 2 2017, 9:06 AM.

Details

Reviewers
None
Group Reviewers
KDE Connect
Maniphest Tasks
T6913: Specific Broadcast Packets
Summary

Switch from broadcasting on "255.255.255.255" and checking if on mobile data to broadcasting link-local on compatible network interfaces.

"255.255.255.255" only works well on normal WiFi. While tethering, the packets may leak to the phone network, and/or not reach tethered clients. This problem is avoided with link-local addresses like "192.168.X.255". Using link-local addresses also makes it easier to support IPv6 in the future.

Support clients on:

  • WiFi tethering (even with nobody connected)
  • Bluetooth tethering
  • USB tethering

Diff Detail

Repository
R225 KDE Connect - Android application
Branch
network-broadcasting
Lint
No Linters Available
Unit
No Unit Test Coverage
daniel.z.tg created this revision.Sep 2 2017, 9:06 AM

I like the idea of broadcasting to certain broadcast IP only, but why the change in isOnMobileNetwork()? It is now way more complex than the previous version, and as far as I know the previous implementation was working well, plus you had to add a new permission to the app.

daniel.z.tg planned changes to this revision.EditedSep 2 2017, 10:38 AM

I plan on fully moving isOnMobileNetwork() to getNetworkBroadcastDestinations(). This is because isOnMobileNetwork() currently prevents the "255.255.255.255"-destined packet from being sent from LanLinkProvider, but subnet-specific destinations from getNetworkBroadcastDestinations() will only be found when we will want to use them. If no destinations are discovered, then behavior will be similar to when isOnMobileNetwork() == true. In the GUI, the message could be changed from "you are on a mobile data connection" to something like "It looks like you aren't connected to any compatible networks. Try connecting to a local network."

Most of the changes to 'NetworkHelper' are for temporary debugging. The list of addresses that will be used and where they come from is being debugged. Also, I don't think android.permission.ACCESS_WIFI_STATE will be needed when getNetworkBroadcastDestinations() is done.

Can everyone verify from the logcat that it detects broadcast addresses, and it properly marks them as usable?
In particular, the regular WiFi use case needs to be checked, as it is the setup that most people use:

  1. Connect to the same network, a KDE computer using any type of connection, and an Android device using WiFi.
  2. On the Android, the logcat should show the "wlan0" interface being found, marked as usable, and give a broadcast address. Note this address as the "broadcast address."
  3. Run the command "ip a" on the KDE computer. Find the interface used to connect to the network, and note the address below next to "inet." Note this address as the "computer address."
  4. Check that the "computer address" will receive packets sent to the "broadcast address."

Here are the logs of the expected output:

Regular WiFi to router:

WiFi to computer hotspot:

WiFi tethering:

USB tethering:

Bluetooth tethering:

No networks: (glitch after stopping WiFi tethering?)

No networks: (normal)

Mobile data:

  • Switch to per-interface broadcasts & Fix tethering

Instead of checking whether we are on a mobile data network so that we
know not to send packets there, we will loop through all enabled network
interfaces, check if a KDE connect computer could be on the other end,
and use their broadcast addresses.

Tethered network changes are better detected now.
The network change detection now also filters out refresh storms.

BackgroundService has been made faster and static to help with the
network detection changes.

daniel.z.tg retitled this revision from Prepare for better broadcast/neighbor detection to Better broadcast/neighbor networking.Sep 3 2017, 10:26 AM
daniel.z.tg edited the summary of this revision. (Show Details)

Effects of most recent changes, tested with one computer and one phone:

Unaffected:

  • Normal WiFi should still work.

Internal:

  • Different broadcasts are received
  • BackgroundService is now static (it was already singleton before)
  • BackgroundService has been slightly optimized (not benchmarked)
  • Link-local broadcast addresses are now used instead of the generic "255.255.255.255"
  • Mobile network detection is now incompatible network detection, and is now done by checking if no broadcast addresses can be found
  • Translations for incompatible network detection are missing

Working better:

  • WiFi tethering
    • Enabling/disabling tethering is correctly in sync with detection of it as a compatible network
    • Peer detection & connection works partially
      • Should fully work on Android L and below (untested)
      • Fully working on LineageOS 14.1 (Android N) and some other ROMs
      • Requires refresh for other devices on Android M and above
    • Peer disconnection requires disabling tethering to be noticed
  • Bluetooth tethering sometimes works fully, and always works partially
    • Peer discovery & connection sometimes requires refresh
    • Peer disconnection is fully working
    • Enabling/disabling tethering is not applicable. Network compatibility is detected upon peer connection and disconnection.
  • USB tethering works partially
    • Peer discovery usually requires a refresh
    • Peer connection may require refresh
    • Peer disconnection is fully working
    • Enabling tethering sometimes requires a refresh to network comparability. When peers are detected before refreshing, the GUI still says that no compatible networks are available.
    • Disabling tethering is fully working is correctly in sync with detection of a compatible network being removed
daniel.z.tg added a comment.EditedSep 3 2017, 11:22 AM

Right now there are three different neighbor discovery codepaths. Only the new per-interface link-local one seems to work, and because of this, is enabled.

Using one or more of the other ones, with or without the link-local one, results in many sockets errors. Problems might also arise when having two devices both connected to two networks, or otherwise having them both detect two broadcast/neighbor paths. Refreshing sometimes helps. Sometimes there are problems connecting, or dropped connections:

I'm trying to review this, but it's huge, so I will need some help. What are the three code paths you mention?

Also, if you want people to test if it works on their devices, maybe you can add an UI with some debug info, so people who find this doesn't work can come back with useful info.

src/org/kde/kdeconnect/BackgroundService.java
58–65

Why making them static?

248

Why is it costly?

src/org/kde/kdeconnect/Helpers/NetworkHelper.java
53–54

How do we know every USB tethering will start with 'rndis', etc? Is this documented anywhere?

src/org/kde/kdeconnect/UserInterface/DeviceFragment.java
341

Unrelated change?

daniel.z.tg updated this revision to Diff 19152.Sep 4 2017, 5:54 AM
  • Finish broadcast networking & Remove debug

Fixed a the no network message being covered by the unreachable message
on the device screen.
Most added debug messages were removed.
Simplified code after removing debug messages.
The ARP neighbor discovery was removed. It was broken and causing
connection problems.
Broadcast discovery currently uses NetworkInterface broadcast addresses.
The 255.255.255.255 broadcast is disabled but the code still exists.

The 3 codepaths can be found in NetworkHelper. They were the following (but now changed):

  • "255.255.255.255"
  • Broadcast addresses from NetworkInterfaces. A list of these can be generated on Linux Java 8 using:
Collections.list(NetworkInterface.getNetworkInterfaces()).stream()
		.filter(x -> isInterfaceNameCompatible(x.getName()))
		.map(NetworkInterface::getInterfaceAddresses).flatMap(List::stream)
		.map(InterfaceAddress::getBroadcast).filter(Objects::nonNull)
		.collect(Collectors.toList());
  • Network neighbors, (from ARP, or other places). On Android/Linux shell:
$ cat /proc/net/arp
$ ip n
$ arp

The ARP codepath has been removed in the latest change, and exactly one of the other two can be active at a time. Previously, enabling multiple codepaths would cause things to randomly and silently break.

Also, getting rid of isInterfaceNameCompatible() does not have any effect on my phone. My mobile data interface, rmnet_data0, does not have a broadcast address, so it does not get included. My virtual interfaces like loopback don't have broadcast addresses either. Is this guaranteed for all devices? If so, we can fully remove isInterfaceNameCompatible(). But if this is not the case, removing it might cause a routing loop if things get sent to loopback interfaces.

src/org/kde/kdeconnect/BackgroundService.java
58–65
  1. They were already effectively static, since only 1 instance of BackgroundService gets created, and that instance is stored is a static field, and all users use that static object to access instance fields & methods.
  2. BackgroundService needed to be called everywhere, and to do this either the instance needed to be passed around, or getInstance() needed to be called. It also looks cleaner, and is more efficient to avoid calling getInstace() every time.
  3. KdeConnectBroadcastReceiver needs BackgroundService.refreshNetwork(). refreshNetwork() in turn needs to use the static locks. It also needs to use onNetworkChange(). Mixing instance and static when they could be the same looks wrong, so I changed onNetworkChange() to static. It needed linkProviders to be static, so I changed that as well. For consistency, I changed the other collections, devices and discoverymodeAcquisitions to static, along with their accessor methods. In the end, nothing except the initialization-related methods needed the BackgroundService instance anymore, so I stopped passing it around in onServiceStart().

They don't need to be static, but it just simplifies things to make them that way. They will still work if changed back to instance fields/methods and chained with getInstance().

248

It's not anymore (only 5ms). It used to be costly earlier, when a lot of broadcast receivers were added, and debug messages were printed everywhere.

Now, the filtering is only used to help with connection problems. If the network is refreshed twice before a neighbor can respond, the connection to that neighbor might require another refresh to get working. This is made worse by more broadcast receivers being added, and them actually refreshing the network every time it changes (before, they didn't always get registered).

src/org/kde/kdeconnect/Helpers/NetworkHelper.java
53–54
src/org/kde/kdeconnect/UserInterface/DeviceFragment.java
341

I don't know the side effects of creating the plugin list, so I just skip it when it's invisible and all that should show is an error message. Probably doesn't make a difference.

To make this easier to review, there are 4 main parts:

Part 1: GUI
Being on mobile data is now called being disconnected or "not being connected to any compatible networks". Non-English translations are needed.
The disconnected error is now on top of the unreachable error.

Part 2: BackgroundSerivce
BackgroundService is now mostly static. A lot of changes resulted from changing the signature of onServiceStart() from void (BackgroundService) to void (). Changes were mostly done automatically by finding and replacing references to the service field to the BackgroundService class, and refactoring the method signature.
All calls to onNetworkChange() now go through refreshNetwork().
While making things static, getDevices() has been made to return Collection<Device> instead of ConcurrentHashMap<String, Device> because nobody cared about the key, and only wanted the Collection<Device> obtained through values().

Part 3: Networking
NetworkHelper has finished switching to link-local broadcasts.
Code for "255.255.255.255" has been left in, but is disabled, so that we can switch back if we have problems. A GUI toggle could be added.
Being disconnected is when there are no broadcast destinations.
Calculations are cached, and tied to BackgroundService.refreshNetwork().
Listeners for tethering have been installed. The network change listener has been fixed for Android N ( https://developer.android.com/reference/android/net/ConnectivityManager.html#CONNECTIVITY_ACTION ).

Part 4: Optimization, etc.
Code affected by the other changes:

  • Has been simplified and/or optimized.
    • LanLinkProvider
    • BackgroundService and users
    • CustomDevicesActivity and users
    • DeviceFragment
  • Has its internal API changed from leaking implementation detail like ArrayList foo() to generic List foo() to make things easier to change.
    • BackgroundService
    • CustomDevicesActivity
  • Has been deduplicated.
    • BackgroundService users
    • CustomDevicesActivity

Thanks for your time working on this. Just two quick comments (I haven't finished reviewing it yet, will continue tomorrow):

  • Translations are managed by the KDE translation teams, so we don't need to worry about them :) This includes removing all the instances of 'on_data_message' on languages other than English like you did: this is something that the translation update script would have done automatically for us.
  • It doesn't matter this time, but it is usually a good practice to split unrelated changes in separated review requests. It makes reviewing easier and helps to keep a meaningful git history (so, for example, a change can be reverted without reverting everything else). For example, using TextUtils.join in the CustomDevicesActivity has nothing to do with the rest of the patch, and it could be merged right away if it was its own patch.

Last thing: I see you are not subscribed to the KDE Connect mailing list [1]. If you want to keep contributing to KDE Connect, it would be a good idea to subscribe so you can take part in discussions and code reviews :)

[1] https://mail.kde.org/mailman/listinfo/kdeconnect

daniel.z.tg abandoned this revision.Sep 19 2017, 7:46 PM

I think it would be better if I finished the UI toggle first, then added this, so that we can reduce the risk of breaking certain peoples' connections.

Also, I will properly split this up into individual changes.
I plan on organizing it like:

Branch 1: internals, off of master

  • Change 1: Required internal changes
  • Change 2: Optimizations

Branch 2: ui, off of internals

  • Change 3: Fix UI centering
  • Change 4: Settings

Branch 4: ui-dark, off of ui

  • Change 5: Dark theme support
  • Change 6: Dark theme

Branch 5: network, off of ui

  • Change 7: Improve network change detection
  • Change 8: Mobile network support

This can make the changes only affect one thing.