How 🖥 Desktop & Mobile should connect?

Desktop

I’ve heard, desktop <-> mobile connection is a hot topic now :fire:. I’ve designed 4 different concepts of how this can be handled from UX point of view. However, I’m lacking more info from the technical and security side of the thing.

What all concepts have in common?

Status Mobile is initiating a “communication” with the desktop, not another way around… This way we prevent anyone that knows your public key can request access and wait for your error. Imagine receiving random requests at the time we don’t support any spam prevention feature. Upss :grimacing:

It will be super helpful if you can check Figma and come back here or comment in Figma on how this should work…

Open views in Figma <- Start here

Few of million questions I have:

  1. What happens on the background when you scan (QR) - Is it desktops temporary public key?
  2. Can the user initiate and scan the desktop’s public chat key (QR) from the mobile wallet or another place or should this be initiated as a new 1:1 chat?
  3. Do we need to send verification code (OTP) after the user scans QR? If yes, what device should display it and which should enter it?
  4. Can synching start automatically after you authorize desktop client (mobile can revoke synching anytime)? Previously, you had to allow syncing on both sides (not user friendly). Is there a reason we want to continue doing it this way?
9 Likes

Thanks for posting this, amazing work!

A couple of questions:

Status Mobile is initiating a “communication” with the desktop, not another way around

This is interesting, we had a time when desktop was widely used among core contributors, and from my own personal experience this would be fairly inconvenient for me.
Desktop installations are inherently more “permanent” than mobile, as we might uninstall/install apps more frequently on mobile devices, we loose/break them more easily.
Effectively throughout desktop was the installation that I was keeping, and then every now and then I would sync a mobile phone (by entering the seed phrase as before).

If we only allow one way, it means that if I first install desktop and I later install on a mobile, my option would be to restore account through a seed phrase, and then initiate communication with desktop from mobile? That would be fine by me, as long as there’s one option and we don’t prevent this, it would be working fine for my use case.

Another use case is to link multiple mobile devices (such as my phone and tablet), is that an option?

One thing that I believe is important to keep in mind when designing the flows is that (contrary to other apps like whatapp, signal etc), we have no source of authority on how many devices/which one is authoritative etc.
I’ll try to explain that.

Let’s take Signal as an example (whatsapp has limited support for multiple devices).

Basically in Signal, you first register a device associated with a phone number.
Signal will treat that device as authoritative, it’s associated to your phone number and that’s how they verify your identity.
In this way, desktop apps are seen as subordinates of this device, and that information is communicated with anyone wanting to send messages to you.
So actions such as “Unlink device” are very clear. Once you unlink desktop, desktop won’t receive messages anymore. And “unlinking” your mobile phone from desktop is not possible (i.e have desktop receive your messages only, and not mobile phone), as there’s an hierarchy that is clear and can be enforced. ( https://support.signal.org/hc/en-us/articles/360007321111-Unlinking-devices)

In our world we don’t really have that option. First, there’s no central authority (signal’s server in the previous example), to check which one is “authoritative”, second anyone can enter a seed phrase in multiple devices and access their account.
If two devices have the same seed phrase, who to trust? (This can be a user having lost their phone and restored an account, a user having two devices and having restored the seed phrase, or a malicious user having stolen a seed phrase).

So practically speaking, no device should be considered above others and imposing a hierarchy might lead into convoluted technical solution which will likely be hard to maintain (not just from a technical point of view, but UX as well).

In terms of options for how this technically is achievable, 3 options come to mind:

Whatsapp

Whatsapp provides a web interface https://web.whatsapp.com/ (I know is web but desktop can be potentially use a similar mechanism).

Here basically the phone will act as a relayer for the desktop app. No key material is shared, your identity key is always on your phone, and given that you have authorized desktop, your phone will encrypt messages to desktop as they are received (We can go in what the QR code will contain, but basically it would be just a key exchange, for example an X3DH bundle).

The benefit is that no key material is shared, the drawbacks are that your phone needs always be connected for this to work (same as whatsapp) and that still does not address the case where for example a user enter the same seed phrase on 2 devices, bypassing this mechanism.

This can be potentially applied both ways, so your desktop might act as a relayer (but it only makes sense if you entered your seed phrase first on it, and then you would sync your mobile phone).

Overall this does not look very appealing, as keeping a device always on is fairly inconvenient, it suits a web interface (not surprisingly :slight_smile: ), but nonetheless is an option.

Signal

Signal uses a different mechanism, when syncing it will actually share your private identity key with your desktop (sending it through an encrypted e2e channel, which is what the qr code will help you set up).
This is the equivalent of restoring a seed phrase on both devices (there are variation on this, for example your don’t have to send the root key, but you could send only the chat key if you don’t want to share wallet, but the concept is identical).
Currently this is basically how multi-device works (minus the fact that we don’t send the key over the wire, only through entering the seed phrase).
The benefits is that it applies most cleanly to our settings, it’s already implemented in the protocol, if it’s good enough for Signal gives us some degree of confidence. The drawback is that key material is shared across devices, so inherently less secure than the solution above.

Hierarchical key scheme

This is another option, where your root key would delegate other keys and instruct other peers to send to them. So effectively no key material is shared (you only sign some authorization that will be presented to other peers).
The benefit is that this is the more secure of the Signal’s way (less than’t the whatsapp), as no key material is exchanged.
The drawbacks is that is more complex and fraught with technical issues as it does not fit very well with a decentralized system, and does not solve all the uses cases. For example, what happens if a user restores a root key on multiple devices (in this case you get the extra complexity, and you still might have to maintain a syncing like in Signal’s way)? What happens if a key that is a delegate, delegates further?
I would not go for this option to be honest, as the complexity seems not to be justified.

I have only considered options that do not require any on-chain action, to best fit with current usage of the app, that does not require any interactions with the blockchain for its basic functioning.
I favor Signal’s way seems as it seems to be the best compromise between security/ease of use, and covers all the use cases (included a user restoring a seed phrase in multiple devices).

Questions

To answer your questions:

  1. What happens on the background when you scan (QR) - Is it desktops temporary public key?

That’s really dependent on what’s the mechanism underneath. It is likely a key exchange of some sort to build a encrypted session between the two devices that can be used for further communication (could be a public key or an x3dh bundle, but no reason to settle on a strategy for now).

  1. Can the user initiate and scan the desktop’s public chat key (QR) from the mobile wallet or another place or should this be initiated as a new 1:1 chat?

I don’t have a strong opinion on it myself, technically both are feasible.

  1. Do we need to send verification code (OTP) after the user scans QR? If yes, what device should display it and which should enter it?
  1. Can synching start automatically after you authorize desktop client (mobile can revoke synching anytime)? Previously, you had to allow syncing on both sides (not user friendly). Is there a reason we want to continue doing it this way?

Yes, it can start automatically, we previously had it so that you would have to do it on both sides because it was non-interactive and you definitely want to have both devices have some sort of explicit action to enable syncing (so that who ever is operating either of the devices is aware that syncing has taken place, to prevent ghost devices).

If it’s interactive (so an action is required on both devices), syncing can start automatically.

Overall my advice is to move a bit away from mobile/desktop having different roles, and consider them as equal as much as you can, in terms of their role they have in syncing (of course sometimes that’s not possible, for example we can’t ask a desktop user to scan a QR code :slight_smile: )

There’s much more to say about syncing unfortunately as it’s quite complex given the landscape we have, but let me know if something is not clear. There might be other options that I didn’t consider of course, and I’ll be glad to hear them.

6 Likes

Awesome, good job :grinning:

Regarding

Status Mobile is initiating a “communication” with the desktop, not another way around

I’d like to filter out several factors and arguments I’ve heard these last weeks so we can discuss the trade-off between these. I see these factors all determine the optimal flow of setting up a new device, in this case Desktop. I’d also love to hear if there are more.

  1. Value Desktop application should provide
  2. Value proposition
  3. Experience of setting up a device
  4. Security of key storage
  5. Future scenarios
  6. Time

Also giving my own conclusion on what this means for the flow of setting up a device at the end if you want to skip to that :wink:

1. Value Desktop application should provide

Going by this discussion and other conversations. The key ‘value’ of desktop is to support Status as a network. This includes offering node management, simply adding nodes to the network, offering infrastructure for future other functionality on mobile, that doesn’t necessarily live on mobile. And yes, chat communication.

This value, impacts use cases and thereby, the most optimal flow.

Another factor here is the potential value of requiring Mobile to set up Desktop, as it would duplicate potential peers. (As pointed out by @jarradhope)

2. Value proposition

I’d like to separate here the value of desktop and the value proposition to end users. To be successful in supporting the network, people need to actually install and run the application. There needs to be an appealing proposition to do this. This proposition can be ‘Earn SNT’ (with node management) and `Secure communication on any device’, perhaps both. This value proposition also affects the use cases and what makes for a convenient flow. Mainly because it helps us understand how frequent users would be authorizing Desktop to use their keys (albeit them stored on mobile or locally)

cc @jonny_z would love to hear your thoughts on this here as well when you get a chance

Experience of setting up a device

Totally agree on the inconvenience when Desktop is the main device, and the proposition is to use Status Desktop, among other things, for office communication or node management, using keys you only ever use on Desktop. If the proposition is anywhere, anytime secure, private communication, mobile would be the main device for most users. (Not implying authorative)

Again, this depends on the value proposition that drives people to install the app on Desktop. Now, important here is that if we choose one direction (e.g. ‘Secure, private messaging on desktop to occasionally check messages you normally check on mobile’) it doesn’t cancel out all other requirements. It does help optimize the UI for convenience for 80% of users, 80% of the time. Effectively making the difference between Enter your seed phrase being a primary button and Connect using mobile being secondary, or the other way around.

Segway-ing into security, @iurimatias noted that using your mobile to sign could lead people to perceive the application as being more security (aside from whether this is or isn’t the case). If ‘security’ is a selling point. This is worthwhile to keep in mind (only if storing on mobile is actually more secure that is).

Security of key storage

Another factor that weighs into Mobile initiating communication is the level of security we can guarantee when storing keys on Desktop. I’ve heard storage on Desktop being described as ‘wild west’, whereas we have a audited and trialed security model in place on Mobile.

However, as @cammellos points out, “we loose and break phones more easily”. Surely storing keys on mobile has it’s limits to.

Future scenarios

Then to complicate things further, aside from Keycard, we currently only have one security model for multiaccounts in the Mobile app, while perhaps in the future we have more security layers. If for some low value accounts we’d allow storing the password the unlock keys in a password manager, is that all to different from storing keys on desktop? Seems like the hierarchical key scheme @cammellos describes.
Also, and admittedly I don’t fully grasp the potential and impact of account contracts yet, but I think the fact that the way accounts are used and keys are managed now might change, at least for some accounts. Would you be able to appoint devices as delegates to use your keys? Is it likely that a chat key will ever be held by a contract?

Time

Whatever trade-off seems most optimal will depend on time and complexity to deliver. Personally I anticipate something like:

  • phase 0 - Getting a client up and running will store keys on Desktop as it mirrors the Mobile implementation
  • phase 1 - Client can offer improved security by allowing to use Desktop with keys stored on Mobile
  • phase 2 - Offer increased security model for keys stored on Desktop, but still the option for even better security by storing on mobile (+ optionally Keycard)

My own conclusion

Up until now, I’ve tried to only highlight some of the factors worth considering.

Looking at the options described by @cammellos’ I’d be keen to explore the Signal-like approach of sending only the chat key over Waku as primary option, in combination with a secondary option to recover your full multiaccount using your seed phrase. I believe this combination can offer a clear option for users that are mainly drawn to the p2p, private messaging proposition which a lot of our communication is centered around; While at the same time offering Advanced users a way to ‘do more’ with desktop.

Effectively offering users a ‘Light’, chat-only connection and a ‘Full’ install.

I agree that a hierarchical key scheme seems to introduce too much complexity, and aside from that, dependency on a project that’s beyond desktop and what Core Mobile is currently working on.

3 Likes

Awesome work on the desktop designs.

Account syncing with a new desktop installation

With regards the Signal model @cammellos has described, I’d propose the following rough process flow.

  • The mobile app stores the private chat key.
  • User installs desktop app
  • desktop app creates a hidden throwaway random private key
  • desktop shows a QR code that contains the throwaway public key and a “sync” identifier
    • Sync identifier tells the app that a sync is being requested
  • User confirms the sync in the mobile app
  • mobile app sends a special Waku message via a 1:1 with the throwaway pk
    • message contains the chat key
    • for extra security this could be done
      • via a LAN only connection
      • have a non-propagation flag attached to the message (in case the desktop is functioning as a full node)
  • desktop receives the chat key and initialises the account set up and mailserver syncing
  • desktop deletes throwaway private key

This, in my mind, is the best approach.

2 Likes

cc @maciej @Simon

Thanks for following up here @samuel! Am I correct in thinking that the flow you’re proposing is for chat key sharing and device pairing without having to enter the seed phrase? (i.e. there’s no exchange of keys used for wallet tx)

for extra security this could be done

  • via a LAN only connection
  • have a non-propagation flag attached to the message (in case the desktop is functioning as a full node)

Simon and I discussed use of a One-Time PIN shown on the Desktop, that needs to be confirmed on Mobile. So you get something like:

  • Desktop app creates a throwaway random key
  • Desktop app shows a QR and sync identifier
  • User scans QR and sends message
  • Desktop shows and sends OTP
  • User compares, if OTP is correct, sends message that contains chat key

Do you think this could offer extra security? I probably need to understand the potential security risks better

Yes, correct…

Hmm, I don’t think that this would add additional security. OTP basically confirms that the user has a known device that belongs to the user, this is useful for web authentication for example. But in this case the user would be confirming that the user has the device they are looking to pair with, it would serve as a double check that the device the user is pairing with is actually the device they mean to pair with.

But in the case of a Status node, I don’t think that we should have the concept of pairing in the technical sense, we don’t need to form a link between the two devices (unless we do :confused: ). What we really want is to bootstrap a blank desktop node with the signing credentials (private key, or seed phrase) that the user’s mobile node has. Once the desktop node has the credentials, the node will be able to operate independently of the mobile device.

Both devices (mobile and desktop node) will be listening for and decrypting the same data once they both have the same key.


Just thinking, I think that OTP is actually a really good idea, just it should work the other way around. The user needs to confirm that the device that is requesting their key is the device they intend to send the key to. We don’t want user’s giving away their keys by accident, or getting their keys hacked by someone sending requests to all known chat IDs. Something like:

  • User opens up sync option page on mobile
  • Mobile creates a code
  • User enters the code into the Desktop
  • Desktop app creates a throwaway random key
  • Desktop app shows a QR and sync identifier and OTP code from the user’s device
  • User scans QR and sends message
  • Mobile node compares OTP in QR to OTP in sync session
  • If OTP is correct, mobile node sends message that contains chat key

The reason I suggest

  • via a LAN only connection
  • have a non-propagation flag attached to the message (in case the desktop is functioning as a full node)

Is because you limit your network to only the devices in your home or office for example, and reduce the chance that the requesting agent is illegitimate.

Thanks for the post,
the syncing method looks good, the only observation is that I would make the flow symmetrical, so
not talking about desktop and mobile for this purpose, but device A and device B, with some exceptions (desktop won’t be able to scan a QR code, so we need to provide an input for the public key etc), as I might want to connect my desktop with mobile, vice versa or mobile with mobile.

But in the case of a Status node, I don’t think that we should have the concept of pairing in the technical sense, we don’t need to form a link between the two devices (unless we do :confused: ).

Of course depends on requirements, but if you want them to be in sync, we do :slight_smile:

The main reasons are:

  1. Some actions don’t generate data over the wire. Opening a public chat for example, currently if devices are paired it creates a public chat on both devices. (just by naively sending a message to the paired devices).

  2. You only receive messages that other user sent to you, but you would not receive messages sent by you on an another device. We currently send a message to the paired devices every time you send a message.

  3. Some historical data is currently synced (contacts/chats open)

There are also some security considerations on 2 as we use PFS, encryption is device-to-device, and for example you need to take into consideration ghost devices (devices who have access to your pk but have not explicitly paired), so you would not want to sync with only pk encryption, but I won’t get into the weeds.

All of these is already in place https://github.com/status-im/status-go/blob/develop/protocol/protobuf/pairing.proto
and works (we sync historic/new contacts, historic new/public chats, new messages ), is a bit naive so could be improved upon, but basically just assumes the two devices have the same key and takes it from there, so ideally we keep the same protocol and we just improve the UX around pairing around and build on top if necessary (historic messages, group chats for example), so we don’t have to break compatibility.

1 Like