Discussion on recent Chat communication protocol changes

learning

#1

Abstract

During our October 2nd weekly chat meeting, we discussed an issue that came up with a recent pull request that changed the communication protocol to support message replies. It did so by adding a new message type c7, so as to not break existing clients which relied on message type c4 (as discussed in Slack, there are ways to achieve that in an existing message type without breaking bi-directional compatibility). The end result was that old clients were not able to display messages from newer clients (which were now only sending the new c7 message type).

Meeting agenda

  • Review: Beta Upgrade Policy.
    • Every release should be compatible especially in terms of:
      • User’s data
      • Messaging/Transactions
    • Should we update the policy, in view of new additions such as PFS?
      • Do we want to write down guidelines regarding what to do when we publish broken builds (e.g. maintaining the new c7 message)?
  • Reduce friction: better communications around new nightlies bug and incompatibility
    • nightly channel posting latest nightlies and known big bugs?
  • Compatibility of messages between releases/nightlies;
  • Message validation;
  • What can be learned from this exercise to reduce the chance of repeating in the future? Any (lightweight) safeguards we can put in place?

Meeting notes

  • Introduction
    • @arnetheduck started by pointing out that the protocol needs to be considered more like a sacrosanct piece of code, which will receive a lot of scrutiny when changed. Moving the protocol to a dedicated repository and formalizing it would help. It was also pointed out that this has repeatedly come up as something we need to do, in different contexts (Cardona with @yenda, PFS with @cammellos - API is actually on the status-go side because of that, etc.)
    • @Chad mentioned that the fact that desktop hasn’t had a proper tested first release so regressions or problems might be in latest desktop nightlies. This will be improved once desktop has regular releases similar to mobile.
  • Beta upgrade policy
    • We agreed that it was time to review the policy given lessons learned since they were first published by @igor, as well as making them more official and covering some aspects such as how to handle a scenario where we put out a broken build.
  • Better communications around new nightlies bugs and incompatibilities
  • Compatibility of messages between releases/nightlies
    • @cammellos mentioned that he thought that we were only striving for compatibility between official releases and that it would be fine to have a broken nightly, however if that’s not the case, then releasing features hidden behind a feature flag that can be enabled locally (maybe through a local .env file that gets merged with the build one) should be an acceptable solution. @cammellos also mentioned that in certain situations it will be more helpful to test with a wider group of people (e.g. all Status contributors), and it was suggested that we could whitelist the CCs so that we would have a way to enable a feature for them only.
  • Message validation
    • @cammellos demonstrated how it is possible to craft payloads which cause clients to crash or otherwise misbehave (e.g. show an error message containing given text). We need to harden the app against non-conforming payloads.
  • Learnings
    • @jan.herich pointed out that the biggest misconception that played a part in this was that most people didn’t realize that when this PR got merged, it would immediately start getting used by people relying on it for daily communications. That is now clear, so any chances of this happening again are very low.
    • As follow up actions, it was agreed to:
      • Start this discuss thread;
      • Create an Architecture decision record (ADR) with context, decision, and consequences;
      • Create GitHub issues for important changes discussed (e.g. doomsday scenario, message validation);
      • Post an updated Beta Upgrade Policy in a visible/permanent place (location TBD). We can probably start with a discuss post to hash it out.

#2

@pedro a big change that we will have soon is decoupling the whisper and wallet keys. as we said previously this is needed for the hardware wallet so that the whisper key is downloaded to the client whereas the wallet key never leaves the card. can we start talking about this maybe in one of the next core-chat meetings please? I think this change will impact many different aspects of the app.


#3

Sure, just add it to the meeting agenda at https://notes.status.im/weekly-chat-sync-agenda#. Tomorrow’s our next meeting.


#4

FYI I can’t see those notes (edit even less so) without being logged into Google. Can we fix that?


#5

Pinged @jakubgs regarding permissions


#6

If you want public notes use https://hackmd.io/, my understanding was that https://notes.status.im/ was intended for Status notes, hence Google OAuth. I see no reason to change that ocnsidering you can do public notes on hackmd.io.


#7

Oh, I wasn’t aware of that, I thought the change was more a question of centralizing our notes under the status.im domain and that we’d stop using https://hackmd.io.