A case for breaking backward compatibility in V1


#7

Agree with @decanus, @Bruno and @jakubgs.

My view on this is that while I believe that we need to do much better in terms of planning for upgradability (and performance/load considerations), we also know that the runway is much shorter than it was just 18 months ago. We need to be as efficient as possible to get a great product out to users ASAP. Therefore, I believe we need to study where we locked ourselves into a corner so we can avoid repeating those mistakes in the future, but also avoid wasting resources maintaining compatibility with a product version that probably no one will be using and that will drag down the user experience of everyone else.


#8

:wave:

Personally I’d be happy to break backward compatibility. As I think about onboarding my family and friends to Status (also non-technical) what matters most to me/them is that they have a fast, reliable and safe app. If breaking backward compatibility would help those things even a little, I think it’s worthwhile.

Whilst other heavy users may indeed be well aware of what they signed up for when joined the beta - we should probably also acknowledge that this may suck for them even if they’re ok with it. Since they’re our biggest fans right now, we should at the very least acknowledge any inconvenience we might cause them.


#9

And that’s a very fair point. Which is why we should give clear instructions for recovering their old wallet address and provide the community with the reasons why we decided to(if we do) put new features and polish above backwards compatibility before we reach V1.


#10

I concur with the general consensus. Backward compatibility should be a priority after V1 launch. I would also like to propose implementing some kind of voting mechanism built into the status app in the future to get community insight as to the direction of such decisions.

As for the current issue, there needs to be a clear public statement to inform users to pull funds and/or back up seeds. I am also interested to know if there will ever be an advanced option to either recover the private key directly from the wallet, or atleast allow me to view the backup phrase again as I have multiple accounts in status, where I have actually lost the backup phrase to.


#11

This is a great thread, thanks @yenda for getting this discussion going and identifying the pro’s and con’s (although a tiny bit bias :p).

So then if I import my legacy wallet, what chat will I be using? WIll it a separate wallet and chat account forever? Isn’t that more confusing for a user in the long run? If this is the case, we may be better off offering an upgrade path for users to be able to “transfer” their wallet funds from their old wallet to the new account.

Beyond this thread saying “we should have better backward compatibility practices”, there don’t seem to be any suggestions or recommendations on how we can improve processes to do that. Are we sure we are not kicking the can down the road for further migration issues later? Are we confident enough to say "this is the LAST forced account migration cut that we’ll do

+1 good idea. We have the Voting DApp, but that isn’t actively maintained afaik.


#12

We don’t have to break compatibility, but kicking the can down the road is exactly what we would be doing if we didn’t do it for v1. We have been keeping backward compatibility for a year now to improve this process.

In this past year of experience we solved backward compatibility issues with 3 different approaches:

  • spend development time
  • accumulate technical debt
  • never move forward

Here’s are some concrete example:

  • for the database it is simple, we accumulate schemas and write migrations, we consume development time and accumulate technical debt.
  • we made mistakes when designing the current communication protocol but the one thing we really took care of, was to embed upgradability. We would have been able to improve it over time. Whenever a protocol upgrade was required, we would have released x versions of the client that support both the old and the new version of the protocol, allowing a smooth transition while clients were upgrading until we could finally make a release that drops the old version. I would even argue that it was a form of backward compatibility already, albeit a sane one. Sadly as we were about to use this feature we mentioned we’ll drop compatibility after a few version the decision was made to stay backward compatible at all cost. The communication protocol uses a very Clojure specific encoding to improve performance and bandwidth. Messaging has since moved to status-go and we could drop this format and this upgradability to make the protocol easier to implement for third parties. We never moved forward.
  • the communication protocol got some upgrade over the past year and to stay backward compatible, we always kept the old fields and only added new ones, sometime duplicated for performance over bandwidth. The option here is development time and technical debt.
  • we never enabled PFS in 1-1 chats, it would break compatibility with versions prior to last December. We never moved forward.
  • the discovery channel that is used by every client to find each other is not partitioned. Old clients are not going to magically learn about the existence of partitioned discovery topics so unless you start ignoring them at some point, you need to follow this single topic forever. This means that by posting to this single whisper topic, you can spam all Status client at once, and you need very little resources to achieve that. So to reduce bandwidth we spent development time and came up with a topic negotiation solution that will drain the traffic away from the single discovery topic over time. But for hostile situation, we never moved forward.

#13

So then if I import my legacy wallet, what chat will I be using? WIll it a separate wallet and chat account forever? Isn’t that more confusing for a user in the long run? If this is the case, we may be better off offering an upgrade path for users to be able to “transfer” their wallet funds from their old wallet to the new account.

You’ll have a new chat account, and it will always be separate from wallet. I don’t believe it’s confusing to users because people aren’t likely consciously aware of the connection between the wallet and chat IDs that we have today. Not sure I understand the preference to manually transfer funds; I believe it’s easier and more direct to import your old wallet using the add account flows we’re introducing.

When it comes to future migrations & assurance of compatibilty:

  • We can be 100% committed to not breaking compatibility again, acknowledging that this is a special circumstance—beta to v1—that we won’t benefit from in the future.
  • Being 100% prepared for backwards compatibility in the future is a different story. It would be great if we had a CTO-like owner who could help by offering foresight.

To the technical team members on this thread @yenda @jakubgs @pedro @Bruno @decanus : is it possible to have a catch-all procedure in place for backwards compatibility in the future? To what extent are we able to predict and prepare for future breaking changes?


#14

I agree with @yenda, @jakubgs, @pedro, and @Bruno.

I don’t exactly disagree with @decanus and the others trying to set up a quid-pro-quo between breaking backward compatibility now and making absolutely sure that it won’t happen in the future, but I don’t view one as a necessary or useful condition for the other.

We have longer to set up a culture of retaining assurances of backwards compatibility post-1.0, but it’s important to decide what to do pre-1.0 quite sooner. I don’t see how one is is supposed to set up the temporally backwards causality from the former to the latter.

So, break compatibility to gain all the advantages that @yenda describes, without trying to set up some kind of conditioning on some potential future guarantee (even though, yes, I happen to think that it’s important to keep backwards compatibility post-1.0).


#15

I would take a moment to stop and give these points in particular a second look - and what they indicate. Let’s for a moment assume that post-1.0 is going to look similar to development we have now - we come up with some pretty ok ideas, then come up with some better ideas around the same topic. So far so good, but now we’ve promised everyone to be backwards compatible. What happens next?

Let us indeed for a moment drop the mushy culture bit for a while - think about the architecture and the code, very concretely. If the only way to solve technical debt and inability to move forward is to remove support for some particular version, what does this tell you of its design? Of not having the capability to keep several protocols or versions of things under one hood and expressing their capabilities between the layers so that the final layer - the one that the user sees - can make the best possible choices given the material it has to work with?

Imagine this for a web browser - html 1.0, 2.0, 3.0, 4.0, 5.0… and all the minor version in between, and all the alternatives like XHTM etc. same thing for http itself - 0.9, 1.0, 1.1 etc - somehow, most browser manage to pull this off, and I’m pretty sure we all agree rendering a web page is more involved than showing a chat bubble. Put differently - when you’re building a web browser, there are two facts of life: standards will keep changing and users won’t care. Compatibility is king.

So sure, by all means, remove the old stuff, make a cut somewhere for the sake of sanity, and so on - whisper really is terrible the way we use it and it would be nuts to keep it around forever. But also take a moment to remember that the world will look exactly the same post-1.0, so make sure to address this in the architecture not by just removing the old versions, but by making it painless to keep them around. Technical debt is not debt if it’s a feature, and moving forward is not a problem if the underlying technology makes room for it. You will know you have it right when the decision to remove an old version can be mode at leisure from a beach in Thailand, instead of as a plea of desperation.

This is culture.


#16

We deal with it when it happens.
Future imaginary problems are no argument to accumulate technical debt of the past if we can get rid of it at this particular point in time, precisely because we are still in Beta.

Wait wait wait wait, where did you get that form? Where is the “only way” coming from in that sentence? Just because we want to take advantage of the last time we are in Beta before V1 to clear up some technical debt doesn’t mean that it’s the only way to deal with those kinds of problems, that’s a false equivalence, and a silly one at that considering the conversation so far.

So never? Because as far as I know it’s never that easy when you have actual users.

I’m starting to get really tired of this discussion, can we please get some kind of decision instead of endlessly talking about this? @oskarth @jarradhope @naghdy this is getting silly, I’d like a yes or no.


#17

Alright, go ahead and break backwards compatibility, things move fast in crypto and Status can’t afford to spend the time to integrate Legacy Status because we need this up and running quickly. It’s going to be an inconvenience to replace all my contacts and stuff but I’ll do it and I feel like the hard core Status users will too.


#18

I agree with your opnion.


#19

It like seems the overwhelming consensus in this thread is in favor of breaking compatibility. There’s a lot of thoughts here and I’m not going to address them all at this time. I’d like to do a few things: give some context on previous decisions made, reframe the discussion to be more high resolution, and finally suggest next steps. This post will be quite long, I apologize for this in advance. It seemed more useful to get it out sooner rather than later.

Context

We’ve had many discussions on compatibility, the value of keeping it and so on. People buy into this value to various degrees. I won’t elaborate on this point here. More specifically, “controversial compatibility” has mainly come up in two contexts recently.

First example - topic negotiation

First is topic negotiation, where the consensus within Core, and later Core Dev calls, was to keep it. and Andrea deviced a solution here:


Let’s have a look at some of the tradeoffs for this solution:

  • took some time to develop (-)
  • provide a graceful upgrade path for chat (+)
  • leveled up our general sophistication level for how do do upgrades the right way (+)
  • address the bandwidth problem long-term by planning for the future (+)
  • doesn’t immediately solve the bandwidth issue (-)
  • gives us the option to later deprecate legacy features if so wish (e.g. unacceptable BW for release => flip switch) (+)

The initial discussions were initiated in protocol<>core calls with me, Dean, Igor and Andrea MP. It was the Core consensus to do it this way, and it was later agreed up on in Core Dev calls.

Second example - multiaccount

The second was with respect to multiaccounts. Now this one is a bit of a clusterfuck, because there’s no real technical owner for this work. This means responsibility, technical owner, the “decision chain” and general understand is…lacking, to put it mildly. That’s a different topic though.

There was a call with me, Igor, Andrea F, Rachel, Hester and GL. The notes are here: https://notes.status.im/XTpDfTZRRVi7-54BlXtSug# To be honest, there was a bit of confusion in the call in terms of what “compatibility” even means for multiaccount, and it often got conflated with migration. It’s my view that this confusion remains today, and it is reflected in a lot of people talking past
each other.

The general consensus at the end of the call was that there was a fairly simple way to provide the option for pre-V1 chat accounts, and that this was worth pursuing. The main challenges identified were some UX tweaks. After that, several stakeholders were afk for a week or two and this decision was informally accepted but not properly communicated. The gist of it is to give people the option to use pre V1 account (‘legacy’) to chat, while still allowing them to use Keycard and multiacconut to secure their funds. The second option would be to essentially burn your old identity and create a new multiaccount. This was my suggestion, as this is how I as a user have a strong preference for. The metaphor I used was that you shouldn’t need to change your wallet and driver’s license just because you got a new secure bank account. This has the following tradeoffs, roughly speaking:

  • requires minor tweaks to copy and UX screens (-)
  • provides graceful path to use same contacts and chats as before (+)
  • while still being able to reap the main benefit of Keycard, which is secure funds (+)
  • clarifies benefits of upgrading from legacy to secure account but give user choice (+)
  • enables users to use insecure/legacy account (-)
  • some increased complexity in terms of devs having to think about different key structure (-)
  • reduced risk in terms of being able to use existing structure if things break/take too long (+)
  • gives the option to later deprecate legacy account in one go if we so wish (+)

Re-framing the conversation

Right now the scope of the conversation is quite big. It includes many aspects of the protocol and the app architecture. Additionally, it’s framed rather coarsely in terms of breaking compatiblity, aka changing things so they no longer work together, for N reasons. No doubt these N reasons are good and useful, and I even agree with most of them!

I would suggest a more useful way of looking at this problem is in terms of:

  • adding specific capablities
  • deprecating specific features

This relates to previous discussions we have had regarding feature flags (https://trunkbaseddevelopment.com/feature-flags/) and proper abstractions (https://trunkbaseddevelopment.com/branch-by-abstraction/). Deprecating feature - like no longer supporting non PFS in the default chat client profile - then becomes a decision that you can do half-drunk on a beach in Thailand, like Jacek alluded to.

The fact that we can’t reason about these things at this level implies that we, generally, don’t have this this culture of thinking carefully about architectural decisions. Another is the fact that we are conflating V1 scope and app concerns bleed into more pure protocol changes discussions. I won’t elaborate on the latter point here, but we are making progress in whipping the specs repo (https://github.com/status-im/specs) into shape. Thanks Adam and Corey for your recent contributions there to make this into a reality.

The current proposal

I think it is great that this conversation got started and it is clear that it resonates with a lot of people. However, somewhat flippantly, right now the proposal as it stands roughly amounts to:

  • change and break everything at the same time
  • (because we can and it is “easy”)

“V1” is mentioned 15 times in OP. So it’s hard to disentangle “protocol compatibility” from all the other changes we want to make to get a solid V1 out there as soon as possible. Which is what we all want! So the conversation is inevitably tied to v1 scope and other “changes” we want to make.

There’s a lot of moving parts here. Off the top of my head: We want to move the protocol to status-go, we want to swap out the underlying database (realm), we want to move message encoding (transit), we want to introduce MVDS, we want to change the fundamental key structure (multiaccount) and everything it could possibly touch in the app, and so on. That’s a lot of things! Especically at the
phase where we are at now, which is at the (supposed) end sprint, or even past it in terms of the goals we set for ourselves, of V1.

What’s our goal here? If our goal is to cut scope and release a usable V1, is more change necessarily the answer? I’d like this discussion to be more high resolution.

Any time you change things, you are putting a lot at risk. As a rule of thumb, at the end of releasing something you want to reduce risk and cut scope. We are currently on our way to do the opposite, and it could easily end in a disaster.

Next steps

The voices in this thread are loud and clear, people want to break (some forms of) compatibility for V1, for various reasons. To me, that generally means deprecating old/bad APIs, assuming the new introduced capabilities actually work as intended. Deprecating bad stuff feels good! And it allows us to get rid of bad designs and start from “scratch”, and build on solid foundations. However, this is also relative. V1 is just the beginning, and we are going to continue to make big changes (like removing Whisper), so it’s important we get the mindset right. Don’t change, add capabilities and deprecate (when you have to).

The main voice I hear in this thread is that people want to cut scope and reduce uncertainity to get V1 shipped. I 100% agree with this, and this is a larger conversation, as scope covers more than just compatibility.

So let’s make sure we cut scope in a rational way, with the proper framing.

See Status V1 scope call on Thursday (prepare by tomorrow) thread for next steps on this.


Status V1 scope call today noon CEST
Status V1 scope call today noon CEST
#20

Yes there is not much more to this post than the proposal you describe, and the purpose of the OP was to check if there was a clear consensus on breaking “backward compatibility”. So far most of the focus in multiaccount meetings was on migrating legacy accounts to multiaccount and the consequences of having both. We can now focus on finishing multiaccount properly which is already a big task before v1

As you rightfully point out we now need a higher resolution discussion to go further. I would consider this discussion closed and move on to https://discuss.status.im/t/status-v1-scope-call-on-thursday-prepare-by-tomorrow/

  • doesn’t immediately solve the bandwidth issue (-)

it never solves the bandwidth issue if you don’t break backward compatibility at some point

FTR I started to take notes on removing realm here:


#21

Browsers have legacy support but HTLM5 is not backward compatible, you can’t open a HTML5 page in IE5 .

The best you can do is keep using the old protocol for a while so that most clients are compatible when you do the switch. That is what we had, messages were versioned, and we were able to support older versions while sending new ones.

When we started using it, we announced that it will make new messages unreadable to older versions after n releases, and the debate on backward compatibility started. It was the change in the way message-id are calculated. The proper solution IMO would have been to make the fix in v1 message-type like we did, AND prepare a v2 free of legacy, on which we would have switched after n releases. But the PR at the time only did the later, while the debate ended up on the decision to do the former. That first fix became the model.

Having the communication protocol in the js thread with limited capabilities also lead to poor decisions like using transit on the wire, because it is an order of magnitude faster to parse for clojurescript compared to json. But having the protocol on status-go, we could have used protobuf and json and eventually pass transit to status-react. Back then we were somehow much more reluctant to move anything beyond go-ethereum patches to status-go.


#22

" API backwards compatibility

When it comes to software APIs, there’s a school of thought that says that you should never break backwards compatibility for some classes of widely used software. A well-known example is Linus Torvalds:
People should basically always feel like they can update their kernel and simply not have to worry about it. I refuse to introduce “you can only update the kernel if you also update that other program” kind of limitations. If the kernel used to work for you, the rule is that it continues to work for you. … I have seen, and can point to, lots of projects that go “We need to break that use case in order to make progress” or “you relied on undocumented behavior, it sucks to be you” or “there’s a better way to do what you want to do, and you have to change to that new better way”, and I simply don’t think that’s acceptable outside of very early alpha releases that have experimental users that know what they signed up for. The kernel hasn’t been in that situation for the last two decades. … We do API breakage inside the kernel all the time. We will fix internal problems by saying “you now need to do XYZ”, but then it’s about internal kernel API’s, and the people who do that then also obviously have to fix up all the in-kernel users of that API. Nobody can say “I now broke the API you used, and now you need to fix it up”. Whoever broke something gets to fix it too. … And we simply do not break user space.
Raymond Chen quoting Colen:
Look at the scenario from the customer’s standpoint. You bought programs X, Y and Z. You then upgraded to Windows XP. Your computer now crashes randomly, and program Z doesn’t work at all. You’re going to tell your friends, “Don’t upgrade to Windows XP. It crashes randomly, and it’s not compatible with program Z.” Are you going to debug your system to determine that program X is causing the crashes, and that program Z doesn’t work because it is using undocumented window messages? Of course not. You’re going to return the Windows XP box for a refund. (You bought programs X, Y, and Z some months ago. The 30-day return policy no longer applies to them. The only thing you can return is Windows XP.)
While this school of thought is a minority, it’s a vocal minority with a lot of influence. It’s much rarer to hear this kind of case made for UI backwards compatibility. You might argue that this is fine – people are forced to upgrade nowadays, so it doesn’t matter if stuff breaks. But even if users can’t escape, it’s still a bad user experience.
The counterargument to this school of thought is that maintaining compatibility creates technical debt. It’s true! "


#23

But we aren’t discussing mindset or culture here, we are discussing using the opportunity of switching from Beta(which is unstable, unfinished, experimental software) to V1(which is stable and should be reliable moving forward). As far as I can tell you wrote a wall of text that amounts to “we should have endless conversations because it’s easier than making a decision”.

No, this is a great conversation to finish and get it over with. It’s easy to talk about work endlessly, especially when you have a big team of engineers, but eventually we have to stop talking and do actual work. Or we can keep throwing walls of text at each other and never get anywhere. The bigger the walls of text the less likely people will be to respond, and whoever generates the most noise wins by stopping the conversation, despite being in the minority with their opinion.

Your “Re-framing of the conversation” hasn’t re-framed anything, more like muddled it.
Erics proposal is very simple:

  • Drop the DB, which allows us to drop Realm.js and a shitload of old DB migrations
  • Avoid having to design, develop, and test a migration process for accounts
  • Give people new accounts, which gives us a clean break for introducing multi-account
  • Allows us to change the protocol to use JSON/Protobuf for easier usage by 3rd parties

This is the only time we can do this. If we wont, we will make future changes even harder to make backwards compatible and make development slower.

I really hope the call on Thursday results in some kind of decision.


Status V1 scope call today noon CEST
#24

i like your style, talk is cheap, just do it. :+1:


#25

:clap:

This. Blow it up, if at least just this one time when there’s a forgiving chance to do so.


#26

So I think from a protocol level enforcing backwards compatibility is quite easy, and if not backwards compatibility allowing for multiple versions to run at the same time should be simple as long as clients expose which version they are on.