Bootstrap Nodes Smart Contract


#1

While we continue research on the data sync layer, one of the important components we need to think of is the discovery method. How do data sync nodes find their peers? There are multiple methods we can go with for example Kademlia. However the actual discovery process is out of scope for this post. What I want to discuss here is how we determine the bootstrap nodes required for most of these peer discovery methods. In order for nodes to connect to the network and find peers they already need to know the address of another node so they can connect.

There are multiple processes that exist for knowing these first nodes:

  • Static Bootstrap nodes
  • Node operators sharing addresses with each other through some secondary service

The latter creates a hurdle for anyone trying to join a network whereas the former is not very democratic. Therefore, we should think of different schemes that make it easy to add new bootstrap nodes and obtain the addresses to these nodes. Ideally this process would be permissionless so anyone can submit their node as a bootstrap node (this brings along problems of its own).

My proposal for this would be a smart contract that allows the submission of a node by any participant of the network. To ensure nodes are not malicious, a vote would occur on every submission that requires 2/3 of the current bootstrap nodes to agree on the new node. A stake would also be submitted which could get slashed if the node is found to not be behaving as intended by the rules of the protocol.

New nodes that want to join the network then simply query the smart contract to find a bootstrap node to connect to.

Of course this idea is very early, comments are welcome to help improve the idea.


Increasing bootstrap node diversity (KISS edition)
#2

I don’t think it’s practical to rely on a smart contract. Since we can’t assume a static and consistent set of boot nodes, the list would have to be mutable and updated. Applying this sort of logic within a smart contract would only further increase bloating of the state and would also present some inherent degree of vulnerability.

Before looking further into that, I would consider IPFS. You could implement some sort of multisig in which the bootnodes themselves sign off on their inclusion in the list. A node that’s entering the network would refer to this IPFS address, reference the list, and further validate identify when establishing a handshake.

I know your response is going to be something about sybil resistance, but for this particular function, I would ask why should I care? In regards to the malicious nodes, I would ask the same question. The staking process seems more complicated than it needs to be and I thinks it’s a misnomer to call it permissionless since the act of submitting and awaiting entry via vote is logically equivalent to asking for permission to join.


#3

Hi @decanus ,
as part of network incentivisation effort, we are moving to something similar to what you describe:

https://docs.google.com/document/d/10ZVIkYkM8_YwpEbxLX05pIcL8ISuZZvDPV6L_t27xgw/edit

https://hackmd.io/kQHqdZ0xR7WbYwGEG4FFNg

and we are currently in the very early stages of the implementation:

which basically read nodes (static, bootnodes and mailservers) from a contract, currently they are approved by us, but eventually the registration will happen through votes (we are working on the contract now).

If you have time it would be very useful if you could give us some feedback on the approach (not everything has been fleshed out yet though)

@zscole thanks for raising those issues, we were thinking about a consesus based solution as there is at least a successful live implementation (DASH), which uses as well the blockchain for membership, what do you think about their approach?


#4

DASH implements a completely different architecture that relies on a masternode system to accomodate for these additional network functions. When I was initially speaking with @decanus about this topic, I suggested something similar, so I think it’s feasible, but I’m not quite sure how practical it would be within the current implementation of the Ethereum protocol.

The easiest thing to do is just offload this logic to smart contracts, but I don’t think that’s practical or scalable since we’re currently trying to address 1.x state-related issues. It could make sense though if we defined a new-fangled ERC token standard which took advantage of some sort of TCR logic to implement this functionality, but I literally just pulled this idea out of my ass as I typed this, so take it with a grain of salt.


#5

IPFS is interesting, but does it work well on a mobile client? Lat time I tried (maybe 3 months ago) the current ref implementation it was quite resource hungry.
So, by doing that, isn’t there a danger in getting dependant on either:

  • a tech that doesn’t exists (light IPFS client)
  • a centralized gateway of some sort (infura?)
  • a problem of discovering non-centralized gateways (that makes it catch-22 question, as I understand)

#6

What are your thoughts on allowing anyone to join the smart contract registry without a vote but adding a reputation system where initial reputation is 0 and overtime proof of service will be checkpointed to the registry and increase the reputation of the nodes that have provided service?

Creating a proof for service provided seems to be the thorny issue here, so I guess the first main question to address is if creating such a proof is feasible?


#7

I think reputation is a good idea, although probably a bit harder to implement then voting, but certainly has many benefits.

In terms of proof of service, we have looked at DASH, which is probably the most basic implementation, loki https://lokidocs.com/Advanced/SwarmFlagging/ and NuCypher https://www.nucypher.com/static/whitepapers/english.pdf , so looks like although might not be bullet proof, it is feasible and we have some references.


#8

It’s also worth noting that voting has it’s own challenges, for example if the vote requires 2/3 of nodes at some point getting a quorum for a vote to reach a decision make be difficult and may call for adding delegate voting in order to prevent deadlock.


#9

When we figure out how to manage the boostrap nodes list, the next immediate question that popped into my head was how to distribute it to the clients.

Currently, we have it hardcoded in the app in a file. However, there are better ways:

  1. A call to a smart contract using Infura or some other HTTP gateway (because obviously one does not have the chain synced at the beginning; this is dangerous as the response can be spoofed),
  2. Use DNS as proposed here http://eips.ethereum.org/EIPS/eip-1459,
  3. Do it offline by getting it from your friend, QR codes etc.

With DNS there is also a problem how to deploy the bootnodes list to a DNS provider because someone will need to own the deploy keys.


#10

Handshake may be helpful for solving some of your problems. It is meant to move the DNS root zone to the commons by facilitating on chain auctions of Top Level Domains. After winning an auction, the TLD is represented by a UTXO. The name state is committed to in each block and a Merkle Proof can be used to validate the requested DNS records. Testnet4 just launched and there is a drop in DNS resolver for Node.js here.

It would be possible to use Handshake for keeping a list of bootstrap nodes. Handshake full nodes are not too computationally expensive, so it should be easier for anybody to set up their own nodes to do DNS resolution. There is also a SPV client that can run in the browser.


#11

@tynes I don’t see how handshake is better than a simple smart contract in this case.


#12

I believe in using whatever solution is the best for the particular problem. You don’t have to use Handshake if you don’t think that its a good solution. I do think that its a good thing to at least consider because:

  • Running a full node is lightweight so you don’t have to rely on something like Infura for uptime
  • Current DNS tooling/infrastructure will play nice with it, like dig and bind
  • The name itself is a liquid asset and can be traded or go up in value if it resolves a useful service
  • The only state rent is a single renewal transaction per year, so a transaction fee pays for data availability
  • Name resolution is encrypted using Noise Protocol
  • Already runs in browsers

It is possible to have a single Handshake TLD that resolves a name server (or many) to which people can add A/AAAA records or even Tor/I2P addresses. This is an offchain solution, which may not be a desirable property in your case. But in return, you would get interoperability with the rest of DNS.

If you wanted to do everything on chain, you can write up to 512 bytes of data to the Urkel tree per name. It is looking like each Handshake name will operate in one of 2 ways - similar to the traditional DNS system where only referral records will be allowed (along with DNSSEC related records) or a DIY mode where you can post whatever sort of unstructured data or any DNS records that you want but you get no compatibility with the current system.

I believe Handshake could be a good alternative to IPFS because Handshake guarantees data availability and provides a Merkle Proof of inclusion. Note that this is based on the network successfully launching and having enough hashrate to keep it secure.


#13

Just wait a hot minute here. I was thinking about this on the toilet a moment ago.

A bootstrap list is required during the bootstrapping process in which a node first entering the network identifies its peers. We can’t assume that this newly participating node has any prior knowledge concerning state and even if it does, this state could likely be invalid if the node has been offline for a period of time. Since it has to query the state after establishing connectivity with its peers and then begin (re)syncing with the current state, within which this smart contract would exist, wouldn’t storing a bootstrap list in such a way be pretty useless?


#14

@zscole Good point.

Under a Proof of Work model, there would have to be some subjectivity in what the longest chain is, because you can never really know unless you validate it yourself. A service like Infura could resolve the requests and you could trust them to 1) validate all previous state transitions, 2) that they are on the heaviest chain and 3) that they are on the most recent block. This is not the most ideal solution imo

Under a Proof of Work model with a finalization gadget like Casper FFG, the state could be queried from the most recent checkpoint. This is better, but means that there is a delay for reading the most up to date data from the contract, but I don’t think that is much of a problem for this use case. Sacrifice a bit of liveliness for safety.

Under a Proof of Stake model, validators could provide additional services besides producing blocks. I’m not an expert on Proof of Stake but if they provide additional services that increase their margins, it makes sense that they could do it. This becomes messy because it becomes possible to drag higher level abstractions into consensus, which could have unintended consequences.


#15

Reading from the chain reliably is an existing problem even with services like Infura (assuming trust) due to limited querying capabilities of the JSON RPC and even with something like EthQL the queries are likely not performant due to the way the data is structured on the node (generally and not indexed based on specific dapp use cases).

This possibly could be addressed with a L2 solution like https://thegraph.com/ by creating a subgraph for the boot nodes smart contract and the client can check the response against multiple query nodes. I don’t know if there is any incentive for running a query node, perhaps thegraph intentionally left this out leaving it up to the subgraph to decide the best incentive model for the application which does seem to make more sense than a general incentive model for all types of subgraphs.


#16

@barry

I’ll poke my friend who works for The Graph to see if he’s willing to respond to your questions.

My main question is how does The Graph handle safety vs liveliness regarding reorgs? It would be really cool if there was a knob that could be tweaked in either direction because I think different applications may have different needs. I’d prefer safety over liveliness.