> despite popular believe P2P networks suffer from centralization issues in P2P ...

tzpardi · on June 19, 2016

The number of existing bootstrapping nodes do not solve the fundamental problem of current centralized bootstrapping. Not even if a virtual box costs $1 instead of $10. The issue is how a connecting node knows about other nodes in the network, and the current propagation of this information i.e. the DNS or IP of the seed nodes introduces centralization.

The closest and best solution for this problem was the early days idea of Bitcoin to use IRC to publish the listening seed information, but IRC still not decentralized, so IMHO decentralized bootstrapping is a real problem which we need to solve.

I kicked out a discussion for this at https://groups.google.com/forum/#!topic/streembit-dev/Xl2DpG...

the8472 · on June 19, 2016

It is decentralized in the sense that it is not centralized, i.e. there is no central entity in control managing entry into the network.

The key point is that any node in a p2p network can be used to join the network, so any number of independent entities can publish lists of ways to join the network.

Certainly, i would also love some multicast/anycast support on the IP level, but we don't have that, so making sure that you have many independent entry points into the network is the next best thing.

Maybe you're thinking distributed, not decentralized? https://i.imgur.com/YsiiKeq.png

zzzcpan · on June 19, 2016

I don't think bootstrapping per se is a problem. You only have to do it on first install and this allows for a simple solution to distribute bootstrapping in time, i.e. changing hardcoded bootstrapping data every N minutes, so new users could bootstrap from different nodes, than the users before them and so on.

A bigger problem is how software is developed. The development itself is centralized, but even worse, traditional software development approaches cannot predict all future problems, but rely on fixing them to keep the software working, which means it must be delivered to users in a centralized way too.

tzpardi · on June 19, 2016

I am sorry, but centralized network bootstrapping is a very well known issue and unsolved problem of decentralized networks.

http://ryandoyle.net/assets/papers/Distributed_Bootstrapping...

http://grothoff.org/christian/dasp2p.pdf

http://www.net.in.tum.de/fileadmin/TUM/NET/NET-2014-08-1/NET...

You say: "i.e. changing hardcoded bootstrapping data every N minutes, so new users could bootstrap from different nodes"

The fundamental issue is, changing the data where? The node who wants to connect to the peer network, obviously cannot obtain the information from the peer network itself (as node is not connected), so obtain the information from where? Currently, all decentralized applications, including my system Streembit use the techniques of obtaining the information from a centralized source - which is the oxymoron of decentralization. If a web services or other centralized applications provide the new node with the list of existing listening/connected nodes then the solution is surely not decentralized. Government agencies or cyber-criminals only need to attack the hard coded, listening seed nodes and then the network is done and never can be back again until a new list of hardcoded seed nodes is published via the application source code or via other channels.

As I said above, Satoshi's original idea to obtain the seed info from IRC was the closest to decentralization, but since IRC is centralized itself I am sure you can see how far that is from the a decentralized bootstrapping.

We have the solutions on local decentralized networks such as mDNS and UDP multicasting which uses protocol level solutions, and we are investigating to solve the problem at Streembit with IPv6 anycasting.

<<< A bigger problem is how software is developed. The development itself is centralized, but even worse >>>

I disagree. Open source software can be forked and then you can adopt as much democratic development methods and governance as you want.

zzzcpan · on June 19, 2016

> The fundamental issue is, changing the data where? The node who wants to connect to the peer network, obviously cannot obtain the information from the peer network itself (as node is not connected), so obtain the information from where?

Ok, you are assuming that the node already has the binary somehow. But that's not the case. In the real world we have to ship binaries to users. And this is where you can put your different bootstrapping data for different users.

You can go a lot farther: let users generate a binary distribution of the software to share with each other and hardcode bootstrapping data there obtained from a running network by that user.

> Open source software can be forked

The majority of users are not going to do that, they just don't have the skills. At best there will be a few popular distributions of the same "decentralized" software, with majority of installations controlled by a few entities. At worst - just one centralized entity that controls every installation.

tzpardi · on June 19, 2016

<<< Ok, you are assuming that the node already has the binary somehow. >>>

Well, I am not assuming anything. I am talking about a fundamental problem of decentralized networks: the current bootstrapping of networks is centralized. Please refer to the quoted papers and there are many other research papers as well which describe this existing problem.

<<< But that's not the case. In the real world we have to ship binaries to users. And this is where you can put your different bootstrapping data for different users. >>>

You are misunderstanding the problem. The point is

a) we should never ever hard code the seed information into the application and consequently ship it with the binary. If we do rely on the current approach then you always connect to the seeds of ETH foundation, Bitcoin foundation and to my company's seeds in the case of Streembit which is the oxymoron of decentralization.

b) we should have a protocol level solution for entity discovery instead of application level solution such embedding the seed info in the source and then compile it into the app. When I say protocol level solution I refer to mDNS and UDP multicasting which works on local networks just fine and I am proposing IPv6 anycast for entity discovery on global networks.

The simple truth is that Bitcoin, Ethereum and all cryptocurrencies conveniently ignore this issue. The companies, lead developers, foundations or whoever run the show maintain the seed nodes, but such solution is surely not a decentralized solution.

zzzcpan · on June 19, 2016

> we should have a protocol level solution

No! This is a problem that solves itself once you solve a binary distribution problem. But none of the papers you refer to address the problem of the centralized binary distribution and jump right to the protocol level for some reason. It doesn't work like that.

tzpardi · on June 20, 2016

I don't want to be condescending and I apologize if I sound like that, but I think you totally misunderstand what the problem is.

What difference it makes if you disseminate a fundamental design problem (i.e. the bootstrapping of the network is centralized) with a different type binary distribution? Whichever method you use for binary distribution, the distribution will deliver the very same existing problem. Again, please refer to the quoted papers to understand why the centralized network bootstrapping is an issue.

On the note of binary distribution, yes that is an issue as well and it would be nice to have a decentralized binary distribution, but again, that is an entirely different problem. BTW, I think our application Streembit can be used for decentralized binary distribution as well.

zzzcpan · on June 20, 2016

I think it's the other way around. You don't want to see that bootstrapping depends on the binary to be present on the node somehow and for some reason you think that solving bootstrapping makes sense even if it depends on another completely unsolved problem. It doesn't. You solve the first problem and only after that move on to the one that depends on this problem being solved.

tzpardi · on June 20, 2016

<<< you think that solving bootstrapping makes sense even if it depends on another completely unsolved problem >>>

No. The problem of centralized bootstrapping that is an existing and real issue of decentralized networks (see the quoted papers and my above explanation) does not depend on the problem of "binary distribution". In fact it has nothing to do with "binary distribution". If a user builds the software from source - so the user avoids any "binary distribution" - then the user still have the problem of centralized network bootstrapping.

It seems you don't understand that the problem of centralized bootstrapping is a generic information technology problem of all users, regardless what was the method of "binary distribution" if any. That's fine, it was a good discussion, but I exit from this debate with you which is becoming meaningless now :-) Thanks for sharing your view" :-)

zzzcpan · on June 20, 2016

> If a user builds the software from source

Again, sources do not emerge out of thin air and have to be shipped to users somehow. No matter what you do, user must get the software first to bootstrap. And since bootstrapping always requires information from the sources it cannot be considered a separate problem until you solve the problem of shipping that information to users in a decentralized way.

All of the so called "decentralized" software is fundamentally broken at the moment.

tzpardi · on June 20, 2016

It seems, you are trying to decentralize the who world which I think is a) unrealistic b) doesn't make sense. On the other hand, we are trying to solve specific use cases with decentralized applications. For instance Bitcoin implements a decentralized payment network to address a specific use case and our application Streembit implements a decentralized communication framework for humans and machines.

I am perfectly comfortable with disseminating the source via the centralized Github, though it would be nice to have a decentralized source control system. Users trust me and get the source from my repository. Users who don't trust me can fork the software and modify or distribute for themselves from the repo of a trusted person. The source distribution is centralized, but so be it. Once the users have the source/binary via Github the application addresses several use cases in a decentralized manner. I don't see how the centralized source repository invalidates the runtime soundness and robustness of a decentralized payment or communication P2P app. At the sametime, the runtime soundness and robustness of decentralized payment or communication P2P applications seriously affected by the centralized bootstrapping.

<<< All of the so called "decentralized" software is fundamentally broken at the moment. >>>

Quite true, but it is broken, because of the centralized bootstrapping, and not because of the source code is managed by a centralized repository application. I can drive with my car to your place and give the source code to you in person, which will be the ultimate decentralized source code distribution, but the issue I was talking about (centralized bootstrapping) will still exists once you run the software. I am talking about design and runtime problems of decentralized networks, and you have been talking about something different. Anyhow, thanks for the chat and all the best! :-)

zzzcpan · on June 20, 2016

I cannot acknowledge the decentralized bootstrapping problem, I'm sorry. It relies on the information from a centralized authority (i.e. source code) and therefore cannot be solved until the first one is solved.

pdimitar · on June 20, 2016

Could you please explain why it doesn't work like that?

zzzcpan · on June 20, 2016

It's very common in distributed system to push problems between different levels and ignore them, instead of addressing them.

weq · on June 20, 2016

Is a fallback to IP scanning a valid option to centralised bootstrapping? Lets say you have connected to 20peers during a session, u could use that list as a start for your scanning activities, because if there is one IP, there might be possibly more Ips close by that also provide the service?

tzpardi · on June 20, 2016

What you suggest with regards to the 20 previous IP is called seed caching, and yes, that is what many, probably most P2P network do. While it works most of the time on large networks, but it is obviously not the solution, because it cannot be guaranteed that any of the previously active 20,50, 100 nodes is still listening in a later time. Also, such caching and scanning does not solve the fundamental problem of knowing the seed IPs at the very first connection, prior when you populated the cache the first time. The question is, how we solve this problem without having a hard coded list of IPs that usually P2P software ship in the source code.

Other techniques and several research papers suggest that scanning of a wide range of IP by guessing what could be a connected node is an option as well. In theory it certainly works, if you scan the whole internet then soon or later you will find a connected node and in theory it does solves the problem of centralized bootstrapping, but all these papers also agree that such scanning could be terribly inefficient.

What I mentioned, mDNS and UDP multicasting work fine on local networks, and we are experimenting with IPv6 anycast on global network to solve this problem.