But the only scientific evidence he has is that some bacteria DNA is found.
Whoah, where did you pull that out of? Have you not been paying attention to the human genome project? We have a ton of genes sequenced for a ton of species. This is not all based on a single bacterium!
I see no proof in this article. All I see is a report about some scientist claiming to have found the truth.
That's right. The proof is not in the article, it would take far to long to explain it all. Read the research. See how common genes have travelled the various paths among organisms and the stepwise refinement introduced in species.
If you want to learn the truth you have to look for it! Scientific understandinf does not come through prayer (like faith).
but you can't prove that God doesn't exist to those who have experienced Him.
Which god are you referring to? There are thousands of thousands of religions with all sorts of gods.
As long as you dont mean to imply that only your god of your religion is true, then I dont have any beef with your self induced relationship with this psuedo-being.
I get really annoyed at people in any religion saying theirs is the only one thats true! Surprise! Everyone thinks their religion is true. But that doesn't make any of them right.
Is how I find the article. Whether or not you believe in evolution, it seems quite unacceptable to me for a scientist/journalist to make bold and provacative claims about how the now-completely-mapped human genetic code proves beyond a shadow of a doubt that his religion is the right one after all (I say this because the article seems to make clear to me that evolution is his religion), and then not explain it with one shred of evidence!
Gee, I'm glad your religion does not ever make such profound assumptions!
I get offended when religious fanatics attempt to remove evltion from school curriculmn entirely. I get a bit miffed when they call all scientific reasearch, evidence, and effort into learning our ourigins complete and utter fiction with no justification.
Perhaps you need to take a look in the mirror and see the kind of prejudices you hold against any type of scientific evidience that does not agree with your assumptions.
Pretty good summary, but you are missing some up to date information.
Everything started with a Big Bang
Actually, many people are now thinking that everything starting with a big expansion. There is no singularity. See supersymmetry and string/M theory.
Spontaneous formation
The SanteFe institute is doing some wonderful research on complex systems and organization. They have a number of theories and experiements which show how this can happen in a straightforward manner. Autocatalytic sets are the main interest here.
magine an explosion. Stuff goes in all directions, approx. the same speed. Because in case of Big Bang, there's nothing to hit, the matter would fly in all directions forever, without hitting anything and without stopping, without forming anything. To particles cannot collide if they have the same speed.
A nice thought, but your mising the effects of quantum mechanics, relativity, and forces in general. We do these types of tests all the time in particle accelerators and they show how all sorts of odd things happen when subatomic particles interact. Once you have atoms, gravity and chemistry / fusion kick into play.
Generation of more complex chemical compounds. Not proven. Generation of amino-acids from the elements in primordial soup is chemically impossible.
Nope, they have made amino acids in recreated environments similar to the primordial soup. However, the more recent more accepted theory is that these complex chemical seeds cam from space. They have found complex organic molecules, including amino acids, in space. They do and can form in a wide variety of places.
Abiogenesis
True, but there is very attractive research on this as well. See the autocatalytic sets above. Also, there are a number of simple aminoacid type chemicals when combined form a structure amazingly similar to a cell membrane. But i digress...
Macroevolution(development of more complex organisms from simpler ones): sorry, not proven. First, there are no 'intermediate' forms found. Every single one is complete and functional. Darwinism states that evolution is a gradual process, taking millions of years. Hence, almost 100% of fossils should be intermediate forms, with clear links. The links are missing.
There is a growing body of evidence that evolution is *not* a linear process, but has distinct jumps from organism to organism, feature to feature. Similar to the way quantum of energy is not a continuum, but a finite jump.
This has to do with autocatalytic sets, so see those again. Also, the process would be similar to various adaptations occuring in stealth mode in an orgism, until a critical mass is reached, and the trait spontaneouly minifests.
The details are many and very interesting, so I would encourage you to check them out.
From what I can tell, if i root a few boxes on nice fast t3 lines then i can send out a query to all hosts on the alpine network searching for all
songs with a "3" in the filename. That should return every mp3 on the planet:)
Yes, you could, but only if they allowed the connection.
Second, they would not send an entire list of MP3s. They would send back a single packet that contains the number of hits found. like, 1,234.
To get the list of MP3s you need to do some more work.
Now since the query is udp based I can spoof the return address to www.ebay.com.
No, the return is sent to the originating connection. That is you. The handshake protocol for establishing a connection makes this as resistant to spoofing as TCP. (Which isn't perfect, but at least its better than nothing)
Also you cant throttle your bandwidht on alpine. Certainly you can send out less searches but if you have 100,000 users online then there will be about 5000 searches a minute, and every 33.6 user will have to download all thoses queries.
Again, a modem user will have fewer connections. And the response to broadcast queries is only a single packet with the number of hits found, if anything is sent at all.
The combined configuration of how many searches you perform, and how many connections you have active control how much bandwidth you use.
But a good dial-up connection runs at only 50 kilobit/s. Are you saying that users who want to use ALPINE should pack up and move to an
area where DSL is available?
They would most likely try to locate a proxy peer. They may have to gain favor by providing quality content or resources, or simply ask nicely to get a procy connection, but they are not left out in the cold.
Also, the network is still usable for modem users. It may take a few minutes to locate something, assuming you have to search everyone in your peer group, but even that may be acceptable to most users.
The adaptive configuration of searching should reduce this as much as possible, so that you may only query a few hundred peers before you locate what you're looking for.
And they're on high-speed T3 or OCx connections to the Internet, connections that are designed to handle such a load.
Yes, that was a poor response on my part...
What if your query isn't an exact match to one file? For instance, I'm looking for "songs by The Offspring, in.ogg or.mp3 format, at bitrate >= 160 kilobit/s,"
You can query as long as you want, however, if you received 100 replies you could automatically halt querying to see if they would suffice. If you want more, you can continue. This is not an atomic operation.
Also, the adaptive features would increase the likelyhood that you would find those hundred hits faster with each successive query.
Its up to you how long, how far, and how fast you want to query.
From every single user who's searching. Say a user searches a 20,000 user network once every 10 minutes (this takes into account inactive users). You'll have to handle (on the average) 2,000 queries a minute, over 60 a second. That's not even counting peak use. Can your hardware and network connection keep up?
On a DSL line you could handle 40-60 queries per second, although, you would likely have a smaller network than this, or at least trottle or exclude the really noisy peers.
A DSL line can handle over a thousand a second. This shouldn't be a problem, and again, this is all configurable and adaptable. You get to make the rules.
But whenever I think of the obvious solution to this problem (proxies that cache search requests for a group of users), I realize that such a topology would be equivalent to that of the existing OpenNap network
True, and there are specific instances where a multitude of other system would be more efficient and response than the alpine network.
This is not intended to be the be-all-end-all of peer searching. It is simply a usable completely decentralized searching network, and in some instances, this may be a very nice solution.
I can think of quite a few better ways to get certain things or lcoate certain things,
Google, AltaVista, etc,
OpenNap, Napster,
etc. etc... but none of these are completely decentralized searching networks. That is where ALPINE is intended to function.
You're missing the point. There's a difference between effective bandwidth and total bandwidth (which includes protocol overhead). Maximizing effective bandwidth is good; maximizing total bandwidth is antisocial, and ultimately reduces the effective bandwidth left over for getting real work done. With your protocol, all of the capacity will be sucked into a black hole doing queries, and actual downloads will slow to a crawl.
No, queries only use half the bandwidth on a piep at most, and can be configured to use less.
And I dont see what your implied difference is between effective bandwidth and total bandwidth.
I can log activity on my internet interface for napster, for freenet, for gnutella, and they all max out my available bandwidth.
I guess I dont understand what part of this your implying is different? Bandwidth in the overall internet with regards to these services? Bandwidth through my ISP?
But if a modem user can only have 200 people to be connected to, then only those 200 people can be connected to him. That means Even if I had a OC48 running and "could" handle everyone's connection, i'm still offlimits to the modem people that have already "picked" their 200 people
Thats correct. I havent gone into connection cycling, but lets say one of those 200 is a lowly rated peer as far as quality is concerned (again, this ties back to the quality metrics)
This peer would probably decide to bump him off, and give you a chance. If you turned out to be a quality peer, you would migrate towards the top of his query list, and would be less likely to be bumped off in return.
If you a rogue/leacher peer then you may end up in a situation where no one wants to allow you a connection, and your T1 goes to waste.
I think you are taking generalizations too far. Each maintained connection uses a measurable amount of bandwidth, say C.
This amount of bandwidth is not a constant. Each connection shares total bandwidth b and then the adaptive nature of the alpine protocol as well as filtering and throttling ensure the fair use of this limited bandwidth. You can use as much as you want, this is a configurable setting in the DTCP stack.
If you are searching for something extremely rare (or nonexistant) and your bandwidth is small with respect to the scope of the network you may be required to cycle your connections many times until you acheive hits. As intended, the network allows you to search at the maximum speed allowed by your bandwidth--but gives you the option of doing a long (but exhaustive) search regardless of whether you have a 14.4 or a gigabit connection
Yes, and this is a drawback, but no diffrent than napster for example. If you cant find your song in the 3,000 to 10,000 peers on the server your at, you can keep searching, or try a different server.
Oh ok. But that means I start all over again with the "adaptive" process each time I 'log on'.
You dont have to. Part of DTCP provides persistant connections. You can resume a connection when you log on, even if your IP and port changed in the interim. So, you only need to start the adaptive process whenever you create a new connection.
Thus...10,000 searches (at a time) going through the 1 client's bandwidth. (replace 10,000 with whatever number we're working with here).
Yes, you are correct. And that is where slow users would have a smaller peering group that they are connected to, as well as throttling peers who query too agressively. They can even outright ban peers who are abusing bandwith.
They may also use a proxy, which would handle the replies.
And last, you control how many peers you query and when. If you find what your looking for after querying 100 peers, then then is no need to query the rest.
Likewise, if you start getting a large number of responses, you can slow or halt the broadcast of additional queries.
My God, you are so fucking stupid. Bandwith is never entirely "your own" - unless we're talking about an isolated home LAN - it is shared with many other users of your ISP (because the ISP has only a limited number of outgoing pipes) and from there on with the other users of the intervening networks. If 1000 people on one ISP start clogging up that ISPs pipes with this crap, and that ISP has a clue, they will kick those people off.
Perhaps you misinterpreted what I meant with that statement.
Regardless of any peering application or network you use, you will be using bandwidth. If this application is maxed out, your using *all* of your allocated bandwidth, i.e. your pipe. This happens all the time.
ISP's continually operate at near peak usage. They dont leave lots of empty bandwidth laying around because someone *might* use it.
Also, 1000 people on alpine would be no different than 1000 people on napster, a 1000 people on freenet, etc. Show me a peering application that does not maximize use of your bandwidth in a large network.
And finally, you can tune the amount of bandwidth you use. If you want to use half of your DSL line, and leave the other half free for surfing, etc. you can. UDP gives you complete control over when and how large a packet is sent. TCP cannot do this, you can only send a buffer, and it may go out as one packet, maybe five, it may be delayed a fraction of a second, etc.
You seem to be contradicting yourself. If a modem user can limit (or has to limit) the number of connections in his/her group, then how is it possible for a T1 user to have everyone in their group? Both cannot happen.
It would be very unlikely, but all that would need to occur is that one of the 10,000 connections that every peer has would be to the T1 server. The rest of the connections may be to random peers, but the T1 user would still be connected to everyone, while everyone else maintains only 10,000 connections.
Not only am I keeping track of every IP-node user out there, but I have to keeep track of it over time. In a napster-success scenario, I'd have 2 million entries to keep track of. Not only that, but it seems like a lot of wasted overhead?
No, you only keep track of this information for the peers you are currently connected to.. This may be 3,000 to 10,000 for a napster sized group (not all one million napster users are on the same server!) or more if you have a beefy machine that can handle it.
It is entirely up to each user how many connections and how much bandwidth they wish to use.
The ISP will have to handle 10,000 user requests of ME. And you can't reiterate the B.S. about throttling search requests
I dont see your point. Each 10,000 ME's would have their own ISP, and would use their own bandwidth.
Ever watch your modem/DSL lights when your on napster? This is no diffrent, and the throttling does work, unlike TCP streaming where the bandwidth is alsways wide open (unless you excplicitly trottle sending in your application).
If people can query everybody in the network, it isn't going to scale.
They cannot query everybody on the network. They can only query everyone they are connected to. So, modem users would obviously have a smaller connection pool compared to a DSL user.
If a peer they are connected to is causing too much load, they have them slow down, or they drop them entirely.
Someone on a T1 connection may indeed be able to connect to just about everyone, but they would also have the bandwidth and memory to do so.
Not to mention the problem of having the searcher send out an individual query to every client it want to search. If I understand this correctly, if I want to search 3,000 hosts I have to send out 3,000 otherwise identical packets. This is not what is known as a scalable protocol. In fact, from a network point of view, it's a worst-case scenario.
Worst case scenario is a forwarded broacast. And at any rate, 3,000 queries to find what your looking for is indeed a worst case search.
Part of the alpine protocol is the adaptive configuration of he query list so that quality peers are queried first, thus greatly increasing the chances that you dont need to query more than a few hundred to find what your looking for.
You won't get the answer until you've already sent queries to the next batch. Net result: not only are you consuming all this bandwidth and
creating all this congestion, then you turn around and drop those packets on the floor. That's just adding insult to injury, as far as your upstream is concerned.
No, there is no batch. The query process is iterative, and can be halted, slowed, at any point in time. While there may be a dozen to a few hundred packets in transit before you start receiving replies, you can slow or stop the process once you see that you have enough replies, or that you have found what your looking for, or just want to cancel.
Please describe how this adaptation occurs. The details are not on your website, it's a complex problem, and I think you're just handwaving
about something you don't understand.
Sure, there are various criteria that indicate a bad or good peer. These include, among other things:
- Did the peer respond to your query?
- Did the peer misrepresent the response?
- Is the file or resource valid?
- Is the peer sending you too many queries?
Etc. The various properties, and other, control where in the list of peers to search an individual peer is located. A high quality peer, who often responds, has quality files, will be queried long before a peer that never responds will.
For negative behavior there is even ban lists and so forth to prevent them from bothering you further.
But the intervening routers are receiving them - and the replies - in huge clumps. That's just like a DDoS.
Only your initial upstream router is receiving them, and from there the packets fan out to their respective destinations. Any any ISP that cannot handle the bandwidth generated by a customer has much more major problems.
I think in order for this system to reach widespread use (especially in the Windoze community), these two functions need to be combined into one interface.
You are correct, and they are combined. Right now a simple TCP transfer ala FTP/HTTP will be used, with additional transfer types provided using pluggable modules.
Secondly, doesn't this facilitate in finding an end users location? After finding the information, now I get to manually enter the IP address into FTP to connect and download. Does this not make it easier for a program to simply track down 'file X', log IP addresses to file and then resolve these IP's and hunt down the users?
Only if the refence you provide for the content is on your machine. You may simply provide a freenet key and the user can then obtain the file anonymously using freenet. You may provide an FTP location on some offshore server that is outside the bounds of US jurisdiction. It could be anywhere. The majority may be on your machine, but this isnt a requirement.
Instead of ever presenting the final address,
perhaps it could transfer this data amongst the network in an encrypted fashion
The final address is only used during a reply. Where you actually get the data is another issue. So, for the paranoid, they may always upload their music into freent, but locate it using Alpine.
This would be the best of both worlds for fast searching and anonymous downloading.
Sending the same data to 10K hosts in separate packets not only doesn't scale, but it's an extremely antisocial abuse of the network
Funny, I thought web servers acted this way...
Even at 60 bytes per packet, if you're
trying to send to 10000 nodes that's 600K. Then the replies start coming in - in clumps - further clogging that pipe.
If you find the reply your looking for, then there is no need to query the remaining peers. Also, you will not clog the incoming pipe, i've covered this quite a bit, you control how many queries you send out and when, and also to which peers they are sent. The adaptive nature of the protocol ensures that successive queries will be more likely to find what they are looking for sooner.
You would only query 10,000 in a worst case scenario.
The traffic patterns ALPINE will generate are like nothing so much as a DDoS attack, with the query originator and their ISP as the victims.
No, each of these 'victims' would only receive a single 60 byte packet. This is the opposite of a DoS attack, as you are sending a large number of packets, but each peer is only receiving one of them.
Omnifarious, are a little naive, but well-known technology in mesh routing and distributed broadcast can easily enough be applied
to create and maintain self-organizing adaptive distributed broadcast trees (phew, that was a mouthful) for this purpose. Read the literature.
I understand what your getting at, but your missing the main purpose of this network. If you need to search a large number of peers for dynamic content in real time, you need to reach all of them to do it. Whether you do this using a tree/routing/forward approach, or a single peer using multiple unicast packets, you have to reach them to do it.
The design of this network is so that the resources you use are your own and that you can tailor the bandwidth, peers, and effectiveness of the search to your own preferences.
This is a highly specific network architecture with a very specific purpose using very small packets. This is why alpine can bend the common conceptions about scalability and performance and still remain efficient and scalable.
The point you have to remember is that you control exactly how much bandwith you use for queries and how many peers you query. Also, the alpine protocol adapts to the responses you receive so that you tend towards a more efficient search.
Similar peers that have similar content and quality service will graviate towards the top of each others query lists. Thus, these higher quality peers will be queried before the others (if the others are queried at all).
The net result is that ech query you make with success enhances the probability and speed with with the next query will be answered.
For example, napster has grown to millions of users, but whever you execute a napster query, you are only searching among a grpoup of 3,000-10,000! And these are randomly selected.
Alpine will allow you to search 3,000 to 100,000+ of *selective* peers, which you have tuned to optimial result.
you still will have to search every node every time you want to find something
This is not the case. You only have to search unitl you *find* what your looking for. This is a big difference, and part of the ALPINE protocol is adapting to the responses and peers your communicating with to ensure that you search fewer peers each time your looking for something.
This is covered in the documents, and is a major benifit. The network adapts to your preferences and optimizes accordingly.
Yes, you are correct. And you will always send a packet first. If you are behind a NAT firewall this will be a NAT discovery packet.
A reply is then returned which has your masqueraded IP and port which the NAT router is using. From this point on, this masqueraded address is what you use to identify yourself.
Some systems may need to turn on loose UDP masquerade or the equivalent to allow reply packets from sources other than the initial destination to which you sent the discovery packet.
There are additional details, but the end result is NAT users are supported.
First, 10,000 peers isn't exactly world domination. If your network is successful and swells to 1,000,000 peers (still less than 1% of the Internet), suddenly you're tying up your modem for 250 minutes per query.
No, you connect to as many as you want. You can stop at 10,000, half a million, etc. Each peer is in direct control of how much bandwidth they use, how many peers to connect to, and how many queries they perform.
Second, presumably other people are making queries, too. If there are even 20 queries per second, your modem link will be saturated even if you're not making any queries of your own.
Thats where ALPINE comes into play. It allows the ordering of peers based on quality and value of responses. If you start getting busy, you simply quit replying to queries, and your perceived value to those peers drops, you then get queried less.
The details are nore complex, but you should never encounter saturation unless you specifically configure you client to do so, and even then its unlikely.
Third, discovering and storing a list of 10,000 peers -- not to mention 1 million peers -- is prohibitively expensive. Remember, there's no centralized server dishing out lists of addresses
You build them up gradually. And continually refine your list over time, so that you eventually have a list of similar peers with quality content and service. You dont get one million peers all at once. There is a discovery protocol in place, where you can ask for a number of peers from one you are already connected to. No need to do it all at once.
Third, discovering and storing a list of 10,000 peers -- not to mention 1 million peers -- is prohibitively expensive
You can store the connection information for 10,000 peers in 2 megabytes of RAM. The DTCP protocol is specifically designed to be very compact with almost no overhead per connection.
Fourth, the amount of churn in a group of 10,000 peers is quite high -- nodes are arriving, leaving, and crashing all the time. Even if you could find out about all 10,000 peers, your link would be saturated keeping up with changes in group membership
There are protocols for resuming connections if you IP address and port change. Also, these peers would have a perceived loq quality in comparision to more stable nodes, and thus would move down your list. you may not even need to maintain a connection to them at all.
The design of the server is also similar to a daemon process. Your GUI or client would interface with the server through a CORBA interface. You can shut down the client and the server is still running. You can reduce bandwidth usage if you wish, or shutdown the server. However, it is designed for a more persistant presence than most peering services.
And last, your network inherits all of the d-o-s, spam, and privacy problems inherent in any broadcast-search network. Gnutella has demonstrated these problems (if not solutions to them) handily. Learn from the idiocy of others.
Actually, this should be less of a problem than you would suspect. DoS is still only as bad as TCP. There is a connection protocol similar to TCP with handshakes, etc.
Spam is even less of a problem, as you can ban peers which spam or attack you. Peers can share this information in a growing pool so that spammers and rogue clients are effectively ostracized from the network. Each peer can decide who and when they communicate with. It puts the power back in your hands.
By the way, these were some very insightfull questions. Thanks for the reply.
But the only scientific evidence he has is that some bacteria DNA is found.
Whoah, where did you pull that out of? Have you not been paying attention to the human genome project? We have a ton of genes sequenced for a ton of species. This is not all based on a single bacterium!
Get educated man!
I see no proof in this article. All I see is a report about some scientist claiming to have found the truth.
That's right. The proof is not in the article, it would take far to long to explain it all. Read the research. See how common genes have travelled the various paths among organisms and the stepwise refinement introduced in species.
If you want to learn the truth you have to look for it! Scientific understandinf does not come through prayer (like faith).
I BELIEVE that God created everything.
Do you beleive this because you want to, or because of cold hard evidence presented to you?
Evidence for evolution: 122,345,566 pieces of evidence.
Evidence for creationism, aka GOD: 1 billion people attesting their faith.
Hmmm.. which one seems more logical. A large cult of fanatics? Or maybe reprodcuble, logical scientific fat... Hard choice!
God is not a programmer, he is a comedian.
Enjoy the humor that is the human condition.
but you can't prove that God doesn't exist to those who have experienced Him.
Which god are you referring to? There are thousands of thousands of religions with all sorts of gods.
As long as you dont mean to imply that only your god of your religion is true, then I dont have any beef with your self induced relationship with this psuedo-being.
I get really annoyed at people in any religion saying theirs is the only one thats true! Surprise! Everyone thinks their religion is true. But that doesn't make any of them right.
Is how I find the article. Whether or not you believe in evolution, it seems quite unacceptable to me for a scientist/journalist to make bold and provacative claims about how the now-completely-mapped human genetic code proves beyond a shadow of a doubt that his religion is the right one after all (I say this because the article seems to make clear to me that evolution is his religion), and then not explain it with one shred of evidence!
Gee, I'm glad your religion does not ever make such profound assumptions!
I get offended when religious fanatics attempt to remove evltion from school curriculmn entirely. I get a bit miffed when they call all scientific reasearch, evidence, and effort into learning our ourigins complete and utter fiction with no justification.
Perhaps you need to take a look in the mirror and see the kind of prejudices you hold against any type of scientific evidience that does not agree with your assumptions.
Pretty good summary, but you are missing some up to date information. Everything started with a Big Bang Actually, many people are now thinking that everything starting with a big expansion. There is no singularity. See supersymmetry and string/M theory. Spontaneous formation The SanteFe institute is doing some wonderful research on complex systems and organization. They have a number of theories and experiements which show how this can happen in a straightforward manner. Autocatalytic sets are the main interest here. magine an explosion. Stuff goes in all directions, approx. the same speed. Because in case of Big Bang, there's nothing to hit, the matter would fly in all directions forever, without hitting anything and without stopping, without forming anything. To particles cannot collide if they have the same speed. A nice thought, but your mising the effects of quantum mechanics, relativity, and forces in general. We do these types of tests all the time in particle accelerators and they show how all sorts of odd things happen when subatomic particles interact. Once you have atoms, gravity and chemistry / fusion kick into play. Generation of more complex chemical compounds. Not proven. Generation of amino-acids from the elements in primordial soup is chemically impossible. Nope, they have made amino acids in recreated environments similar to the primordial soup. However, the more recent more accepted theory is that these complex chemical seeds cam from space. They have found complex organic molecules, including amino acids, in space. They do and can form in a wide variety of places. Abiogenesis True, but there is very attractive research on this as well. See the autocatalytic sets above. Also, there are a number of simple aminoacid type chemicals when combined form a structure amazingly similar to a cell membrane. But i digress... Macroevolution(development of more complex organisms from simpler ones): sorry, not proven. First, there are no 'intermediate' forms found. Every single one is complete and functional. Darwinism states that evolution is a gradual process, taking millions of years. Hence, almost 100% of fossils should be intermediate forms, with clear links. The links are missing. There is a growing body of evidence that evolution is *not* a linear process, but has distinct jumps from organism to organism, feature to feature. Similar to the way quantum of energy is not a continuum, but a finite jump. This has to do with autocatalytic sets, so see those again. Also, the process would be similar to various adaptations occuring in stealth mode in an orgism, until a critical mass is reached, and the trait spontaneouly minifests. The details are many and very interesting, so I would encourage you to check them out.
From what I can tell, if i root a few boxes on nice fast t3 lines then i can send out a query to all hosts on the alpine network searching for all :)
songs with a "3" in the filename. That should return every mp3 on the planet
Yes, you could, but only if they allowed the connection.
Second, they would not send an entire list of MP3s. They would send back a single packet that contains the number of hits found. like, 1,234.
To get the list of MP3s you need to do some more work.
Now since the query is udp based I can spoof the return address to www.ebay.com.
No, the return is sent to the originating connection. That is you. The handshake protocol for establishing a connection makes this as resistant to spoofing as TCP. (Which isn't perfect, but at least its better than nothing)
Also you cant throttle your bandwidht on alpine. Certainly you can send out less searches but if you have 100,000 users online then there will be about 5000 searches a minute, and every 33.6 user will have to download all thoses queries.
Again, a modem user will have fewer connections. And the response to broadcast queries is only a single packet with the number of hits found, if anything is sent at all.
The combined configuration of how many searches you perform, and how many connections you have active control how much bandwidth you use.
But a good dial-up connection runs at only 50 kilobit/s. Are you saying that users who want to use ALPINE should pack up and move to an
area where DSL is available?
They would most likely try to locate a proxy peer. They may have to gain favor by providing quality content or resources, or simply ask nicely to get a procy connection, but they are not left out in the cold.
Also, the network is still usable for modem users. It may take a few minutes to locate something, assuming you have to search everyone in your peer group, but even that may be acceptable to most users.
The adaptive configuration of searching should reduce this as much as possible, so that you may only query a few hundred peers before you locate what you're looking for.
And they're on high-speed T3 or OCx connections to the Internet, connections that are designed to handle such a load.
.ogg or .mp3 format, at bitrate >= 160 kilobit/s,"
Yes, that was a poor response on my part...
What if your query isn't an exact match to one file? For instance, I'm looking for "songs by The Offspring, in
You can query as long as you want, however, if you received 100 replies you could automatically halt querying to see if they would suffice. If you want more, you can continue. This is not an atomic operation.
Also, the adaptive features would increase the likelyhood that you would find those hundred hits faster with each successive query.
Its up to you how long, how far, and how fast you want to query.
From every single user who's searching. Say a user searches a 20,000 user network once every 10 minutes (this takes into account inactive users). You'll have to handle (on the average) 2,000 queries a minute, over 60 a second. That's not even counting peak use. Can your hardware and network connection keep up?
On a DSL line you could handle 40-60 queries per second, although, you would likely have a smaller network than this, or at least trottle or exclude the really noisy peers.
A DSL line can handle over a thousand a second. This shouldn't be a problem, and again, this is all configurable and adaptable. You get to make the rules.
But whenever I think of the obvious solution to this problem (proxies that cache search requests for a group of users), I realize that such a topology would be equivalent to that of the existing OpenNap network
True, and there are specific instances where a multitude of other system would be more efficient and response than the alpine network.
This is not intended to be the be-all-end-all of peer searching. It is simply a usable completely decentralized searching network, and in some instances, this may be a very nice solution.
I can think of quite a few better ways to get certain things or lcoate certain things,
Google, AltaVista, etc,
OpenNap, Napster,
etc. etc... but none of these are completely decentralized searching networks. That is where ALPINE is intended to function.
You're missing the point. There's a difference between effective bandwidth and total bandwidth (which includes protocol overhead). Maximizing effective bandwidth is good; maximizing total bandwidth is antisocial, and ultimately reduces the effective bandwidth left over for getting real work done. With your protocol, all of the capacity will be sucked into a black hole doing queries, and actual downloads will slow to a crawl.
No, queries only use half the bandwidth on a piep at most, and can be configured to use less.
And I dont see what your implied difference is between effective bandwidth and total bandwidth.
I can log activity on my internet interface for napster, for freenet, for gnutella, and they all max out my available bandwidth.
I guess I dont understand what part of this your implying is different? Bandwidth in the overall internet with regards to these services? Bandwidth through my ISP?
But if a modem user can only have 200 people to be connected to, then only those 200 people can be connected to him. That means Even if I had a OC48 running and "could" handle everyone's connection, i'm still offlimits to the modem people that have already "picked" their 200 people
Thats correct. I havent gone into connection cycling, but lets say one of those 200 is a lowly rated peer as far as quality is concerned (again, this ties back to the quality metrics)
This peer would probably decide to bump him off, and give you a chance. If you turned out to be a quality peer, you would migrate towards the top of his query list, and would be less likely to be bumped off in return.
If you a rogue/leacher peer then you may end up in a situation where no one wants to allow you a connection, and your T1 goes to waste.
I think you are taking generalizations too far. Each maintained connection uses a measurable amount of bandwidth, say C.
This amount of bandwidth is not a constant. Each connection shares total bandwidth b and then the adaptive nature of the alpine protocol as well as filtering and throttling ensure the fair use of this limited bandwidth. You can use as much as you want, this is a configurable setting in the DTCP stack.
If you are searching for something extremely rare (or nonexistant) and your bandwidth is small with respect to the scope of the network you may be required to cycle your connections many times until you acheive hits. As intended, the network allows you to search at the maximum speed allowed by your bandwidth--but gives you the option of doing a long (but exhaustive) search regardless of whether you have a 14.4 or a gigabit connection
Yes, and this is a drawback, but no diffrent than napster for example. If you cant find your song in the 3,000 to 10,000 peers on the server your at, you can keep searching, or try a different server.
Oh ok. But that means I start all over again with the "adaptive" process each time I 'log on'.
You dont have to. Part of DTCP provides persistant connections. You can resume a connection when you log on, even if your IP and port changed in the interim. So, you only need to start the adaptive process whenever you create a new connection.
Thus...10,000 searches (at a time) going through the 1 client's bandwidth. (replace 10,000 with whatever number we're working with here).
Yes, you are correct. And that is where slow users would have a smaller peering group that they are connected to, as well as throttling peers who query too agressively. They can even outright ban peers who are abusing bandwith.
They may also use a proxy, which would handle the replies.
And last, you control how many peers you query and when. If you find what your looking for after querying 100 peers, then then is no need to query the rest.
Likewise, if you start getting a large number of responses, you can slow or halt the broadcast of additional queries.
My God, you are so fucking stupid. Bandwith is never entirely "your own" - unless we're talking about an isolated home LAN - it is shared with many other users of your ISP (because the ISP has only a limited number of outgoing pipes) and from there on with the other users of the intervening networks. If 1000 people on one ISP start clogging up that ISPs pipes with this crap, and that ISP has a clue, they will kick those people off.
Perhaps you misinterpreted what I meant with that statement.
Regardless of any peering application or network you use, you will be using bandwidth. If this application is maxed out, your using *all* of your allocated bandwidth, i.e. your pipe. This happens all the time.
ISP's continually operate at near peak usage. They dont leave lots of empty bandwidth laying around because someone *might* use it.
Also, 1000 people on alpine would be no different than 1000 people on napster, a 1000 people on freenet, etc. Show me a peering application that does not maximize use of your bandwidth in a large network.
And finally, you can tune the amount of bandwidth you use. If you want to use half of your DSL line, and leave the other half free for surfing, etc. you can. UDP gives you complete control over when and how large a packet is sent. TCP cannot do this, you can only send a buffer, and it may go out as one packet, maybe five, it may be delayed a fraction of a second, etc.
You seem to be contradicting yourself. If a modem user can limit (or has to limit) the number of connections in his/her group, then how is it possible for a T1 user to have everyone in their group? Both cannot happen.
It would be very unlikely, but all that would need to occur is that one of the 10,000 connections that every peer has would be to the T1 server. The rest of the connections may be to random peers, but the T1 user would still be connected to everyone, while everyone else maintains only 10,000 connections.
Not only am I keeping track of every IP-node user out there, but I have to keeep track of it over time. In a napster-success scenario, I'd have 2 million entries to keep track of. Not only that, but it seems like a lot of wasted overhead?
No, you only keep track of this information for the peers you are currently connected to.. This may be 3,000 to 10,000 for a napster sized group (not all one million napster users are on the same server!) or more if you have a beefy machine that can handle it.
It is entirely up to each user how many connections and how much bandwidth they wish to use.
The ISP will have to handle 10,000 user requests of ME. And you can't reiterate the B.S. about throttling search requests
I dont see your point. Each 10,000 ME's would have their own ISP, and would use their own bandwidth.
Ever watch your modem/DSL lights when your on napster? This is no diffrent, and the throttling does work, unlike TCP streaming where the bandwidth is alsways wide open (unless you excplicitly trottle sending in your application).
If people can query everybody in the network, it isn't going to scale.
They cannot query everybody on the network. They can only query everyone they are connected to. So, modem users would obviously have a smaller connection pool compared to a DSL user.
If a peer they are connected to is causing too much load, they have them slow down, or they drop them entirely.
Someone on a T1 connection may indeed be able to connect to just about everyone, but they would also have the bandwidth and memory to do so.
Not to mention the problem of having the searcher send out an individual query to every client it want to search. If I understand this correctly, if I want to search 3,000 hosts I have to send out 3,000 otherwise identical packets. This is not what is known as a scalable protocol. In fact, from a network point of view, it's a worst-case scenario.
Worst case scenario is a forwarded broacast. And at any rate, 3,000 queries to find what your looking for is indeed a worst case search.
Part of the alpine protocol is the adaptive configuration of he query list so that quality peers are queried first, thus greatly increasing the chances that you dont need to query more than a few hundred to find what your looking for.
You won't get the answer until you've already sent queries to the next batch. Net result: not only are you consuming all this bandwidth and
creating all this congestion, then you turn around and drop those packets on the floor. That's just adding insult to injury, as far as your upstream is concerned.
No, there is no batch. The query process is iterative, and can be halted, slowed, at any point in time. While there may be a dozen to a few hundred packets in transit before you start receiving replies, you can slow or stop the process once you see that you have enough replies, or that you have found what your looking for, or just want to cancel.
Please describe how this adaptation occurs. The details are not on your website, it's a complex problem, and I think you're just handwaving
about something you don't understand.
Sure, there are various criteria that indicate a bad or good peer. These include, among other things:
- Did the peer respond to your query?
- Did the peer misrepresent the response?
- Is the file or resource valid?
- Is the peer sending you too many queries?
Etc. The various properties, and other, control where in the list of peers to search an individual peer is located. A high quality peer, who often responds, has quality files, will be queried long before a peer that never responds will.
For negative behavior there is even ban lists and so forth to prevent them from bothering you further.
But the intervening routers are receiving them - and the replies - in huge clumps. That's just like a DDoS.
Only your initial upstream router is receiving them, and from there the packets fan out to their respective destinations. Any any ISP that cannot handle the bandwidth generated by a customer has much more major problems.
I think in order for this system to reach widespread use (especially in the Windoze community), these two functions need to be combined into one interface.
You are correct, and they are combined. Right now a simple TCP transfer ala FTP/HTTP will be used, with additional transfer types provided using pluggable modules.
Secondly, doesn't this facilitate in finding an end users location? After finding the information, now I get to manually enter the IP address into FTP to connect and download. Does this not make it easier for a program to simply track down 'file X', log IP addresses to file and then resolve these IP's and hunt down the users?
Only if the refence you provide for the content is on your machine. You may simply provide a freenet key and the user can then obtain the file anonymously using freenet. You may provide an FTP location on some offshore server that is outside the bounds of US jurisdiction. It could be anywhere. The majority may be on your machine, but this isnt a requirement.
Instead of ever presenting the final address,
perhaps it could transfer this data amongst the network in an encrypted fashion
The final address is only used during a reply. Where you actually get the data is another issue. So, for the paranoid, they may always upload their music into freent, but locate it using Alpine.
This would be the best of both worlds for fast searching and anonymous downloading.
We meet again, ;)
Sending the same data to 10K hosts in separate packets not only doesn't scale, but it's an extremely antisocial abuse of the network
Funny, I thought web servers acted this way...
Even at 60 bytes per packet, if you're
trying to send to 10000 nodes that's 600K. Then the replies start coming in - in clumps - further clogging that pipe.
If you find the reply your looking for, then there is no need to query the remaining peers. Also, you will not clog the incoming pipe, i've covered this quite a bit, you control how many queries you send out and when, and also to which peers they are sent. The adaptive nature of the protocol ensures that successive queries will be more likely to find what they are looking for sooner.
You would only query 10,000 in a worst case scenario.
The traffic patterns ALPINE will generate are like nothing so much as a DDoS attack, with the query originator and their ISP as the victims.
No, each of these 'victims' would only receive a single 60 byte packet. This is the opposite of a DoS attack, as you are sending a large number of packets, but each peer is only receiving one of them.
Omnifarious, are a little naive, but well-known technology in mesh routing and distributed broadcast can easily enough be applied
to create and maintain self-organizing adaptive distributed broadcast trees (phew, that was a mouthful) for this purpose. Read the literature.
I understand what your getting at, but your missing the main purpose of this network. If you need to search a large number of peers for dynamic content in real time, you need to reach all of them to do it. Whether you do this using a tree/routing/forward approach, or a single peer using multiple unicast packets, you have to reach them to do it.
The design of this network is so that the resources you use are your own and that you can tailor the bandwidth, peers, and effectiveness of the search to your own preferences.
This is a highly specific network architecture with a very specific purpose using very small packets. This is why alpine can bend the common conceptions about scalability and performance and still remain efficient and scalable.
The point you have to remember is that you control exactly how much bandwith you use for queries and how many peers you query. Also, the alpine protocol adapts to the responses you receive so that you tend towards a more efficient search.
Similar peers that have similar content and quality service will graviate towards the top of each others query lists. Thus, these higher quality peers will be queried before the others (if the others are queried at all).
The net result is that ech query you make with success enhances the probability and speed with with the next query will be answered.
For example, napster has grown to millions of users, but whever you execute a napster query, you are only searching among a grpoup of 3,000-10,000! And these are randomly selected.
Alpine will allow you to search 3,000 to 100,000+ of *selective* peers, which you have tuned to optimial result.
you still will have to search every node every time you want to find something
This is not the case. You only have to search unitl you *find* what your looking for. This is a big difference, and part of the ALPINE protocol is adapting to the responses and peers your communicating with to ensure that you search fewer peers each time your looking for something.
This is covered in the documents, and is a major benifit. The network adapts to your preferences and optimizes accordingly.
Yes, you are correct. And you will always send a packet first. If you are behind a NAT firewall this will be a NAT discovery packet.
A reply is then returned which has your masqueraded IP and port which the NAT router is using. From this point on, this masqueraded address is what you use to identify yourself.
Some systems may need to turn on loose UDP masquerade or the equivalent to allow reply packets from sources other than the initial destination to which you sent the discovery packet.
There are additional details, but the end result is NAT users are supported.
First, 10,000 peers isn't exactly world domination. If your network is successful and swells to 1,000,000 peers (still less than 1% of the Internet), suddenly you're tying up your modem for 250 minutes per query.
No, you connect to as many as you want. You can stop at 10,000, half a million, etc. Each peer is in direct control of how much bandwidth they use, how many peers to connect to, and how many queries they perform.
Second, presumably other people are making queries, too. If there are even 20 queries per second, your modem link will be saturated even if you're not making any queries of your own.
Thats where ALPINE comes into play. It allows the ordering of peers based on quality and value of responses. If you start getting busy, you simply quit replying to queries, and your perceived value to those peers drops, you then get queried less.
The details are nore complex, but you should never encounter saturation unless you specifically configure you client to do so, and even then its unlikely.
Third, discovering and storing a list of 10,000 peers -- not to mention 1 million peers -- is prohibitively expensive. Remember, there's no centralized server dishing out lists of addresses
You build them up gradually. And continually refine your list over time, so that you eventually have a list of similar peers with quality content and service. You dont get one million peers all at once. There is a discovery protocol in place, where you can ask for a number of peers from one you are already connected to. No need to do it all at once.
Third, discovering and storing a list of 10,000 peers -- not to mention 1 million peers -- is prohibitively expensive
You can store the connection information for 10,000 peers in 2 megabytes of RAM. The DTCP protocol is specifically designed to be very compact with almost no overhead per connection.
Fourth, the amount of churn in a group of 10,000 peers is quite high -- nodes are arriving, leaving, and crashing all the time. Even if you could find out about all 10,000 peers, your link would be saturated keeping up with changes in group membership
There are protocols for resuming connections if you IP address and port change. Also, these peers would have a perceived loq quality in comparision to more stable nodes, and thus would move down your list. you may not even need to maintain a connection to them at all.
The design of the server is also similar to a daemon process. Your GUI or client would interface with the server through a CORBA interface. You can shut down the client and the server is still running. You can reduce bandwidth usage if you wish, or shutdown the server. However, it is designed for a more persistant presence than most peering services.
And last, your network inherits all of the d-o-s, spam, and privacy problems inherent in any broadcast-search network. Gnutella has demonstrated these problems (if not solutions to them) handily. Learn from the idiocy of others.
Actually, this should be less of a problem than you would suspect. DoS is still only as bad as TCP. There is a connection protocol similar to TCP with handshakes, etc.
Spam is even less of a problem, as you can ban peers which spam or attack you. Peers can share this information in a growing pool so that spammers and rogue clients are effectively ostracized from the network. Each peer can decide who and when they communicate with. It puts the power back in your hands.
By the way, these were some very insightfull questions. Thanks for the reply.