Home
Knowledge Base
Credits
Site Map
 


Query Flooding


Gnutella, a public-domain file-sharing application, is the best known example of this type of P2P application

The peers form an abstract, logical network called an overlay network

edge - an abstract link which may consist of tens of underlying physical links

Although a Gnutella network may have hundreds of thousands of participating peers, a given peer will typically be connected to fewer than 10 other nodes in the overlay network

Steps in query flooding

  • Peers send messages to neighboring peers in the overlay network over pre-existing
    TCP connections
  • The neighbors forward the Query message to all of their neighbors, which forward it
    to to all of their neighbors, etc
  • When a peer receives a Query message, it checks to see whether the keyword
    matches any of the files it is making available for sharing
  • Once a match is found, it sends back a QueryHit message, which contains the file
    name and the file size of the match
  • The QueryHit message follows the reverse path as the Query message, thereby using pre-existing TCP connections
  • Multiple QueryHit messages may be received, in which case the user decides which
    file to download
  • The Gnutella process then sets up a direct TCP connection with the desired user and sends a HTTPGET message that includes the specific file name
  • The file is sent with a HTTP response message
  • Once the entire file is received, the direct TCP connection is terminated

 



 

Gnutella criticisms

  • Not scalable
  • Because a query propagates to every other peer in the overlay network, a significant amount of traffic is dumped into the Internet

 

 

 

To answer this second critism, Gnutella engineers created limited scope query flooding

  • When a query message is initially sent, a peer-count field is set to an initial amount
  • Each time the query message reaches a new peer, the peer-count is reduced by 1
  • When a peer receives a query with a peer-count equal to 0, it stops forwarding the message

Obviously, this reduces the query traffic that is dumped into the Internet, but it also reduces the number of peers that are queried, which increases the probability that the desired file won't be found

Another problem is knowing when another peer is online, called a bootstrap problem

Bootstrap solution

  1. The user maintains a list of peers that are often up in the Gnutella network
  2. The user continuously tries to establish a TCP connection with a peer from this list
    until a connection is made
  3. After the connection is established, the user sends a ping message which includes a
    peer-count field, and the peer sends a pong message back which includes the IP
    address, the number of files it is sharing, and the number of Kbytes taken by the
    files it is sharing
  4. The peer forwards the message to its neighbors in the overlay network and this
    process continues until the peer-count reaches 0
  5. Once the user receives all of the pong messages, It knows the IP addresses of many members in the overlay network