Monday, June 1, 2015

Socket Programming

Sockets allow communication between two different processes on the same or different machines. To be more precise, it's a way to talk to other computers using standard Unix file descriptors.

Types of sockets:
  • Internet Sockets (DARPA Internet addresses )
  • Unix Sockets (path names on a local node)
  • X.25 Sockets (CCITT X.25 addresses)

Types of Internet Sockets:
  • Stream Sockets (SOCK_STREAM)
  • Datagram Sockets (SOCK_DGRAM) - connectionless sockets
  • Raw Sockets
  • Sequenced Packet Sockets

Stream Sockets (SOCK_STREAM)
  • reliable two-way connected communication streams
  • Same order will be maintained on both sides
  • error-free
  • eg:telnet
  • Use TCP - (So data arrives sequentially and error-free.)

Datagram Sockets (SOCK_DGRAM) - connectionless sockets
  • if you send a datagram, it may arrive. It may arrive out of order.
  • If it arrives, the data within the packet will be error-free.
  • use IP for routing.
  • But not TCP. It is UDP
  • used either when a TCP stack is unavailable or when a few dropped packets here and there is not a problem
  • Ex: tftp (trivial file transfer protocol, a little brother to FTP), dhcpcd (a DHCP client), multiplayer games, streaming audio, video conferencing
  • Why would you use an unreliable underlying protocol?  speed
  • How to implement reliable SOCK_DGRAM applications:  tftp and similar programs have their own protocol on top of UDP. For example, the tftp protocol says that for each packet that gets sent, the recipient has to send back a packet that says, "I got it!" (an "ACK" packet.) If the sender of the original packet gets no reply in, say, five seconds, he'll re-transmit the packet until he finally gets an ACK.

Raw Sockets
  • These provide users access to the underlying communication protocols, which support socket abstractions.
  • normally datagram oriented, though their exact characteristics are dependent on the interface provided by the protocol.
  • provided mainly for those interested in developing new communication protocols, or for gaining access to some of the more cryptic facilities of an existing protocol

Sequenced Packet Sockets:
  • similar to a stream socket, with the exception that record boundaries are preserved

Hostname Resolution:
  • The process of finding out dotted IP address based on the given alphanumeric host name.
  • Done by Domain Name Systems (DNS)
  • The correspondence between host names and IP addresses is maintained in a file /ect/hosts

Layered Architecture:

A layered model more consistent with Unix might be:
  • Application Layer (telnet, ftp, etc.)
  • Host-to-Host Transport Layer (TCP, UDP)
  • Internet Layer (IP and routing)
  • Network Access Layer (Ethernet, wi-fi, or whatever)

  • All you have to do for stream sockets is send() the data out.
  • All you have to do for datagram sockets is encapsulate the packet in the method of your choosing and sendto() it out.
  • The kernel builds the Transport Layer and Internet Layer on for you and the hardware does the Network Access Layer

How to Make Client
The steps involved in establishing a socket on the client side are as follows:

  • Create a socket with the socket() system call.
  • Connect the socket to the address of the server using the connect() system call.
  • Send and receive data. There are a number of ways to do this, but the simplest way is to use the read() and write() system calls.

How to make a Server:
The steps involved in establishing a socket on the server side are as follows:

  • Create a socket with the socket() system call.
  • Bind the socket to an address using the bind() system call. For a server socket on the Internet, an address consists of a port number on the host machine.
  • Listen for connections with the listen() system call.
  • Accept a connection with the accept() system call. This call typically blocks the connection until a client connects with the server.
  • Send and receive data using the read() and write() system calls

  • To resolve the problem of identifying a particular server process running on a host, both TCP and UDP have defined a group of well-known ports.
  • defined as an integer number between 0 and 65535. This is because all port numbers smaller than 1024 are considered well-known -
  • The port assignments to network services can be found in the file /etc/services.
  • Normally it is a practice to assign any port number more than 5000.

Network Byte Order
  • not all computers store the bytes that comprise a multibyte value in the same order.
    • Little Endian: In this scheme, low-order byte is stored on the starting address (A) and high-order byte is stored on the next address (A + 1).
    • Big Endian: In this scheme, high-order byte is stored on the starting address (A) and low-order byte is stored on the next address (A+1).
  • Network Byte Order:To allow machines with different byte order conventions communicate with each other, the Internet protocols specify a canonical byte order convention for data transmitted over the network

The select Function
  • The select function indicates which of the specified file descriptors is ready for reading, ready for writing, or has an error condition pending.
  • When an application calls recv or recvfrom, it is blocked until data arrives for that socket.
  • An application could be doing other useful processing while the incoming data stream is empty. Another situation is when an application receives data from multiple sockets.
  • Calling recv or recvfrom on a socket that has no data in its input queue prevents immediate reception of data from other sockets.
  • The select function call solves this problem by allowing the program to poll all the socket handles to see if they are available for non-blocking reading and writing operations.

Blocking vs Non Blocking Sockets

  • By default, TCP sockets are in "blocking" mode. Its possible to set a descriptor so that it is placed in "non-blocking" mode.

Blocking Mode:
  • When you call recv() to read from a stream, control isn't returned to your program until at least one byte of data is read from the remote site.
  • This process of waiting for data to appear is referred to as "blocking". This is same for connect(), write().

Non Blocking Mode:
  • When placed in non-blocking mode, you never wait for an operation to complete.
  • If you call "recv()" in non-blocking mode, it will return any data that the system has in it's read buffer for that socket.
  • But, it won't wait for that data.
  • If the read buffer is empty, the system will return from recv() immediately saying ``"Operation Would Block!"''
  • Non-blocking sockets can also be used in conjunction with the select() API. In fact, if you reach a point where you actually WANT to wait for data on a socket that was previously marked as "non-blocking", you could simulate a blocking recv() just by calling select() first, followed by recv().
  •  Programs that use non-blocking sockets typically use one of two methods when sending and receiving data.

  1. polling-=when the program periodically attempts to read or write data from the socket using a timer.
  2. asynchronous notification-the program is notified whenever a socket event takes place, and in turn can respond to that
  • When designing a high performance networking application with non-blocking socket I/O, the architect needs to decide which polling method to use to monitor the events generated by those sockets.

  1. Polling with select()
  2. Polling with poll()
  3. Polling with epoll()
  4. Polling with libevent


No comments:

Post a Comment