Wednesday, August 20, 2008

Sockets Programming

Author: Chad Z. Hower


This article is an introduction to socket (TCP/IP sockets) concepts. It's not meant to be a complete coverage of all socket topics; it's meant as a primer to educate the reader at a level at which socket programming can be easily communicated. I've also chosen not to cover higher-level protocols, such as FTP, World Wide Web, etc, as I assume you're familiar with these (after all, you are on the Internet using the World Wide Web as you read this).

There are several concepts that must be introduced first. As much as possible, the following concepts will be likened to a real world concept you are likely familiar with: a phone system.

Winsock ("Windows Sockets")

Winsock is a defined and documented standard API for programming network protocols. Most commonly it is used to program TCP/IP, but can also be used to program Novell (IPX/SPX) and other network protocols. Winsock is accessible as a DLL that is part of Win32.


TCP/IP stands for Transmission Control Protocol and Internet Protocol. TCP/IP can mean many things, but in most cases, it refers to the network protocol itself.


A client is a process that initiates a connection. Typically, clients talk to one server at a time. If a process needs to talk to multiple servers, it creates multiple clients.

Likening it to a phone call, a client would be the person who makes a call.


A server is a process that answers incoming requests. A typical server handles numerous requests from many clients simultaneously. Each connection from the server to the client, however, is a separate socket.

Likening it to a phone call, the server would be the person (or voice mail, interactive system, etc.) who answers the phone when it rings. A server is typically set up so that it can handle multiple incoming phone calls. This is similar to how a call center might handle many calls by having hundreds of operators and routing each incoming call to an available operator.

IP Address

Each computer on a TCP/IP network has a unique address associated with it. Some computers may have more than one address associated with them. An IP address is a 32-bit number and is usually represented in dot notation, e.g. Each section represents one byte of the 32-bit address.

An IP address is like a phone number; a location (residence, business, etc.) can have one or more phone numbers. To talk to someone at that location, a connection attempt is initiated (a call is placed and the dialed location's phone rings) by dialing the phone number for that location. The party at the ringing end of the phone can then decide whether or not to answer the phone.


A port is an integer number that identifies which application or service the client wishes to contact.

A port is much like a phone extension. Calling a phone number will get you to a location, but with TCP/IP, every location also has an extension. There is no default extension, as with a residential phone.

When an application (a server) is ready to accept incoming requests, it begins to listen on a port. This is why sometimes application or protocol is used interchangeably with the word port. When a client wants to talk to a server, it must know where the application is (the IP address/phone number), and which port (extension) it's listening (answering) on.

Typically, applications have a fixed port so that no matter where they run, the port is fixed for that type of application. For example, HTTP (Web) uses port 80, and FTP uses port 21. So when you want to retrieve a Web page, you only need to know the location of the computer you wish to retrieve it from, as you know HTTP uses port 80.

Port numbers below 1024 are reserved, and should only be used if you're talking to or implementing a known protocol that has such a port reserved. Most popular protocols use reserved port numbers.


All references to sockets in this article are references to TCP/IP. A socket is the combination of an IP address and a port number. A socket is also a virtual communication conduit between two processes. These processes may be local (residing on the same computer) or remote.

A socket is like a phone connection that carries a conversation. To have a conversation, you must first make the call, and have the other party answer; otherwise, no connection (socket) will be established.

Host Names

Host names are "human-readable" names for IP addresses. An example host name is Every host name has an equivalent IP address, e.g. =

Host names are used both to make it easier on us humans, and to allow a computer to change its IP address without causing all of its potential clients (callers) to lose track of it.

A host name is like a person's name or a business name. A person or business can change their phone number, but we can still contact them.


DNS stands for Domain Name Service. DNS is the service that translates host names into IP addresses. To establish a connection, an IP address must be used, so DNS is used to look up the IP address first.

To make a phone call, you must dial by using the phone number. You cannot dial using a person's name. If you don't have the person's phone number, or it has changed, you would look up the person's phone number in the phone book, or call directory assistance. Thus, DNS is the phone book/directory assistance for the Internet.

More Topics

Now that the basics have been covered, you should have a basic understanding of sockets and related topics. Further topics can now be covered, and programming tasks communicated.

The following topics are not essential for a basic understanding, but can be useful.


TCP (Transmission Control Protocol) is sometimes also referred to as stream. TCP/IP includes many protocols and many ways to communicate. The most common transports are TCP and UDP. TCP is a connection-based protocol - that is, you must connect to a server before you can send data - that guarantees delivery and accuracy of the data sent and received on the connection. TCP also guarantees that data will arrive in the order that it's sent. Most things that use TCP/IP use TCP for their transport.

TCP connections are like placing a phone call to carry on a conversation.


UDP (User Datagram Protocol) is for datagrams and is connectionless. UDP allows "lightweight" packets to be sent to a host without having to first connect to another host. UDP packets are not guaranteed to arrive at their destination, and may not arrive in the same order they're sent. When sending a UDP packet, it's sent in one block. Therefore, you must not exceed the maximum packet size specified by your TCP/IP stack or component. Windows TCP/IP stack's UDP maximum packet size is typically 32KB.

Because of these factors, many people assume UDP is utterly useless. This is not the case. Many streaming protocols, such as RealAudio, use UDP. (The term "streaming" can be easily confused with "stream" connection, which is TCP. When you see these terms, you need to determine the context in which they're used to determine their proper meaning.) The reliability of UDP packets depends on the reliability of the network. UDP packets are also often used on applications that run on a LAN, as the LAN is very reliable. UDP packets across the Internet are generally reliable, and are therefore often used. This, however, cannot be guaranteed - so don't assume your data will always arrive at your destination.

Because UDP doesn't have delivery confirmation, it's not guaranteed to arrive. So if you send a UDP packet to another host, you have no way of knowing if it arrived. Winsock will not - and cannot - determine this, and thus will not provide an error. If you need this information, you'll need to send some sort of return notification back from the remote host.

UDP is like sending someone a message on their pager. You know you sent it, but you don't know if they received it. The pager may not exist, may be out of range, may not be on, or may not be functioning. In addition, the pager network may lose the page. Unless the person pages you back, you don't know if your message was delivered. In addition, if you send multiple pages, it's possible for them to arrive out of order.


ICMP stands for Internet Control Message Protocol. ICMP is a control and maintenance protocol. Typically, you won't need to use ICMP. Typically, it's used to communicate with routers and other network devices. It allows nodes to share IP status and error information and is used for PING, TRACEROUTE, and other such protocols.


HOSTS is a text file that exists somewhere in your Windows directory or a sub-directory (its location varies depending on which version of Windows you're using). By default, many installations don't have such a file, but have a HOSTS.SAM (.SAM = Sample), which you can use to create a HOSTS file. A HOSTS file contains a local host lookup table. When Winsock attempts to resolve a host name to an IP, it firsts look in the HOSTS file. If a matching entry exists, it will use that entry. If an entry doesn't exist, it will proceed to use DNS.

Here is an example HOSTS file:
# This is a sample HOSTS file
caesar  # Server computer
augustus  # Firewall computer

The host name and IP can be separated by spaces or the tab character. A comment is also optional using the # character.

HOSTS can be used to fake entries, or override DNS entries. The HOSTS file is often used on computers on a small LAN that have no DNS server. The HOSTS file is also useful for overriding host IPs for debugging. You don't need to read the HOSTS file; Winsock will take care of this detail for you transparently whenever name resolution occurs.


A SERVICES file is similar to a HOSTS file, but instead of resolving host names into IP addresses, it resolves service names into the ports they're assigned to.

The following is a partial SERVICES file. You can look on your computer to see a complete file, or obtain RFC 1700. RFC 1700 contains assigned and reserved port numbers:
echo                7/tcp
echo                7/udp
discard             9/tcp    sink null
discard             9/udp    sink null
systat             11/tcp    users                  #Active users
systat             11/tcp    users                  #Active users
daytime            13/tcp
daytime            13/udp
qotd               17/tcp    quote                  #Quote of the day
qotd               17/udp    quote                  #Quote of the day
chargen            19/tcp    ttytst source          #Character generator
chargen            19/udp    ttytst source          #Character generator
ftp-data           20/tcp                           #FTP, data
ftp                21/tcp                           #FTP. control
telnet             23/tcp
smtp               25/tcp    mail                   #Simple Mail Transfer Protocol

The format of each entry is:
<service name> <port number>/<protocol> [aliases...] [#<comment>]

You don't need to read the SERVICES file; Winsock will also take care of this detail for you. The SERVICES file is read by certain function calls in Winsock; however, most programs don't call these functions, and therefore ignore its values. For example, most FTP programs default to 21 without ever using Winsock to look up the port for the 'ftp' entry.

Normally, you should never modify this file. Some programs, however, add an entry to this file and actually use it. You can then change this entry to tell those programs to use another port. One such program is Interbase. Interbase makes the following entry:
gds_db           3050/tcp

You can change this entry to make Interbase use a different port. While it's not common practice to do this, it's a good practice and should be considered if you write socket applications, especially servers. It's also good practice for clients to use Winsock to look up the value in SERVICES, especially for non-standard protocols. If no entry is found, a default should be used.


LOCALHOST is similar to "Self" in Delphi, or "this" in C++. LOCALHOST refers to the computer you're working on. It's a loopback address, and has a physical IP of If you use in any client, it will always loopback and look for a server on the computer the client is on.

This is useful for debugging and can also be used to contact any service running on your computer. If you have a local Web server, instead of needing to know the IP of the computer or have each developer change it in test scripts, you can specify


Ping is a protocol that verifies whether a host is reachable by the local computer. Ping is usually used in a diagnostic capacity.

Windows has a command-line utility to perform a Ping. Its usage is:
ping <host name or IP>

The following is sample output of a successful Ping:

Pinging with 32 bytes of data:

Reply from bytes=32 time<10ms TTL=128
Reply from bytes=32 time<10ms TTL=128
Reply from bytes=32 time<10ms TTL=128
Reply from bytes=32 time<10ms TTL=128

Ping statistics for
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum =  0ms, Average =  0ms

If a host cannot be reached, the output will look similar to this:
D:\>ping CAESAR

Pinging with 32 bytes of data:

Destination host unreachable.
Destination host unreachable.
Destination host unreachable.
Destination host unreachable.

Ping statistics for
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
Approximate round trip times in milli-seconds:
    Minimum = 0ms, Maximum =  0ms, Average =  0ms


TCP/IP packets don't travel directly from one host to another. They are routed, much like a car drives from one house to another. Typically, the car must travel on more than one road to reach its destination. TCP/IP packets travel much in the same way. Each time a packet changes "roads" at an "interchange," it travels through a node. By obtaining a list of nodes or "interchanges" that a packet must travel through between hosts, you can determine its path. This is quite useful in determining why a host cannot be reached.

Windows has a command-line utility to perform TraceRoutes, called TraceRt. TraceRt displays a list of IP routers (nodes) that are used in delivering a packet from your computer to the specified host, and how long each trip (hop) took. These times can be useful in determining bottlenecks. TraceRt can also display the last router that successfully handled your packet in the case of a failed transfer. TraceRt is used to further diagnose problems detected with Ping.

The following is sample output of a successful TraceRt:

Tracing route to []
over a maximum of 30 hops:

  1   <10 ms   <10 ms   <10 ms  LANmodem []
  2    40 ms    40 ms    50 ms []
  3    50 ms    50 ms    50 ms []
  4    70 ms    71 ms    60 ms  588.Hssi3-0-0.GW1.ATL1.ALTER.NET []
  5    60 ms   130 ms    91 ms  103.ATM2-0.XR2.ATL1.ALTER.NET []
  6    60 ms    70 ms    60 ms  194.ATM2-0.TR2.ATL1.ALTER.NET []
  7   120 ms   100 ms    70 ms  109.ATM6-0.TR2.CHI4.ALTER.NET []
  8   110 ms    70 ms   141 ms  103.ATM2-0.XR2.ATL1.ALTER.NET []

  9    90 ms    80 ms    90 ms  194.ATM9-0-0.BR1.CHI1.ALTER.NET []

 10   101 ms    80 ms    90 ms  198.ATM6-0.XR2.CHI4.ALTER.NET []
 11   131 ms   160 ms    90 ms []
 12   290 ms   211 ms   140 ms []
 13   141 ms   170 ms   160 ms []
 14   210 ms   130 ms   170 ms []
 15   130 ms   151 ms   140 ms []
 16   180 ms   160 ms   151 ms []
 17   140 ms   210 ms   180 ms
 18   151 ms   150 ms   140 ms

Trace complete.


The IETF (Internet Engineering Task Force) is an open community that promotes the operation, stability, and evolution of the Internet. The IETF works much like Open Source software development teams. The IETF can be found at


RFCs (Request For Comments) are the official documents of the IETF that describe and detail protocols of the Internet.


My goal for this article was to provide the reader with a basic understanding of socket concepts necessary to move to the next level, and to liken them to real-world concepts. I hope that I have met this goal.


This article is an extract from the book Indy in Depth. Indy in Depth is an e-book which you can subscribe to and receive the complete book by e-mail. Also check out the Atozed Indy Portal at

About the Author

Chad Z. Hower, a.k.a. "Kudzu" is the original author and project coordinator for Internet Direct (Indy). Indy consists of over 110 components and is included as a part of Delphi, Kylix and C++ Builder. Chad's background includes work in the employment, security, chemical, energy, trading, telecommunications, wireless, and insurance industries. Chad's area of specialty is TCP/IP networking and programming, inter-process communication, distributed computing, Internet protocols, and object-oriented programming. When not programming, he likes to cycle, kayak, hike, downhill ski, drive, and do just about anything outdoors. Chad, whose motto is "Programming is an art form that fights back", also posts free articles, programs, utilities and other oddities at Kudzu World at Chad is an American ex-patriate who currently spends his summers in St. Petersburg, Russia and his winters in Limassol, Cyprus. Chad can be reached using this form.

Chad works as a Senior Developer for Atozed Software.