Research Papers

 

   
 

How Internet Works

The key to how the Internet works lies in understanding what it is:

Internet is a network of networks. It is not an easily defined single object.

 

Introduction

Internet is a global collection of networks, both big and small. These networks connect together in many different ways to form the single entity that we know as the Internet. In fact, the very name comes from this idea of interconnected networks. The Internet is the world's largest distributed system; it was designed and engineered for redundancy and resilience. The Internet is not a single company or a group of companies, nor even a single network. It is a worldwide mesh or matrix of hundreds of thousands of networks, owned and operated by hundreds of thousands of people in hundreds of countries, all interconnected by about 8,000 Internet Service Providers(ISP). No single organization controls the Internet; not the U.N.; not the biggest ISPs; and the Internet has long since outgrown control by the U.S. government.

 

Internet Service Provider

When you, the user, look at a web page through the Internet, many things happen along the way. There are various ways to get from your house or office through the ``last mile'' to the Internet: modem dialup, ISDN, DSL, cable modem, wireless, leased line, etc. These various technical methods may provide speeds anywhere from very slow (a few hundred bits per second) to very fast (billions of bits per second). All these access methods are onramps to the information superhighway.

 

History

The Internet has had a relatively brief, but explosive history so far. It grew out of an experiment begun in the 1960's by the U.S. Department of Defense. The Department of Defense wanted to create a computer network that would continue to function in the event of a disaster, such as a nuclear war. If part of the network were damaged or destroyed, the rest of the system still had to work.

The theory for the Internet first started being published in 1961 with Leonard Kleinrock’s document on packet-switching theory, “Information Flow in Large Communication Net.”  This document presented the theory behind the first problem of the Internet, and how to solve it.  The problem was this:  when a large document is sent then pieces of it become lost in transfer and the entire document has to be resent, but then different pieces are missing from the new copy of the document.  This is a major problem and the obvious solution is to “chop” the information up into smaller pieces and then transmit the smaller

pieces.

That network was ARPANET, which linked U.S. scientific and academic researchers. It was the forerunner of today's Internet. `12`2pen architecture developed for the ARPANET, which later became the Internet, included four key points which contributed to the success of the Internet.

  • Independent networks should not require any internal changes in order to be connected to the network.
  • Packets that do not arrive at their destination must be retransmitted from their source.
  • Routers do not retain information about the packets that they handle.
  • No global control exists over the network.

In 1985, the National Science Foundation (NSF) created NSFNET, a series of networks for research and education communication. Based on ARPANET protocols, the NSFNET created a national backbone service, provided free to any U.S. research and educational institution. At the same time, regional networks were created to link individual institutions with the national backbone service.

 NSFNET grew rapidly as people discovered its potential, and as new software applications were created to make access easier. Corporations such as Sprint and MCI began to build their own networks, which they linked to NSFNET. As commercial firms and other regional network providers have taken over the operation of the major Internet arteries, NSF has withdrawn from the backbone business.  

 

Packet Switching/Routing Technology

The Internet employs a less expensive and more easily managed technique known as packet switching. Data is broken down into packets, labeled with their source and destination, which travel from computer to computer until they reach their destination. The destination computer collects the packets and reassembles the original data. The computers encountered en-route are called routers and they determine the best way to move the packet towards its destination.

Packet switching over the Internet has several benefits. A long stream of data is broken down into smaller manageable chunks, allowing the small packets to be distributed over a number of paths in order to balance the traffic; in addition, it is inexpensive to replace a missing or damaged packet once it has arrived.

In order to transmit text or pictures, your data is chopped up into small packets which are routed through the Internet. But first they have to go from you to your local ISP, or the equivalent piece of the Internet inside your organization (an intranet). This local ISP is a possible point of failure. If something goes wrong at your local ISP, it may look to you like the Internet is broken. It's not. Only one small piece of it is broken. The rest of the Internet, with its portals and stock portfolios and shops and reams of scientific data and plethora of information and people on it will not break because one ISP does.

To reach a web server, your local ISP sends your packets of data to another ISP, which may send them to another ISP, or through an Exchange Point (IX) or a National Access Point (NAP) or Local Access Point (LAP) to get to another ISP. Thus your packets pass through a chain of ISPs through nodal points to reach their destination. Your packets may pass through fiber optic cables in the ground, satellites in the sky, undersea cables, or radio links. They may travel at speeds including T-1 (1.544 Mbps), T-3 (45Mbps), or faster (or slower). The Internet Protocol (IP) ties all of those links together, enabling your packets travel through the Internet.

Eventually your packets arrive at the web server, and the web server sends responses back along a similar path (almost definitely not the same one). Any of these Internet providers can have problems (congestion, broken link, power outage, broken computer, etc.), which may cause the web server to seem slow or unresponsive to you. But the web server is broken only if the web server is actually broken. Problems in intervening parts of the Internet do not break the web server, which may well be accessible to other people, and may become accessible to you as soon as the various Internet providers route your traffic around problems.

Much rerouting in the Internet is dynamic, and happens automatically. (Imagine you are driving up the California coast and come to a sign that says that there has been a mudslide. You drive inland, north on another road, perhaps rejoining the coastal highway again. You have changed your route dynamically.) Some rerouting isn't automatic. In particular, the biggest ISPs, frequently called backbones, cover vast geographical areas and carry large proportions of the Internet's traffic. A failure in a backbone or in one of the major interconnection points between them can affect many Internet users. And such a problem may take some time to be resolved, as the biggest ISPs often prefer to manually examine changes in major routes before implementing them.

Internet providers use the same methods for routing packets for electronic mail or file transfers or remote login or voice or video. People tend to be quicker to notice slowness in accessing web pages, so we have used accessing a web server as an example.

The TCP/IP Protocol Suite

Internet protocols were first developed in the mid-1970s, when the Defense Advanced Research Projects Agency (DARPA) became interested in establishing a packet-switched network that would facilitate communication between dissimilar computer systems at research institutions. The TCP/IP protocol was first proposed in 1973 but was not until the year 1983 when the first standardized version was developed and adopted for the wide area use. TCP/IP later was included with Berkeley Software Distribution (BSD) UNIX and has since become the foundation on which the Internet and the World Wide Web (WWW) are based.

TCP/IP is made up of various but limited addresses, are set up in different classes, and can add more host addresses and separate segments in a given network by using a thing called subnet mask. TCP/IP is one of the most important elements of Internet technology and is the element that makes intranets so easy to set up and use. The TCP/IP is actually a whole family of protocols, which provides the foundation to the Internet. TCP, meaning Transmission Control Protocol, and IP, meaning Internet Protocol, is the first thing that you can do before you can connect to the internet or do anything with your workstations.

TCP controls the assembly of a message into smaller packets before transmission over the Internet and for reassembly of packets at their destination. IP controls the routing of the packets across the Internet. IP is the language that computers use to communicate over the Internet. A protocol is the pre-defined way that someone who wants to use a service talks with that service. The "someone" could be a person, but more often it is a computer program like a Web browser. The Internet Protocol (IP) is a network-layer protocol that contains addressing information and some control information that enables packets to be routed. IP is the primary network-layer protocol in the Internet protocol suite. Along with the Transmission Control Protocol (TCP), IP represents the heart of the Internet protocols. IP has two primary responsibilities: providing connectionless, best-effort delivery of data grams through an internet work; and providing fragmentation and reassembly of data grams to support data links with different maximum-transmission unit (MTU) sizes.

The Internet protocols are the world's most popular open-system (nonproprietary) protocol suite because they can be used to communicate across any set of interconnected networks and are equally well suited for LAN and WAN communications. The Internet protocol suite not only includes lower-layer protocols (such as TCP and IP), but it also specifies common applications such as electronic mail, terminal emulation, and file transfer.

As with any other network-layer protocol, the IP addressing scheme is integral to the process of routing IP data grams through an internet work. Each IP address has specific components and follows a basic format. These IP addresses can be subdivided and used to create addresses for subnetworks.

IP Address and Addressing

Each host on a TCP/IP network is assigned a unique 32-bit logical address that is divided into two main parts: the network number and the host number. The network number identifies a network and must be assigned by the Internet Network Information Center (InterNIC) if the network is to be part of the Internet. An Internet Service Provider (ISP) can obtain blocks of network addresses from the InterNIC and can itself assign address space as necessary. The host number identifies a host on a network and is assigned by the local network administrator.

An IP address consists of 32 bits, grouped into four octets.


A typical IP address looks like this: 216.27.61.137

 Every machine on the Internet has an IP Address. To make it easier for us to remember, IP addresses are normally expressed in decimal format as a dotted decimal number like the one above. But computers communicate in binary form. Look at the same IP address in binary:

 11011000.00011011.00111101.10001001

The four numbers in an IP address are called octets, because they each have eight positions when viewed in binary form. If you add all the positions together, you get 32, which is why IP addresses are considered 32-bit numbers. Since each of the eight positions can have two different states (1 or 0) the total number of possible combinations per octet is 256. So each octet can contain any value between 0 and 255. There is a possible of 4,294,967,296 unique values but out of the 4,294,967,296 billion possible combinations, certain values are restricted from use as typical IP addresses. The octets serve a purpose other than simply separating the numbers. They are used to create classes of IP addresses that can be assigned to a particular business; government or other entity based on size and need. The octets are split into two sections: Net and Host. The Net section always contains the first octet. It is used to identify the network that a computer belongs to. Host (sometimes referred to as Node) identifies the actual computer on the network. The Host section always contains the last octet.

Out of the almost 4.3 billion possible combinations, certain values are restricted from use as typical IP addresses. For example, the IP address 0.0.0.0 is reserved for the default network and the address 255.255.255.255 is used for broadcasts.

Domain Name Server

When the Internet was in its infancy, it consisted of a small number of computers hooked together with modems and telephone lines. You could only make connections by providing the IP address of the computer you wanted to establish a link with. For example, a typical IP address might be 216.27.22.162. This was fine when there were only a few hosts out there, but it became unwieldy as more and more systems came online.

The first solution to the problem was a simple text file maintained by the Network Information Center that mapped names to IP addresses. Soon this text file became so large it was too cumbersome to manage. In 1983, the University of Wisconsin created the Domain Name System (DNS), which maps text names to IP addresses automatically.  Human-readable names like "www.csi.cuny.edu" are easy for people to remember, but they don't do machines any good. All of the machines use IP Addresses to refer to one another. For example, the machine that humans refer to as "www.csi.cuny.edu" has the IP address 163.38.52.223. Every time you use a domain name, you use the Internet's domain name servers (DNS) to translate the human-readable domain name into the machine-readable IP address. During a day of browsing and e-mailing, you might access the domain name servers hundreds of times!

Domain name servers translate domain names to IP addresses. That sounds like a simple task, and it would be -- except for five things:

  • There are billions of IP addresses currently in use, and most machines have a human-readable name as well.
  • There are many billions of requests made from domain name servers every day. A single person can easily make a hundred or more DNS requests a day, and there are hundreds of millions of people and machines using the Internet every day.
  • Domain names and IP addresses change daily.
  • New domain names get created daily.
  • Millions of people do the work to change and add domain names and IP addresses every day.

The DNS system is a database, and no other database on the planet gets this many requests. No other database on the planet has millions of people changing it every day, either. That is what makes the DNS system so unique.

Internet Applications

Electronic mail, or e-mail,

E-mail has been around since the early 1970s and it is an important component of e-commerce. Organizations use it to confirm orders, confirm shipment of goods, and generally maintain communications with their customers. It is also used when software is downloaded from the web to send the customer an electronic key which unlocks the software or turns a demonstration version into a fully-featured product.

 

The real e-mail system consists of two different servers running on a server machine. One is called the SMTP Server, where SMTP stands for Simple Mail Transfer Protocol. The SMTP server handles outgoing mail. The other is a POP3 Server, where POP stands for Post Office Protocol. The POP3 server handles incoming mail. The SMTP server listens on well-known port number 25, while POP3 listens on port 110. Whenever you send a piece of e-mail, your e-mail client interacts with the SMTP server to handle the sending. The SMTP server on your host may have conversations with other SMTP servers to actually deliver the e-mail.

                                                  

 

A Typical E-mail Server

 

FTP

FTP is the fastest way to deliver business information from one computer to another. It is often used for the sale and delivery of software packages and updates. Anonymous FTP is used to access academic and commercial sites to download files. With a username and password, FTP is used to access academic or corporate sites in a more privileged way.

Telnet

Telnet allows you to login to a remote computer attached to the Internet. It allows you to access the computer as if you were a locally connected user. Most windows-based telnet clients emulate a few terminals, such as the industrial standards VT-52 and VT-100.

Internet Future

Every year it gets easier to expand computers. New operating systems are making a point of including easy Internet integration and many new computers come with built in Ethernet support. In the not so distant future connecting a computer to the Internet will be as simple as plugging a telephone into the wall. The Internet is, or will be, a high bandwidth connection to any computer in the world. There are many, many possibilities as to what can be done with this ubiquitous a network. 

 

 

 

References

  1. Ed Krol, The Whole Internet User's Guide & Catalog (second edition), O'Reilly & Associates, Inc., Sebastopol, CA, April 1994. ISBN 1-56592-063-5

 

  1. Garry Schneider & James Perry, Electronic Commerce, Course Technology - ITP, 2000. ISBN 0-7600-1179-6

 

  1. Brain,Marshall,How Stuff Works

 

  1. http://www.iro.umontreal.ca/~dift1226/projections_Heuring_Jordan/Ch10.pdf