The World Wide Web (WWW) is a repository of information linked together from points all over the world. The WWW project was initiated by CERN (European Laboratory for Particle Physics) to create a system to handle distributed resources necessary for scientific research.
Architecture of www
The WWW is a distributed client-server service, in which a client using a browser can access a service using a server. However, the service provided is distributed over many locations called sites. Each site holds one or more documents, referred to as Web pages. Each Web page can contain a link to other pages in the same site or at other sites.
The browser interprets and displays a Web page. Each browser usually consists of three parts:
- Controller: It receives input from the keyboard or the mouse and uses the client programs to access the document. The controller uses one of the interpreters to display the document on the screen.
- Client Protocol: The client protocol can be one of the protocols described previously such as FfP or HTTP.
The Web page is stored at the server. Each time a client request arrives, the corresponding document is sent to the client. To improve efficiency, servers normally store requested files in a cache in memory; memory is faster to access than disk. A server can answer more than one request at a time by using multithreading or multiprocessing.
Uniform Resource Locator
A client that wants to access a Web page needs the address. To facilitate the access of documents distributed throughout the world, HTTP uses locators. The uniform resource locator (URL) is a standard for specifying any kind of information on the Internet. The URL defines four things:
- Protocol: The protocol is the client/server program used to retrieve the document. For example, HTTP, FTP etc.
- Host Computer: The host is the computer on which the information is located. Web pages are usually stored in computers, and computers are given alias names that usually begin with the characters “www”.
- Port: The port number of the server is optional. If the port is included, it is inserted between the host and the path, and it is separated from the host by a colon.
- Path: Path is the pathname of the file where the information is located. The path can itself contain slashes that, in the UNIX operating system, separate the directories from the subdirectories and files.
The web document on World Wide Web (WWW) can be broadly divided into two categories:
- Static Document: The static documents are those whose content cannot be changed on client. These are fixed-content documents that are created and stored in a server. The client can get only a copy of the document. The static webpages are generally created in HTML (Hypertext Markup Language). For example, login page of Facebook.
- Dynamic Document: The dynamic document/webpage is a document that is created at runtime. When a request arrives, the Web server runs an application program or a script that creates the dynamic document. The content of this type of document is constantly changing so you might see different content each time you request the document from your browser. For example, timeline of your Facebook profile.
The Hypertext Transfer Protocol (HTTP) is a protocol used mainly to access data on the World Wide Web. HTTP uses the services of TCP on well-known port 80. HTTP is a stateless protocol.
A proxy server is a server which acts on behalf of other clients, and presents requests from other clients to a server. It acts as a server while talking with a client, and as a client while talking with a server. It is a server that sits between a client application (Web browser), and a real server. It improves performance and filter requests to prevent users from accessing a specific set of web sites.
Download as PDF
Read next: Introduction to Operating System ››
« Back to Course page