Introduction to HTTP

Derek Bridge

Department of Computer Science,
University College Cork

Introduction to HTTP

Aims:

What makes the web tick

[Diagram showing the main components of the web.] Diagram of web components from The TCP/IP Guide by Charles M. Kozierok

Web servers

Web server computers

The market leaders (www.netcraft.com):

Both free!

Uniform Resource Locators (URLs)

[Illustration of the three main parts of a URL: scheme, hostname, pathname.]

Schemes

Schemes: HTTPS

Hostnames

Pathnames

[A hierarchy of directories/folders.]

Absolute URLs

Relative URLs

Class exercise

The current document is highlighted

  1. [A hierarchy of directories/folders for the exercise.] Give the absolute pathname for each of the following
    1. The current document
    2. Its base directory/folder
    3. b.html
    4. ../a.html
    5. ../../dirB/a.html
    6. ../../dirA/dirA/a.html
  2. Give the relative pathname for b.gif

The HyperText Transfer Protocol (HTTP)

[Diagram showing HTTP requests and responses.]
  • HTTP uses the client-server model
  • Servers listen on port 80
  • Transmission is (mostly) by TCP
Diagram of HTTP requests and responses from The TCP/IP Guide by Charles M. Kozierok

Responses and request in more detail

  • The user generates an HTTP request
    • by typing the URL or by clicking on a link
  • The client (browser)
    • uses DNS to map the server hostname to its IP address (if necessary)
    • establishes a TCP 'connection' with the server
    • creates an HTTP request and sends it (using TCP)
  • The web server
    • receives the request
    • takes action (e.g. locates the requested file, if it can)
    • creates an HTTP response and sends it (using TCP)
  • The browser
    • receives the response
    • takes action (e.g. displays the web page)

Embedded content

  • Suppose the web page contains embedded content (e.g. stylesheets, images)
  • The server does not send all the content in one go
  • The client receives the web page and then sends separate requests for the embedded content
  • Example 1: http://www.cs.ucc.ie/cs1/mugshots-2007.html
  • Example 2

HTTP requests

  • Request line (required): command (method), URL and HTTP version number
  • Request header lines (largely optional): info about date, browser,...
  • Request message body (optional): empty for most commands (methods)
[Example of an HTTP request.] Example HTTP request from The TCP/IP Guide by Charles M. Kozierok

HTTP request commands (methods)

  • GET: retrieve a file (95% of requests)
  • HEAD: just retrieve header information for a file
  • POST: submitting data to a server

Other

  • PUT: store enclosed document on server
  • DELETE: removed named resource from server
  • LINK/UNLINK: in HTTP 1.0, gone in HTTP 1.1
  • TRACE: http 'echo' for debugging (added in 1.1)
  • CONNECT: used by proxies for tunneling (1.1)
  • OPTIONS: request for server/proxy options (1.1)

HTTP responses

  • Status line (optional): HTTP version number, status code, short explanation of code
  • Response header lines (optional): info about date, server,...
  • Response message body (required): the requested resource (web page, image,...)
[Example of an HTTP response.] Example HTTP response from The TCP/IP Guide by Charles M. Kozierok

HTTP response status codes

1XX: Informational (used in 1.1):
e.g. 100 Continue, 101 Switching Protocols
2XX: Success:
e.g. 200 OK, 206 Partial Content
3XX: Redirection:
e.g. 301 Moved Permanently, 304 Not Modified
4XX: Client error:
e.g. 400 Bad Request, 403 Forbidden, 404 Not Found
5XX: Server error:
e.g. 500 Internal Server Error, 503 Service Unavailable