Web Development

Dr Derek Bridge
School of Computer Science & Information Technology
University College Cork

Lecture Objectives

  • learn the main parts of a URL: scheme, host, port number, pathname
  • learn the difference between absolute pathnames and relative pathnames

URLs

In HTML, we use URLs in several places:
hyperlinks, image elements,
link elements for favicons (also for stylesheets),
video and audio source elements, …


<a href="http://www.example.org/animals/wombats.html">wombats</a>
    

<img src="sea_photo.jpg" alt="..." />
    

<link rel="icon" href="favicon.ico" />
    

<video controls>
   <source src="sea_video.mp4" type="video/mp4" />
</video>
    

We will use them in CSS too for background images and for fonts.

URLs

In general, URLs consist of six parts:
scheme://host:port/pathname?query#fragment

Most are optional; several have defaults

E.g. http://www.example.org/animals/wombats.html

The Scheme

Let's start by looking at the scheme
scheme://host:port/pathname?query#fragment

The Scheme

tells your browser what action to take when, e.g., a user clicks on the link.

http (the default):
tells browser to send out a request using HTTP.

https:
same as http but HTTP request and response are encrypted.

Other Schemes: file

<a href="file:///C:/Users/derek/mypage.html">My page</a>
tells browser to load web page directly from user's local disk.

Never use this kind of URL in a web page.

Other Schemes: mailto

<a href="mailto:s.murphy@cs.ucc.ie">Mail me</a>
tells browser to launch an email program.

Be very wary of using this.

Other Schemes: chrome

chrome://version
tells Chrome browsers to display 'internal' information.

chrome://settings
tells Chrome browsers to display your 'settings' information.

The Host

Next up, the host:
scheme://host:port/pathname?query#fragment

The Host: IP Address

Every device that is connected to the Internet is assigned a unique IP address.

In IPv6, IP addresses are 128-bits
e.g. in hexadecimal
2001:0db8:85a3:0000:0000:8a2e:0370:7334

In a URL, the host specifies the IP address of the server that will receive your request.

The Host: Hostname

Numeric IP addresses are cumbersome for humans.
Hence, most computers (hosts) that are connected to the Internet also have one (or more) names (hostnames).
e.g.
www.cs.ucc.ie
cs1.ucc.ie
www.rte.ie

In a URL, you can give the hostname instead of its IP address.

Domain Names

A hostname comprises
name of the computer + domain name, e.g.:
www.cs.ucc.ie

The domain name is one or more labels that place the computer in a hierarchy.

The Domain Name System (DNS)

Your browser asks DNS to convert the hostname to its IP address.

Your browser sends a request to a DNS server, which sends back the IP address of the Web server. So now the browser can send a request to the Web server.

The Port Number

Now it's the turn of the port number:
scheme://host:port/pathname?query#fragment

The Port Number

One server might be offering ('hosting') more than one service.
E.g. a server computer may be running web server software and email server software.

How does a client indicate which server program is to handle its requests?

The Port Number

Each server program is assigned an identification number, called a port number.

A client request includes the port number of the server program that should respond.

Listening on Ports

Suppose a machine is running Web server software and email server software. Then requests from clients must indicate which server software should handle the request and deliver the response. This is done by including a number, called a port, with the request: 80 for the Web, and 143 for email.

Well-Known Port Numbers

The Internet Assigned Numbers Authority (IANA) oversees the use of port Numbers:

The Pathname

Onto the pathname:
scheme://host:port/pathname?query#fragment

The Pathname

The pathname says which resource we want
(usually by giving its filename)
and where it is on the server.

Unix File Systems

  • Directories (what Windows calls folders):
    • can contain other directories;
    • can contain files.
  • The topmost directory (the root) is known as /
  • The directory that contains your personal files, the one you log into, is your home directory.

Example

A file system. / at the top with two subdirectories, var and users. var has a subdirectory called wwww. www has two subdirectories, html and dogs. dogs contains a file, poodles.html. html contains a file, index.html, plus three subdirectories, errors (containing bad_urls.html), fish (containing guppies.html and catfish.html) and cats (containing index.html, persian.html and siamese.html).

File and Directory Names

  • Avoid spaces.
  • Avoid punctuation symbols except underscore, _
  • For files, use a proper filename extension, e.g.:
    .html .css .gif .jpg .png .js
  • Unix is case-sensitive, so best to consistently use lowercase.
  • Be descriptive but keep them short.

Unix Permissions

Owner the person who owns the file/directory
Group users who belong to the group associated with the file/directory
Other all users who can login to the system

Unix Permissions

Read view the contents of the file
list the names of the files in the directory
Write modify the contents of the file or delete it
add files to the directory, remove files from it, rename files in it
Execute run the file (if it is a program)
traverse the directory and, if also have read permission, list the files, view their contents, view their permissions,…

Pathnames

  • A pathname is a list of directories that you must 'travel' through to get to where you want to go (separated by /).
  • An absolute pathname is one that begins at the root.
    • They start with /
  • A relative pathname is one that begins in the current working directory.
    • They do not start with /
    • To go up a level, use ..

Absolute Pathnames

The absolute pathname for the directory called dogs is /var/www/dogs

The same filesystem as before.

Absolute Pathnames

The absolute pathname for the file called guppies.html is /var/www/html/fish/guppies.html

The same filesystem as before.

Relative Pathnames

Assume the current working directory is fish.
The relative pathname for the file called siamese.html is ../cats/siamese.html

The same filesystem as before.

Relative Pathnames

Again, assume the current working directory is fish.
The relative pathname for the directory called users is ../../../../users

The same filesystem as before

Relative Pathnames

This time, assume the current working directory is html.
The relative pathname for the file called index.html (the one in html) is index.html

The same filesystem as before.

Relative Pathnames

Again, assume the current working directory is html.
The relative pathname for the other file called index.html (the one in cats) is cats/index.html

The same filesystem as before.

Absolute URLs

Absolute URLs use absolute pathnames.

But they start at the document root, not the filesystem root.

Relative URLs

Relative URLs use relative pathnames.

They start at the directory that contains the current document.

The browser converts the relative URL into an absolute URL before sending a request to the server.

G'luck!