In HTML, we use URLs in several places:
hyperlinks, image elements,
link elements for favicons (also for stylesheets),
video and audio source elements, …
<a href="http://www.example.org/animals/wombats.html">wombats</a>
<img src="sea_photo.jpg" alt="..." />
<link rel="icon" href="favicon.ico" />
<video controls>
<source src="sea_video.mp4" type="video/mp4" />
</video>
We will use them in CSS too for background images and for fonts.
In general, URLs consist of six parts:
scheme://host:port/pathname?query#fragment
Most are optional; several have defaults
E.g. http://www.example.org/animals/wombats.html
Let's start by looking at the scheme
scheme://host:port/pathname?query#fragment
tells your browser what action to take when, e.g., a user clicks on the link.
http
(the default):
tells browser to send out a request using HTTP.
https
:
same as http
but HTTP request and response are encrypted.
file
<a href="file:///C:/Users/derek/mypage.html">My page</a>
tells browser to load web page directly from user's local disk.
Never use this kind of URL in a web page.
mailto
<a href="mailto:s.murphy@cs.ucc.ie">Mail me</a>
tells browser to launch an email program.
Be very wary of using this.
chrome
chrome://version
tells Chrome browsers to display 'internal' information.
chrome://settings
tells Chrome browsers to display your 'settings' information.
Next up, the host:
scheme://host:port/pathname?query#fragment
Every device that is connected to the Internet is assigned a unique IP address.
In IPv6, IP addresses are 128-bits
e.g. in hexadecimal
2001:0db8:85a3:0000:0000:8a2e:0370:7334
In a URL, the host specifies the IP address of the server that will receive your request.
Numeric IP addresses are cumbersome for humans.
Hence, most computers (hosts) that are connected to the Internet also have one (or more) names (hostnames).
e.g.
www.cs.ucc.ie
cs1.ucc.ie
www.rte.ie
In a URL, you can give the hostname instead of its IP address.
A hostname comprises
name of the computer + domain name, e.g.:
www.cs.ucc.ie
The domain name is one or more labels that place the computer in a hierarchy.
Your browser asks DNS to convert the hostname to its IP address.
Now it's the turn of the port number:
scheme://host:port/pathname?query#fragment
One server might be offering ('hosting') more than one service.
E.g. a server computer may be running web server software and email server software.
How does a client indicate which server program is to handle its requests?
Each server program is assigned an identification number, called a port number.
A client request includes the port number of the server program that should respond.
The Internet Assigned Numbers Authority (IANA) oversees the use of port Numbers:
22: | ssh |
25: | smtp |
80: | http |
110: | pop3 |
115: | sftp |
143: | imap |
Onto the pathname:
scheme://host:port/pathname?query#fragment
The pathname says which resource we want
(usually by giving its filename)
and where it is on the server.
.html
| .css
| .gif
| .jpg
| .png
| .js
|
Owner | the person who owns the file/directory |
Group | users who belong to the group associated with the file/directory |
Other | all users who can login to the system |
Read | view the contents of the file |
list the names of the files in the directory | |
Write | modify the contents of the file or delete it |
add files to the directory, remove files from it, rename files in it | |
Execute | run the file (if it is a program) |
traverse the directory and, if also have read permission, list the files, view their contents, view their permissions,… |
..
The absolute pathname for the directory called dogs
is /var/www/dogs
The absolute pathname for the file called guppies.html
is
/var/www/html/fish/guppies.html
Assume the current working directory is fish
.
The relative pathname for the file called siamese.html
is ../cats/siamese.html
Again, assume the current working directory is fish
.
The relative pathname for the directory called users
is ../../../../users
This time, assume the current working directory is html
.
The relative pathname for the file called index.html
(the one in html
) is index.html
Again, assume the current working directory is html
.
The relative pathname for the other file called index.html
(the one in cats
) is cats/index.html
Absolute URLs use absolute pathnames.
But they start at the document root, not the filesystem root.
Relative URLs use relative pathnames.
They start at the directory that contains the current document.
The browser converts the relative URL into an absolute URL before sending a request to the server.