Web Development

Dr Derek Bridge
School of Computer Science & Information Technology
University College Cork

Lecture Objectives

learn about caching — clicks don't always result in requests
learn that browsers are not the only clients

Client-server

The web has machines running client software and machines running server software.

Prep

Sftp index.html and ucc.png onto the cs1 web server. Fix the permissions.
Note the URL, e.g. http://cs1.ucc.ie/~dsab1/index.html
Ssh to cs1 and move to the folder.

Live

Use an Incognito Chrome window to avoid tracking stuff
Enter the URL and see the response (they do not do it).
CTRL-Shift-J for the Chrome Dev Tools.
Show how to move it from side to below.
Click on Network tab.
Now they enter URL and hit enter. You click shift+reload.
Look at request and response headers.
Repeat for the image.
Repeat for an incorrect URL: 404.
Edit the permissions and repeat and fix permissions.

Repeated Requests

The user clicks on a link:
Suppose the user presses the Back button later today, or clicks on the same link tomorrow.

Browser Caching

Once it has it, the browser can store the resource in its cache.
Then, subsequent requests can be satisfied from the cache:

What to Cache

HTTP response headers control what to cache:

A resource that the browser should not cache.

How Long to Keep It

Headers also control when to discard it (in seconds):

A resource that the browser should cache for a week.

Closing Remarks about Caching

The headers are more complicated than those shown here!
Caching is used all over the Internet, e.g. in forward-proxies.

Clients

Browsers are not the only clients!

Browsers include Chrome, Edge, Firefox, Opera, Safari, but there are browsers for mobile devices too.

Visually-Impaired Users

a regular browser or special browser
+
a screen reader,
which sends the information from the web page to a speech synthesizer or a braille display.

Sometimes these browsers are voice-controlled.

On your PC or laptop, you click on a hyperlink by using the mouse or trackpad to move the cursor over the hyperlink. On your phone or tablet, which has a touchsreen, you press on the hyperlink with your finger.

Visually-impaired users may not be able to do this. They cannot see whether the cursor or their finger is over the hyperlink.

Instead, they may make more use of the keyboard. Once familiar with a keyboard, they can use the tab key (on the left) to move from hyperlink to hyperlink. I will illustrate this with an example. Alternatively, they might use voice control.

Try a Screen-Reader e.g. for the RTE news.

On Windows 10, use the Narrator application:
Windows logo key + Ctrl + Enter

On Mac OS, use the VoiceOver application:
Command + fn + F5

Windows logo + Ctrl + Enter starts and stops it.

Command + fn + F5 starts it. Clicking the x in the little window seems to turn it off. Caps-lock right arrow seems to advance, but I seem to need to click on a word to get it into the page itself.

Other Clients

harmful	beneficial
denial of service attacks	dead link checkers
email harvesting	search engine crawlers

Denial of Service Attacks


repeat:
    send HTTP GET request to www.example.org
    receive HTTP response but discard it

Ouch!

Email Harvesting


list_of_urls = [www.example.org]
list_of_emails = []
while list_of_urls is not empty:
    remove a URL from list_of_urls
    send HTTP GET request to the URL
    receive HTTP response
    find all email addresses within the response
    insert the email addresses into list_of_emails
    find all hyperlinks within the response
    insert the URLs of the hyperlinks into list_of_urls

Now sell the list of email addresses to spammers!

Dead Link Checking


list_of_urls = [www.example.org]
list_of_dead_links = []
while list_of_urls is not empty:
    remove a URL from list_of_urls
    send HTTP GET request to the URL
    receive HTTP response
    if response status code is 404:
        insert the URL into list_of_dead_links
    else:
        find all hyperlinks within the response
        insert the URLs of the hyperlinks into list_of_urls

Print out the dead links for the web developer to fix!

Web Search Engines

Search engines use a crawler to build an index.

Indexes

book index
brocolli	p.2, p.17, …
carrots	p.2, p.112, …
cheese	p.6, p.17, p.28, …

search engine index
brocolli	URLs of pages mentioning brocolli
carrots	URLs of pages mentioning carrots
cheese	URLs of pages mentioning cheese

Search Engine Crawlers


list_of_urls = [www.example.org]
while list_of_urls is not empty:
    remove a URL from list_of_urls
    send HTTP GET request to the URL
    receive HTTP response
    find all important words within the response
    for each important word:
        in the index entry for that word, insert the URL
    find all hyperlinks within the response
    insert the URLs of the hyperlinks into list_of_urls

Summary & Consequences

There are many different clients:
what they have in common is they send HTTP requests and receive HTTP responses.

Web Developers must endeavour to create web pages, web sites and web apps that are usable by many different kinds of clients!