Overview for 2024-2025

In 2024-2025, I will supervise three student projects.
Below are my project suggestions for 2024-2025:
- DGB1 (BScCS/BScDSA). Can kNN Explain NN?
  This project is no longer available. It has been assigned to Jack Moloney.
- DGB2 (BScCS/BScDSA). Six Degrees of Everything
  This project is no longer available. It has been assigned to David Brett.
- DGB3 (BScCS). Murder on the Dancefloor
  This project is no longer available. It has been assigned to Conor Shipsey.
Students may also access the Computer Science Department's official Project Site.

DGB1 (BScCs/BScDSA). Can kNN Explain NN?

This project is no longer available. It has been assigned to Jack Moloney.

You want to classify an image, q: does it contain a cat or a dog? You have a dataset, D, that contains examples of cats and dogs.

One simple method is k nearest-neighbours: find k images in D that are similar to q. Your prediction for q is a majority vote of the neighbours. This method is referred to as kNN. An advantage is that you can explain your prediction by displaying the neighbours.

A different method is a neural network, especially a convolutional neural network. A learning process exposes the network to the images in D; it adjusts weights on the edges between neurons until the network accurately classifies the images in D and, hopefully, will also correctly classify unseen images, such as q. Often, the neural network will be more accurate than kNN. But it cannot explain its predictions.

In this project, we investigate ways of explaining NN predictions. One way is to highlight the parts of the image that most contribute to the prediction: the Grad-CAM method is one popular way to do this [Selvaraju et al., 2016]. Another method is to intercept the outputs of an intermediate layer of the NN and use these to implement the kNN algorithm. The explanation for unseen image q will then be its neighbours, but these will have been chosen on the same representation used by the NN, rather than the raw images. See [Lee et al. 2020] for an example of doing this.

But there are lots of questions. To what extent do the neighbours found using neural representations agree with those found on the raw images? To what extent do the classifications found by the NN and by the kNN methods agree? To what extent do these results differ depending on where in the NN we make our interception? Which explanation method do different users with different needs find most helpful?

In this project, we will build and evaluate systems that explore these questions. A good student will try several of the ideas on multiple datasets. An excellent student will compare with wider ideas from the relevant literature.

Background reading

There are descriptions of kNN and neural networks in my AI1 lecture notes
Ramprasaath R. Selvaraju, Abhishek Das, Ramakrishna Vedantam, Michael Cogswell, Devi Parikh Dhruv Batra (2016). Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization, http://arxiv.org/abs/1610.02391
Ritchie Lee, Justin Clarke, Adrian Agogino and Dimitra Giannakopoulou (2020). Improving Trust in Deep Neural Networks with Nearest Neighbors, AIAA Scitech 2020 Forum, https://arc.aiaa.org/doi/abs/10.2514/6.2020-2098

Skills required

This project is suitable for a student who wants to study the research literature, has very good programming skills and is unafraid of mathematical notation.

DGB2 (BScCS/BScDSA). Six Degrees of Everything

This project is no longer available. It has been assigned to David Brett.

Suppose you have scraped the web to obtain a graph. Some nodes correspond to objects in the world, e.g. people. Others correspond to pieces of text, e.g. academic papers or newspaper stories. Edges connect the nodes, e.g. from a person to their university. In particular, if a text mentions an object, there is an edge between the two nodes.

Given a small set of seed objects or texts, we want to find ways to connect them, if possible. This might remind you of Six Degrees of Kevin Bacon, where the task is to find a path between two actors (one being Kevin Bacon). However, I have developed an algorithm for finding a subgraph, rather than a path. Instead of finding a tenuous path, the idea is to find a rich subgraph of interconnections for presentation to the user. Why do we want to do this: it gives the user an informative tour through a domain.

In this project, we will scrape the web to obtain objects, edges between objects and texts; we will apply Named Entity Recognition to the texts to connect them to the objects mentioned in the texts; we will implement my algorithm (and perhaps some variants of it); we will investigate ways of presenting the subgraph to users; and we will run experiments with users.

The domain that I am thinking about here is academic papers. But I am open to the idea of working in other domains (e.g. music, soccer, recipes, current affairs new stories), provided the data is available. For higher marks, the student might develop a generic code pipeline and apply it to more than one domain.

Skills required

The project requires a student who has excellent programming skills.

DGB3 (BScCS). Murder on the Dancefloor

This project is no longer available. It has been assigned to Conor Shipsey.

Suppose you are playing music in a shared space: a house party, the afters at a wedding, a car journey, or maybe just an online room. As host, you need an app that selects the music for you and your guests. The app allows the following:

You can choose some seed tracks and put them into a queue.
You and your guests can give a thumbs-up or thumbs-down to tracks in the queue, including the one that is playing.
You and your guests can submit song requests, selected from a catalogue.
You can veto requests, removing them from the queue.

The app decides what to play next.

It needs to be fair in the handling of requests.
It needs to take user preferences (thumbs-up/down) into account.
It needs to ensure that the sequence is agreeable (e.g. tracks that blend into each other).
It needs to choose additional songs, either when the queue is empty or to ensure an agreeable sequence.

It must do all these things in real-time.

In this project, we will develop this app and run some experiments with users. For higher marks, the student will take the research literature into account. See below for some initial pointers to this literature.

Background reading

Clemens Drews and Florian Pestoni. 2002. Virtual Jukebox: Reviving a Classic. In Proceedings of the Annual Hawaii International Conference on System Sciences. 887–893.
David Sprague, Fuqu Wu, and Melanie Tory. 2008. Music Selection using the Partyvote Democratic Jukebox. In Proceedings of the Working Conference on Advanced Visual Interfaces. 433–436.
Felipe Vieira and Nazareno Andrade. 2015. Evaluating Conflict Management Mechanisms for Online Social Jukeboxes. In Proceedings of the International Society for Music Information Retrieval Conference. 190–196.
Dave Cliff. 2000. Hang the DJ: Automatic Sequencing and Seamless Mixing of Dance-music Tracks. Technical Report. HP Laboratories Technical Report.
Ben Fields and Paul Lamere. 2010. Finding a Path Through the Juke Box: The Playlist Tutorial. In Proceedings of the International Society for Music Information Retrieval Conference.

Skills required

The project has many challenges: getting data from APIs; understanding research papers; multi-user app development; app evaluation with users.

It requires a student who has excellent programming skills. Also, you'd better not kill the groove.

Derek Bridge

B.Sc. and M.Sc. Projects

Overview for 2024-2025

DGB1 (BScCs/BScDSA). Can kNN Explain NN?

Background reading

Skills required

DGB2 (BScCS/BScDSA). Six Degrees of Everything

Skills required

DGB3 (BScCS). Murder on the Dancefloor

Background reading

Skills required