07/10/11: It is now too late for undergraduate students to choose any of these projects. I will take on M.Sc.
students, if any are interested and suitable.
DGB1. Ant Algorithms for Constructing Steiner Trees
When ants forage for food, they leave a trail of pheromone (chemical). Other ants detect these trails and, on occasion, follow them. These ants, in their turn, are also secreting pheromone, thus reinforcing the trails. The stronger a trail, the more likely it is that other ants will follow it. Thus we have a positive feedback loop.
From this, some surprising collective behaviour can emerge. For example, an ant colony can establish the shortest paths from its nest to sources of food.
Computer Scientists have been inspired by these observations from Nature. They are building systems in which waves of simulated ants traverse data structures, choosing paths that are influenced by trails of simulated pheromone.
(See Tarasewich and McMullen 2002.)
These systems can be used to find good solutions to ordering problems, such as the Travelling Salesperson Problem (TSP), where the task is to find an order in which to visit a set of towns (See Dorigo et al. 1996.) But in this project we will uses these systems for finding Steiner Trees.
Steiner Trees are like Minimum Spanning Trees, except that we are allowed to introduce new, intermediate nodes. See, e.g.,
the explanation on Wikipedia. There is evidence that ants in Nature find something like Steiner Trees: see the article Ants Build Cheapest Networks. A few Computer Science researchers have already attempted ant algorithms for finding Steiner Trees: see Hu et al. 2004, Liu et al 2003, Luyet et al. 2007, and Shen et al. 2008
In this project, we will compare different ways of using ant algorithms to construct Steiner Trees, basing most of them on the approaches described in the aforementioned papers but perhaps also trying our own ideas too.
Background reading
Chien-Chung Shen, Ke Li, Chaiporn Jaikaeo and Vinay Sridhara: 'Ant-based distributed constrained steiner tree algorithm for jointly conserving energy and bounding delay in ad hoc multicast routing', ACM Transactions on Autonomous and Adaptive Syste, Volume 3 Issue 1, 2008
Yu Hu, Tong Jing, Xianlong Hong, Zhe Feng Xiaodong Hu, Guiying Yan: 'An Efficient Rectilinear Steiner Minimum Tree Algorithm Based on Ant Colony Optimization', International Conference on Communications, Circuits and Systems, pp.1276 - 1280, 2004.
Ying Liu, Jianping Wu, Ke Xu, Mingwei Xu:
'The degree-constrained multicasting algorithm using ant algorithm', 10th International Conference on Telecommunications, pp.370-374, 2003.
L. Luyet, S. Varone and N. Zufferey: 'An Ant Algorithm for the Steiner Tree Problem in Graphs', Proceedings of the EvoWorkshops, M. Giacobini et al. (eds.), LNCS 4448, pp.42-51, Springer, 2007
Tarasewich, P. and McMullen, P.R.: 'Swarm Intelligence: Power in Numbers', Communications of the ACM, vol.45(8), pp.62-67, 2002
Skills required
This project is suitable for a good student on the B.Sc. in Computer Science or a student on the taught M.Sc. in Computing Science. It will best suit someone who excelled at modules in algorithms and data structures. The project also requires good programming skills. The research papers that the student will have to read describe a lot of algorithms and use an amount of mathematical notation.
DGB2. Retro-Tagging
User tagging is a key feature of the Web 2.0 era. Users annotate resources (such as photos, web pages, blog posts, music) with words drawn from an uncontrolled vocabulary in order to facilitate search, browsing and sharing. The uncontrolled vocabulary is tagging's strength, since it means there's no learning curve, but also its weakness, since it means that a resource's tags may suffer from incompleteness, inconsistency in their level of detail, ambiguity, and so on.
We can envisage two kinds of software tool that might alleviate some of these problems:
Tools to tag new resources. When the user inserts or discovers a new resource s/he may want to tag it. S/he might find useful a tool that proposes a set of tags. Of course, many systems already suggest tags that other users have used.
Tools to retrofit tags to existing resources. Suppose the user, when tagging a resource, invents or discovers a tag that s/he has not previously used. S/he might wish to apply the new tag to other relevant resources in his/her collection. She might find useful a tool that recommends the set of resources to which the tag most likely applies.
In this project, I am most interested in the second of these challenges.
We will use a variety of simple AI techniques, mostly ones that concern what AI people call 'classification'. For example, Singh et al. (2008) have used Support Vector Machines and the idea of Active Learning to tackle the first of the two challenges above. I will propose some additional ideas, and I would hope the student will have ideas of his/her own.
It will be necessary to collect a dataset on which we can experiment. For this, the student will need to write code that accesses the API of a Web 2.0 system that uses tagging.
Background reading
M. Singh, E. Curran and P. Cunnningham: 'Active Learning for Multi-Label Image Annotation', in Procs. of the 19th Irish Conference on Artificial Intelligence & Cognitive Science, D. Bridge et al. (eds.), University College Cork, 2008.
Any text on data mining or machine learning will be useful, e.g. Witten et al: Data Mining, or Mitchell: Machine Learning.
Skills required
This project will suit an imaginative student who is keen to explore the (sometimes mathematical) research literature and who has good programming skills. For this project, imagination and programming skills are more important than whether you are B.Sc. or M.Sc.
DGB3. Point2Point: A Product Review Browser
Users of sites such as Amazon and TripAdvisor use the sites to record and share their experiences of the various products available. They rate products; they write reviews of products; they rate other people's reviews; they comment on reviews; and so on. When users have a decision to make in the real-world (e.g. which product to buy, which hotel to book, etc), they need to retrieve, aggregate and reuse the experiences that others have recorded. This is not easy.
There is great value in tools that help users to reuse these experiences. Consider, for example, the way that the Amazon site can contrast the most helpful favourable and critical reviews. This adds value in a very simple way, making it easier for users to decide whether to purchase an item.
In a recent project, Paul Healy and I developed the GhostWriter-2.0 system, which is a tool that helps someone who is writing a new Amazon review (Healy & Bridge 2010). As the user types her review, GhostWriter-2.0 uses Ajax techniques to obtain suggestions for topics that the user might write about. These topics are mined from related and helpful reviews that we extracted from Amazon Web Services (see the Product Advertising API).
This year, I propose that we use similar techniques to help someone who wants to choose a product or hotel. We will investigate ways of mining product reviews for recurring topics. We can then present reviews for pairs of products, comparing the products point by point. For example, hotels might be compared for room cleanliness, front-desk service, gym facilities, etc. The key innovation is that the topics on which we compare the products will not be predefined; they will be mined from the reviews.
Background reading
Aidan Waugh and Derek Bridge: 'An Evaluation of the GhostWriter System for Case-Based Content Suggestions', in Lorcan Coyle, John Dunnion and Jill Freyne (eds.), Procs. of the 20th Irish Conference on Artificial Intelligence and Cognitive Science, pp.264-273, 2009.
Paul Healy and Derek Bridge: 'The GhostWriter-2.0 System: Creating a Virtuous Circle in Web 2.0 Product Reviewing', in Derek Bridge, Sarah Jane Delany, Enric Plaza, Barry Smyth and Nirmalie Wiratunga (eds.), Procs. of WebCBR: The Workshop on Reasoning from Experiences on the Web (Workshop Programme of the Eighteenth International Conference on Case-Based Reasoning), pp.121-130, 2010.
Ibrahim Adeyanju, Nirmalie Wiratunga, Juan A. Recio-García and Robert Lothian: 'Learning to Author Text with textual CBR', Procs. of ECAI, pp.777-782, 2010.
Hu, M. and Liu: 'Mining opinion features in customer reviews'. In A. G. Cohn (ed.), Procs. of the 19th National Conference on Artifical intelligence, pp.755-760, 2004.
Skills required
This very challenging project will suit a student who loves programming, especially mixing lots of technologies and investigating different ways of solving problems. An M.Sc. student would be expected to take the ideas further, and to evaluate them as well.
DGB4. Competence-Driven Active Learning
This project is no longer available. It has been assigned to Charles McCarthy.
In classification problems, the task is to decide to which of a predefined, finite set of categories an object belongs. See section 1 of these lecture notes.
In Artificial Intelligence, there are many ways to build systems that carry out automatic classification. For example, k-NN classifiers store a database of already-classified objects. To classify a new object, they retrieve from the database the k objects that are most similar to the new object; the class of the new object is then whichever classification is most common among the k objects that have been retrieved. See these lecture notes. One problem for this approach (and all others) is how to obtain some already-classified objects: they have to be classified manually, which places a burden on the expert who has to carry this out.
Instead of asking the expert to manually classify a large number of randomly-selected ojects, Active Learning is the name given to the process by which we automatically select a small set of what we believe to be the objects for which knowing the classification will be the most informative. There are many ways of choosing this small set. For example, we might choose the ones whose classification are most uncertain; see, e.g., (Hu et al 2009). Alternatively, we might choose those that come from dense regions of the population while ensuring a level of diversity; see, e.g., (Hu et al 2010).
I would like to try out some new ideas for Active Learning, using measures of competence (Smyth & McKenna 1998, Delany 2009). We might ask the expert to classify those objects that have the potential to bring about the biggest change in competence, or objects that lie in areas where the system currently has low competence.
We would evaluate our ideas, and compare them to existing techniques, by running experiments on publicly-available datasets; see these lecture notes. There is the option of building the software from scratch, or integrating it into existing open-source software such as Weka or jColibri.
Background reading
Rong Hu, Sarah Jane Delany, Brian Mac Namee: 'Sampling with Confidence: Using k-NN Confidence Measures in Active Learning', in Procs. of the UKDS Workshop at 8th International Conference on Case-based Reasoning, pp.181-192, 2009.
Rong Hu, Sarah Jane Delany, Brian Mac Namee: 'EGAL: Exploration Guided Active Learning for TCBR', in Procs of 18th International Conference on Case-based Reasoning, pp.156-170, 2010.
Sarah Jane Delany: 'The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing', in L.McGinty & D. Wilson (eds.), Procs. of the International Conference on Case Based Reasoning, LNCS 5650, pp.135-149, Springer Verlag, 2009.
Smyth, B. and McKenna, E.: 'Modelling the Competence of Case-Bases', in Proceedings of the 4th European Workshop on Advances in Case-Based Reasoning, B. Smyth and P. Cunningham (eds.), LNCS 1488, pp.208-220, Springer-Verlag, 1998
Skills required
This project is quite research-oriented. It will suit a student who is an excellent programmer, willing to read the research literature (some of which is a little mathematical or algorithmic) and keen to try out new ideas.
This project considers case base editing algorithms. These delete noisy and redundant object from the database. The goal is to reduce the size of the database (to speed up classification) while preserving, or even improving, the accuracy of the classifier.
Numerous case base editing algorithms have been proposed. But there is a new family of algorithms: Delany 2009.
In recent work, Lisa Cummins and I have used measures of dataset complexity to evaluate the efficacy of case base editing algorithms: Cummins & Bridge 2011. In this project, I want to take the same approach to evaluate Delany's new case base editing algorithms.
Sarah Jane Delany: 'The Good, the Bad and the Incorrectly Classified: Profiling Cases for Case-Base Editing', in L.McGinty & D. Wilson (eds.), Procs. of the International Conference on Case Based Reasoning, LNCS 5650, pp.135-149, Springer Verlag, 2009.
Lisa Cummins and Derek Bridge: 'On Dataset Complexity for Case Base Maintenance', in Wiratunga et al. (eds), Procs of the International Conference on Case-Based Reasoning, 2011, forthcoming
Skills required
This project is quite research-oriented. It will suit a student who is an excellent programmer, willing to read the research literature (some of which is a little mathematical or algorithmic) and keen to try out new ideas.
DGB6. firstday@ucc.ie
Think back to being a fresher: unable to read UCC's unintelligible timetables; not knowing where anything is; not knowing others in your class.
In this project, the idea is to develop a location-based game that students would play on Orientation Day in order to acquaint themselves with UCC. They would use an iPhone or Android phone app to check-in at various places on campus, to answer questions, to connect with staff and other students, to obtain discounts in campus shops, and to gain points for doing these things until they are no longer newbs (unless they are n00bs).
I've deliberately left this quite vague in order to allow you to bring your own ideas to this project. For inspiration, look at, for example, Seth Priebatsch's TED talk (Seth is from SCVNGR, and he presents some of what he calls their game dynamics.) Closer to home look at iRule Dublin and Viking Ghost Hunt
Important to me is that the software be exceptionally well-designed. In particular, it should have enough generality that a game administrator can easily configure the game (e.g. by adding new locations, questions, activities, etc.), and so that it can be easily ported to another campus.
We can consider using the Gowalla API or FourSquare API to handle check-ins. (However, building something on SCVNGR is probably too simplistic for a Computer Science project.)
Skills required
Excellent software engineering skills and a good imagination are the keys to success in this project. Experience of iPhone or Android development would be useful.
DGB7. Your own Web app
If you have an idea for a cool web app, perhaps a mash-up, especially a social media idea, perhaps with a mobile Web angle to it, then I'm willing to discuss it with you. If it looks like it would make a good project (many student ideas don't!), i.e. if I think it will enable you to display your talents and the skills that we are looking for, and if I feel it lies within my competencies, then I may agree to supervise it.