Text only

CM30076 / CM30082
Individual Project

Project Ideas

Dr Alwyn Barry

A.M.Barry@bath.ac.uk

Investigating new algorithms for Machine Learning Method
Learning Classifier Systems combine techniques from function approximation, reinforcement learning and genetic algorithms to create a very powerful machine learning system that is competitive with the best ML systems. We have developed a number of potential improvements to LCS through a theoretical analysis, and would like to try out these proposed enhancements to see how they work. This project would take initial LCS code, written in Java, Python or C, and modify it to incorporate the various enhancements, one at a time. The project will test out these enhancements on a range of test problems to identify any performance benefits that they bring about.
Pre-requisite knowledge: This project is only suitable for a very good 2:1 or 1st class student who has a good grasp of programming, is interested in AI or Machine Learning, and is not afraid of mathematical expressions.
Developing Theory for a Machine Learning Method
This project is primarily for a Maths and Computing or Mathematical Sciences student who has majored in statistics units or has an interest in the application of applied mathematics in AI.
Learning Classifier Systems combine techniques from function approximation, reinforcement learning and genetic algorithms to create a very powerful machine learning system that is competitive with the best ML systems. We have developed a number of potential improvements to LCS through a theoretical analysis, but are limited in our progress by a lack of mathematical knowledge. We need a very good student (1st class profile) who has an interested in statistics or related areas to investigate one of a number of possible avenues for development. The project will be primarily mathematics based, with some application for empirical tests.
Pre-requisite knowledge: A strong background in undergraduate level statistics.
Extending PACS
PACS is a high-performance evolutionary computation based machine learning system that was first implemented by a student in a previous undergraduate project. This project will seek to identify the properties of the PACS learning system through the development of appropriate mathematical descriptions of the dynamics, and use this investigation to refine the algorithm.
Prerequisite knowledge: Excellent mathematical knowledge. Good Java programming skills. Prior 1st class or high 2:1 performance.
Robot simulator for Lion Predator Behaviour
AIn recent work I have examined a particular aspect of cooperative Lion hunting behaviour, and have demonstrated that the complex behaviour can be efficiently implemented using very small visual communication overheads, much smaller than previously predicted. This project will use these results to demonstrate search-and-locate capabilities in a small swarm of robots. The project will use the Player/Stage robot simulator.
Prerequisite knowledge: Ability to read academic papers. Good C++ programming skills. Linux/Unix/Cygwin user.
Genetic Algorithm based on Bacterial forms of Reproduction
Some bacteria can actually gain new genetic material from their environment by a process termed "Homologous Recombination". This provides an interesting new way of providing exploration of a problem space within a Genetic Algorithm (an Evolutionary Computation method of search for solutions). This project will seek to modify an existing Genetic Algorithm with the various operators seen in Homologous Recombination, and will evaluate the effectiveness of these operators in comparison with others from "Evolutionary Strategies". You need to be a good student who is able to read research papers to tackle this project.
Prerequisite knowledge: The ability to read scientific papers from a variety of disciplines; Previous good 2:1 or above performance.
Parallel Classifier System for Bioinformatics
Data-mining using Evolutionary Computation techniques has been demonstrably effective and is competitive with the very best Machine Learning approaches. However, where the data is very large, such as within Bioinformatics datasets, processing speed becomes a problem. If we can identify a technique which would enable us to divide the data within records, as well as between records, then significant parallelisation is possible. This project will investigate mechanisms for dividing data sets, separately mining, and recombining the results.
Prerequisite knowledge: Good programming skills. Previous 1st or 2:1 performance.
P2P for Bio-informatics
Bio-informatics data is characterised by its very large size and heavy computational demands. Parallel and grid-based distributed processing has been widely used to address these demands. Peer-to-peer processing is an alternative avenue, but data distribution, collection and load balancing can be problematic. This project will examine existing P2P solutions, and then use advances in P2P database implementation to inform new P2P solutions for bio-informatic processing.
Pre-requisite knowledge: A very strong technical understanding. Ability to read technical and academic papers. Ability to deploy and use highly unstable research software. Only suitable for academically strong students.
Visualisation of the Genetic Algorithm inside an LCS
Learning Classifiers Systems are a form of Machine Learning which utilise a Genetic Algorithm. Their operation has been described in other project ideas given above. One problem with LCS is that although relatively small in [code] size, they are very complex algorithms and so it is very difficult to identify the progress of the algorithm in learning the production rules to solve a given problem. What is really needed is a toolset of visualisation techniques that will enable the user to call up a number of views of the operation of the LCS and the progress in finding the result. This project will seek to provide these views as plug-in components to an existing Java based LCS implementation.
Prerequisite knowledge: You will need good Java programming skills, and an ability to read academic papers.
CMS over an XML Database
Joomla is a very popular CMS, using PHP and MySQL as a back-end. XML is an ideal format for storage and manipulation of data for display within a CMS. This project will use the strength of Joomla's interface and information structuring concepts, with the convenience of a native XML database, such as eXist, to produce a more usable XML CMS than is currently available. Key decisions will be how to structure the stored data, how much user customisation to permit with the additional representational power of XML, the compromise between templating, XSLT and fixed format, and the division of labour between PHP scripting, and Xquery/XSLT transformations. It would also be useful if an abstraction layer was created so that any XML database can be used.
Prerequisite knowledge: A strong programming background with excellent conceptual skills, achieving 2:1 or 1st class performances.