CS433H/533 projects

In the case of graduate students, this course requires a substantive project. Project work is also required of Honor's students. Other undergraduate students may substitute project work for assignments by choice, subject to approval by the instructor which will require, at a minimum, good performance on standard assignments so far.

I strongly prefer research oriented projects. By this I mean projects which support some research effort, whether it is your own, the vision lab's, or someone else's.

Part of the motivation for the current structure is to allow graduate students to spend more of their time on research, and less of their time on canned assignments. Thus, if your project is research oriented, I will be very flexible on what the project is. Ideally, there will be images involved, but beyond that, I am open to a wide scope of projects.

Below I have begun a list of project suggestions. It is not necessary to choose a project from the list. They are only suggestions.

All project plans need to be finalized with me.

Group projects are encouraged. Honors students may work with grad student teams if they wish.

Because the projects are meant to be research oriented, it will be assumed that others in vision are free to use what you have developed once you are done with it. If, for any reason, this is not the case, then you need to let me know in advance. This will exclude you from doing many (if not all) the suggested research projects, as these are mostly parts of larger projects. Thus, if you have any reservations about releasing your work, you should work with me to find a suitable project. Other than restricting the choice of projects somewhat, there is no consequence for choosing not to release your work, and you do not need to give a reason for your choice.

Publication potential: Most of the projects described have the potential to become part of published work. Publication likely requires that someone (possibly you) continues the work after the course is over (this is not required!). If something you work on as part of a project becomes a substantive part of a research endeavor, I will ensure that you get credit where appropriate. If you have any questions on this, please come talk to me.

Scope: I am willing to consider a wide range of projects related to graphics including many which are related to computer vision, image processing, bio-informatics, human computer interfaces, and image data bases.

Context: As these are research oriented projects, there are no limitations on using any OpenGL facilities, any code and/or tools available on the internet, or any ideas already published. A literature search is always a good idea. Someone may have already solved your problem better than what you were about to do. Best to work from that point. The idea is to go from the current state and do something new.

Platforms: For most projects, it is expected that the platform will be Linux. However, if there is a compelling reason to use another platform, then this should be argued for early on and settled by the proposal due date. Projects using the stereo display will need to compile and run on the specific machine used to drive that advice. It is highly recommended that you use some form of conditional compilation to make your code compile, run, and do something reasonable when used on machines with ordinary displays (such as the graphics machines). This will make development easier.



Time-line

Oct 5: By this time you have discussed your project with me, at least by E-mail. You have some ideas of what you want to do and who you will be working with.

Oct 21: Proposal

By this time you have chosen your partners and a project. A short project proposal is due (via E-mail). Depending on the amount of contact that you have had with me regarding what you are doing, it can be short.

The schedule is designed to help you get through the assignments quickly so that you can focus on your project for the last month of the term. However, you need to make a good start on the project well before then. There will likely be much research and thinking that should be in place before the last assignment is due. Furthermore, if you are using the stereo wall, or any novel software or system, I expect that you have writtem small "practice" programs for that environment.

Nov 25: First project review due (worth a modest part of the grade).

What you need to provide is arguably project dependent. However, I suggest approaching this in the following way. Your project has an implementation component and a written component. The written component could be viewed as a draft of a paper. The project reviews could be thought of as iterations on a draft for the written part. Your proposal was the first draft. On each iteration, you go from writing what you were going to do, to writing what you have done. For many projects, images are appropriate, and are an excellent way to help tell the story.

I have made the turnin key cs433_project. Formats: PDF is fine, or HTML, or even text. Images are good, either in the PDF document, pointed to by the HTML, or just in the directory, but referred to in your text.

It may help you to think what the purpose of the review is. First, it should convince me that you are on track. Second, it should help you clarify your thinking. Third, in some cases, some of what you write can considered an early draft on parts of papers, and thus you can get feedback when it is most effective (early on).

Dec 16 (after exam): Project demos/presentations: (worth a modest part of the grade).

What has worked in previous classes is to have project presentations after the exam i.e. around 4:30 onwards), followed by a session in the pub. If this time is not going to work for you, let me know so we can make other arrangements (which, by necessity, means that you will have you present your project sooner than others.

Dec 16 (day of exam): Final code and write-ups due (hard deadline -- I need to have EVERYTHING before I do my grades on Dec 17 ) .



Sample projects

These descriptions are necessarily incomplete and likely somewhat mysterious. If a project seems interesting to you, let me know, and I will set up a meeting for all those interested in that direction.

I will update this list over the next few weeks as I think of more ideas and see flaws in existing ones.


Where was this picture taken? (approach one)

(Ongoing vision lab work)

This project has a strong computer vision / image processing component.

Given a image, there are many applications for having some ideas on where it was taken from. We would like to develop methods to constrain possible locations. This can be used in conjunction with other information (a historical picture taken "near Tucson") to perhaps identify a small set of locations.

This project is in collaboration with PHD student Scott Morris, who will work with the project team as needed. Some of the existing code was written by Alan Morris. The skyline matching part was worked on by James Judd. Most of the preliminary work is already done, and accomplishing it may simply involve checking/repackaging existing code.

We begin by putting topography (DEM) data into an appropriate 3D model. We then texture map it with satellite photo data which produces the ability to produce a view of any place for which we have data such as continental USA. Then we will take a photograph taken at a known location, and with known (or estimated) camera parameters and project it onto the view. Ideally this will work on both 2D and 3D stereo display. Having done this, we will create some kind of correlation measure for the degree of match (the TA has experience with this, and should consulted for help).

The real meat of the project is to take a photo, and develop algorithms and data structures which will allow us to identify good candidates for matches of where the picture was taken and the 3D world as constructed above. An overly brute force approach is out of the question. Also available from previous projects is trying to do this by finding the skyline in the photo and then pulling out the skyline from the DEM data, and focusing on this match. Conceivably, this could be used to identify a manageable set of candidates for matching.

Scott has a nice data set for thinking about this problem, namely 900 pictures taken at known locations along a specified route.

A complete solution to the meat of the project is likely beyond the scope of what can be done as a term project, even with several motivated students working together with Scott and Quanfu. A plausible target is to a) solidify the infrastructure and user perspective both in 2D and 3D; b) have the capability of doing a brute force solution; and c) a start on a pruning approach, perhaps based on skyline (carrying on James's work).


Where was this picture taken? (approach two)

(See approach one for additional background).

Another approach to help identify where a picture was taken is to match features to existing images in a database of images which have some geographical information, if only in a place name (which can be translated into geographical coordinates using the Alexandria on-line gazetteer). Conceivably, easy targets like "Yosemite" or "Grand Canyon" can be handled without too much effort. A key problem is that the camera angle and zoom will be different from the new photograph, and the one in the image database. However, recent developments in the vision community have made it so that this kind of matching can be done, and we have experience with one method for doing so (Kate Taralova can help with this). If you are interested in this approach, you should see me for some papers.

This project will thus involve sorting out a redundant image database with Scott's and/or Quanfu's help, trying out the affine matching code, and developing some notion of how well we can expect to do and what the problems are. Depending on how many people are working on it, and what problems we run into, we may consider reimplementing the affine matcher (or another affine matcher) in order to have reasonable computational performance on a large data set and/or improving matching performance through more sophisticated consideration of color or other means. Integration with the Alexandria gazetteer is also a possible direction. A good user interface is always a plus. In the long run it may be the case that the best solution is a Java based web client talking to a C based server.


Virtual Mountain Bike Riding

This project is oriented to the 3D wall.

This project is very related to the first "where is this picture taken" project above, and will obviously share a great deal of infrastructure which will be developed jointly. I am open to suggestions regarding different ways to slice and dice the geographically oriented projects.

We again want to texture map DEM (Digital Elevation Model) data, either with satellite photos, or with photos taken by people as they hike / bike around. We have a very good start on this part of the project due work done last term by James Judd and by Alan Morris the previous term. Alan developed a program to project mountain bike trails onto the 3D terrain model. The goal of this project is to add photo based texture mapping to the model, and provide a "view" from the perspective of the rider on the trail. There is likely some prior work of this sort, and a literature search should be done so that you can put the project in the context of what has been published. The meat of the project is less publishable than that for some of the other projects, but it will likely support other research projects such as the "where is this picture taken" work, and the digital trail library work that Scott is working on for his thesis. And it is cool!


Tracking people in the dark (could support several sub-projects)

(Ongoing vision lab work)

This project has a strong computer vision / image processing component. For virtual reality applications we need to have camera systems which know the configuration of the people in the space (e.g. which way you are looking, what direction you are pointing) from camera data. For artistic VR, we need to do this in the dark.

Students in previous courses have made some progress in this endeavor, but there is still much to be done. Kate Taralova is currently working on some of the issues as an undergraduate RA and will play a role in this project.

Some tasks include: (a) developing a calibration procedure for the cameras and characterizing the error (already well under way), and moduralizing the calibration tasks; (b) integrating multiple camera information to improve the background model; (c) integrating a face finder (Quanfu has some experience here, and you should talk to him about this; d) integrating stereo matching, especially with structured light; e) integrating the affine matcher to estimate face pose; f) traking body position in general, and hands in particular.


Browsing image data in 3D

Quite likely Honor's student John Bruce will take part in this project.

We have developed novel methods to browse and search data. We have built several 2D systems for browsing such data, and the last time this course was offered, a group did a project on browsing such data in 3D using the stereo display wall. However, we have a long way to go before we understand what 3D is good for in this kind of application.


Visualizing complex statistical data (preferably in 3D).

This is similar to the image data browsing project above, except that the data is now any complex data of interest (perhaps from your research).


Illumination models for fluorescent surfaces

There has been surprisingly limited work on fluorescent surfaces in graphics, and a effectively including them into a renderer would be a great project. (However, I have not searched the literature recently, so this would be the first step).

I have some measured data for fluorescent surfaces and an approach to modelling it which may be usefual in graphics (I have already used it in computer vision). There are a few papers that need to be read in order that we build our model in the context of what others have done. Given the model, you would reander some interesting scenes with some fluorescent surfaces.

Work on the radiosity equations in this paradigm would also be very interesting.


Fitting illumination models to web cam data

The first thing to do is to find, say 10 web cams which you can download data from with a cron job (the more the better). Thus you will collect image sequences over a number of days which are of exactly the same thing, except for the occasional human activity (someone walking in front of the camera). The other big difference in the images is the illumination. What you want to do is fit a model which explains the data at each pixel in terms of illumination and what is in the world.

Having done this, there is a number of things that could be done, although the extent to which any need to be done in the context of the term project depends on how many people are working on it, and how the first part above went. Two things which are of interest: a) you should now be able to illuminant the scene differently; b) since we envision a number of models (one for each web cam location), we now have information about the statics of shadows, which is very useful for ongoing vision work.

There is some prior work on this sort of thing. Reading a few papers and a literature search for new papers is in order.


Manipulating soft (DXA) X-ray data and linking it with MRI data

This project requires confirmation on data availability which I can work on confirming if there is interest.

This project is the exploratory analysis for a big and very important project.

The main idea is to use DXA (soft X-ray) data to measure muscle mass loss in elderly women. The strategy is to warp/align DXA data to MRI data (the standard). This then will support building a classifier or a statistical model so that muscle mass can be estimated using inexpensive/portable DXA imaging instead of expensive MRI imaging. This second part is likely beyond the scope of the term project.

The first concrete task will be creative hacking to reverse engineer the data format produced by the DXA machine (and/or dealing with ancient code and docs). Some people in the vision group have done some leg work on this, and may be useful as resources. Having done that, image processing could be applied for a variety of tasks such as identifying the skeleton. Then we will want to align multiple DXA images (same person different times) as best as possible so differences can be tracked over time. Possibly having done that, the data needs to be warped/aligned/mapped to the MRI data.

This project will need to be started soon in order to verify that we can actually extract the images from the data file. If this proves too difficult, and we cannot think of an alternative strategy, a switch to a different activity may be in order.