What I missed at the APS March Meeting

April 13th, 2012 by

Late in February and early March, while the others were enjoying the IPS Meeting in Singapore, I headed off to IPS’s rather larger cousin, the American Physical Society’s March meeting in Boston.

The March meeting is by far the largest physics conference I have ever been to, with more than 7000 papers presented throughout the week spread across 50 parallel sessions. At this stage, quantum information is the largest and fastest growing topical group within the APS, and will hopefully become the 15th division of the APS within the next few years (you can help it get there sooner by joining GQI). With that in mind, it should not come as a surprise that for most of the conference there were parallel sessions on quantum computing and information. Hopping between sessions I managed to miss IBM announcing their latest results on superconducting qubits (see arXiv:1202.5344). It seems they are getting close to the threshold for fault-tolerance, which would make things very exciting in the field, since superconducting qubits are inherently more easily scalable than things like ion traps or linear optics quantum computing. Previously I had thought the route to scalable quantum computing would involve optically coupling a bunch of few qubit systems, with solid state systems coming only later, but recent advances with superconducting qubits are making me start to doubt this belief.

A hawker tour

March 5th, 2012 by

My father is coming to visit Singapore next week.  Of course one thing I have to do is take him to sample some of the great food at the hawker centres here.  Say we are very ambitious and want to visit 18 of the best hawker centres on the island…in order to intelligently engage in the raging debate on which stall has the best char kway teow.  As my Dad’s stay is limited, we want the shortest tour visiting all these hawker centres.

Here is a google map of the 18 chosen hawker centres.  What do you think the shortest tour is?

18 of the best Singaporean hawker centres: ABC brickhouse, Amoy St, Changi Village, Chinatown Complex, Chomp Chomp, East Coast Lagoon, Geylang Serai, Ghim Moh, Golden Mile Complex, Hong Lim, Lau Pa Sat, Maxwell Rd, Newton, Old Airport Rd, Tampines Roundhouse, Tekka Centre, Tiong Bahru, Whampoa

Read the rest of this entry »

Buy your experiment at Kelantan Lane

February 29th, 2012 by

Singapore is known to be a shopping mecca. But did you know, that you can buy the better part of your physics experiment in just one afternoon? No, don’t rush over to Orchard Road. You won’t find much useful stuff there. Better go down Kelantan Lane.

Street sign of Kelantan Lane

If you have never heard about Kelantan Lane, let me assure you: It’s awesome. Need some really weird screws for you latest creation? Kelantan Lane. Need to connect the piping for water cooling your latest gadget to some exotic fitting? Kelantan Lane. Need to get raw materials really quickly? Kelantan Lane. And where can you finish your shopping experience with some outstandingly good Laksa? You are right, Kelantan Lane.

If you have never been there, I suggest you take a stroll there. Kelantan Lane is a uniquely Singaporean area. It’s a cluster of lots and lots of small shops, all highly specialized in some kind of product. In fact Kelantan Lane is for experimentalists what Orchard Road is for the shopaholics. It is so nice, we regularly take visitors down there, and they love it.

Interested to find out more? Take a look at the impression below from one of our latest shooping sprees. Or contact me if you want to join our next one. To help you if you are planning a trip of your own, we compiled some of our favorite shops in the google map below. If you do know other hidden gems in the Kelantan Lane area, drop us a line and we will add it to the map.

Read the rest of this entry »

When physicists on the little red dot gather

February 24th, 2012 by

So here it is, the meeting of all physicists on the little red dot. Or as it is officially known, the IPS Meeting 2012, this year’s edition of the annual meeting of the Institute of Physics Singapore. As you can see from the program, the IPS Meeting is closely modeled after the annual meetings of bigger physical societies such as the APS in the United States of the DPG in Germany. There are exhibitors, plenary talks, invited talks, contributed talks and plenty of space for poster sessions.

So much for the theory. But will it fly? Isn’t the local physics community to small? Aren’t the topics to widespread? Isn’t there culture too different from the States or Europe? The answer to that is a clear no. Wandering through the halls of the conference venue during the coffee and lunch breaks, you will see that there is a vivid exchange of ideas. And seeing how the conference brings together physicists from all over Singapore, you are easily let to the conclusion that the IPS Meeting has no reason to fear comparison to its bigger cousins. Except for the fact that the whole conference comfortably fits into three seminar rooms and the connecting hallways.

Discussions at the IPS Meeting

Poster Session of the IPS Meeting 2012

Irrationality of reviewing

February 23rd, 2012 by

I have to finish reviewing two papers today.  Posting about the irrationality of reviewing seems like a great way to procrastinate.

Why do we review papers?  As pointed out in the boycott against Elsevier, with the big commercial publishers, reviewers are doing free work for very profitable companies.  Could it actually be that we are more likely to do reviews because they are not paid?

In Predictably Irrational, Dan Ariely distinguishes actions we undertake as part of social norms versus market norms.  When we help a friend move a sofa, we would probably be offended if he offered to pay us something at the end—this action takes place in the social realm, not the market realm.  Ariely did experiments (with college students of course) where he found subjects worked harder when asked to do a simple task as a favor to the experimenter than when they were paid some nominal amount.  Similarly, he relates the story of a daycare center that found that the tardiness of parents picking up their kids increased when the center started imposing fines for lateness.  Before the fines, parents avoided being late as a courtesy to the teachers who they were keeping from going home.  With fines this moved to a market interaction, where you could pay to be late.

Reviews largely work in the realm of social norms.  Many review requests I receive are from colleagues I know well, making them hard to turn down.  It is much easier to say no to form letters from people I do not know, and it would be similarly easier to say no if it became a (low) paid job rather than a favor.

While on the topic of reviewing irrationality, I remember an old post of Lance Fortnow that when evaluating results in a paper, sometimes less is more.  He related the story of a student who had a paper rejected from STOC, then removed one of the two main theorems and won best student paper at FOCS.

Being on a pop economics kick, I just finished reading Thinking, Fast and Slow by Daniel Kahneman, who discusses this very “less is more” phenomenon.  This book was very enjoyable, full of interesting examples of how real human behavior differs from that of the Homo Economicus usually assumed by economists.

Kahneman describes an experiment where subjects are asked to put a price on a set of dinnerware.   Set A has 24 plates and bowls.  Set B is set A with the addition of 8 cups and 8 saucers, a few of which are broken.

In single evaluation, where subjects are presented only one of set A or set B, set A was valued at $33 while set B was valued at $23.  In joint evaluation, where both sets are presented simultaneously, subjects acknowledge that set B can only be better than set A, and valued set B at $32 and set A at $30.

I think the implication for paper submission is clear: if in doubt, submit multiple versions and force your reviewers to behave rationally with joint evaluation.

An open revolution

February 8th, 2012 by

We are in the midst of an open revolution in science. This is a revolution towards doing research in the open, using online tools to share unfinished ideas and allow massive collaboration, but also an open revolt against the traditional model of publishing those finished ideas, in particular against the publishing giant Elsevier.

In his book Reinventing Discovery, Michael Nielsen (co-author of The Book in quantum information) is an enthusiastic chronicler of the rise of open science, and inspiring advocate for its success.  Some of his examples were familiar to me, like the arXiv or the polymath project.  But there were also some neat examples I had not heard of, like the Galaxy Zoo, where anyone can help classify galaxies, FoldIt, where users compete video game style to come up with low-energy configurations of proteins, or Kasparov vs. The World, where 50,000 people democratically played chess against Kasparov, doing much better than expected.

One example I was surprised not to find in the book was the stack exchange family of question and answer websites, mentioned previously on this blog.  Sites like math overflow or cs theory stack exchange, where people can ask research-level questions and expect to get informative answers in a matter of hours, have quickly become very useful research tools.

The end of Nielsen’s book is a call to arms, with suggestions of what scientists, programmers, citizens can do to develop open science, and also some fantasies of what open science might look like in the future.  One such idea I found quite cool was a sort of stack exchange meets eHarmony service.  This tool would automatically pair questions with respondents, based on the respondents expertise and questions they have successfully answered previously.  As Nielsen describes it, every morning a researcher would receive an email with a list of ten requests related to their expertise, which they could then choose to answer or not.

What I personally find the most exciting in the open science revolution is the development of open education.  The Khan academy deserves a lot of credit for innovation here, showing that students often prefer a short video that can be paused and replayed to a live lecture.  They now have over \(10^8\) downloads and 2600 videos.

Sebastian Thrun, who co-taught the Stanford online artificial intelligence class last semester that attracted 160,000 students, recently gave a talk about the experience which I think is well worth watching.  Thrun says that this experienced changed his life, and the talk is surprisingly heartfelt and emotional.  He also has some harsh words for the current university teaching model: “Miraculously, professors teach today exactly the same way they did 1000 years ago.  University has been, surprisingly, the least innovative of all places in society.”  Nielsen also addresses this point in discussing the low rate of participation of researchers in Wikipedia.

Thrun further says that after teaching 160,000 students (of which 23,000 completed the course), he can’t go back to the old way of teaching the 200 in his normal Stanford classes.  He gave up tenure at Stanford in April of last year and now has began a startup, udacity, focused on producing online classes.  Two classes are on tap for the first offering, CS101 going from zero programming experience to programming a web search engine, and CS373, programming some of the algorithms under the hood of a driverless car.

Postdoctoral Position in Quantum Computation

February 2nd, 2012 by

I’m currently looking for a postdoc to work with me here in CQT. The start date is flexible, and I’ll be considering applications until the position is filled. See below the fold for details.

If your thinking to yourself “Sure, CQT is great, but why would I want to travel so far?”, I’ll just point out that the forecast for today is 25-31°C.

Read the rest of this entry »

Blind quantum computing: An introduction

January 20th, 2012 by

Cross-posted from my personal blog.
Blind quantum computing - as illustrated by my lovely wife Liu Jia
Since I have a new paper out today in Science, which seems to have attracted the interest of a fair number of non-physicists, I thought I would take the time to explain the paper: what it is we do, and the background behind it.

The paper itself is an experimental implementation of a cryptographic scheme designed to allow a user to run a calculation on a remote computer (server) in such a way that the users input, the operations performed and the output all remain hidden from the server. In our case we care about running a computation where the remote computer is a quantum computer, but the user has access to a much more limited machine. With that in mind, I thought I might first explain the protocol before talking about the experiments themselves.

The current experiments are based on Universal Blind Quantum Computing, a protocol proposed by Anne Broadbent, Elham Kashefi and myself a few years ago (illustrated above).

In quantum computing, the basic unit of information is the quantum bit (or qubit for short). This is exactly the structure you get if you take a classical bit, which can be 0 or 1, and try to represent it as a quantum system. Quantum mechanics allows for a systems to exist in superposition. This is essentially a class of non-classical states which exists in between classical states. This means that the qubit can exist in various superpositions of 0 and 1 at the same time. Each quantum state is can be described in terms of classical states, associating two numbers with each: an absolute amplitude (between 0 and 1) which when squared gives you the probability of finding obtaining that classical state when you measure the system, and a complex phase which governs interference effects. As it turns out, this means that we can represent the state of an (unentangled) qubit very simply, as simply the point on the surface of a sphere, known as a Bloch sphere (see below).

The Bloch sphere

The classical states 0 and 1 are the top point and bottom point of the sphere, while the other states are in superposition. The lower down you go, the higher the probability of obtaining a 1 instead of a 0, with the points around the equator representing states where there is an equal probability of measuring 0 or 1.

Since we imagine this in 3 dimensional space, it makes sense to assign an X, Y and Z axis to the ball. By convention, we take the Z axis to pass through both 0 and 1, and to put the point (0,0,0) at the centre of the ball. This gives us a very easy way to visualise the basic operations of quantum computing which can be performed on a single qubit: they are simply rotations of the ball, and indeed it turns out that it is enough to consider rotations about only the X, Y and Z axes. If you can do these, you can make any more complex operation as a sequence of (3) rotations about these axes. These operations are basically a generalisation of the NOT operation a classical computer can perform, which takes 0 to 1 and 1 to 0.

As with a classical computer, we also need some way to affect one qubit dependent on the state of another qubit. As it turns out, an operation called a CZ gate is sufficient for this. The CZ gate rotates one qubit 180 degrees about the Z axis if the other qubit is in state 1, and does nothing if it is in state 0. Conveniently, this gate is symmetric, so it has exactly the same net effect independent of which qubit you designate to be rotated.

The way our protocol works is based heavily on a result which says that you can perform any computation (classical or quantum) in the following way:

  1. Prepare a sufficiently large number of qubits in the state which lies on the equator of the Bloch sphere (which specifies its absolute amplitudes) and has phase +1. I will refer to this as the + state, since in the notation most physicists use it is written as \[\frac{1}{\sqrt{2}}\mid 0 \rangle + \frac{1}{\sqrt{2}}\mid 1 \rangle\]
  2. Arrange the qubits into a regular 2D lattice (such as a square grid).
  3. For every qubit i in order:
  • Rotate i about the Z axis by some angle which depends on exactly which gate you want to perform. Note that this angle can depend on the previous measurement result.
  • Rotate i 90 degrees about the Y axis.
  • Measure whether i is 0 or 1.

We noticed that if Z rotations before and after the CZ operations have exactly the same effect, and that a Z rotation before the CZ is identical to simply preparing some other state chosen from the equator of the Bloch sphere. The idea is then to get the server to do all off the hard parts: applying the CZ gates, doing the Z rotations and making measurements, while the client only does the easy parts: computing the angle for the measurement in each round, and preparing single qubits (this last part may seem like a physically hard operation to perform, but in fact the client needs little more than a very weak laser and a pair of 3D glasses from the cinema to accomplish this if the server is sufficiently powerful).

If you look at the list of operations that the server is expected to perform, only the Z rotations actually depend on the computation being performed. This is because it is possible to choose a fixed arrangement for the qubits (the lattice referred to earlier) which will work for any computation you may wish to perform. The lattice has only 2 parameters, length and breadth, which are analogous to the time and memory the computation uses. As these parameters can be artificially inflated, they are not revealing anything particularly useful about the computation. So only the Z rotation reveals information about the computation. However, if the user prepares the initial qubits in random states on the equator of the Bloch sphere, the net rotation the server needs to perform must incorporate both the undoing of this initial random rotation and rotate by the appropriate angle required to correctly implement the desired calculation. This means that, depending on the initial random state chose (of which he is not aware), for any fixed angle required to implement the users computation, the server is equally likely to be asked to perform a Z rotation through any angle. Thus the angle he is told is completely independent of the users calculation.

So none of the classical messages the user send reveal any information about the calculation, but what about the qubits? Can’t they be measured to reveal information about their random rotation? If so, then the entire computation could be infered by the server from a combination of the classical and quantum messages sent by the user. Fortunately for us, the quantum mechanics come to the rescue. Suppose instead of choosing the inital state of the qubit and the angle sent to the server in such a way that the exactly implement the correct angle required to implement their desired computation, suppose they choose them randomly to either add up to the correct angle or add up to the correct angle + 180 degrees. What would happen then? Well, this would be the same as randomly choosing on of two opposing points on the Bloch sphere. No matter what two points you pick, it is impossible to determine either of them by making a measurement on such a system. This is because no matter what axis you pick to make a measurement along, you always have an equal chance of obtaining a 1 or a 0, independent o the choice of points. So if the user sends such randomized states to the server, together with measurement angles, there is nothing to be learned by any measurement of the system. Thus everything is perfectly hidden from a malicious server, even if they deviate from the protocol.

But doesn’t this randomization mess up the computation, resulting in nonsense? As it turns out, no, it doesn’t. This is because this extra rotation through 180 degrees is identical to flipping the measurement result the server gets when they measure the qubit. Since the measurement results are randomly flipped, the server can’t learn anything about the computation. However, the user knows whether or not the result has been flipped, and so can unflip it when they receive the result from the server. As long as they base their actions on these corrected results, the outcome of the computation is identical to if they had never occurred at all. Since the measurement results the server obtains are flipped, even for the last qubits to be measured, they have no knowledge of the outcome of the computation, though it can be trivially read by the user.

Since the server doesn’t know what computations are being performed, it is possible to set up traps to detect if the server deviates from the protocol, but this isn’t necessary to ensure blindness, and is beyond the scope of this blog post.

Now that I’ve explained the explained the theory, I’ll tell you a little about the experiment. To actually see our protocol become something real, we spent much of the last two years working with experimentalists Stefanie Barz, Philip Walther and Anton Zeilinger to find a way to implement the protocol using photons. Photons make perfect qubits, since they have two polarizations which can be used to encode a single qubit. In practical terms they would be ideal for a mature implementation of blind quantum computation, since photons can be relatively easily be transferred over large distances, allowing the user to be hundreds or thousands of kilometers distant from the server. Additionally, the group already had significant success in implementing the measurement based model of quantum computing on which our protocol is based.

Optical setup

The final set-up used (see above), like every other first demonstration, makes some compromises most notably in giving the client something a little more sophisticated than a pocket laser and 3d glasses, since this allowed us to simplify the server somewhat. None the less, it is a pretty faithful implementation of our original idea. With the 4 photonic qubits we had, we were able to hide all of the basic building blocks for a larger scale computation, meaning that we were able to demonstrate blind single and 2 qubit gates, which are sufficient to build an arbitrary computation out of. We were also able to hide instances of two quantum algorithms: the Deutsch-Josza algorithm and Grover’s algorithm.

Lastly, and most importantly, we were able to show that virtually no information could have been leaked by the apparatus. Although we have a theoretical proof, as outlined above, actual experimental apparatus aren’t as well behaved as our theoretical models, and so we need to show that our apparatus isn’t leaking information by introducing some subtle error on the quantum states. This may sound impossible, since how do you prove a negative? By running the experiment many times over for each and every possible choice user could make, we were able to use measurements made on different runs to infer the real state received by the server. This was possible because we knew exactly what the users choice of randomness was in each case, and so weren’t subject to the blindness which affects the server. With all of these measurements made, and the true states inferred, we applied a result known as Holevo’s theorem to calculate a maximum amount of information that could be extracted by a malicious server had they knowledge of the exact peculiarities of our experimental set-up. Philip and Stefanie are so good at their jobs, the system is pretty accurate, an the amount of information which could possibly have been leaked was no more than 0.169 of a bit. Even if the server could somehow learn something about the user’s desired calculation or random choices, the amount he could learn from measurements on the quantum state rises only slightly to 0.185 of a bit. In the case of these experiments, we were using 8 different initial states (i.e. 3 bits) for each blind qubit which is sufficient for any calculation, and so the amount of leakage is tiny. Not too bad for a first attempt, I think!

IPS Meeting 2012

January 18th, 2012 by

Take a look at Singapore on map and you will soon discover that you have to travel very far in order to attend any of the major physics conferences in the world. But despair not: Singapore will host it’s very own physics conference on February 23rd to 24th this year. The IPS Meeting 2012.

The IPS Meeting meeting is Singapore’s own mini version of the APS Meeting, the DPG Tagung or any other conference that aims to bring together all physicists working in a particular region. Last year’s IPS Meeting was great fun. And this year’s meeting is even to get better. Held at the Faculty of Science at NUS, the IPS Meeting 2012 will be the ideal occasion to connect with fellow physicists, present your latest work and catch up on what others are doing.

So if you are a physicists and in Singapore on February 23rd and 24th, submit your abstract by January 31st, polish your poster or slides and head over to the Faculty of Science at NUS for the IPS Meeting 2012.

Putting a face on non-negative rank

January 16th, 2012 by

Last time we talked about non-negative rank in the context of communication complexity.
Non-negative rank also has a more practical face, with applications in data analysis and
machine learning.  Today we will talk about its application to the representation of faces, a nice
change of pace as it involves lots of pictures.

The usual way of representing a digital image is to store values for each pixel indicating the
color of that pixel.  For this example, we will work with grayscale images of 5000 faces from the
faces in the wild dataset.  To get an idea, here are the first 100 images from the dataset.

First 100 images in the dataset

Each face is a \(32 \times 32\) pixel grayscale image.  Each pixel is represented by a number form
0 to 255, with 0 indicating total absence of light (black) and 255 full intensity light (white).  Thus
in total it takes \(32 \times 32 \times 8=8192\) bits, or a kilobyte, to represent each picture.

Representing images by their pixel values is very general, and we can represent any image this way.
Say that for our application, however, we are going to be working exclusively with images of faces.
This information already substantially reduces the space of possible images.  Indeed, if we were
to randomly choose the intensity of each pixel value, the image would probably not look at all
like a face, but like TV static  (though people do have a knack for seeing faces in
unusual places!).

A random grayscale image

If we also allow some reduction in quality of the image (lossy compression), we can have vastly more
efficient representations.  We will see recognizable approximations of these faces using only
100 bytes, a factor of ten reduction in information!

The main idea behind this is to use a different set of basis images.  In the normal pixel representation,
the basis images are single pixels, one for each point of the image.  We could instead use a different
set of basis images that are specialized to working with faces.  For example, our basis images could be
prototypical faces.  Or, we could come up with a small set of images of eyes, noses, mouths, etc. and
express each face by combining the most appropriate features.  To faithfully represent any image,
we would still need at least \(32 \times 32\) basis images, the same number as originally used in the
pixel representation.  Our hope, however, is that by using basis images carefully chosen to work with
faces, with only 100 basis images we can still reasonably approximate any face in our
dataset—that is, the contribution of the other \(32 \times 32 -100\) remaining basis images will be
very small.

Of course, we don’t want to have to construct this basis this by hand, but want some
automatic way of choosing a good set of basis images for our dataset.  Principal components analysis and
non-negative matrix factorization are two ways of doing this.  In the case of principal components analysis, the
basis images are known as eigenfaces.  Each face is represented as a linear combination of
these eigenfaces.  The eigenfaces are chosen in such a way that the first \(k\) eigenfaces can represent
the dataset with less error than any other choice of \(k\) basis images.  (The particular error measure
here is the average squared difference in pixel intensity between the original and the approximation.)
Thus if we want an approximate representation of our faces in terms of only 100 basis images, taking
the top 100 eigenfaces seems like a good bet.

This is not the end story, however.  With principal components analysis, the faces in our dataset are
represented as linear combinations of the eigenfaces, so in particular coefficients can be negative.
This means that we can subtract faces.  Have you ever heard anyone describe a face as Brad Pitt minus a bit of George Clooney?  (I get that all the time.)  Probably not—such a superposition is difficult to visualize, and it seems
unlikely that our brains process faces in this way.  It is much more common to hear that you have your father’s eyes and the mouth of your mother.  (And the curly hair of the mailman?)  This addition of disjoint parts is much easier to visualize.

Non-negative matrix factorization tends to give rise to basis images that more closely correspond to
parts of faces.  In a non-negative matrix factorization we again represent each face as a linear
combination of basis images, but now all coefficients in the linear combination are non-negative—we
can no longer subtract faces.  We can have such a non-negative factorization as all the data we
started out with—all the pixel values—are non-negative numbers.  When we restrict ourselves to 100
basis images, we are looking for the best approximation of all the images by a matrix whose
non-negative rank is 100.  In principal components analysis, when we restrict ourselves to 100
basis images, we are looking at he best approximation of all the images by a rank 100 matrix.

Illustration of matrix factorization

Here is a pictorial representation of matrix factorization.  We represent our dataset as a big \(5000 \times 1024\)
matrix.  The rows are indexed by faces, and each row gives the pixel intensities of the \(32 \times 32\)
image written as a long vector—we stack the rows of the image together, beginning with the first
row of the image, followed by the second row, etc.  In this example, in the \(i^{th}\) image the
\(j^{th}\) pixel value is 42.  Representing the faces as linear combinations of 100 basis images
amounts to (approximately) factorizing the matrix as the product of a \(5000 \times 100\) matrix
holding the coefficients telling the contribution of each basis image to a face, and
a \(100 \times 1024\) matrix giving the 100 basis images.  In the case of a non-negative factorization
the coefficient matrix will be non-negative.
You can judge the final results for yourself.  Here are the 100 basis images for our dataset arising from
principal components analysis and non-negative matrix factorization.

Top 100 eigenfaces

 

Top 100 non-negative basis images

 

And the final product, the reconstruction of the faces in terms of the 100 basis images.  First, from PCA.

Faces in terms of top 100 eigenfaces

And from the non-negative factorization.

Reconstructed faces from rank 100 non-negative matrix factorization

You will probably agree that the recovered images from the non-negative factorization are more
recognizable.  Unfortunately, there is a price to pay for this improvement.  For a \(m\)-by-\(n\) matrix
(in our case a 5000-by-1024 matrix), all the eigenfaces can be found in about \(\max\{m,n\}^3\) many steps.  On my laptop this took about 21 seconds.  Finding the best approximate non-negative factorization, on the other hand, is an NP-hard problem (shown by Vavasis).  In practice, a non-negative factorization is usually found by iterative methods that are not guaranteed to converge to the optimal solution.  I did 500 iterations of the algorithm suggested in this paper by Lee and Seung, which took about 5 minutes.

Another application of these techniques is to recommender systems.  You know, if you like this
you might also like that.  Matrix factorization played a big role in the winning algorithm for the
Netflix prize, a contest to give an improve the recommendation algorithm of Netflix.
Here the data can also be represented as a big matrix, now with rows indexed by users instead of
faces, and columns indexed by movies.  An entry of the matrix gives the user rating for that movie—if
the user has rated it.  The big difference between the setting of recommender systems and that
described above is that now there are lots of holes in the matrix, places where the value is unknown.
Indeed, the whole point is to fill in these holes and predict user ratings.  But, just as here we used
matrix factorization to come up with basis images which could well approximate our faces, in
recommender systems matrix factorization can be used to come up with features of
movies that do a good job of distinguish their preference among users.  In other words, instead
of coming up with factors by hand like, level of action, includes a big star, or humor and evaluating
the movie on each factor, these factors and evaluations are automatically generated just like the eigenfaces, and can often be similarly difficult to interpret.  This survey article by members of the Netflix prize winning team is a nice place to learn more.