The Lost Art of Thinking: January 2011

Sunday 30 January 2011

Molecular Random Tilings

I am still organising the seminars in our group every Friday. In the last one, we had a very interesting one given by Prof. Juan Garrahan, from Nottingham University. The title was the same as in this post, Molecular Random Tilings. Although I don't have the version of the talk he gave, you can access a very similar previous version on his webpage through this link.

The idea is a very interesting and beautiful one. The chemical problem is related to an organic molecule which is called TPTC or p-terphenyl-3,5,3’,5’-tetracarboxylic acid. It has the form below:

These molecules are adsorbed onto a substratum of graphite and bind together by means of hydrogen bonds in one of two possible relative configurations. After the deposition process, they cover the substratum forming an hexagonal molecular lattice. In fact, you can associate to each of these molecules a rhombus and, in doing so, each configuration of the molecular lattice can be associated a tiling of the plane (also known as a tesselation) by these polygons, which is a classic mathematical problem. It's also equivalent to another well known statistical mechanical problem, which is called the covering of the lattice by dimers, the simplest one being that on the regular lattice.

The way to associate the rhombus to the molecule is quite interesting. There are three directions for the rhombus in the plane and to each one a colour is associated (guess which...): red, green and blue. The picture in the top of this post, taken from the paper Molecular Random Tilings as Glasses by Garrahan et al., shows how the model is and the figure below, taken from an article in the AMS site (and property of Peter Beton) shows on the left an image of the molecular lattice taken by an scanning tunneling microscope and on the right the associated tiling.

The interesting thing in statistical mechanics is always to analyse phase transitions. In models like these, what is interesting is to study how the system passes from a phase dominated by random tilings, meaning tilings which are not ordered in the obvious way, to an ordered phase where the tiling is regular as we vary the temperature of the system. The basic quantity we need to calculate turns out to be the free energy, which will allow us to calculate everything else we want to know about the system. The beautiful thing about this model is that the free energy is proportional to the integral of the squared gradient of a field called the height field that can be defined for each point of the tiling. The cool thing is that this field can be seen in some sense as the height of the pile of 3-dimensional cubes you will certainly see when you look at the tiling!

Another very interesting aspect of this model is that it supports fractional excitations, which are very much like the anyons we already discussed in some previous posts. While the anions have fractional statistics, these defects in the tiling are triangles which are a result of imperfect matches. Two triangles form a rhombus, but then they can divide themselves and run free through the tiling. This amounts for a fractionalisation of the degrees of the freedom of the model and, as you can imagine, charges can be associated to these defects.

The details of the model are in the paper I linked to in the beginning of the post. It's worth to take a look at it as there are a lot of beautiful images and much more information about the phase diagram of the model. After the seminar, I took Prof. Garrahan to have lunch in our Business School (one of the advantages of giving seminars in our group). A friend called also Juan, which is again also Argentinian as Prof. Garrahan, accompanied us. It was a nice lunch and I would like to thank Prof. Garrahan for an excellent talk and a pleasant conversation afterwards.

Monday 24 January 2011

A Note about Footnotes

I know this seems completely off-topic and unnecessary, but one of the advantages of having a blog is to be able to make your complaints available for a wider audience. However, I believe that this will not be so useless as it seems and I would use it as an advice when writing documents, specially reports and thesis. It's simple: do not overuse footnotes!

Footnotes are devices that should be used with care, which in many books (some very famous) and articles they are not. I don't mind when the author use the footnotes to place the references, for instance. In fact, in some journals this is part of the articles standard format. I do prefer when the references are at the end of the paper, but that is just a biased opinion and there is not much difference. The biggest and most annoying misuse of footnotes is to add "extra information". I have an opinion about that. If you have any relevant information, just put it on the main text. If it's not relevant, almost all the time it's better to just keep it out of the document. There are very rare occasions where a footnote is okay, but they are really rare.

When should you consider the information worth of a footnote? Well, you must use your own common sense, but there are some tips to see if you are not abusing them. For instance, if every page of your thesis has a footnote, you actually have more than one thesis. Also, if your footnotes are longer than two lines, maybe the information should be written with slightly larger characters in the main text. Believe me, I have seen books where the main page had just a few lines of text and the whole rest of it was filled with footnotes!

Another thing, there is nothing more distracting for the reader than a long sequence of footnotes that keep interrupting the flow of the text all the time. It's absolutely disrupting and I gave up reading some books because the footnotes made it look like a jigsaw puzzle. And to give just one example of a brilliant person who abused too much of footnotes, think about the Landau & Lifshitz books (it's a famous series of physics books for those who are not physicists). Beyond all the other issues that make those books difficult to follow, on top of that the footnotes keep interrupting the reading over and over again. And Landau is surely in the pantheon of physics gods.

When I wrote my Ph.D. thesis, I used just one footnote. I kept it because, in fact, I wanted to look smart about a topic, but I regret it. The rest of the 150 pages has no footnotes, except for the references but they were at the end of the document, not of the pages. At the end, I received many compliments for the clarity of the text.

So, my advice is: include every relevant piece of information in the main text. Use parenthesis, comas or whatever other trick you may need, but don't force the reader to make a detour to the end of the page unless you really, really, really think there is no other way. Your readers (maybe me one day) will thank you. (Of course, that's only MY taste...)

Sunday 23 January 2011

Anthropic Principle

All definitions of the Anthropic Principle can be classified into two groups: the trivial and the wrong.

I know that the above assertion can be criticised for being too strong and too careless, and in some sense I must admit that there is a sort of radicalism in it. However, given that the probability of it being precise is high, it's worth the risk. I would expect that such issue would be longer settled, but over and over again I end up reading about the Anthropic Principle as if it is a really great and brilliant idea. In fact, I only decided to write about it because I was reading Richard Dawkins's The God Delusion and he talks about it at some point. So let me explain the reasoning behind my point of view.

The detailed definition of the Anthropic Principle, with all technical terms and such, can be found in a summarised form in the Wikipedia Article about the topic. Technically, there are basically two versions that can be afterwards subdivided according to extra details. They are the strong and the weak versions.

The strong version says that the laws of the universe are such that at some point conscious observers must appear. The "must" is what makes the version strong. I have very little to say beyond the fact that this is a highly non-falsifiable argument. It claims that conscious beings are somehow an objective to be reached by an universe and that the laws of the universe should be such that they allow them to appear. Or that without these beings the universe cannot exist somehow. It's actually quite easy to smell a bit of deism in this kind of argument. You may argue that this has something to do with some kind of natural selection principle where universes with conscious beings are fitter, but in fact I do not know any convincing argument apart from shear speculation. The fact that it is not falsifiable should be clear. How would we falsify it? Well, we could if the universe did not have conscious observers from the beginning to the end. Too bad it does. We could construct this kind of universe... oh, but wait... if we are constructing them, then the universe that include ours and that one also contains conscious observer. This is the version I call wrong. I know it's too strong to call it wrong, specially for a philosopher, but it's basically true.

The fact is that, in principle, there is nothing that prevents a version of our universe that is too fast to be able to sustain any kind of life, be it conscious or not, to exist. Mathematically, for instance, I see no problem. The issue is even deeper, because we don't really have a detailed understanding of the phenomenon of conscience or even of life itself. The only example we have of life is the one we can observe on Earth, which is hardly a fair sample of the whole universe. It's true that according to some calculations, a slight deviation from the known versions of the physical constants would have a huge impact on life as we know it to the point it would not be able to exist, but we are not really sure that some other kind of life would not. That brings me to the second group of definitions.

The second group are collectively known as weak versions. These are the ones I am calling trivial. Again, I am exaggerating on purpose. They all say that the constants of physics must be such that they allow (conscious) life to develop. For example, based on the fact that humans exist, you can get a good estimative of some physical constants and the allowed range of the estimative falls very close to the real value. I hardly see the point of calling such and observation by the term "Principle". I tend to think that every person in the world which works with some kind of inference procedure, which obviously include science as well, should see that if you assume life and try to estimate a physical constant, it just shows that your model is correct, not much else. The fact that humans exist is data. It's given evidence. If you do your estimate and reach a wrong value, it would mean that you should work on a better model for your physics. Now, you can take every piece of evidence in the world and associate some kind of principle to it. For instance, let's talk about the 'Bread Principle'. It says that the physical constants must be such that bread can exists. Now, bread requires yeast among other things. So the 'Bread Principle' says that microscopic life must exist. And it must be such that the chemical reactions that take place in the yeast must occur in such a way that allows bread to grow (!). You probably see where do I want to get.

At the end, my point is in fact very simple. Any idea trying to justify the laws of the universe by requiring consciousness are relying on a phenomenon that we are not even close to understand at the moment and, to be honest, are nothing more that some sort of religious argument disguised in science cloths. On the other hand, the fact that the laws of the universe are compatible with our existence and that given the correct model we can calculate things backwards is just a statement of the obvious: the model must agree with the experimental evidence. I am not aware of any breakthrough provided by the so called Anthropic Principle idea, and I am willing to bet that none will ever come from it, besides of course the usual ones provided by probabilistic inference.

Wednesday 19 January 2011

xkcd: 3D

Tuesday 18 January 2011

Critical Care

For those who are not Star Trek fans, it is probably not very clear what a geek sci-fi show that is not even being broadcast anymore has to do with anything barely real. Fans, otherwise, know better. When Gene Rodenberry created Star Trek, his idea was to to discuss the problems of society in a disguised language. By placing them into the distant future and on distant planets, he could be excused from criticizing his own country. Just to give an example, it was on Star Trek (the Original Series) that the first interracial kiss in the US television took place, with William Shatner being highly responsible for it not being cut from the original text. But that's another episode. The episode I really want to talk about was the one I watched last weekend.

By showing the social side of Star Trek to my wife, I was able to convince her, who's a lawyer, to watch the whole five Star Trek series with me. She's actually enjoying it so much that she's also a fan now. Okay, from now on I will spoil the episode. So if you like surprises, stop reading now. The episode is from the last season of Voyager and it is called Critical Care. The ship's doctor, which is a hologram, is stolen and sold in a planet where the health system has some similarities with the real (not the idealised) terrestrial one. The story goes like this. The planet's economy was crashing (any similarity here?) and then an alien race appeared to help. They ended up leaving them with a health system where people would be treated accordingly to an index named the treatment coefficient, TC for short. The TC of a person would be calculated by an advanced computer, left by the nice aliens, that would carefully take into consideration the impact of the corresponding person upon society, i.e., how much the person in question contributes to the well being of all.

As in all societies, it turns out that the ones with the highest TC receive the best treatments, while the others, which are less relevant, receive just an annual quote of treatment. You can read other synopses on Wikipedia and IMDB. Alternatively you can watch the whole episode, although I am not sure for how long, on YouTube by following the links starting with this:

Critical Care - Part 1

There are very interesting dialogues and scenes. For example, the higher TC patients are treated in a Blue Zone (or something like that) where everything is nice and clean. Then, the computer allows the doctors to treat the patients with a certain quote of medicines but if the doctors do not use everything, the computer decreases it in the next month. At first sight, it seems okay, but if you think deeply, that is just absurd. Try. Of course the episode was meant to be a direct critic of the US health system, but if you change the time to today, the country to UK and the terms Treatment Coefficient by Research Impact and patient by scientific project, you have an isomorphism.

As a friend of mine said, the messenger changes, but the message is always the same. In fact, what is happens with people in that episode is presently happening with science and education in the UK. And it's not just metaphorically. The methods are literally, and I really mean LITERALLY, the same as in the Star Trek show! Is it possible that there are profound and important things that science fiction writers can see about how to make a better society while politicians cannot? If so, aren't we giving the wrong job to each of them?

I am not a person who thinks that politicians do not know what they are doing (well, maybe some...). They are clever people. They know exactly what they are doing. Our duty is not to call them stupid. That's actually just helping them. What we need to think is 'They are not stupid, so they are doing this for some reason. What is the reason?'. The answer to this question is the most important.

Sunday 9 January 2011

The Holographic Way

Those of you who have been following me on Twitter (and had the patience to read what I am posting there) probably noticed the huge amount of twits with the tag #holography attached. The reason is, naturally, that I am trying to learn it. But before I enter in details, I need to explain what it is all about. If you already know what it is, I will hardly say anything new.

The term "holography" has two meanings in modern physics, and they are obviously related. The first and most popular one is the technique used to create holograms, those three dimensional images embedded in a two dimensional sheet of paper or plastic. The second one is derived from an analogy with this property of storing the information for a three dimensional environment into a two dimensional one. The story starts with Jacob Bekenstein, a theoretical physicist that was thinking about thermodynamics and black holes. Although I will cut the story a lot, the main point is that he discovered that the entropy of black holes should be proportional to the area of their even horizon, the surface after which nothing can come back. That's what we call, in statistical mechanics language, non-extensive. We call a property extensive when it's proportional to the volume of the object.

The story actually mixes a lot of things. But I will try not to rush in. Back to the black holes, they are in fact the most entropic "objects" in the universe. The argument is simple enough and works by, as in many situations, invoking the Second Law of Thermodynamics. Suppose that in a region of space of radius R there is more entropy than a black hole the size of that region. Then, by adding matter to the region you can increase its mass. If you do that with no care at all, you can always increase the entropy by creating disorder, which is actually very easy as anyone know. It's easy to see where it ends. With enough matter, you can create a black hole the size of the original region. If the black hole has less entropy, than you decreased the TOTAL entropy of the universe and broke the Second Law.

Enters statistical mechanics. In the late 19th century, Boltzmann discovered that the entropy can be understood microscopically as the number of states accessible to some system. And it was by using this concept, that two other physicists, 't Hooft and Susskind, suggested which became known as the Holographic Principle. Consider a region in space. The entropy of that region is bounded by the area of the event horizon of a black hole the size of that region, which means, that the maximum entropy of that region is given by this area. Therefore, the number of possible states in which the entire region can be is proportional not to the volume of the region, but to its area!

Now it's easy to see why it is called the Holographic Principle. The possible configurations of the whole three dimensional region are in fact limited by the two dimensional area of its boundary. Like a hologram. Well, the Holographic Principle actually go one step further by suggesting that the boundary actually ENCODES the degrees of freedom (the equivalent of the possible configurations in some sense) inside the region. That's a bit more difficult to accept, but around 1997, a string theory guy named Juan Maldacena, based on his work on strings proposed something called the AdS/CFT conjecture. In a few words, the conjecture says that the degrees of freedom of a quantum gravity theory in Anti-de Sitter space are encoded in a strongly coupled conformal field theory that leaves on its boundary.

The importance of this is that in some limit, the quantum gravity theory becomes classical gravity, which means general relativity. In fact, it means a classical field theory with a dynamical metric, where metric is the mathematical way of encoding the distance between two points in any kind of space. I am not sure if I understood this point precisely, but I guess that this classical limit is the limit where the conformal field theory becomes strongly coupled. A conformal field theory is a special kind of field theory with an additional scaling symmetry. The good thing is that although we don't know how to deal with strong coupled field theories, we more or less can calculate things in the gravity sector of the AdS/CFT duality.

To finish, let me explain finally why I am interested in it. Recently, there has been some work where the CFT part of the duality display a phenomenology very similar to some strong coupled systems in condensed matter. Now, these systems are quite important and very difficult to deal with with traditional methods like statistical physics or perturbation theory. One of the most famous example is the high temperature superconductor. These superconductors were discovered in 1986 and we still do not have a good understanding of them. It seems that AdS/CFT can shed some light on this. Another problem is called non-Fermi liquids, which are also strong coupled systems of fermions in condensed matter.

Well, this was just an introduction to the topic. I will try to write more about it as I read. It's a selfish endeavour as it's meant to help myself to think more clearly and understand better this subject. If anyone have comments, suggestions or want to correct the probably lots of mistakes I wrote, or the ones I will write, feel free. That's the aim after all. :) Oh, and by the way, the video has really nothing to do with the text. I just thought of it as a nice example of a hologram. :)

Saturday 1 January 2011

Intuition and Neural Networks

I had an interesting discussion with a friend during Christmas. It started because one of my presents, which I chose, was Richard Dawkin's The God Delusion. The discussion at some point became one about spirituality. He was arguing in favor of the existence of it and I was trying to understand what he exactly meant by the word spirituality. The details of the conversation are really not important, but at some point he argued that spirituality was related to intuition, and intuition is something that cannot be logically understood. Of course I disagreed for to me that sounds like a very fun remark as, among all cognitive phenomena, intuition is the one which I would say that was most illuminated by the study of artificial neural networks and machine learning in general.

For many the above statement may seem not only surprising, but highly unbelievable and extremely exaggerated. It's not. In order to prove it, let me start by explaining what I understand by intuition. This is also probably the concept that everyone shares. Most people have already been in a situation where you have to take a decision and, although you cannot explain why and it may even sound counterintuitive, something inside you tells what is the correct answer. I will not use intuition in the sense of premonition or anything like this. I will concentrate on this sort of "I know this is the correct answer but I can't explain it." thing.

You may think that the fact that you cannot explain the decision makes it something beyond logic and therefore impossible to understand. Actually, it is the complete opposite. The explanation is in fact the simplest one: the feeling of what is the correct decision comes from our brain's experience with similar situations. Too simplistic, you would say. Okay, but why should this not be so? But this is not just a guess, we can actually reproduce this in a computer. That is exactly how machine learning algorithms work.

Let me start by describing the simplest machine learning model, the perceptron. The perceptron is a mathematical model inspired by a real neuron. It has N entries, which usually are taken as N binary numbers, and computes what is called a boolean function using them, giving as a result another binary number. The simplest rule is this

\[\sigma(\mathbf{x})=\mbox{sign}{\sum_i x_i w_i},\]

where the $\mathbf{x}=(x_i)_{i=1,...,N}$ are the N boolean entries and the real numbers $w_i$ are what enables this simple model to do some kind of very basic learning. The trick is that, if we change these numbers, we can change (to some extent, which is already a technical issue) the boolean function that is implemented by $\sigma$. The idea is that we have what is called a dataset of pairs $(\sigma_\mu,\mathbf{x}_\mu)$, with the indices $\mu$ labeling the datapoints. We usually call these datapoints by the suggestive name of examples, as they indicate to the perceptron what is the pattern it must follow. We then use a computer algorithm to modify the $w_i$ such that it tries to match the correct answers $\sigma_\mu$ for every corresponding $\mathbf{x}_\mu$. The simplest algorithm that works is the so called Hebb algorithm, which is based on the work of the psychologist Donald Hebb, and amounts to reinforcing connections (by which I mean the numbers $w_i$) when the answer is correct and weakening them when it's wrong.

As I said, in simple situations this algorithm really works. Of course, there are more complex situations where the perceptron does not work, but then there are more sophisticated machine learning models as well as algorithms. I will not discuss these details now, as this is not important to our discussion. The important thing is that, after learning, the perceptron can infer the correct answer to a question based simply on the adjusted numbers $w_i$. Now, notice that the perceptron does not really know the pattern it's learning. It is too simple a model to have any kind of awareness. The perceptron also does not perform any kind of logical thinking to answer the questions, it just knows the correct answer as soon as the question is presented. It never really knows the pattern it's following after learning. Basically, it gives an intuitive answer. But what is really more incredible is that, even if we look at the numbers $w_i$, we also cannot explain what is the pattern the perceptron learned. It's just a bunch of numbers and if the number N is too large it becomes even more difficult for us to "understand" it.

Looks too simplistic but this is exactly what we called intuition above. In the end, taking a decision based on intuition happens when your brain tells you that the question you are faced with follows some kind of pattern that you cannot really explain, but just seem right. You learned it somehow, although you cannot explain what you've learned. As you can see, intuition is in fact the first thing we were able to understand with machine learning and the myth that this cannot be understood is just that: a myth.

Pages