spoonless | Entries tagged with statistical mechanics

I realized after writing part 5 that by continuing on to the anthropic principle and observer selection effects, I've skipped over a different issue I planned to write more about, which was how statistical mechanics and quantum mechanics are actually the same thing. I think I actually covered most of what I'd wanted to cover in part 4, but then forgot to finish the rest in part 5. However, in thinking more about that it has led to lots more thoughts which make all of this more complicated and might change my perspective somewhat from what I said earlier in this series. So let me just briefly note some of the things I was going to talk about there, and what complications have arisen. Later, we'll get to the quantum suicide booth stuff.

The first time I used Feynman diagrams in a physics class, believe it or not, was not in Quantum Field Theory, where they are used most frequently, but in graduate Statistical Mechanics, which I took the year before. We weren't doing anything quantum, just regular classical statistical mechanics. But we used Feynman diagrams for it! How is this possible? Because the path integral formulation of quantum mechanics looks nearly identical mathematically to the way in which classical statistical mechanics is done. In both cases, you have to integrate an exponential function over a set of possible states to obtain an expression called the "partition function". Then you take derivatives of that to find correlation functions, expectation values of random variables (known as "operators" in quantum mechanics") and to compute the probability of transitions between initial and final states. This might even be the same reason why the Schrodinger Equation is sometimes used by Wall Street quants to predict the stock market, although I'm not sure about that.

One difference between the two approaches is what function gets integrated. In classical statistical mechanics, it's the exponential of the Boltzmann factor for each energy state e^(-E/kT). You sum this over all accessible states to get the partition function. In Feynman's path integral formalism for quantum mechanics, you usually integrate e^(iS) where S is the action (Lagrangian for a specific path integrated over time) over all possible paths connecting an initial and final state. Another difference is what you get out. Instead of the partition function, in quantum mechanics, you get out a probability amplitude, whose magnitude then has to be squared to be interpreted as a transition probability.

I was going to write about how these are very close to the same thing, but as I read more in anticipation of writing this, I got more confused about how they fit together. In the path integral for quantum mechanics, you can split it up into a series of tiny time intervals, integrating over each one separately. Then taking the limit as the size of these time intervals approaches zero. When you look at one link in the chain, you find that you can split the factor e^{iS} into a product of 2 factors. One is e^{ip*\delta_x} which performs a Fourier transform, and the other is e^{-iHt} which tells you how to time-evolve an energy eigenstate in quantum mechanics into the future. The latter factor can be viewed as the equivalent of the Schrodinger Equation, and this is how Schrodinger's Equation is derived from Feynman's path integral. (There's a slight part of this I don't quite understand, which is why energy eigentstates and momentum eigenstates seem to be conflated here. The Fourier transform converts the initial and final states from position into momentum eigenstates, but in order to use the e^{-iHt} factor it would seem you need an energy eigenstate. These are the same for a "free" particle, but not if there is some potential energy source affecting the particle! But let's not worry about that now.) So after this conversion is done, it looks even more like statistical mechanics. Because instead of summing over the exponential of the Lagrangian, we're summing over the exponential of the Hamiltonian, whose eigenvalues are the energies being summed over in the stat mech approach. However there are still 2 key differences. First, there's the factor of "i". e^{-iEt} has an imaginary exponent, while e^{-E/(kT)} has a negative exponent. This makes a pretty big difference, although sometimes that difference is made to disappear by using the "imaginary time" formalism, where you replace t with it (this is also known as "analytic continuation to Euclidean time). There's a whole mystery about where the i in quantum mechanics comes from, and this seems to be the initial source--it's right there in the path integral, where it's missing in regular classical statistical mechanics. This causes interference between paths which you otherwise wouldn't get. The second remaining difference here is that you have a t instead of 1/kT (time instead of inverse-temperature). I've never studied the subject known as Quantum Field Theory at Finite Temperature in depth, but I've been passed along some words of wisdom from it, including the insight that if you want to analyze a system of quantum fields at finite temperature, you can do so with almost the same techniques you use for zero temperature, so long as you pretend that time is a periodic variable that loops around every 1/kT seconds, instead of continuing infinitely into the past and the future. This is very weird, and I'm not sure it has any physical interpretation, it may just be a mathematical trick. But nevertheless, it's something I want to think about more and understand better.

Another thing I'd like to think about more, in order to understand the connection here, is what happens when you completely discretize the path integral? That is, what if we pretend there's no such thing as continuous space, and we just want to consider a quantum universe consisting solely of a finite number of qubits. Is there a path integral formulation of this universe? There's no relativity here or any notion of space or spacetime. But as with any version of quantum mechanics, there is still a notion of time. So it should be possible. And the path integral usually used (due to Dirac and Feynman) should be the continuum limit of this. I feel like I would understand quantum mechanics a lot more if I knew what the discrete version looked like.

Oh, one more thing before we move on to the quantum suicide booth. While reading through some Wikipedia pages related to the path integral recently, I found something pretty interesting and shocking. Apparently, there is some kind of notion of non-commutativity, even in the classical version of the path integral used to compute Brownian motion. In this version of the path integral, you use stochastic calculus (also known as Ito calculus I think?) to find the probabilistic behavior of a random walk. (And here again, we find a connection with Wall Street--this is how the Black Sholes formula for options pricing is derived!) I had stated in a previous part of this series that non-commutativity was the one thing that makes quantum mechanics special, and that there is no classical analog of it. But apparently, I'm wrong, because some kind of non-commutativity of differential operators does show up in stochastic calculus. But I've tried to read how it works, and I must confess I don't understand it much. They say that you get a commutation relationship like [x, k] = 1 in the classical version of the path integral. And then in the quantum version, where there's an imaginary i in the exponent instead of a negative sign, this becomes [x, k] = i or equivalently, [x, p] = ih. So apparently both non-commutativity and the uncertainty principle is directly derivable from stochastic calculus, whether it's the quantum or the classical version. So this would indicate that really the *only* difference between classical and quantum is the factor of i. But I'm not sure that's true if looked at from the Koopman-von-Neumann formalism. Clearly I have a lot more reading and thinking to do on this!

As I mentioned in part 3, I had never heard of the Koopman-von Neumann formulation of classical mechanics until reading Tom Banks' 2011 post about probability in quantum mechanics on cosmic variance.

But finding out about it makes so many things about quantum clear to me that were murky in the past. The main thing that's now crystal clear is this: quantum mechanics is a generalization of statistical mechanics. They aren't really two different theories, rather quantum mechanics is statistical mechanics... it's just that a central assumption of statistical mechanics had to be dropped in light of the evidence.

I had made it most of the way to understanding this when I wrote my series on Wandering Sets in 2013. In some ways, I think it's probably the best thing I've ever written on this blog, even though I think it ended up being too long, meandering, and esoteric for my friends to follow all the way through. I want to write a popular physics book at some point where I explain these ideas more clearly, with pictures and more analogies and examples. What I've learned via KvN solidifies my hunch that QM and SM are really the same theory.

I think one of the first things any student is struck with when they take their first course on quantum mechanics is how different the math is from classical mechanics or stat mech. In classical mechanics, you have lots of differential equations that come from a single important master entity called a Lagrangian, and if you want you can write this in an alternate way as something similar called a Hamiltonian. But all of the variables in the theory just stand for regular real numbers (like 2, pi, 53.8, etc.) that describe the world. In quantum mechanics, you start from the assumption that there is a complex Hilbert space of operators. And you can write down a Hamiltonian, which you're told is an analog of the Hamiltonian used in classical mechanics. The Hamiltonian seemed like a weird way of writing the Lagrangian in classical mechanics, but in quantum mechanics it takes on a more important role. But the "variables" used in the quantum Hamiltonian are not ordinary real numbers, they're operators. These operators correspond to observables (things you can observe about the world), but instead of being a single number they are more like a technique used for making measurements and getting a set of possible results out with associated probabilities. And instead of these operators acting on states in a more familiar space (like the ordinary 3-dimensional space we live in, or the phase space used in statistical mechanics), they act on states in a complex Hilbert space. Complex numbers like 5+i play an important role in this space, and yet as a student there's really no way of understanding why or what the purpose is. You're just asked to accept that if you start with these assumptions, somehow they end up predicting the results of experiments correctly where the corresponding classical predictions fail.

There were many reasons why I ended up leaning towards many worlds rather than other interpretations. I've always preferred representational realism to instrumentalism, so that was one reason. Another was locality (reading David Deutsch's 1999 paper on how quantum mechanics is entirely local as long as you assume that wave functions never collapse was the most influential piece of evidence that convinced me.) But there was a third reason.

The third reason was that whenever I had asked myself "what's the essential difference between classical mechanics and quantum mechanics?" it came down to the idea that instead of regular numbers representing a single outcome, you have operators which represent a set of possible outcomes. In other words, instead of reality being single threaded (one possibility happens at a time), it's multi-threaded. Things operate in parallel instead of in series. This especially resonated with my computing background, and my hope that one day quantum computers would be developed. I knew that it was a little more complicated than just "replace single-threaded process with multi-threaded process", but I thought it was the biggest difference between how the two theories work and what they say.

Learning about the KvN formalism hasn't completely destroyed my preference for Many Worlds, but it has obliterated my view that this is the most important difference between the theories. I now understand that this is just not true.

While I was writing my wandering set series in 2013, I discovered the phase space formalism of quantum mechanics (and discussed it a bit in that series, I believe). This was very interesting to me, and I wondered why it wasn't taught more. It demonstrates that you can write quantum mechanics in a different way, using a phase space like you use in statistical mechanics, instead of using the usual Hilbert space used in quantum mechanics. That was surprising and shocking to me. It hinted that maybe the two theories are more similar than I'd realized. But even more surprising and shocking was my discovery this year of KvN, which shows that you can write statistical mechanics... ordinary classical statistical mechanics... in an alternate formalism using a Hilbert space! What this means is that I was just totally wrong about the number/operator distinction between quantum and classical. This is not a difference in the theories, this is just a difference in how they are written down. Why was I mistaken about this for so long? Because the standard procedure for taking any classical theory and making it a quantum theory is called "canonical quantization", and the procedure says that you just take whatever variables you had in the classical theory and "promote" them to operators. It's true that this is how you can convert one theory to the other, but it's extremely misleading because it obscures the fact that what you're doing is not making it quantum but just rewriting the math in a different way. What makes it quantum is solely the set of commutation relations used!

to be continued in part 5...

When I wrote part 9 I was thinking I was pretty confused and I was not sure I had really made any progress on answering the basic question that I had set out to answer with this series. But I'm pleased to announce that within 24 hours after writing that one, I started thinking and piece by piece, it just all came together. I think I have pretty much solved the mystery, aside from some minor threads that might still need to be wrapped up. (Just didn't get a chance to write it down until today.) That was way faster than I'd imagined it would take.

The main thing I wasn't seeing is how mixing (whether the ordinary process of two gasses mixing in a box, or the more esoteric quantum measurement process) relates to wandering sets. And the lynchpin that was missing, that holds everything together, and explains how mixing relates to wandering sets, is "what is the identity of the attractor?"

I realized that if I could pinpoint what the attractor was in the case of mixing, then I would see why mixing is a wandering set (and hence, a dissipative process). Soon after I asked myself that question, the answer became pretty obvious. The attractor in the case of mixing--and indeed, in any case where you're transitioning from a non-equilibrium state to thermodynamic equilibrium--is the macrostate with maximal entropy. In other words, the macrostate that corresponds to "thermodynamic equilibrium".

I think the reason I wasn't seeing this is because I was thinking too much about the microstates. But from the point of view of a microscopic description of physics, any closed system is always conservative--all of the physics is completely reversible. You can only have dissipation in two ways. One is fairly trivial and uninteresting, and that's if the system is open and energy is being sucked out of it. Sucking out energy from a system reduces its state space, so from within that open system, ignoring the outside, you start in any corner of a higher dimensional space and then you get pulled into an attractor that represents the states which have lower total energy. If energy keeps getting sucked out, it will eventually all leave and you'll just be left in the ground state (which would in that case be the attractor).

But there's a much more interesting kind of dissipation, and that's when you course grain a system. If you don't care about some of the details of the microscopic state, but you only care about the big picture, then you can use an approximate description of the physics, you can just keep track of the macrostate. And that's where the concept of entropy comes into play, and that's when even closed systems can involve dissipation. There's no energy escaping anywhere, but if you start in a state that's not in thermodynamic equilibrium, such as two gasses that aren't mixed at all, or that are only halfway mixed, or only partially mixed anywhere in between... from the point of view of the macrostate space, you'll gradually get attracted towards the state of maximal entropy. So it's the macrostate phase space that is where the wandering sets comes in, in this case. Not the microstates! The physics of the evolution of the macrostate involves a dissipative action, meaning it contains wandering sets; and it is an irreversible process because you don't have the microstate information that would be required in order to know how to reverse the process.

So how does this work in the case of a quantum measurement? It's really the same thing, just another kind of mixing process. Let's say you have a quantum system that is just a single spin (a "qubit") interacting with a huge array of spins comprising the "environment". Before this spin interacts, it's in a superposition of spin-up and spin-down. It is in a pure state, similar to the state where two gasses are separated by a partition. Then you pull out the partition (in the quantum case, you allow the qubit to interact with its environment, suddenly becoming entangled with all of the other spins). In either case, this opens up a much larger space, increasing the dimensionality of the microstate space. Now in order to describe the qubit, you need a giant matrix of correlations between it and all of the other spins. As with the mixing case I described earlier, you could use a giant multidimensional Rubik's cube to do this. The only difference is that classically, each dimension would be a single bit "1" or "0", while this quantum mechanical mixing process involves a continuous space of phases (sort of ironic that quantization in this case makes something discrete into something continuous). If this is confusing, just remember that a qubit can be in any superposition of 1 and 0, and therefore it takes more information to describe it's state than a classical bit requires.

But after the interaction, we just want to know what state the qubit is in--we don't really care about all of these extra correlations with the environment, and they are random anyway. They are the equivalent of thermal noise, non-useful energy. So therefore, we shift from our fine grained description to a more course grained one. We define the macrostate as just the state of the single qubit, but averaged over all of the possibilities for the environmental spins. Each one involves a sum over its up and its down state. And if we sum over all of those different spins, that's accomplished by taking the trace of the density matrix, which I mentioned in part 9. Tracing over the density matrix is how you course grain the system, averaging over the effects of the environment. As with the classical mixing case, putting this qubit in contact with the environment suddenly puts it in a non-equilibrium state. But if you let it settle down for a while, it will quickly reach equilibrium. And the equilibrium state, the one with the highest entropy, is one where all of the phases introduced are essentially random, ie there are no special extra correlations between them. So the microstate space is a lot larger, but there is one macrostate that the whole system is attracted to. And in that macrostate, when you trace over the spins in the environment, you wind up with a single unique state for the qubit that was measured. And that state is a "mixed state", it's no longer a coherent superposition between "0" and "1" but a classical probability distribution between "0" and "1". The off diagonal elements of the density matrix have gone to zero. So while the microstate space has increased in dimensionality, the macrostate space has actually *decreased*! This is why I was running into so much confusion. There's both an increase in dimensionality AND a decrease in dimensionality, it just depends on whether you're asking about the space of microstates or the space of macrostates.

Mystery solved!

I'm very pleased with this. While I sort of got the idea a long time ago listening to Nima Arkani-Hamed's lecture on this, and I got an even better idea from reading Leonard Susskind's book, it really is all clear to me now. And I have to thank wandering sets for this insight (although in hindsight, I should have been able to figure it out without that).

I would like to say "The End" here, but I must admit there is one thread from the beginning--Maxwell's Demon--which I never actually wrapped up. I suspect that my confusion there, about why erasure of information corresponds to entropy increase, and exactly how it corresponds, is directly related to my confusion between macrostate and microstate spaces. So I will write a tenative "The End" here, but may add some remarks about that in another post if I think of anything more interesting to say. Hope you enjoyed reading this series as much as I enjoyed writing it!

The End

I have to admit, in trying to tie all of this together, I have realized that there still seems to be something big that I don't understand about the whole thing. And there is at least one minor mistake I should correct in part 8. So from here on out, we're treading on thin ice, I'm doing something more akin to explaining what I don't understand rather than describing a solution.

It seemed that if I could understand wandering sets, then all of the pieces would fit together. And it still seems that way, although the big thing I still don't get about wandering sets is how they related to mixing. And that seems crucial.

The minor mistake I should correct in part 8 is my proposed example of a completely dissipative action. I said you could take the entire space minus the attractor as your initial starting set, and then watch it evolve into the attractor. But this wouldn't work because the initial set would include points that are in the neighborhood of the attractor. However, a minor modification of this works--you would just need to start with a set that excludes not only the attractor but also the neighborhood around it.

In thinking about this minor problem, however, I realized there are also some more subtle problems with how I presented things. First, I may have overstated the importance of dimensionality. In order to have a completely dissipative action, you could really just use any space which has an attractor that is some subset of that space, where it attracts any points outside of it into the attractor basin. The subset wouldn't necessarily have to have a lower dimension--my intuition is that in thermodynamics that would be the usual case, although I must admit that I'm not sure and I don't want to leave out any possibilities.

This leads to a more general point here that the real issue with irreversibility need not be stated in terms of dimension going up or down--a process is irreversible any time there is a 1-to-many mapping or a many-to-1 mapping. So a much simpler way of putting the higher/lower dimensionality confusion on my part is that I often am not sure whether irreversible processes are supposed to time evolve things from 1-to-many or from many-to-1. Going from a higher to lower dimensional space is one type of many-to-1 mapping, and going from lower to higher is one type of 1-to-many mapping. But these are not the only types, just types that arise as typical cases in thermodynamics, because of the large number of independent degrees of freedom involved in macroscopic systems.

Then there's the issue of mixing. I still haven't figured out how mixing relates to wandering sets at all. Mixing very clearly seems like an irreversible process of the 1-to-many variety. But the wandering sets wiki page seems to be describing something of the many-to-1 variety. However, they say at the top of the page that wandering sets describe mixing! I still have no idea how this could be the case. But now let's move on to quantum mechanics...

In quantum mechanics, one can think of the measurement process in terms of a quantum Hilbert space (sort of the analog of state space in classical mechanics) where different subspaces (called "superselection sectors") "decohere" from each other upon measurement. That is, they split off from each other, leading to the Many Worlds terminology of one world splitting into many. Thinking about it this way, one would immediately guess that the quantum measurement process therefore is a 1-to-many process. 1 initial world splits into many different worlds. However, if you think of it more in terms of a "collapse" of a wavefunction, you start out with many possibilities before a measurement, and they all collapse into 1 after the measurement. So thinking about it that way, you might think that quantum physics involves the many-to-1 type of irreversibility. But which is it? Well, this part I understand, mostly... and the answer is that it's both.

The 1-to-many and many-to-1 perspectives can be synthesized by looking at quantum mechanics in terms of what's called the "density matrix". Indeed, you need the density matrix formulation in order to really see how the quantum version Lioville's theorem works. In the density matrix formulation of QM, instead of tracking the state of the system using a wavefunction--which is a vector whose components can represent all of the different positions of a particle (or field, or string) in a superposition--you use a matrix, which is sort of like the 2 dimensional version of a vector. By using a density matrix instead of just a vector to keep track of the state of the system, you can distinguish between two kinds of states--pure states and mixed states. A pure state is a coherent quantum superposition of many different possibilities. Whereas a mixed state is more like a classical probability distribution over many different pure states. A measurement process in the density matrix formalism, then, is described by a mixing process that evolves a pure state into a mixed state. This happens due to entanglement between the original coherent state of the system and the environment. When a pure state becomes entangled in a random way with a large number of degrees of freedom, this is called "decoherence". What was originally a coherent state (nice and pure, all the same phases), is now a mixed state (decoherent, lots of random phases, too difficult to disentangle from the environment).

What happens is that you originally represent the system plus the environment by a single large density matrix. And then, once system becomes entangled with environment, the matrix decomposes into the different superselection sectors. These are different sub matrices, each of which represents a different pure state. The entire matrix is then seen as a classical distribution over the various pure states. As I began writing this, I was going to say that because it was a mixing process, it went from 1-to-many. But now that I think of it, because the off-diagonal elements between the different sectors end up being zero after the measurement, the final space is actually smaller than the initial space. And I think that's even before you decide to ignore all but one of the sectors (which is where the "collapse" part comes in, in collapse based interpretations). From what I recall, the off-diagonal elements wind up being exactly zero--or so close to zero that you could never tell the difference--because you assume the way in which the environment gets entangled is random. As long as each phase is random (or more specifically--as long as they are uncorrelated with each other), when you sum over a whole lot of them at once, they add up to zero--although I'd have to look this up to remember the details of how that works.

I was originally going to say that mixed states are more general and involve more possibilities than pure states, so therefore evolving from a pure state to a mixed state goes from 1-to-many, and then when you choose to ignore all but one of the final sectors, you go back from many-to-1, both of these being irreversible processes. However, as I write it out, I remember 2 things. The first is what I mentioned above--even before you pick one sector out you've already gone from many-to-1! Then you go from many-to-1 again if you were to throw away the other sectors. And the second thing I remember is that, mathematically pure states never really do evolve into mixed states. As long as you are applying the standard unitary time evolution operator, a pure state always evolves into another pure state and entropy always remains constant. However, if there is an obvious place where you can split system from environment, it's tradition to "trace over the degrees of freedom of the environment" at the moment of measurement. And it's this act of tracing that actually takes things from pure to mixed, and from many to 1. I think you can prove that from a point of view of inside the system, whether you trace over the degrees of freedom in the environment or not is irrelevant. You'll wind up with the same physics either way, the same predictions for all future properties of the system. It's just a way of simplifying the calculation. But when you do get this kind of massive random entanglement, you wind up with a situation where tracing can be used to simplify the description of the system from that point on. You're basically going form a fine grained approximation of the system+environment to a more course grained approximation. So it's no wonder that this involves a change in entropy. Although whether entropy goes up or down in the system or in the environment+system, before or after the tracing, or before or after you decide to consider only one superselection sector--I'll have to think about and answer in the next part.

This is getting into the issues I thought I sorted out from reading Leonard Susskind's book. But I see that after a few years away from it, I'm already having trouble remembering exactly how it works again. I will think about this some more and pick this up again in part 10. Till next time...

Our universe has 3 large spacial dimensions (plus one temporal dimension, and possibly another 6 or 7 microscopic dimensions if string theory is right, but those won't be of any importance here).

Given 3 numbers (say, longitude, latitude, and altitude), you can uniquely identify where a particle is located in space. But the state of a system depends not only on what the particles positions are, but also on what their momenta are, ie how fast they are moving (the momentum of a particle in classical mechanics is simply it's mass times it's velocity--when relativistic and quantum effects are taken into account, this relationship becomes much more complicated). This requires another 3 numbers in order to fully specify what the state of a particle is.

If you were to specify all 6 of these numbers for every particle in a given system, you would have completely described the state of that system. (I'm ignoring spin and charge here, which you'd also need to keep track of in order to fully specify the state.) In order to categorize all possible states of such a system, you therefore need a space with 6N dimensions, where N is the number of particles. It is this 6N dimensional space which is called "phase space" and it is in this space where wandering sets are defined. The state of the system is represented by a single point in phase space, and as it changes dynamically over time, this point moves around.

In statistical mechanics and in quantum mechanics, you often deal with probability distributions rather than a single state. So you might start out with some distribution of points in phase space, some fuzzy cloudy region near some neighborhood of a point for instance. And as the system evolves, this cloud can move around and change its shape. But one really central and important theorem in physics is Liouville's theorem... it says that as this cloud of probability moves around, the volume it takes up in phase space always remains constant. This theorem can be derived from the equations of motion in classical or quantum mechanics. But it really follows as a consequence of energy conservation, which in turn is a consequence of the invariance of the laws of physics under time translations. At any given moment in time, the basic laws of physics appear to be the same, they do not depend explicitly on time, so therefore energy is conserved and so is the volume of this cloud in phase space. It can morph into whatever different shape it wants as it wanders around, but its total volume in phase space must remain constant.

Systems that obey Liouville's theorem are called conservative systems, and they don't have wandering sets. Systems that do not obey Liouville's theorem are called dissipative systems, and they *do* contain wandering sets.

But wait-- I just said that Liouville's theorem follows from some pretty basic principles of physics, like the fact that the laws of physics are the same at all times. So doesn't that mean that all physical systems in our universe are conservative--in other words, that there really is no such thing as dissipation? And does that in turn mean that entropy never really increases, it just remains constant?

This is one of the most frustrating paradoxes for me whenever I start to think about dissipation. It's very easy to convince yourself that dissipation doesn't really exist, but it's equally easy to convince yourself that all real world systems are dissipative, and that this pervasive tendency physicists have for treating all systems as conservative is no better than approximating a cow as a perfect sphere (a running joke about physicists).

I'll let this sink in for now, but end this part by hinting at where things are going next and what the answers to the above paradox involve. In order to understand the real distinction between conservative and dissipative systems, we have to talk about the difference between open and closed systems, about the measurement problem in quantum mechanics, what it means to make a "measurement", and how to separate a system from its environment, and when such distinctions are important and what they mean. We need to talk about the role of the observer in physics. One of the popular myths that you will find all over in pop physics books is that this role for an observer, and this problem of separating a system from its environment, was something that came out of quantum mechanics. But in fact, this problem has been around much longer than quantum mechanics, and originates in thermodynamics / statistical mechanics. It's something that people like Boltzmann and Maxwell spent a long time thinking about and puzzling over. (But it's certainly true that quantum mechanics has made the problem seem deeper and weirder, and raised the stakes somewhat.) Philosophically, it's loosely connected to the problem of the self vs the other, and how we reconcile subjective and objective descriptions of the world. In short, this is probably the most important and interesting question in the philosophy of physics, and it seems to involve all areas of physics equally, and it all revolves somehow around dissipation and entropy. To be continued...

I gave 3 examples of things that dissipate in part 2: friction, electrical resistance, and hurricanes. I feel like I understand fairly well why we call these dissipative, although I've always felt or hoped that there is some unifying principle that sheds more light on the subject and explains why them and not other things. But there's a forth example that is far more interesting, and for that example I still don't feel like I really understand why exactly it's dissipative: computation.

Now, you might first think--maybe computation is dissipative because it involves the flow of electricity through circuits (whether those circuits be wires or be microchips), but that's beside the point. First, as I understand it, any kind of physical irreversible computational process must necessarily dissipate heat and increase entropy. So this applies not just to electrical circuits but to anything we could conceivably use to compute an answer to something, including for example, an abacus (of course the amount of computation that can be performed by an abacus is presumably so tiny that you wouldn't notice.) Second, it's not just the electrical resistance because supposedly, computers actually draw *more* electricity while they are involved in some intense computation, not when they are just idling. There are many circuits which are on while the computer is doing nothing, but it's not being on that creates the entropy I'm worried about... it's the entropy created specifically from irreversible computation, from switching those circuits on and off in just such a way that it computes a simple answer to a more complex question fed to it. Beforehand, there are many possible answers, but afterwards, there is only one... for example, 42. This reduces the available microstates of the system from many to one, and therefore represents a reduction of entropy (which remember, counts the number of available microstates). Because of the 2nd law, this cannot happen by itself without producing heat... it needs to produce heat in order to cancel out that entropy loss by a gain in entropy due to the heat, for exactly the same reason that the earth must dump heat into its environment if evolution is to result in more highly organized organisms. So even a perfectly efficient computer which caused no net entropy gain for the universe would still produce heat!

The only exception to the above process is if, instead of taking a large set of inputs and reducing them to one output, all of the inputs and outputs correspond exactly in a 1-to-1 fashion, in other words, you use all reversible logic gates to build the computer. An example of an irreversible logic gate is and AND gate. It takes 2 inputs and has 1 output, it outputs "Yes" if both the inputs are on, and "No" if either one of them is off. Another example is an OR gate, which outputs "Yes" if either input is on, and "No" if both are off. To build a reversible gate, you need 2 inputs and 2 outputs, so that if you ran the computation backwards, you could recover the question from the answer. For example, if you put 42 into the computer, it should be able to spit out what the ultimate question is, just as easily as going the other direction. This is the meaning of reversibility.

Maxwell's demon is a thought experiment that James Clerk Maxwell came up with which illustrates how weird this connection between entropy and information is. If there were a little demon who were watching the individual molecules in a box, and he had a switch that could slide in a divider instantly in the middle of the box, then he could sit there and watch for the moment when each gas particle (normally, bouncing around randomly in the box) was about to cross the boundary from one side of the box to the other. If he presses the switch at just the right time, he can deflect the gas particle back into the left side of the box without expending any energy. If he keeps doing this for hours and hours, eventually all of the gas particles will randomly wander into the left side of the box and get stuck there, because he will put in the partition just as they try to cross the boundary back over to the right. Because entropy is connected to volume (smaller volumes have a smaller # of microstates), the final state has less entropy than the initial state, due to having half the volume. And yet, no work was done and no heat was expended in the process! This seems to be a blatent violation of the 2nd law of thermodynamics. So what happened here?

Well, in the real world, demons don't exist. And humans do not have the supernatural powers that demons have that would enable them to see individual gas particles moving super fast around in a box. But what if we set up a computer that could play the role of the demon? In principle, a computer could detect a gas particle much faster than a human, maybe even as fast as Maxwell's hypothetical demon. But if it does this, it has to either store the information about where each of the gas particles are, or temporarily watch each gas particle for a moment and then forget about it. If it stores this information, then it needs to fill up an exponentially large memory storage system. If it wants to keep the storage from getting out of hand, then it has to erase some of this information at some point... and erasure of information is an irreversible process. Because it is irreversible, it must dissipate heat. I mostly understand this part of Maxwell's demon. The other part I've always been a little bit fuzzy on though... what happens if the computer chooses to just store and store more and more information in its memory? Then, it will be filling up its memory with more and more information about the trajectories of the billions and billions of particles in the box. But does this in itself represent an increase in entropy? Or is it just the erasure of such information which increases entropy? It seems to me that storing anything at a memory location which could have previously taken multiple values but is then set to a single value represents a decrease in entropy. It would seem that storing it decreases entropy and then erasing it undoes that increasing it again. But I must be thinking about things a bit wrong there. I admit, this is where my understanding has always grown a bit fuzzy.

In the next part, I hope to actually get to wandering sets, and by extension, Boltzman's brain paradox, Poincare recurrence, and Liouville's Theorem. But maybe that's ambitious. To be continued...

In the summer of 2003, exactly one decade ago, I had a few months to spare before I started graduate school with no full time employment. So I spent that time reading and reviewing as many books on physics as I could. Partly to prepare for graduate school, partly for fun, and partly because I knew the first thing I would have to do when I got to California was take the PhD qualifying exams. One of the most fun parts for me was rereading the interesting parts of my undergrad textbook on statistical mechanics, and making sure that I understood entropy and related concepts very well.

The 2nd law of thermodynamics says that entropy always increases as time progresses. One of the most common popular conceptions of entropy is that it's sort of like a measure of order. It's not exactly true--rigorously, it's the number of microstates in a system that correspond to a given macrostate, but if you're reading this and not familiar with it at all, order should be the first thing you think of when you think of entropy. The visual picture of the 2nd law you should have is that... when eggs fall from a great height, they naturally break into little tiny pieces. But you never see the time-reversed process happening, little tiny splinters of eggshells gathering themselves up and coming together to form a whole egg. In this way, there is an "arrow" associated with time, it has a direction to it that progresses from low entropy states to high entropy states but not the reverse. Even though all of the microscopic laws of physics are 100% reversible, the macroscopic progression of states is irreversible.

One argument that advocates of intelligent design often make is that if natural processes always go from low entropy states (a lack of disorder) to high entropy states (more disorder) then there is no natural process that can explain the spontaneous emergence of order from nothing. Hence, they claim, there must have been some intelligent being who injected such order into the system, unnaturally. So one of my projects that summer was to understand what's wrong with this argument, and why it isn't the conclusion one draws if you actually understand what entropy is and how it works. For the most part, I succeeded in understanding this--to sum it up, it's mostly that they are forgetting the the earth radiates heat out into space, and heat is a form of disorder, it contributes to entropy. The earth absorbs energy from the sun, turns some of it into useful work, organizing things, and lets some of it messily spill out into the surrounding space. Entropy increases locally here on earth in that things look more organized every century here than they did the last, but this comes at the cost of dumping all this messy heat into outer space. (And lately, it seems, not all of that heat has even been making it out into space, some of it has been getting trapped in the atmosphere, warming the planet and causing disorganized things to happen like freak hurricanes and tsunamis! Probably another part of science that intelligent design advocates don't believe in.)

I say I succeeded for the most part. But after I made my own notes about everything, what energy, temperature, entropy, pressure, and volume were and how these macroscopic properties of our world arise from microscopic physical laws that determine the motion of individual particles... after I'd mapped out all that, I realized there was one missing piece. I didn't quite understand the definition of reversibility, and how to tell what kind of a system would be reversible and what kind wouldn't. I knew that reversible processes were those where entropy remains constant, rather than increasing. My textbook said that a reversible process was one which was "slow enough to stay in equilibrium, and without the presence of any "dissipative forces". I understood the first half of that definition. If you expand a box to twice its size instantly, you go out of equilibrium because the air inside expands very rapidly, and this has the effect of increasing the entropy (entropy is proportionate to volume, for a fixed temperature). By contrast, if you double the size by slowly and carefully expanding it, then you cool the gas as you expand it and the whole thing stays in equilibrium the whole time... there is no entropy increase, because miraculously, the entropy that would have been added by an increase in volume is cancelled out exactly by the decrease in temperature. The former process is irreversible, while the latter is reversible. But then there was this pesky second qualification they added to the definition... not only does it have to be slow, but it also has to lack "dissipative forces". But what the heck is a dissipative force? I searched all over for a definition of this but didn't find a good one that made sense to me.

I knew various examples of forces that I thought of as being dissipative... friction, electrical resistors, hurricanes. And all of these things had certain things in common... they all seem like messy kinds of things that mess up the nice clean elegant laws of physics you usually learn in college, where you can count on things like conservation of energy and path independence. Electrical resistors radiate heat, and friction gives off heat. Hurricanes (or tornadoes, cyclones, or any kind of vortex in a fluid) exchange various stuff including heat with its surroundings. But if giving off heat was all there was to it, why didn't they just say that the system can't exchange heat with its surroundings? Couldn't they have just said it has to be a closed system? And most importantly, what is it really about these processes that makes them the kind of things that give off heat, while other types of processes don't? And what about mixing processes--when you pour cream into a coffee cup, and let it sit for a while, it eventually mixes into the coffee. This is an irreversible process as it won't naturally unmix, and therefore it does increase entropy. But in what sense is this "dissipative"? Surely it was a slow process. Although perhaps it couldn't have been said to be in equilibrium before it got done mixing.

The concept of a wandering set purports to explain what exactly a dissipative system is, so I feel like it is the missing part I was looking for here. To be continued in part 3...