Thursday, June 14, 2018

Notes on a Cosmology - Part 24b, Epilogue cont'd


One of the delights of quantum science is that it is filled with puzzles, paradoxes and surprises. In this post, I want to exposit the close connection between quantum superposition and parallel computation. At first, these might seem to be widely unrelated ideas – after all, superposition is a wave phenomenon (continuous), while parallel computation describes a discrete process. As we saw in the previous post, however, the discrete and the continuous are not so incompatible as it might seem. Often, the distinction is a matter of perspective.

The Perl programming language has become a workhorse of Linux operating systems, bundled by default into most major distributions of that operating system. Around 2000, a Perl package named Quantum::Superpositions was added to the Perl package repository. This package added several features to the language, but the two most notable for our purposes are the any() construct and the all() construct. These constructs allows a program to treat a list as though it were a single value, in one of two different ways. If you are interested in a more detailed description, the documentation is quite straightforward. The irony is that these constructs have nothing to do with simulating quantum computations – so why name the package Quantum::Superpositions?

In order to understand the rationale behind the name, we have to understand exactly what is happening in a qubit that is taking advantage of the power of quantum superposition. Physicists will often describe it in terms of parallel universes – imagine that the quantum bit is simultaneously existing in two, separate universes and, in one universe, its value is 0 and in the other universe its value is 1. In this sense, the qubit is in “two, mutually-exclusive states at once.” These kinds of mind-bending descriptions tend to make us feel either monumentally stupid or secretly skeptical. Like the story of the naked Emperor, however, we dare not air our skepticism lest we look like the only dunce in the room.

On modern computers, operating systems use a technique called “multi-tasking” to perform many, unrelated tasks in such a way that they all appear to be performed in parallel. The core technique of multi-tasking is time-slicing … if there are 100 processes running on your system, the operating system will divide each unit of time in such a way that each process is allowed to run for a small slice of time before it is forced to give up control to the next process scheduled to run. Because the CPU in a typical computer is operating at billions of cycles per second, the user cannot perceive any observable artifacts from the time-slicing. When more threads are available to the operating system (i.e. when running on a multi-core CPU), the operating system performs less time-slicing because each task can be allowed to run for a longer slice of time on each thread. Suppose we have a system with four threads, the load that each thread has is one-fourth the load that a single-threaded system would have. Thus, the system only needs to time-slice 25% as much, allowing each process to run four times as long without changing the total throughput of the system as perceived by the user. Now, suppose we have a system with eight threads. All these numbers change proportionally – each thread has 1/8th the load of a single-threaded system and we can allow each process to run eight times as long. As we increase the number of threads until we have one thread per process in the system, we reach a point at which every process can run 100% of the time because every process can monopolize a thread on the CPU without interfering with any other running process – all while maintaining the same multi-tasking guarantees that the operating-system must satisfy to the end-user.

But given the fantastic performance of time-slicing on small numbers of threads, it is obvious that a hardware system with so many threads that every process is running on its own dedicated thread would be wasteful. Most processes (and the threads they are running on) would be idle most of the time. Most of the potential computing power of a CPU with hundreds of threads would be wasted. This brings us to the problem of automatic parallelization.

To understand the problem of automatic parallelization, let’s look at the behavior of a single computer program running on the CPU (suppose it has a complete monopoly on the CPU). How can this program keep the CPU as busy as possible, especially if the CPU has multiple threads? One approach is for the program to explicitly parallelize itself, cutting up its work into blocks so that each block can be dispatched to a separate thread on the CPU. Another approach – the approach that has dominated from the advent of superscalar CPUs in the mid-1990s until today – is to allow the hardware to try to “extract” parallelism from the software, without explicit assistance from the software. This way, the software can focus on describing what it is trying to compute, and the hardware can focus on how to compute it.

How does the hardware do this? After all, the hardware has only a very low-level “machine code” view of what the software is doing. The hardware differentiates between two classes of instructions. The first class of instructions are non-branching instructions. The second class of instructions are branching instructions. Non-branching instructions always proceed in order, one after the other, like the steps in a cooking recipe. But branching instructions can “jump” to any other instruction, based on logic conditions. Branching instructions are how software implements conditional constructs like “if-then-else”. Whenever you want some part of a program to only be executed under certain conditions, you’re going to need a branching instruction to implement that functionality in the machine code.

We can use code visualization tools to look at how machine code looks to the hardware – this is called a control-flow graph. A CFG shows only the branching instructions (as vertices in the graph) while all other instructions are abstracted away as graph edges. Whenever the CPU encounters a branch it faces a conundrum – which branch should it follow? It can wait to find the outcome of the branch condition, but this wastes valuable cycles that the CPU could use to get a head-start on processing the branch path that will be taken. In order to avoid wasting cycles, the CPU performs “branch prediction”. It makes a guess as to which path will be taken and starts executing instructions down that path. Branches can be nested multiple times and the branch predictor will attempt to make a reasonable guess as to which branch path will be taken. When the branch predictor is wrong, the cost of misprediction is about the same as having waited for the branch condition to be evaluated, but when the branch predictor is right, the program is able to proceed as if there had been no branch at all.

When a CFG becomes sufficiently deep and intricate, the branch predictor is bound to make mistakes and the penalties will pile up. A modern, superscalar CPU may have as many eight available executions (EUs) units for branch-predicted instructions to proceed through – keeping all those EUs busy is the scheduler's job, a job that would be almost pointless without the branch predictor. But this general problem could be scaled up to the level of hardware threads and automatic parallelization architectures aim to do just this.

One early proposal for automatic parallelization is called eager execution. If you have two available CPU threads and you encounter a branch instruction, rather than trying to predict which path the program will take, you simply set one thread to executing one branch path and the other thread to executing the other branch path and just discard the results of the wrong path. If you have four threads you can process two branches one right after the other – with eight threads you can process three sequential branches, with sixteen threads you can process four branches, and so on.

Obviously, eager execution cannot scale to parallelizing entire programs which may easily have thousands of branch instructions. A CPU with 1,000 threads utilizing EE would waste roughly 99.9% of its processing power on branch paths that would never occur and would still only be able to perform lookahead for a handful of branches at a time. To handle this scaling problem, the idea of disjoint eager execution (DEE) – and many other similar architectures – was proposed. A CPU that implements DEE combines branch-prediction and eager-execution to process the control-flow graph in probability-order – the most probable paths are scheduled across all available threads until there are no more available threads. As the results from those paths become available, the branch-predictor is updated, pruned paths are discarded and the currently most-probable paths are again dispatched across the available threads.

Conceptually, this is an optimal utilization of a multi-threaded CPU for processing a serial program. Such a CPU, if it were built, would be utilizing all the available information about the program’s runtime behavior to make informed predictions about its future branch behavior, allowing the CPU to speculatively look ahead across dozens, maybe hundreds of branches at a time. This would allow a CPU with N threads to provide speedup in proportion to a non-trivial ratio of N.

The Quantum::Superpositions package provides us an interesting way of visualizing what is happening at the hardware level of a DEE CPU. Let’s suppose we have written a program that searches for a specific number from a list of 1,000 numbers. Assuming ideal memory performance (no cost to read or write memory), our 1,000 thread CPU could potentially locate the number we are searching for in just 1 step (each available thread gets assigned to perform one of the search comparisons). Using the any() construct from Quantum::Superpositions, our search would look something like this:

if(any(number_list) == search_number){
    print “Found it!”
}

Read: “If any number in number_list is equal to the search_number, tell the user we found it, otherwise, do nothing.” This construct suggests that there is another way of thinking of the search procedure. Suppose that the any() construct actually splits the universe into separate parallel universes until it completes. Instead of having a 1,000 threaded CPU, we would have 1,000 copies of a single-threaded CPU, each in its own parallel universe. The overall construct would exhibit the same net behavior, either way. We search through 1,000 numbers in 1 time-step. (For me, this was the “Aha”-moment. Your mileage may vary.)

More sophisticated models of branches in the CFG would assign a weight to each edge. It might even “expand” the CFG itself by duplicating the entire graph relative to branches that choose between independent weight-assignments across many edges in the graph[1]. From the point-of-view of the multi-threaded CPU, the CFG looks like a weighted spanning tree (which would be a kind of probability-tree) at any given time. But this would be equally true for our single-thread-parallel-universe CPU … some parallel universes would have more weight than others. I assert that the mathematics that describes the behavior of superposed qubits exactly corresponds to the weighted-parallel-threading CPU model I have described[2]. In short, a DEE CPU using an ideal CFG model would have exactly the same net behavior as a quantum computer processing the same CFG using a number of qubits that is logarithmic in the number of threads of our DEE CPU. Ten qubits could theoretically match performance of a 1,024-thread DEE CPU; 20 qubits could theoretically match performance of a 1.05 million-thread DEE CPU, and so on.

But hold on a moment -- where is the quantum fabric of the universe getting the CFG model from? There is no reason to believe that the quantum computer can divine what we’re trying to compute just from the quantum operators we supply for the computation. What the mathematics of quantum computation conceal is the assumption of ideal traversal of the CFG as guided by the weights on its edges – these weights would correlate to the amplitudes of each superposition state within a quantum computation. Think of each of the exponential number of states in a quantum computation as being a path through the spanning tree of the CFG – if we don’t place weights on the spanning tree, this is exactly equivalent to Eager Execution! On a computation spanning just 10 qubits, 99.9% of the computation would be wasted, and this waste factor increases exponentially in the number of qubits.

The mathematics that describes the bulk of quantum computation does not view the discarded paths as waste. Rather, it is taken as a given that Nature performs computation for free, so all we have to do is give her the problem we want to be solved and then wait while she solves it. But the practical challenges of protecting qubits from decoherence have encouraged research into a new area – quantum error-correcting codes. What I assert is this: quantum error-correcting codes are logically equivalent to the weights on a weighted CFG used in a multi-threaded DEE CPU.[2]

Let us think of every particle of matter in the Universe as though it were a tiny, single-threaded CPU, capable of performing some kind of simple functional computation. We are surrounded by a veritable ocean of such particles. If we could somehow program these particles to work together at an atomic scale, we could build a plain old classical multi-threaded CPU that could compute immense problems through sheer scale – a quadrillion such particles could easily fit within the confines of a warehouse and could compute problems at scales that are hardly imaginable for us, today. The promise of quantum computation is that we do not need to have a particle-per-thread because Nature is very clever and she uses her resources much more efficiently. Instead, we can (theoretically) obtain exponential advantage by properly preparing qubits. But once we have prepared our qubits – or so the theory goes in some quarters – Nature stops being so stingy and she will compute countless parallel Universes almost all of which will be discarded at the moment of measurement when all the possible paths through quantum state-space are discarded and only the actual path remains.

To be clear, the theory of quantum computation is correct. But the most optimistic expectations of quantum speedup (exponential in the number of qubits) do not take into account the complexities of practical quantum computation. The further we go down the path of quantum speed-up, the greater the challenges are going to be. Quantum speedup faces a law of diminishing returns, nay, it faces a difficulty curve that becomes asymptotically vertical. You can only get particles so close to absolute zero, you can only isolate so much environmental noise, you can only maintain the coherence of qubits for brief periods of time. Each of these challenges increases in difficulty at a geometric rate, in the number of qubits. Quantum error-correction is a promising path forward but, judging as an outside observer, I sense a certain aloofness among the quantum physics community, as though quantum computers are just one breakthrough away from obsoleting all prior computing technologies. It’s not that easy. Even Nature cannot escape the laws of information theory, and it is those laws that place fundamental limits on what both classical and quantum computers can do.

I propose a new way of thinking about quantum computation. I propose that quantum computation is a viable candidate for the characteristica universalis (CU), a concept that goes back to 18th-century philosopher and inventor of the calculus, Gottfried Leibniz. The idea of the CU is that there must be some language which is ideal for discussing any problem of science, mathematics or physics – a perfect language. This perfect language would enable us to think so much more clearly that it would become the language in which we think about everything, even fuzzy, complicated subjects like law, social norms, economics and politics. This ideal language would be so well-defined, Leibniz imagined, that it could be processed by a mechanical system, in much the same way that modern programming languages are mechanically processed by computers. He called the device that would process the CU the calculus ratiocinator (CR). I propose that Nature herself – in all her quantum subtlety – is the CR that Leibniz was seeking.

In summary: I assert that the Simulation Hypothesis is the case and that quantum computation is the characteristical universalis; the CU is processed by the universal simulator, which is more accurately thought of as being an instantiation of Leibniz’s reasoning machine, the calculus ratiocinator. There are no “levels” of simulation in the vein of simulation-based science-fiction, whereby we are trapped inside of an illusion or construct. Rather, the Universe, considered in its entirety, is self-simulating or co-simulating, more like a peer-to-peer network than like a centrally-controlled computation.[3]

Next: Part 24c, Finis (stay tuned)

Footnotes:

[1] – Suppose when the program starts, the user can select between one of several operating modes… the program’s behavior might be radically different in each of those modes. In that case, you would want to model each of those different modes as separate copies of the entire CFG with each having their own weight-assignments.

[2] – One caveat is that our CPU would have to be probabilistic in order for this logical equivalence to hold, not deterministic like the ordinary CPUs we are familiar with

[3] As a closing note, the astute reader familiar with quantum phenomena will notice that I have not mentioned entangled states. It is sometimes implied that entangled states illustrate that quantum computers can do things that classical computers cannot -- this is partly true because classical computers cannot exhibit specifically quantum phenomena. Nevertheless, every computational consequence of quantum phenomena can be -- and is -- simulated by classical computers, including entangled states. In short, entangled states are just one among many counter-intuitive consequences of the mathematics of quantum computers.

Saturday, June 9, 2018

Notes on a Cosmology – Part 24a, Epilogue


In the next few posts, I plan to wrap up the Notes on a Cosmology series and draw some general conclusions. The idea I want to convey is very difficult to put into words not because it is a very complicated or novel idea but because there are so many ways to miscommunicate it.

Let’s begin the project of summarizing the series by looking at one of the emerging technologies of our time: Bitcoin. The principles of Bitcoin are defined as follows:

  • 21 million coins
  • No censorship: Nobody should be able to prevent valid txs from being confirmed.
  • Open-Source: Bitcoin source code should always be open for anyone to read, modify, copy, share.
  • Permissionless: No arbitrary gatekeepers should ever prevent anybody from being part of the network (user, node, miner, etc).
  • Pseudonymous: No ID should be required to own, use Bitcoin.
  • Fungible: All coins are equal and should be equally spendable.
  • Irreversible Transactions: Confirmed blocks should be set in stone. Blockchain History should be immutable.

This may seem to be a topic far removed from the lofty ontology of the Holy Trinity and the other ideas I have covered throughout the series. But it is not so far removed. Let’s re-word the Bitcoin principles slightly (while retaining their essential properties):

  • Fixed, finite pie - no one can pad their pocket or create additional value for themselves, ex nihilo
  • Non-obstruction – everyone can engage the system without the possibility of being censored
  • Reverse-engineering is allowed – no one owns a patent on the system or can act as its central director
  • Permissionless – similar to non-obstruction; you don’t have to ask permission to participate in the system
  • Secrecy is possible within the system (if you don’t divulge your key, no one can access the associated funds)
  • Homogeneity – every part of the system follows the same rules as every other part of the system
  • Irreversibility – no take-backs

If you think about it, these properties describe another system that is very familiar to all of us, whether tech-savvy or not. This system is called the world. Look around you, and ask:

  • Is there anyone who can wave a magic wand and bring gold bars into being and make themselves rich, thusly? No. The physical pie is a fixed, finite pie.
  • Is there anyone who can obstruct you from using your own body? Sure, you can be put in jail or even in restraints. But this does not obstruct you from using your own body within those constraints, it merely encloses the extents in which your body can exist, while you remain free as ever to use your body. You could be drugged but this entails some loss of ordinary consciousness, and it is the ordinary world that I mean to examine.
  • Is there anything preventing the rules of Nature from being reverse-engineered and copied verbatim? No. You are free to reverse-engineer the very fabric of reality itself and copy it, verbatim, if you are able.
  • Must you ask anyone’s permission in order to be alive? Sure, someone might kill you if you do not do what they want. But that is not the same as having to ask permission, it just means that your existence is not unconditional and permanent.

As far as we know, it is possible to have secrets in the physical world; that is to say, no one has proved that the physical world is a holographic construct in which all facts are accessible (with sufficient energy) to every observer within that construct. The laws of the physical world are homogeneous – the speed of light on Arcturus is the same as the speed of light on Earth. Finally, all physical processes are irreversible.

It is, at once, remarkable and unremarkable that the principles of a currency that was invented to be ideal, in some sense, would happen to correlate with properties of our world. The correlation is remarkable because physics appears to be a very different kind of thing than a digital currency. But it is unremarkable in that we are physical beings – what we consider to be useful and relevant is inescapably shaped by our physical mode of being.

My purpose in mentioning the correlation between the principles of the world’s largest cryptocurrency and some of the properties of the physical world is this: even though computational systems and material systems appear to be very different, our belief in this “computational dualism” is purely our own prejudice. We see the physical world as being very different from the artificial world only because we are so used to the physical world.

I have recently done a deep-dive into the subject of artificial neural net (ANN) architectures. One of the things I have realized as a result of this study is that the relationship between discrete and continuous computation is not well understood; at least, there is some persistent confusion in the various specializations. If we were to identify one property that makes artificial neural nets so useful and flexible, it would have to be that they are end-to-end differentiable. The backpropagation algorithm is possible because differential equations work equally well going from the input to the output or vice-versa. Differential equations merely specify the differential relationships between two or more variables; they do not specify which variable “causes” or “drives” which. For this reason, we can use an iterative training method that allows us to alternately forward-propagate and back-propagate the values in any set of equations that is end-to-end differentiable (including ANNs).

The physical world has a property very similar to end-to-end differentiability – our best physical theories describe the world using complex-valued functions; we can use analytical continuation to extend these functions beyond the reach of observation. The logic of analytic continuation is this: supposing the world continues to behave at very large and very small scales in a manner that is consistent with the way it behaves at scales we can observe, then it must behave so-and-so. Note that such reasoning is not a substitute for experiment. We only resort to this kind of indirect reasoning for those scales where experiment is simply not possible with current technology. The key point is this: analytic functions are everywhere differentiable. This means that every aspect of the physical world that is described by our best theories is amenable to back-propagation!

In Part 16 of this series, I described the quantum monad. Key to understanding the idea of the quantum monad is understanding the relationship between continuous information and discrete information. Mathematics leads us to think of the discrete and the continuous as two, irreconcilably separate domains. But the physical world exhibits a unity between discreteness and continuity. Our best theories of physics describe a continuous world, a world that is everywhere differentiable – yet discrete signals abound within this world. Speech, the written word and hand gestures are all familiar examples of discrete signals. Every digital electronic computer is built using analog circuitry and yet the electronic digital computer is a physical system that almost perfectly implements idealized, discrete computation. What this tells us is that discreteness can be thought of as a matter of staying far away from boundary conditions. In digital electronics, the boundary condition is called the non-allowed region – the circuit is simply not allowed to remain at voltage levels that are not clearly “high” or clearly “low”. Any circuit that stays in this region is exhibiting undefined behavior. Note that such conditions commonly arise in real electronic circuits as the result of high-impedance states but a logic failure is likely if these high-impedance states are used as input to logic gates.

Another, less known form of electronic computation that enjoyed some popularity in the pre-PC era is called stochastic computation. The basic principle of stochastic computation is to binarize a signal, while treating the value of that signal as a ratio of its levels – if a signal is ‘1’ 70% of the time and ‘0’ 30% of the time, then the value of the signal is 0.7. Note that the mathematics that describes the signal values in a stochastic system is continuous, not discrete. The resolution of measurement is, at any given time, finite but this is a limitation that is very well-understood since physical theories must always take into account measurement error. The point is that we can build a discrete system and use continuous mathematics to correctly describe its behavior, just as we can build continuous systems and implement discrete-symbol systems within them.

An article just published in Nature has calculated that the efficiency gains of artificial neural networks built with analog circuitry are around 100-fold over the GPUs that are commonly used. Biological brains – including the human brain – are analog computers, not digital computers. We can see that there is a deep relationship between analog, discrete, noise-tolerance, power-consumption and stability. This relationship is one of tradeoffs, not a black-and-white choice between discrete or analog.

The theory of quantum computation predicts (and experiment confirms) that computations using qubits are able to harness modes of computation not available to classical computers. But here’s the punchline: the mathematics of quantum systems straddles the very same divide as the discrete-analog distinction. Is it a wave? (Continuous?) Is it a particle? (Discrete?) The correct answer: it is both. This is true of all real information and all real computation. There is no exception – all digital computers are actually just analog systems that we interpret discretely by staying far away from the boundary conditions.

The theory of computational complexity tells us how hard it is to compute the solutions to different kinds of mathematical problems. Certain problems are easy to solve. Others are very difficult. Some are provably impossible. But the domain of computational complexity theory is restricted to symbolic computation (discrete computation) – the answer to hard mathematical problems can sometimes be found very directly with analog methods. The circuits used to implement artificial neural networks are a good example of this disconnect. Digital multiplication is an O(n2) operation which basically means that the number of steps required to calculate the multiplication grows as the square of the number of digits in the numbers to be multiplied. But an analog multiplier is very simple and its time complexity is O(1) – it returns a result with a single, fixed delay regardless of the number of digits in the multiplication. The limitation is in the precision of the multiplier. The only way to get more digits of precision is to measure the output of the multiplier more precisely and experience shows that the cost of measurement grows geometrically with each additional bit of precision.

Deep Learning is just the first baby step towards something much bigger. Some people are confidently predicting that we will soon build quantum neural networks but it is difficult to imagine how we are going to do that when we are not yet even harnessing the power of analog electronic neural networks or optical neural network technologies. We know that the mathematics of classical systems (such as analog computers and digital computers) is just a special case of the mathematics of quantum systems. Despite the computational speedup that quantum computers can deliver over classical computers, there is no well-defined problem that a QC can solve that no analog or digital computer cannot solve, given enough time and resources. The difference is one of degree, not of kind. The power of quantum computers is not the result of some kind of quantum pixie-dust.

From the information theoretic point-of-view, quantum systems are nothing more or less than continuous systems that admit to discrete conventions, like the non-allowed regions in a digital logic circuit. The mathematical implications of these kinds of continuous systems are subtle and easy to miss from a purely physical perspective. When a physicist has a theory of physics, he takes this to mean that the entire evolution of the physical system is, in some sense, predictable or determined by the parameters of the theory itself (even if the theory is probabilistic, as quantum theory is). But if that physical system can compute, then its long-run evolution might be undecidable. Specifically, the long-run evolution of any physical computer that can simulate a Turing machine is undecidable, otherwise, we could build such a physical machine and use it to solve the halting problem[1].

I have often encountered confusion about the underlying basis of quantum computation, a confusion that I think it is important to dispel. A quantum computer cannot exist in two, mutually exclusive states at the same time. This is true by definition because what we mean by “mutually exclusive states” is that nothing can be in both of those states, at the same time. Quantum experiment only breaks our intuitive notions of the microscopic world, a world that we envision as consisting of tiny classical marbles hurtling unimpeded through empty space and colliding with one another like microscopic billiard balls. This intuitive notion is untenable and is thoroughly contradicted by laboratory experiment.

In Part 16, I laid out the cosmological idea of the quantum monad. The quantum monad is the idea of abolishing the veil of mystery that surrounds quantum phenomena. The problem with quantum physics is not that it is quantum; the problem with quantum physics is how we talk about it. It is one thing to have wonder and awe at the intricate patterns of the natural world. It is another thing to speak of the natural world in frankly magical language, something that frequently happens in popular discussions of quantum physics. If we mean to do science, then we must dispense with non-rational and non-causal thinking.

Let’s return to the two-slit experiment. If we perform this experiment in a shallow pool of water perturbed by a single wave source, we will find that the interference patterns produced exactly correspond to the quantum experiments. Light was thought to be a wave by many early modern physicists. So, there is no surprise here. The surprise arises when we perform the experiment with a single slit – in this case, the pattern formed by the light is not like that which we would see in a shallow pool of perturbed water. Instead, we see the light striking the back-screen in a particle-like pattern. Some early physicists (including Isaac Newton) thought that light was a particle like any other. The surprise of quantum experiment is that light behaves as either a wave or a particle. But light never behaves as both at once. It either exhibits wave-like properties, or it exhibits particle-like properties.

Of course, quantum particles exhibit many other counter-intuitive properties. If we perform the two-slit experiment by emitting a single photon (or electron) at a time, we will see the same wave-pattern on the back-screen as if we had emitted the photons or electrons all at once. Quantum theory explains this phenomenon by extending the wave-equation through both space and time. Of course, classical fluids and gases do not behave this way. Water waves interfere as they do because the countless water molecules are interfering with each other, all at once.

The analogy I assert is this: quantum systems are to classical systems as analog computation is to digital computation. This analogy is a loose one. The idea is this – discrete computation is just a convention, the only computers we can actually build are analog systems. Discrete computers get their discreteness from staying well clear of boundary conditions – nothing more. Similarly, quantum theory says that all classical systems are really just quantum systems with so many copies (nearby quantum particles in similar states) that they exhibit classical behavior.

I opened this post by discussing the principles of Bitcoin in order to make the following point: it is conceivable that all the properties we associate with the material world are what they are for computational reasons. This idea puts the Simulation Hypothesis in a new light – rather than being a reflection of a generic existential angst or nihilism, perhaps the idea that the world is a simulation has a solid basis in laws that must regulate any information system. If this is the case, then we can see every particular fact of the material world in light of a symbolic computation. I will expand on this idea in more depth in an upcoming post.

Next: Part 24b, Epilogue cont'd

----

[1] – As a pedantic point, any finite physical machine is fully computable because it has only finite memory resources; however, we can make a limiting argument to counter this… we mean by “physical computer that can simulate a Turing machine” any physical computer whose indefinite operation is merely a matter of adding more of a homogeneous physical resource, such as loading more and more tape into a mechanical Turing device.

Wave-Particle Duality Because Why?

We know from experimental observation that particles and waves are fundamentally interchangeable and that the most basic building-blocks of ...