Tuesday, September 12, 2017

Cryptocurrency, rendering water and very large-scale computation

Consider the following video clip:


This is a natural scene. To render this scene artificially using a computer would be a highly non-trivial task. There is no simple algorithm that will handle any one aspect of this computation. A "bottom up" approach of computing the dispersion of light through the air and water according to a wave equation would be unimaginably computationally expensive. This thought is what motivated Feynman's remark about the computational aspect of the laws of physics:

It always bothers me that, according to the laws as we understand them today, it takes a computing machine an infinite number of logical operations to figure out what goes on in no matter how tiny a region of space, and no matter how tiny a region of time. How can all that be going on in that tiny space? Why should it take an infinite amount of logic to figure out what one tiny piece of space/time is going to do? So I have often made the hypotheses that ultimately physics will not require a mathematical statement, that in the end the machinery will be revealed, and the laws will turn out to be simple, like the chequer board with all its apparent complexities. - The Relation of Mathematics to Physics

Modern digital simulation exists in tension with Nature. Despite the massive scale of available computation in modern compute systems, rendering water and other natural phenomena in modern compute systems requires the use of layers and layers of "hacks" to simplify the scenes, to approximate visual effects well enough to "fool" the human eye, and to speed up the computation by taking all available computational shortcuts.

The natural environment, by contrast, makes computation appear completely effortless. Nature keeps notoriously rigid accounts - energy, mass and so on, are exactly conserved. Yet it appears that Nature considers computation to be so cheap that it "throws away" an apparently unlimited amount of computation at every detail of the physical state of the world.

This suggests that there is a problem with the way we understand computation. Ethereum is the major  challenger to Bitcoin for the cryptocurrency top-spot and it is based on the idea of charging a very tiny amount of money for each step of a computation so that the network can perform distributed, secure computations in order to implement smart-contracts. This idea is ingenious for the purposes of smart contracts but it is diametrically opposite of Nature's attitude toward computation which is that computation is so free and abundant as not to require any economization at all. In short, Ethereum is suitable for highly-secure distributed computation but it is not suitable for very large-scale computation.

So what would a truly large-scale, distributed computation network look like? Cryptocurrencies are making possible new models of large-scale cooperation that were not possible prior to their invention. According to the source linked above, Disney used 55,000 "cores"[1] to render Moana. Based on this number, we can surmise that turnaround time on pre-production rendering of a single scene was probably on the order of a few hours to a couple days. Nature performs far more than this amount of computation all throughout the Cosmos, in parallel, 24/7, down to the tiniest physical detail and it does all of this computation apparently without any expenditure of energy! This should encourage us to look at how to compute at much larger scales.

Suppose we had a global, distributed network of computers that could cooperate on a single computation and generate revenue for their operators (thus incentivizing them to join the network). In fact, we already have such networks - they are called cryptocurrencies. Cryptocurrencies don't actually compute anything other than themselves, however, so they are not of direct use. However, we can use a cryptocurrency to arrange payment between computation-consumers and computation-producers. In this way, it is possible to construct a global, distributed network that can perform truly massively parallel computation[2].

The amount of computation that the Bitcoin network performs today is about 8 exahashes per second. This amount of computational work completely dwarfs anything that the CGI rendering farm Disney used to render Moana could do by many orders of magnitude. In short, the Bitcoin network is a compute network that could (in principle) have rendered the entirety of Moana in a matter of minutes to hours, instead of days to weeks.

We need to readjust how we think about computing-at-scale. No matter what happens to Bitcoin and Ethereum, cryptocurrency is never going away, it's here to stay. The commercial computation services that will be built with the aid of cryptocurrencies are inevitable and among these services will be distributed computing. In 2017, Bitcoin is already performing computation at a scale that could render an entire 3D CGI film in a matter of minutes, and Bitcoin isn't even designed to perform outsourced computation. The scale of distributed computation in the cryptocurrency market is only going to grow.

Imagine a physics-simulation package used for aircraft wing design that can update aerodynamic simulations in real-time in response to deformations of the aircraft's design. Imagine using this to build a genetic-algorithm based aerodynamics simulator that searches through the space of possible aircraft and wing shapes to find the most efficient shapes in a matter of minutes or hours. These kinds of computations are not a question of having the right algorithms - we already have algorithms that can do all of this. It is a question of the time-scales involved. There's no point in implementing a computation algorithm that can only find useful answers in a matter of decades or centuries. The algorithms for solving many of the world's most important computational problems are dead simple. They just require unimaginably large amounts of computation to generate useful solutions.

At 10 nanometers, we have more or less reached the end of the road for scaling down integrated circuits. This means that chip frequencies are not going to go much higher and chip power consumption is not going to go much lower. However, we have hardly begun to explore the power of parallelism in computation. The most important obstacles to harnessing parallelism are the economic obstacles, not the algorithmic obstacles. Parallelism is a well-studied problem and modern systems already automatically parallelize computation. But the kind of parallelism that is most widely used today is best described as centralized parallelism, as opposed to de-centralized parallelism.

Centralized parallelism is more efficient for small-scale problems and exists in modern computational systems in several forms - instruction-level parallelism (ILP), thread-level parallelism (TLP), single-instruction-multiple-data (SIMD), and so on. Even an ambitious computation like rendering a CGI movie is utilizing centralized parallelism because the render jobs are scheduled centrally.

For very large-scale computation, we need to leverage de-centralized parallelism. De-centralized parallelism is less efficient for small-scale computational problems. But de-centralized parallelism can attain higher computational efficiency on large-scale computations by eliminating most of the need to communicate with a central scheduler. To this end, de-centralized parallelism utilizes techniques like best-effort computation (component computations are allowed to fail or disappear without affecting the overall result), distributed job scheduling (some component computations may be duplicated), and actor-model computation (component computations model the network around themselves, reducing the need to communicate). A scatter-gather approach is utilized to broadcast available work into the network and the results are later "mailed back" to the initiator as they come available.

Within a decade, cryptocurrencies are going to make this kind of compute-scale possible, a scale that is almost unimaginable today. The software ecosystem is going to have to evolve to adjust to this new model of computation where problems are farmed out to a global, distributed supercomputation network capable of computing at scales that will dwarf the most powerful, massively-parallel computation systems in existence today.

---

1. For rendering purposes, the relevant cores would be GPU cores, not CPU cores.

2. Cryptocurrency-driven distributed-computing projects include SONM, iExec, golem and Elastic, as of this writing

Monday, September 11, 2017

The beautiful concept of recursion

The beautiful concept of recursion

As the author indicates, misunderstanding of recursion abounds even within the computer design community - I've seen it in both hardware and software design. Recursion is beautiful because it preserves symmetry across repetition. Recursion is not the only way to think about solving large problems. It is not always the best way to solve a problem. But it is a beautiful abstraction and the space of possible applications of recursive solutions is unimaginably large.

Wave-Particle Duality Because Why?

We know from experimental observation that particles and waves are fundamentally interchangeable and that the most basic building-blocks of ...