Monthly Archives: June 2013

Avian Computing – Overview

macawThe Avian Computing Project seeks to reduce the length of time it takes to develop parallel computer programs by improving how we think and talk about parallel programs. To accomplish this goal, the Avian Computing Project replaces the current mental programming model (based on “math equation-ish” lines of code) with a mental model based on natural group elements, such as birds or ants or fish. The advantages are discussed in blogs about using a nature-based model to encourage overview level thinking while reducing exposure to program code. These aspects were chosen based on the guidelines for the next generation development model that were derived based on the limitations of the human brain AND the shortcomings of developing computer programs using languages.

Flock of Birds

Using a flock of birds as the model for hundreds or thousands of processors/cores will make it easier to develop programs that can use all those cores

One basic assumption of the Avian Computing Project is that the number of processors or cores available in each computing device will continue to increase until hundreds or thousands of cores will be available in each system. While the technology to physically build such devices currently exists (for example, the hundreds of cores currently available in Graphical Processing Units (GPU) in our graphics cards), we do NOT have a way of quickly and efficiently building the software that can make use of all of those cores.

The Avian Computing Project focuses on reducing parallel program development times by making it easier to think and talk about parallel operations. It does this by emphasizing nature-based models instead of lines of code. The primary goal of Avian Computing is NOT to make apps that run faster in parallel but instead to make it faster to get parallel apps running. When the tasks in an app are correctly apportioned and distributed among the (potentially hundreds of) cores available, the speed issue will take care of itself.

All birds follow this basic life cycle

All birds follow this basic life cycle

The Avian Computing environment also provides a standard framework (the Concurrency Explorer or ConcX) to configure and launch birds (threads). All birds have a runtime life cycle that resembles the natural behavior and life cycle of birds. Birds are hatched, look for food, eat it and digest it, store the resulting food and then take a nap before doing it all again. They also reproduce when they are well fed and die off when they cannot get enough to eat. The only coding that needs to be done by the developer is describing what is done during the digestion phase; the rest is accomplished by selecting and configuring the right birds.

All of the fussy details about thread locking and deadlocks that are normally agonized over in parallel programs are handled by ConcX so the developer doesn’t have to. This isn’t a convenience or just some handy feature; instead it was considered an absolute necessity to hide all locking in the framework so developers can focus solely on making the program work correctly. Same for starting new threads and stopping un-used threads and thread sleep; it’s all handled in the framework so the developer doesn’t waste time thinking  about them.

The Avian environment and ConcX borrow the concept of the Tuplespace from Linda, the parallel programming constructs developed by David Gelernter and Carriero and others back in the 1980’s and 1990’s. In the Avian environment, every bird gets its food from the TupleTree and stores its results back in the TupleTree. All locking is automatically managed by the TupleTree when food (objects) move in or out of the tree. The TupleTree is the only place where locking is required and the TupleTree does it all invisibly for the birds. When a bird looks for food, it “requests” a specific food type. The TupleTree locks itself as required and returns a matching food object if it has one or returns null if it does not. A basic diagram of the Avian Computing Environment is shown below.

Basic Avian Diagram

All birds feed in the TupleTree, eating a specific food and storing a specific food. The birds feed asynchronously at the rate appropriate for the task that they are performing.

When a bird receives a food object from the TupleTree, it has the only copy of that object and is the only bird that can make changes to it while it possesses it. (Other instances of the requested food type may still be in the TupleTree, just like a real tree will normally have more than one fruit or seed).

As the bird is digesting the food object, it is performing some work or applying some transforms to the data contained in the food object. When its work is completed, the bird normally stores one or more food objects back in the TupleTree where other birds can eat the resulting food.

Napping allows the processors to share resources – basically so all the threads play nice together. The maximum nap length is configurable by the developer in ConcX. The actual length of each nap is a randomly selected duration between 0 ms and the (maximum) nap length set by the developer.

Logging is internally maintained by each bird and internally to each food item. The internal bird logs are viewable in ConcX immediately after a bird is stopped. The internal food logs are listed on the TupleTree tab of ConcX; each food item listed contains a timestamp of when each action was made and which bird made the action.

By carefully analyzing the logs, issues can be researched and problems identified. The logs are written in CSV format and can be saved to files and then opened with spreadsheet programs. It has been very useful to be able to find what every bird was doing at an exact moment in time. Again, because this is all handled automatically; the developer doesn’t have to worry about them.

All together, the Avian Computing project tries to provide the concepts, tools and resources needed to quickly develop parallel programs. The Avian Computing project encourages us to think about parallelizing applications and stop thinking about code.

Avian Computing Goals – Develop From An Overview Perspective

One of the main goals of the Avian Computing Project is to reduce the length of time it takes to develop parallel computer programs. In this post, we’ll look at how focusing developers on the overview perspective will help us achieve our goal.

Developers/humans work better at the overview level. For example, if I say, “let’s send a person to the moon,” we all have a pretty good idea of what’s going to happen. But if every person who works on the moonshot is required to know every single detail, that trip may never happen because everyone will be trying to do every job and interfering with each other. If we can take an overview perspective, however, we can break the moonshot into logical tasks, such as propulsion, life support, computational, etc., and again break each task into subtasks and so on until everyone has a reasonably-scoped amount of work that they can perform without interfering with others.

In the Avian Computing environment, the overview perspective is provided by thinking about controlling a flock of birds who will together accomplish one goal or task in parallel. Some types of birds in the flock will perform one part of the process while other types of birds will perform other parts of the process.

Avian Life Cycle

Avian Life Cycle

The overview perspective is provided again by the diagram of the standard life cycle of a bird (thread). The threads don’t really undergo this life cycle (threads can’t really eat); instead this conceptual overview provides us with enough knowledge about Avian Computing that developers can use the Concurrency Explorer framework (ConcX) to produce parallel applications in relatively short lengths of time.

Locking Details

The overview perspective is easier with the Avian framework is because it handles all the locking and synchronizing automatically, inside the framework. This eliminates spending time thinking about mutexes and synchronizing and deadlocking and all of the related junk that developers normally have to contend with when developing parallel applications. It accomplishes this by borrowing some concepts from the Linda parallel programming construct and adapting them to the bird metaphor. Linda was pioneered by David Gelernter in mid-80’s and developed with Carriero & others into the ’90s. Linda manages information in tuples and threads select and work on tuples that meet their specified conditions.

Birds spend a lot of time in trees, so in the Avian environment, the birds find and eat all of their food from the TupleTree. When they are done digesting their food, they store their food (results) back into the tree. In Linda-speak, these are the in() and out() functions.

All of the locking and synchronizing is managed by the TupleTree. The TupleTree manages all requests by birds to get food and to store food as synchronized methods. If a bird gets some food to eat, that food is removed and gone from the tree, guaranteeing that multiple birds cannot access it. And once a bird has a chunk of food (object), it has complete control over it. No other bird is aware of that food object unless and until it is stored back in the tree.

Avian developers are encouraged to NOT think in too much detail about which individual bird (thread) actually performs any given task. Neither should the order of the food objects in the tree be considered or depended upon. Developers should focus on how to break any given task into subtasks and translate those subtasks into food types, the kind of logic that is best done at a higher level, at an overview level.

By thinking at the overview level in the Avian environment, we can think through the tasks and subtasks that we want our parallel program to perform and then rough out those tasks in ConcX. And then, best of all, we get to run the rough application and verify that it’s doing what we want. If not, it is simple enough to change the order that the tasks are performed in or to add or subtract, combine or split tasks.

This is one of the strengths of the Avian Computing environment; the birds in your flock are loosely coupled and work independently, giving you great flexibility in achieving your goals.

Avian Computing Goals – Loosely Coupled Code

We’re all taught that we should write loosely coupled code so that changes in one code module or function won’t affect another function or module. Like all things, it’s easier said than done.

One of the goals of Avian Computing is to make loosely coupled code easy to create. Each of the birds is a separate entity with its own set of variables. The only way it can get info to operate on is to eat from the TupleTree; the only way it can pass on info is to store food in the TupleTree.

While this might seem restrictive, it is conceptually simple and follows the ways of nature. But this simple mechanism structures our thinking so we automatically produce loosely coupled code. The only way that one bird can affect another bird is by making a change to the food it stores. The only way that a bird can be affected by a different bird is by changing its response to the food that it ate.

For example, if a function changed it’s return value from “Fred” to “10”, the code that receives the returned value may or may not know how to handle the new value. Usually we manage this changed return value by remembering all the places where we use that changed function and update it’s code. And then we usually forget one place where we used, causing the system to crash or otherwise go wonky, whereupon we do a code search of our project and find those other instances where we used it. And if we’re lucky, the changed function isn’t in a library that is used in other projects that would need to be changed.

In Avian Computing, because we share everything thru the TupleTree, we force changes up to the surface or force them down inside the bird. If changing “Fred” to “10” only affects one function in that one bird, the change can never get outside to affect other birds. If the change from “Fred” to “10” should affect other birds, the change is forced to the surface, there in the shared food, where everyone can see it. No pathological invisible couplings lingering inside the code.

Of course, there are times when we’ll have multiple instances of the same bird but we want to have different responses. Perhaps we want it to return “Fred” in some circumstances and return “10” in other circumstances. In this case, the ability of the StdBird to save two different kinds of food is used. If it should return “Fred”, the bird will save a “FredFood” to the TupleTree. If it should return “10”, the bird will save a “10Food” to the TupleTree.

The developer will have to create a FredFood type and a 10Food type, as well as a FredEater  bird and a 10Eater bird, but this sounds harder than it really is. Unless there’s something really unique in the new food types, typically they are just sub-classes of StdFood with their own constructor that identifies their food type.  Same for new birds. New birds are usually just sub-classes of either StdBird or some other bird, with changes to how they digest their food.

Sub-classing food and birds this way keeps the code for the different behaviors separate and ensure that there are no invisible connections in the behaviors when processing “Fred” or “10”. This loose coupling of modules also allows developers to substitute different versions of FredEater or 10Eater without affecting the overall program.

The ConcurrentExplorer (ConcX) makes bird substitution easy; when ConcX is running, you can configure up to 100 individual birds that will participate in your program. ConcX allows you to start the desired birds (or all birds) and then stop or start any of the birds without negatively affecting (crashing) the rest of the birds. So, for example, you might stop the FredEater bird and then start the FredAEater bird and then stop it and then start the FredBEater bird. Or run Fred, FredA, and FredB and let them compete and show you which one provides the best results.

This flexibility is only available in ConcX (and Avian Computing) because of its inherent loose coupling. All of the birds in the flock behave individually in just the way they were configured, without affecting the other birds, and together as a flock they operate in parallel to accomplish the goals of the program.

Avian Computing Goals – Limiting Programming Language Exposure

One of the main goals of the Avian Computing Project is to reduce the length of time it takes to develop parallel computer programs. In this post, we’ll look at how limiting the exposure of developers to programming languages will help us achieve our goal.

Conceptually, Avian Computing is a programming framework simulating the nature-based model. The model is relatively straight-forward; every thread is modeled on the lifecycle of a bird, hence the name Avian. Every bird is hatched, looks for food (work to do), digests it when it finds it, stores the resulting food, and then takes a nap. Well fed birds reproduce while unfed birds starve and die. Here’s a diagram of the process.

AvianLifeCycle

The birds always look in the “tree” for the food type configured by the developer in the Concurrency Explorer (ConcX). Any bird descended from a BasicBird knows how to find its kind of food without any coding by the developer. Storing the food and taking a nap, reproducing, and dying from old age or starvation are also done without any additional developer coding. The developer configures in ConcX how “well fed” a bird should be before it reproduces as well as how long it will live when it doesn’t find food.

Which means that the developer is only responsible for writing the code to “digest” their food. The digestion code can do anything the developer needs done, such as perform a mathematical calculation or format a string or look up and replace a value. The developer only needs to write the code to make the changes that should be applied to the chunk of food that the bird found before it puts the updated food back in the tree for the next bird to find.

The beauty of this system is that the Avian framework handles all of the locking and synchronizing without involving the developer. When a bird has a chunk of food, it has absolute control over that chunk of food, without having to worry about sharing it, and any changes that it makes to that food apply to that food only.

The Avian Computing environment moves the developer’s focus from “how should these threads be programmed to cooperate?” to “how should the data be broken into logical chunks that make sense to work on?” For example, an invoicing program might use the following food types: customer name and address, customer past due, customer current charges, customer special charges, customer invoice template, etc., and a different bird would eat each one of those food types.

The bird that ate the customer’s name and address would only know how to look up that information and put it into a standard format before putting it back into the tree. Eventually, an Invoicing bird would eat all the pieces for that customer and produce a finished invoice with all necessary information.

In this example, all of the work needed to generate an invoice was been done in parallel, with a minimum of programming, and with no particular sequence required. And since the Avian system automatically adjusts to the load, any birds that fall behind will reproduce and clone themselves until enough of them are available to keep up with the load.

All with a minimum of programming by the developer.

Avian Computing Goals – Nature-based Models

One of the main goals of the Avian Computing Project is to reduce the length of time it takes to develop parallel computer programs. In this post, we’ll look at how the use of nature-based models can help us achieve our goal.

By changing our programming model to natural items, such as people or animals, we immediately improve our ability to visualize the actions. For example, it is fairly easy to visualize “George gave a ring to Janet.” We might visualize a pleasant young man standing beside a river, his heart pounding in his chest when he kneels in front of Janet and proposes to her. Or we might imagine an elderly couple that has been married for 50 years, and George reaffirming his love for his lifelong companion.

Compare that to, “Instantiate an ring object and associate it with object G and then have G send a message to object J containing the ring object.” What does an instantiation look like? When asked to visualize this sequence, probably what we’d imagine is some lines of code, or at best, imagine some indistinct boxes that represent the G object and the J object with lines and arrows indicating the motion of the message with the ring object. It’s all very abstract and sanitary, devoid of emotional and associative content.

One reason that using nature-based models improves our visualization is that humans have been successfully visualizing natural elements for (tens or hundreds of) thousands of years, figuring out how to knock fruit out of a tree, or imagining how to climb that tree, or building a ladder to simplify climbing into the tree. Historically, the success of the human species has depended upon our ability to imagine a way of manipulating the natural world to achieve a desired outcome, such as getting enough food to survive for another day or two. “Natural selection” has favored those who can visualize the natural world; we have inherited this ability from our successful ancestors. The ones who lacked this ability to visualize died off.

And once we can visualize something as a natural element, we have a rich vocabulary available to us to describe actions on those natural elements and relationships between them. A bird in a tree or a tiger on a pillow, for instance. It is easy to imagine these items and to imagine their behaviors, such as a bird flying to a branch on a different tree or a tiger searching for food by moving thru the jungle.

By mapping our computer models into a model based on natural elements, we instantly gain improved visualization and a rich vocabulary to describe their actions.

Plus, natural elements improve our ability to remember them in the correct sequence and associate them correctly with other elements. The current “Memory Champ” in the US visualizes numbers as different natural elements, such as “9” is a pillow and “3” is a tiger, so to visualize 39, he imagines a tiger on a pillow. To memorize a series of numbers, he “places” tiger_pillow_casecombinations of images (tiger on a pillow) on the furniture in his house. To recall the numbers, he just walks around his mental house and looks what’s on his furniture. To memorize the names of people, he identifies a characteristic of each person and associates their name with that characteristic. This technique is almost universally recommended by experts in learning the names of people we meet.

Unconvinced? People have demonstrated that they can recall visual “passwords” for far longer periods of time than text passwords. In numerous tests, they found that a person can remember a visual password, even if they haven’t used it in 6 months. For comparison, most people can remember a text password for about a week or two without having used it. Hence the most frequent method of password cracking: finding the scrap of paper that has the passwords written on them.

Researchers have also found that when they show a sequence of pictures to their test subjects for just a second or so, their test subjects can generally recognize most of the pictures that they’ve seen before.

Our brains are hard-wired for visual images and dedicate a significant portion of the cerebral cortex to processing those visual images. To make program development faster and more efficient, we need to make better use of the strengths of the human, such as our ability to visualize.

New Model Needed for Program Development

The preceding series of blogs have addressed both the physical limitations of the human brain and the limitations of programming languages when developing computer programs.

Finally, it is time to list some of the characteristics that a new model for computer program development should attempt to embrace. Future development models should address as many of the following elements as possible:

  • Not be exclusively language based
  • Not be English language based
  • Move our focus from language details to overview/conceptual level
  • Allow better reuse of the program chunks that others have developed
  • Engage the visual processing capabilities of the brain
  • Include elements of the physical/natural world to simplify and improve our ability to visualize what is happening in the program
  • Intrinsically support parallel behaviors so thinking about parallel behavior in the model automatically translates into thinking about parallel operations in the program
  • Limit human exposure to explicitly managing parallel details (locks, mutexes, etc)

Notice that none of these items address maximizing parallel program performance or any of the other performance side-tracks that we could get derailed on. Avian Computing is exclusively addressing the efficiency of the human-program development interface. The goal of this list is to reduce the time it takes to develop a high-quality parallel program.

Yes, it is always nice to reduce a 100-second job down to 10 seconds, but if it takes 3 months to develop, test, correct, and deliver that 90-seconds saving, then it’s a false savings. After all, the next generation of hardware will probably achieve that time savings. If not, throw another 100 CPU cores at the task – that should speed it up.

By the way, ConcX implements the last six items in the above list with the specific intention of providing a workspace to explore using multiple threads to achieve a parallel program solution. Its GUI design allows each thread to be pre-configured and then its progress monitored while executing, as well as its runtime behavior saved for analysis after each run. These features allow developers to experiment with different thread settings to identify bottlenecks and performance-critical processes.