Using Virtual Addresses with Communication Channels

Using Virtual Addresses with Communication Channels

Oskar Schirmer
February 11th, 2013

Abstract

While for single processor and SMP machines, memory is the allocatable quantity, for machines made up of large amounts of parallel computing units, each with its own local memory, the allocatable quantity is a single computing unit. Where virtual address management is used to keep memory coherent and allow allocation of more than physical memory is actually available, virtual communication channel references can be used to make computing units stay connected across allocation and swapping.

Parallel Architecture

For various reasons, an alternate design to SMP based parallel computing for use with dynamic applications is assumed to be implemented: Large numbers of computing units, each composed of a processing unit and local memory [1]. To allow computing units to cooperate, they shall be connected by some network of comminucation channels. Each computing unit being programmed much the same as MMU-less micro controllers, the full network is understood as a parallel computing system in the sense of communicating sequential processes [2]. The single computing units should not differ in connectivity and amount of local memory.

Resource Allocation and Usage

On a single processor or SMP machine, system global memory is the main resource to administer, and it usually is portioned in memory pages (figure 1

figure 1

).

On a parallel computing system however, the computing units are the main resource, already portioned in units as is: Running an application will make use of an arbitrary, not necessarily fixed, number of computing units (figure 2

figure 2

).

I.e., memory is never a passive resource to allocate, but always served in conjunction with processing units. It is not accessed thru memory addresses, but thru communication channels, which in turn are addressed using a number for each channel end, the channel end address. For computing unit “A” to access some computing unit “B”, it configures its channel end “a” to communicate to channel end “b” at computing unit “B” and subsequently transmits data over the channel. The network layer of the channel is managed automatically by some interconnect node hardware, a good real example is provided by XMOS [3].

Resources Demand versus Availability

To cope with the need of application programmes for larger amounts of memory than actually are available on a system, and to avoid fragmentation, virtual address translation has been introduced for single processor systems in 1977.111VAX by Digital Equipment Corporation Through virtual address translation, the main resource used by some application (virtual memory) is mapped to the main resource offered by the machine (physical memory, figure 3

figure 3

).

Now, with computing units being the main resource on a parallel computing system, them being referenced by the channel end addresses they offer, we introduce a differentiation between virtual channel end addresses, as used by applications, and physical channel end addresses, as needed by the interconnect node hardware. This scheme asks for an equivalent to address translation tables, some channel end address translation means.

Channel End Address Translation

Comparing this channel end address translation to conventional memory address translation, channel end address translation tables have to be provided. Different approaches lend itself to implement such tables:

  • Explicit address translation is provided by some dedicated computing unit

  • Implicit address translation is performed automatically by some single facility at a single central location. When establishing a connection, the initiating computing unit automatically requests the channel end address translation at the central facility to find the physical destination

  • Implicit address translation is performed automatically, but is distributed, e.g. equally to the interconnect nodes. When establishing a connection, the initiating computing unit “A” automatically requests the channel end address translation at the responsible interconnect node (next to computing unit “T”) to find the physical destination, computing unit “B” (figure 4

    figure 4

    )

In fact, there are real examples of similar systems. One is the domain name system (DNS) that translates node names into IP addresses. Here, from the application point of view, translation is accomplished explicitely.

Implementation

With automatic route establishing already given, and to avoid performance drop, channel end address translation should be implemented as an automatic feature, too. To avoid congestion at a central facility, and because virtual addresses may be chosen without regard to the numbering, address translation tables, and thus virtual addresses, shall be provided per interconnect node. E.g., the upper part of the channel end address may be used to determine the interconnect node responsible, the lower part of the address being the virtual address part, which, by using it to index the translation table at the interconnect node next to computing unit “T”, is translated to a full physical channel end address. This physical channel end address is returned to the establishing computing unit for further establishing the connection to the physical destination, computing unit “B”.

Whether it is favourable to keep translation results in local caches for repeated use by the establishing computing unit, is subject to research. From the point of view of system simplicity, caches should be avoided altogether.

Channel end address translation may fail. On a system that supports exception handling, an appropriate exception handler might be triggered on computing unit “A”, or on interconnect node “T”. Again, for reasons of simplicity, it may be desirable to avoid exceptions altogether. To achieve this, the translation facility should allow for configuration to send a message, some exception signal, to a dedicated computing unit, which in turn is responsible for handling the failure by loading or swapping appropriate code.

References

[1] Eds. R. J. Elliott, C. A. R. Hoare: Scientific Applications of Multiprocessors, 1989, Prentice Hall

[2] C. A. R. Hoare: Communicating Sequential Processes, 1985, Prentice Hall International

[3] David May: The XMOS XS1 Architecture, 2009, XMOS Ltd.

Comments 0
Request Comment
You are adding the first comment!
How to quickly get a good reply:
  • Give credit where it’s due by listing out the positive aspects of a paper before getting into which changes should be made.
  • Be specific in your critique, and provide supporting evidence with appropriate references to substantiate general statements.
  • Your comment should inspire ideas to flow and help the author improves the paper.

The better we are at sharing our knowledge with each other, the faster we move forward.
""
The feedback must be of minimum 40 characters and the title a minimum of 5 characters
   
Add comment
Cancel
Loading ...
233783
This is a comment super asjknd jkasnjk adsnkj
Upvote
Downvote
""
The feedback must be of minumum 40 characters
The feedback must be of minumum 40 characters
Submit
Cancel

You are asking your first question!
How to quickly get a good answer:
  • Keep your question short and to the point
  • Check for grammar or spelling errors.
  • Phrase it like a question
Test
Test description