Grid-Occam

The Grid-Occam project was a research project during my time at the Operating Systems and Middleware Group at the Hasso Plattner Institute. The project was partially sponsored by Microsoft. We organized two courses in the context of this project, and Kai Köhne wrote his Masters thesis ( thesis.pdf) about the topic.

The project was about the re-use of the old-fashioned transputer programming language Occam in today’s Grid environments. In this context, we wanted to separate the parallel / distributed program logic from the underlying infrastructure technology. We developed two compilers, one generating C# code for a given Occam application, one generating Java code. The generated code uses a runtime library for the distribution-specific Occam functionality, such as channel communication or transparently replicated variables. By the abstraction of all distributed functionality out of the compiler, it was possible to develop different back-ends for the same compiler output. We had successfull demonstrations for a multithreaded and a MPI-based runtime libary. The latter one is also shown in a video.

The project is currently suspended. It would be great to bring this effort back to life, since we stopped with a reasonably well working Grid-Occam compiler for Java. You can download the last version of the sources ( gridoccam.tgz) and just start.

This web page is intended to publish the latest status, in order to encourage the continuation of the work by other people. If you use the stuff provided here as base for a new scientific publication, we kindly ask you to add one the following papers to your reference list:

Peter Tröger and Kai Köhne. Grid Programming for Heterogeneous Environments – The Grid-Occam Project. In Tagungsband des 2. Workshops zu Grid-Technologie für den Entwurf technischer System, page 39–46, April 2006, ISSN 1862-622X
Peter Tröger, Martin von Löwis, and Andreas Polze. The Grid-Occam Project. In Grid Services Engineering and Management, LNCS 3270. page 151-164, ISBN 3-540-23301-6, Axel Springer-Verlag, September 2004

Of course, you can also invite us for co-authorship ...

Technical Details

The compiler ( gridoccam.tgz) was implemented in most parts by Kai Köhne. It currently only supports a subset of the Occam 2.1 standard. Furthermore, no work has been done to make the compiler “user-friendly”. If you enter an incorrect Occam program, you will probably get some kind of Java exception – or no error message at all. It could even be the case that the Java code generated compiles correctly – but nevertheless doesn’t do what is intended to do.

Feature Status

Primitive Processes
(Multiple) assignments as well as basic input and output via channels are supported. SKIP and STOP are supported.
Constructed Processes
Both the basic as well as the replicated versions of sequences, conditionals, selections, loops and parallel blocks are supported. The basic form of alternations is also supported, but you can’t use channels as guards that are part of arrays. The replicated ALT is NOT supported.
Data Types
The primitive data types as well as literals of these types are supported. Named data types as well as record data types etc. are NOT supported. Arrays (of data types/ channels) are partly supported.
Variables and Values
All advanced stuff with arrays (components / segments etc.), as well as Abbreviations is currently NOT supported.
Channels / Protocols
Only basic protocols are supported.
Expressions
Most operators are supported. Not working are SIZE, MOSTPOS, MOSTNEG, as well as data type conversions.
Advanced Features
No work has been done on procedures, functions, timers, retyping and reshaping.

Entry Point

The compiler tries to follow the Occam 2.1 language standard as accurately as possible. However, the standard leaves also room for interpretation. One aspect is the entry point of the program. Our compiler expects in every file to compile a top-level procedure with either no parameters at all, or three CHAN OF BYTE’s representing stdin, stdout and stderr.

	PROC main (CHAN OF BYTE keyboard, screen, error)

Using the Compiler

The compiler is a native Java application. To use it on the command line, make sure the CLASSPATH is set and call:

	java org.occam.grid.compiler.Console example1.occ

Here is an example for specifying the classpath in the Java call, in case the libraries are in the same directory as the compiler:

	java -classpath groccam.jar;antlr-2.7.5.jar org.occam.grid.compiler.Console Exercise.occ

You will see a lot of debug information. If no exception occurs, a new java file example1.java is created, which can be further compiled with javac and executed.

Parameters

The compiler expects as parameter the occam files to compile. Furthermore, it supports following parameters:

	--debugTokenStream
	--debugParser
	--debugScopeIdentifier
	--debugNameModifier
	--debugJava
	--help
	--version

The debug parameters show the different stages of the abstract syntax tree during the compilation.

Compilation of the resulting Java code

Since our compiler only translates the Occam code to Java, it is also needed to generate the final class file with the Java compiler:

	javac Exercise.java

Running the Occam application

Pick the JAR file of a runtime implementation and execute your program:

	java -classpath groccam_mt.jar;trove.jar;.;log4j.jar;backport-util-concurrent.jar Exercise

Your executable accepts the --occamRuntime command-line argument to specify the runtime library to be used. The default setting is to use the multi-threading runtime library for the execution of parallel parts. This would be a good starting point for any project based on our code - you could develop a new runtime library for your own parallel or distributed infrastructure.

Functional Occam Example

PROC hello()
  INT x, y:
  CHAN OF INT c, d:
  PAR
    SEQ
      c ! 117
      d ? x
    SEQ
      c ? y
      d ! 118
:

Ideas For Grid-Occam Language Extensions

Explicit String Type

Occam has no dedicated data type for strings – strings are just array of BYTEs. This has the advantage that there is no need for a string handling library – almost everything can be achieved via the normal array operators. However, it has also the limitation that only ASCII characters are supported. According to the occam manual, the BYTE data type has a value range from 0 to 255. This maps quite nicely to ASCII, because every character can be stored exactly in one byte. If other multi-byte encodings should be supported, the length of the byte array is much more compicated to predict.

Our proposal is the introduction of two new data types, CHAR and STRING. The runtime library provides string manipulation + conversion routines.

Type Extensability

The type system of Occam 2.1 is quite limited. When used as a coordination language, there is a high chance that the type system does not support the construction of the complex type needed. The compiler and runtime are extendable to support further “native” Occam types. New types can be easily implemented in Java (e.g. by implementing a certain interface). A mechanism has to be provided to register these types within the compiler and runtime.

Flexible Arrays

Flexible arrays do not have declared bounds; the bounds are set at runtime, based on which elements of the array have been assigned values.

Scattered Array Segments

Occam allows the definition of new arrays by the means of array segments. Using array segmentation together with the aliasing feature allows the programmer to modify different parts of the original array in parallel:

	[10]INT array:
	seg1 IS [array FOR 5]:
	seg2 IS [array FROM 5]:
	PAR
	  -- manipulate seg1
	  -- manipulate seg2

However, this works only for continuous areas. This is sometimes not feasible (for instance when the array represents channels). We propose to provide a method to construct scattered arrays:

	[10]INT array:
	seg1 IS [array SELECT [0 2 4 6 8]]:
	seg2 IS [array SELECT [1 3 5 7 9]]:

The indices for the array selection are INT arrays themselves, and therefore can also be constructed computationally:

	[]INT FUNCTION even(INT max)
	  ...
	:
	seg1 IS [array SELECT even(9)]:

It seems that this is a specialization of the more general filter functions known in functional programming languages. However, this would require functions as parameters …

Other Ideas