[PLA] PLA #00013

Clifford Wolf clifford at clifford.at
Fri Jul 11 00:08:48 CEST 2008


Hi,

I have written yet another PLA with a new language syntax not related to
the 'keyword less language' approach followed in some of the earlier
documents. This one is more JavaScript- and C-like and has some interesting
concepts in it, i think.

	http://www.clifford.at/pla/pla00013.txt

Feedback is - as always - very welcome.

yours,
 - clifford



/01    10\/11    20\/21    30\/31    40\/41    50\/51    60\/61    70\/71    80\
|--------||--------||--------||--------||--------||--------||--------||--------|

Programming Language Apocalypse #00013
Author: Clifford Wolf <http://www.clifford.at/>
Date: 2008-07-10

              A new more C-like proposal for the language syntax
              **************************************************

This document describes another language syntax, which is more simmilar to
what C, JavaScript, etc. looks like. Its based on some ideas how the objects
may be implemented presented in the prior PLAs, especially PLA #00012 with
the title "L-value objects and simplified object interface". One of the
implications of PLA #00012 is that functions may return L-values and that
assign-operations such as '=' do not need any special handling by the VM
core.


Keywords and Identifyiers
=========================

This language proposal is using keywords. All identifiers which are not
identical to a keyword can be written as they are. But it is also possible
to prefix an identifier with a dollar sign ($), allowing variables with
names such as "if" without conflicting with the language keyword.

The backslash prefix-operator (\) transforms a string to an identifyer,
allowing identifiers such as "***" or dynamically at-runtime generated
identifiers.


Comments
========

The language has support for C-like comments using // and /* ... */.


Variables, Varibale declarations and simple operators
=====================================================

Variables can be declared using the 'var' keyword. This hides a call to
a special operator in the context object, the object which manages a vm
state and the local variables (aka stack) of this vm state.

The special variable-name $$ can be used to access the vm state object
directly. All variable name lookups (i.e. whenever an identifier is used)
are done thru this $$ object using the lookup operator ".".

Example given:

	var foo = 13, bar = 23;
	foo = foo * bar;

could also be written as:

	var $$.foo = 13, $$.bar = 23;
	$$.foo = $$.foo * $$.bar;

The '$$' object is accessed using a special vm instruction and not by
any kind of name resolution. Normally a programmer never needs to access
$$ and so does not need to bother about the existence of it.

The language implements the usual infix and prefix operators (+ - * etc.)
with the precidence we are all used from existing languages. The operators
themself are just passed on the the objects in charge and they handle the
logic. I.e. the vm core doesn't know what e.g. '+' does - it just cares
about its precidence and leafes the actual execution to the object
implementations.

A special operator is the '()' operator for calling a function. It's left
operand is the function to call and it's right operand is the functions
argument list. The argument list is - however - hot passed after the closing
parenthesis but inbetween the parenthesis, as used from many other programming
languages.

One expression followed by a comma seperated list of expressions (which does
not start with an opening parenthesis) is also interpreted as function call
by the compiler. So the following two lines of code are identical:

	printf("Hello %s!\n", "World");

	printf "Hello %s!\n", "World";

A function without arguments must always be called using the parenthesis.

Parenthesis can also be used for grouping expressions. the use of
parenthesis for grouping can be ambiguous with calling a function. E.g.:

	printf ("Hello %s!\n"), "World";

For sure, such uses of the language should be avoided by any price. However,
if it must be, the user can use the special grouping operator $( .. ) to
make an explizit grouping which can't be confused with a function call:

	printf $("Hello %s!\n"), "World";

String substitutions (which are not covered by this document) should also
use the syntax '$name' for substituting a variable and '$( expression )'
for substituting an entire expression.


Custom operators
================

In addition to the ``normal'' operators the language should support
custom free-form infix and pre- or postfix operators.

Custom infix operators do always begin the the tilde character (~)
followed by an identifier. This way e.g. the string object may implement
a special string compare function to avoid confusion with the
arithmetic relational operators. Example given:

	var string1 = "test";
	var string2 = "demo";

	if (string1 ~eq string2)
		print "Both strings are identical.\n";
	else
		print "The strings are different.\n";

The pre- or postfix operators are written using a double tilde (~~) followed
by an identifier. The tildes are not passed to the operator handling function
of the objects involved, just the identifiers. So it is also possible to
use the backslash operator (\) which creates an identifier from s tring in
combination with the (~) and (~~), as the following example demonstrates:

	var mode = i > 0 ? "--" : "++";
	i ~~ \mode;

This evaluates to "i--" for an i greater 0 and to "i++" for an i less than or
equal 0.

All custom infix operators have precedence of just before the standard
relational operators and are evaluated from left to right. So the following
would work as expected without any additional parenthesis needed:

	var string1 = "test";
	var string2 = "demo";

	if (string1 ~eq "demo" == string2 ~eq "demo")
		print "Both string or none are 'demo'.\n";
	else
		print "Exactly one string is 'demo'.\n";

The pre- and postfix operators have the same precedence as the pre- and
post-increment operators.

The use of grouping parenthesis is recommended however in all cases where
custom operators are used.


Control Flow
============

The language is using the C-like control statements if, if-else, while,
do-while and the c-like for with three arguments. Command blocks are
constructed using { ... } and statements are terminated using the
semicolon (;).

All loops support the 'break' and 'continue' statements. It is also
possible to define labels using 'name:' and jump to labels using a
'goto' statement.

Labels which point directly to a loop may also be used as targets for 'break'
and 'continue' statements, as in:

	outer: for (var i=0; i<100; i++)
	inner: for (var j=0; j<100; j++) {
		if (...)
			break inner;
		if (...)
			continue outer;
	}

The '...' actually is a statement: It is a 'yada yada yada' statement as
some might know from perl6.


Foreach Statement
=================

There also should be a special foreach statement to hide the complexity of
handling iterators. It should support the following syntax:

	foreach i (20 ~downto 10)
		print "Current value: $i\n";

The expression in the parenthesis must evaluate to an iterator object which
must follow a special object interface for iterators. In fact the code above
is just a shortcut for the following code fragment:

	for (var _i = 20 ~downto 10; _i.valid(); _i.next()) {
		var i = _i.value;
		print "Current value: $i\n";
	}

It is even possible to access the _i variable in the "foreach" loop to
interact directly with the iterator object.


Container Objects
=================

Container Objects are a Temprorary storage for lists of all kinds. Usually
a container object is used to initialize array or hash data structures or
for passing argument lists to functions.

Container Objects are automatically created for the stuff put in the 
parenthesis in a function call whenever square brackets are used in the
program code.

Example given the following code examples are semantically indentical
(some assumptions about the interface of the "Hash" class implied).

	Example 1:

		var h = Hash();

		h("foo") = "demo";
		h("bar") = "test";

	Example 2:

		var h = Hash(foo: "demo", bar: "test");

	Example 3:

		var h = Hash("foo" => "demo", "bar" => "test");

	Example 4:

		var hash_init = [ foo: "demo", \"bar": => "test" ];
		var h = Hash(hash_init);

A container object maintains a list of key-value pairs. In which, the
key may be omitted and also must not be unique. The elements of this
list are written as comma seperated list in the program code. There
are three ways of writing a list element:

	1.) Without a key.

	2.) With the key as identifier using the "key: value" syntax.

	3.) With the key as expression using the "key => value" syntax.

The consumer of the container object simply processes it lineary. There are
no index structures or anything like that in the container object itself and
it is not ment as permanent data sorage.

The container object itself implements the same iterator API which is also
used by the foreach() loop, with the current value in the ".value" child and
the current key (if any) in the ".index" child. The specification of the
whole iterator API which all it matadory and optional featers is left to the
domain of another PLA..

There are two more special entries in a container object declaration:

	@ expression

		Expression must evaluate to an iterator object. All elements
		returned by this iterator are also returned by the container
		object, but without any keys (i.e. only the ".value" member
		is passed on).

	% expression

		Expression must evaluate to an iterator object. All elements
		returned by this iterator are also returned by the container
		object, including the ".index" member if there is any.

This are usefull features for e.g. including arrays or hashes as templates
in parameter lists. The container object should also implement a low-level
api which allows identification of such @... and %... statements to handle
them more efficient than by iterating over this whole object. Example
given the following code example could result in a smart copy-on-write
tree structure in 'x' instead of a plain copy of the two arrays 'a' and 'b'
into the new array 'x':

	var a = Array(...);
	var b = Array(...);

	var x = Array(@a, @b);


Function Declarations
=====================

Functions can be declared with or without automated argument handling. A
declaration without outomated argument handling for a function dummy can
be written three different ways:

	function dummy {
		...
	}

	var dummy = function {
		...
	};

	var dummy = ({
		...
	});

In this case the container object holding the parameter list can only be
accessed using the special variable "_args".

Functions with automated argument handling can be declared two different ways:

	function dummy(x, y, z) {
		...
	}

	var dummy = function(x, y, z) {
		...
	}

Positional arguments are spezified as in the example above by simply writing
the designated name of the positional argument in a comma seperated list
inbetween the parenthesis of the function declaration, as used example given
from JavaScript.

Named Arguments are specified using the same "key: " syntax we have seen
already for the syntax of container objects:

	function sort(list: l, cmp: c) {
		...
	}

Optional Arguments must be put in square brackets to avoid runtime errors
during parameter checking:

	function sort(list: l, [cmp: c, debug: dbg]) {
		...
	}

Sometimes a function might want to except a random number of additional
positional parameters which than can be accessed using an array. Such an array
can be declared using the @-syntax simmilar to the @-syntax we have seen
already in the container object syntax:

	function printf(fmt, @values) {
		...
	}

Sometimes a function might want to except a random number of additional
named parameters which than can be accessed using a hash. Such a hash
can be declared using the %-syntax simmilar to the %-syntax we have seen
already in the container object syntax:

	function http_load(url, %options) {
		...
	}

It is also possible to specify types using the '::' notation. The arguments
are then automatically converted to this types or a runtime error is produced
if the type conversion is not possible:

	function sort(list: Array::l, cmp: Function::c) {
		...
	}

	function printf(String::fmt, @values) {
		...
	}


Missing Stuff
=============

This document is missing some important parts of the language specification,
such as:

	The handling of strings, string substitutions and SPL-like text
	templates.

	A detailed specification of the object model (incl. creation of
	new classes), the iterator interface, the $$ object and basic
	scalar objects.

	Builtin fuctions, Regular Expressions, Numerical constants.

	Assembler and Bytecode representation.

	Preprocessor Features.

	Exception handling.

These are left for further PLA documents.


-- 
For extra security, this message has been encrypted with double-ROT13.


More information about the PLA mailing list