previous next contents
->
.
Its structure can be described by means of nearly the same mechanisms as known already
from the source part.Example:
nonterm = SourcePart
-> TargetPart1 (* can be referenced as nonterm_ or nonterm_1 *)
-> TargetPart2. (* dto. nonterm_2 *)
name
can be used by
name_i
. For i=1 the digit can be omitted. (name_1
= name_
)
Using a target for which no definition is given (e.g. nonterm_4
in the above example)
causes a runtime error message and inserts a <***UNDEFINED TARGET***>
text in the generated result.
There is also a conditional expression
where, depending on the value of(!boolExpr; targetExpr1 | targetExpr2 )
or(!boolExpr; targetExpr1 )
boolExpr
, targetExpr1
or
targetExpr2
(resp. nothing) is inserted.
Example:
nonterm = SourcePart
-> SOURCETEXT. (* just copies the source *)
Remark: SOURCETEXT
must not be applied if SourcePart
contains (direct or
indirect) calls to From or Back.
Ml4
translator needs not to know anything about them, because their interpretation is defined elsewhere.
Their syntax is:
'<' identifier ' ' {character} '>'
Example:
nonterminal = 'IF' expression 'THEN' statement
-> 'if (' expression_ ')' <'Indentation +'> statement <'Indentation -'>.
This could be interpreted as formatting information.
Ml4
production
only.
Ml4
production four variables are declared implicitly:
c
.
The counting starts with one.
(* An identity mapping; removing heading spaces and comments:*)
Digit = '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
-> c-1.
O
is 1, if a source part enclosed in option brackets
appears in the actual source, otherwise its value is 0.
In the following example the variable O
is used implicitly
to transfer information from source to target.
In the target part, the character "-
" is included in the target only if it is
present in the source.
SigndNum = ['-'] Num
-> ['-'] Num_.
i
and N
deal with iterations.
In the variable N
the number of iterations found in the
source is stored, i
is the loop variable. In the following
example N
is again used implicitly in souce and target,
such that the numbers of iterations in source and target are identical.
For the nonterminal Dcl
a respective number of instances is
needed. These are distinguished using an index with respect to the loop
variable i
.
DclSequence = { Dcl[i] }
-> { Dcl_[i] }.
UnexpectedTranslation = { ('1'|'2'|'3') }
-> { ('1'|'2'|'3') }.
applied on "12123
" will result in "33333
". The reason is that the
value of (implicit) c is changed in every iteration, and finally set to 3, i.e. its last
value is always propagated into the target generation. (The effective handling
of such cases will be dealt with next.)
/... /
.
A control must be the first significant entity (after the opening bracket, if any) of its surrounding
structure.
DclSequence = /c/ ident | [/O/ '-'] number.
/i=1..N/
. There are several shortcuts:
/j/
for/j=1..N/ /=2/
for/i=2..N/ /..M/
for/i=1..M/ /j=2/
for/j=2..N/ /=3..M/
for/i=3..M/
//
as (empty) control.
DclSequence = 'VAR' {// ident {//',' ident}':' type}.
[/v/ 'abc']
acts
as if v>0 then generate('abc')
.
123456789
":
Generate1to9= -> {/..9/ i }.
Control by value can also be useful in the source part. There is a simple way to override the default control by source behavior: Write an expression, which is syntactically different from a simple variable, e.g., by enclosing a variable in brackets.
NonCFG = {'a'}{/..(N)/'b'}{/..(N)/'c'}.
accepts the famous (non context-free) language anbncn. The
first iterator is controlled by source, i.e., the number of 'a' determines the value of N. The subsequent
iterators will fail, if the number of 'b' and 'c' is not the same as N.
Another example for exploiting controls and indices is the inversion of a (non-empty) comma separated identifier list:
RevertList = ident[1] {/=2/ ','ident[i]}
-> ident[N]{/..N-1/',' ident[N-i]}.
!boolExpr
NonCFG = {'a'}{/..c/'b'} !c=N; {/..O/'c'} !O=N.
(This is also an example of safe re-using the predeclared variables c and O - there are not options or alternatives.)
The condition, in connection with an alternative can be used to simulate a conditional statement
in Ml4
:
(!boolExpr; src1 | src2 )
is semantically equivalent to (imaginary)
IF boolExpr THEN src1 ELSE src2 FI
It is important that there is always a second (maybe empty) branch even in incomplete conditional
statements, i.e. the meaning of IF boolExpr THEN src1 FI
must be expressed as
(!boolExpr; src1 | )
.
!!
(without any symbol in between).
Preferably, it can be applied after acceptance of unique keywords or the like. cut
), e.g., if it is reached somewhere
within an alternative no further branch will be checked. So it should be used sparingly and in high
level (i.e. near to the root) rules preferably.
Ml4
may be local or global. There are also static
variables, which will be discussed in 3.14.3.
VAR oi, ot: INT; AR: ARR 20 OF SYM;
There must not be more than one declaration part per production.
For global variables, one has to distinguish between declaration and specification of variables:
GLOBVAR
.Declaration:
DCL count, number: INT;
Specification:
USE count, number: INT;
Declarations and specifications of global variables can be arbitrarily mixed.
The runtime system connects specified global variables with their targets. During translation there is no check for adequate bindings. At runtime it is checked for each specified variable if there a target instance has been declared (otherwise the parser issues an error). The runtime system keeps also track of type compatibility (based on structure equivalence).
LOCAL
the implementation of nonterminals can be locally overloaded.
(Locally means for the parse tree rooted by the nonterminal containing the directive.)
This change influences the actual production and all productions which are called subsequently.
When the production is left finally, the initial configuration is reestablished.LOCAL 'id' <- 'Dp4Stdlex';
I.e., as long as the current production is active, the implementation for the symbol id
will be taken
from a module Dp4Stdlex
.LOCAL 'id' <- 'ident:Dp4ExtLex';
means that id
is substituted by ident
from module
Dp4ExtLex
.nt_0
").
Example:
This is useful mainly when describing typical (class) terminals by means of myIdent = ident
=> 'yy' ident_0 (* this is the SYM value, referenced by myIdent_0 *)
-> 'yy'ident_. (* ordinary target, used as myIdent_ or my_Ident_1 *)
Ml4
.
Currently, parameterized productions have not been implemented yet.
Ml4
code with external software, i.e., making it an open
system. As Depot4 claims host language independence there are restrictions.Ml4
translator assumes that any imported entity is applied correctly. Usually, this
requests some knowledge regarding implementation strategies (type mapping etc.) of the respective system.
Incorrect application of imported elements will be reported not until by the system's host language compiler.
Imports are declared in a list of simple identifiers, e.g., IMPORTS Module1, pack2
.
As this does not meet all requirements, there is an additional mapping mechanism. It allows to define for each
of these module identifiers a string, which is inserted actually in the host program.
Mappings can be declared locally in an import or globally (e.g. in the configuration). Individual mapping
borrows its syntax from Oberon, i.e., it looks like
IMPORTS Module1:= 'Module_1', pack2:= 'MyPacks.Pack.p2'
.
TYPE ... TYPEND
.
Type definitions are of form identifier = type
and semicolon separated, where
type
has to be a valid type description (see 3.4.).INIT
and follows after the (last) production. It may contain
assign statements and procedure calls only. All used variables must be static.Ml4
translator is issued in the init part
of its root production:
INIT
Dp4OP.WrStr('Depot4: Ml4/Java - Translator 1.9.2 ');
Dp4OP.WrStr(Ml4Date); Dp4OP.WrLn();
DefCom('(*', '*) ')
for Pascal type comments, the space at the end of the second
string is essential (in Pascal as in many other languages a comment acts like a single space)
$ $
DefCom('$+${', '} ')
DefCom('$1${', '} ')
makes this the top record (all other are deleted)
DefCom('$-$', '')
DefCom('$1-$/*', '*/ ')
for C style comments (overriding the first/default)NONTERM
to consume the text stretch following the opening. Any remaining
text will be skipped as usual, i.e. the closing must not
be consumed by NONTERM
!DefCom('$+:NONTERM${$', '} ')
for TurboPascal like directivesNONTERM
will be ignored (if not saved via parameters or
global variables).
DefCom('$1-$/*', '*/ '); DefCom('$+-$//','\n ');
suffices to define comment formats for Java. However, if there is a need to handle documenting comments differently,
one has to add
DefCom('$+-$/**', '*/ ');
Be aware of the sequence
DefCom('$+-$/*', '*/ ');... DefCom('$+-$/**', '*/ ');
because otherwise any comment starting with "/**" will be already skipped as one of form "/*".
DefCom('$+:NtDirect$(*!', '*) ');
Nonterminal definition:
NtDirect=
GLOBVAR USE traceState: BOOL;
('notrace'|'trace'|(*ignore*)); traceState:= c=2;.
could be used to analyze
... (*!trace -here starts the ordinary comment- version 33 29-FEB-00 *) ...
and would result
in traceState
set true.
Ml4
language have evolved during several steps of change, adding
and removing of features. Some features were added rather ad hoc following an urgent need. Later
on, it became clear that they do not fit well in the general structure, they are just a special case
of some problem, which should be solved more generally, or they are limited to certain host
languages. Unfortunately, if a program has been released and applied, it is no longer easy to remove
anything without invalidating existing applications. I.e., such features have to be kept in the system,
even if there is no need for their existence. That's common in software engineering and called
backward compatibility . Ml4
is no exception in this respect. We have
marked these features, which still are present but should not be used in any new program with (-).
This are:
- Variables
chrChr
,nxtChr
and the intrinsic proceduregetChr
.- As there is no character type they do not fit in the type system. By this, their implementation causes several restrictions.
- Intrinsic procedure
NoSkip
andReSkip
- There is no need for them, use < ... > instead.
previous next contents