EELC v2.0

The EEL Compiler Using Adaptive Object
Oriented Programming and Demeter/Java

April 1997

Motivation

The Secure Internet Gateway System (SIGS) project at GTE Laboratories (now Verizon) has a component whose purpose is to perform validation of incoming requests from customers. These validations are generally simple, and have the form of: "field name must be always present", "field date must be a valid date type", "if field name is present, then field telephone_no must be also present", "field customer-id must be in the list of valid customer ids", etc. Traditionally, these 'edit rules' are buried in different parts of a COBOL, C, or C++ application, which make maintenance costs extremely high.

The case was made, therefore, to abstract the concept of "editing" and define a very simple, high-level language in which these validation rules can be expressed. In this way, business-analysts can write/debug the rules themselves, at a conceptual level.

In order for these 'high-level rules' to be performed, they need to be translated into some form the computer can understand. Therefore, the Edit Engine was built as a framework where these rules can be 'plugged-in'. In other words, the interfaces of the Rule objects (and all other objects in the system, for that matter), were defined at abstract layers so that specialization by subclassing can take place at a later stage. Still, a compiler from 'Edit Engine Language' to C++ was necessary.

The first such compiler was developed using the traditional lex/yacc tools, as well as C++. It took about 4 man-months to finish. Because of the way the code-generation phase works, this compiler generates inefficient code (and possibly incorrect code). For example, for a rule like:

if (notempty(field1)) then (field1[1] == "A"); // EEL v1.0 used

the generated code will be of the form (C++ used):
        ...
        bool a = notempty(field1);
        bool b = field1[1] == 'A';
        if (a) return b;
        ...

Succinctly, the compiler makes a in-order traversal of the parse tree and generates code at every node, generating temporaries as it goes along, with the decision being made last in the process. Besides the performance penalties, this approach might generate incorrect code. For instance, the above piece of code would crash if field1 is indeed empty! (note: the compiler-support functions would thus have to take care of checking for validity of arguments to solve this problem)

Inception

After playing with Demeter/Java for a little while, and realizing the potential applicability of Demeter/Java's built-in parsing capabilities to this problem, I decided to rewrite the compiler from scratch. Management approval was obtained at this phase.

Elaboration

The following risks, separated in risk categories, were identified:

Requirement Risks: there were no major problems in this area. The function of the compiler was understood and the Edit Engine framework's API provided a clear view of the target. Furthermore, a complete set of validation rules, written in the Edit Engine Language (EEL), was already available, which made the requirements pretty much complete.
Technological Risks: while Demeter/Java has been around for quite some time, its use in the creation of mission-critical applications has not yet been demonstrated. Furthermore, Demeter/Java itself uses cutting-edge technology (JDK 1.1, JavaCC), which make the technological risks high.
Skills Risks: I had never used Demeter/Java in my life. My experience with Java is limited. I am an average object oriented designer, proficient in C++.
Political Risks: these were not a problem.

Planning

Because of the peculiar characteristics of this project, planning was based mostly in incremental acceptance of the EEL specification, followed by incremental code-generation, instead of the most common use-case based approach. A point can be made in that each micro-step can be considered a scenario into a generic use-case, but it's immaterial to this report.

The different phases I anticipated were:

Get a sample of the full EEL file parsed
Get the complete EEL file parsed
Experiment with traversals finding statistics from the parsed code to gain familiarity with the Visitor-pattern style of programming.
Get the code-generator to create the class declarations.
Get the code-generator to create the constructor definitions (for simple rules first, and meta rules later)
Get the code-generator to create the Validate() and MetaCheck() functions
Get the code-generator to create the Parser utility class (declaration and definition)

With regards to schedule estimates, I really could not make any, since I had not seen/used the technology before. Nevertheless, if the project could be finished in the same or less time than what the first version took, it could be considered a success.

Construction

The Edit Engine Language

The EEL is tailored for doing field validations. Instead of presenting here a full version of a BNF grammar for it, Figure 1 shows an example of an actual validation file.

//==============================================================
issue "1" 1 {
meta {
    E01METH00101:
      if ( in(lsr.reqtyp,"EB","MB") and
           in(lsr.act,"N","T","V") )
      then ( required(lsr) and
             required(eu) and
             required(rsl) and
             prohibited(lp) and
             prohibited(port) and
             prohibited(inp) and
             prohibited(lpnp) and
             required(dsr) )
      do ( cross_form_set, lsr_set, eu_set, rsl_set, dsr_set, dl_set);
}
// --------------------------------------------------------- CROSS_FORM SET
set cross_form_set {
    E01METH00001:
      if ( in(lsr.reqtyp,"EB","MB") and
           in(lsr.act,"N","T","V","A") )
      then ( (lsr.pon == eu.pon) and
             (lsr.pon == rsl.pon) and
             (lsr.pon == dsr.pon) );
    E01METH00002:
      if ( in(lsr.reqtyp,"EB","MB") and
           in(lsr.act,"C","M","R","B","D","S") )
      then ( (lsr.pon == eu.pon) and
             (lsr.pon == rsl.pon) );
}
// ----------------------------------------------------- LSR SET
set lsr_set {
   p E01FLDH00003: if (notempty(lsr.ccna))
                   then (externalvalidate(tbhit,lsr.ccna));
    E01FLDH00012: always (required(lsr.sc));
    E01FLDH00013: if (notempty(lsr.sc[1..2]))
                  then (lsr.sc[1..2] == "GT");
    E01FLDH00014: always (required(lsr.dtsent));
    E01FLDH00015: if   (notempty(lsr.dtsent[1..8]))
                  then (isdate(lsr.dtsent[1..8]));
    E01FLDH00016: if ( notempty (lsr.dtsent[9..10] ) )
                  then ( (lsr.dtsent[9..10] >= "00") and
                         (lsr.dtsent[9..10] <= "23") );
}
}
// ==========================================================================
                                Figure 1: Sample from validation file

Each validation file contains one or more issue{...} blocks. For each issue, first a meta{...} set is presented and then a number of rulesets (set blah{...}) are described. Within a set, a number of rules are described. Metarules differ from simple rules in that they contain a do( <ruleset_list> )statement at the end. Furthermore, metarules are always "if" commands. A rule definition is as follows:

An optional tag to denote this rule is spawnable on its own [p]
The rule name (this will be transformed to a class name later) followed by a ':'
A command (either if (expr) then (expr), or always (expr))
In the case of metarules, a do( <set_list>)

Expressions can have a number of representations. There are several 'functions', such as in(...), notin(...), length(...), required(...), isdate(...), istime(...), etc, that make rule-writing simple. Data elements (i.e. form fields) are represented by a qualified pathname with the pattern: form[.form][.field]. Parts of fields are extracted using an index notation [a] or [a..b], where a is the first char and b the last to be taken into account (one-based).

The Edit Engine Framework

In order to understand the function of the EEL compiler, it is useful to review the Edit Engine framework, shown in Figure 2. The objects met001, met002, fld001, fld002, etc are those that have been instantiated from classes defined by the EEL compiler. The Rule class is an abstract superclass of all these generated classes which defines the interfaces they should support. There are two 'types' of rules in the system: meta rules (which direct the validation flow) and simple rules (which validate data on a form). Each class generated by the EEL compiler must support:

An empty constructor
bool Validate(...) function
bool MetaCheck(...) function (for metarules only)

Figure 2: Runtime snapshot of the Edit Engine framework.

The objects met001, met002, fld001, fld002, etc are the ones generated by the EEL compiler.

The Target Generated Code

In terms of the rules, we need both the .h and .cpp files that contain the classes declarations and definitions, respectively.

For rules E01METH00101 and E01FLDH00012 above, for instance, we would expect the following code generated in the header file.

// eelrules.h
class E01METH00101 : public Rule {
public:
   E01METH00101();
   virtual bool Validate(Form*);
   virtual bool MetaCheck(Form*);
private:
   static list<string> meta_list;
};
class E01FLDH00012 : public Rule {
public:
   E01FLDH00012();
   virtual bool Validate(Form*);
};
...

For the .cpp file, we would want the following for the above rules:

// eelrules.cpp
// E01METH101 first...
list<string> E01METH00101::meta_list;
E01METH00101::E01METH00101() : Rule() {
   Name("E01METH00101");
   Meta(true);
   meta_list.push_back( "cross_form_set" );
   meta_list.push_back( "lsr_set" );
   meta_list.push_back( "eu_set" );
   meta_list.push_back( "rsl_set" );
   meta_list.push_back( "dsr_set" );
   meta_list.push_back( "dl_set" );
   SetList(&meta_list);
}
bool E01METH00101::MetaCheck(Form* form) {
   static vector<string> var_0;
   var_0.push_back( "EB" );
   var_0.push_back( "MB" );
   static vector<string> var_1;
   var_1.push_back( "N" );
   var_1.push_back( "T" );
   var_1.push_back( "V" );
   return ( rf_in(form, "lsr.reqtyp", 0, 0, var_0) && rf_in(form, "lsr.act", 0, 0, var_1));
}
bool E01METH00101::Validate(Form* form) {
   return ( rf_req(form, "lsr", 0, 0) && rf_req(form, "eu", 0, 0) && rf_req(form, "rsl", 0, 0) && !rf_req(form, "lp", 0, 0) && !rf_req(form, "port", 0, 0) && !rf_req(form, "inp", 0, 0) && !rf_req(form, "lpnp", 0, 0) && rf_req(form, "dsr", 0, 0) );
}

// E01FLDH00012 next...
E01FLDH00012::E01FLDH00012() : Rule() {
Name("E01FLDH00012");
}
bool E01FLDH00012::Validate(Form* form) {
return ( rf_req(form, "lsr.sc", 0, 0) );
}

Parsing EEL using Demeter/Java

In order to parse EEL using Demeter/Java, I created a class dictionary annotated with syntax. There were a number of problem points since EEL does not lend itself to LL(k) parsing very nicely. Nevertheless, the task was relatively simple. One important point was parsing of the low-level identifiers. Demeter/Java uses Ident as an identifier class. Unfortunately, Demeter/Java Idents do not allow dots in them, while EEL identifiers do. For this reason, I had to define a complete class structure to parse each 'piece' of an EEL identifier. While this was an annoyance, Demeter/Java's support for automatic traversals made it easier.

The complete class dictionary, heavily commented, follows:

// ****************************************************************
// eelc.cd - Class Dictionary for EEL
// ---------------------------------------------------------------
// Copyright (c) 1997 - GTE Laboratories Incorporated
//                       40 Sylvan Road
//                       Waltham, MA 02254
// All rights reserved.
// ---------------------------------------------------------------
// this file defines the class structure of an eel program. it is
// used by the Demeter/Java system to generate a hierarchy of Java
// classes on which we later execute traversals. note that the
// different classes are annotated with specific syntax to facili-
// tate parsing by the system. please see
//    http://www.ccs.neu.edu/home/lieber
// for more information on Demeter/Java
// ---------------------------------------------------------------
// date      who       what
// 10-30-97 lblando   created
// ****************************************************************

(@
import java.util.*;
import java.io.*;
@)

// an entire eel program is just a list (possibly empty) of
// issue declarations
CompilationUnit = IssueList .
IssueList ~ {IssueDeclaration} .

// an issue declaration starts with "issue", is followed by
// a string (the name), an optional integer, and the body.
IssueDeclaration        = "issue" <name> String
                                [ <num> Integer ]
                                         IssueBody .

// an issue body is enclosed in braces and contains one
// metaset and a list of rulesets
IssueBody = "{" <metaset> MetaSet
<sets> RuleSetList "}" .

// a metaset starts with "meta", is enclosed in braces, and
// contains a list of metasetrules.
MetaSet = "meta" MetaSetBody .
MetaSetBody = "{" <rules> MetaRuleList "}" .

// a ruleset starts with "set" and has a name, then
// a brace and a list of rules followed by another brace
RuleSetList             ~ {RuleSet} .
RuleSet                 = "set" <name> Ident
                          "{"   <rules> RuleList "}" .

// metarules cannot be spawned, have a name, then an IfCmd (the
// only one possible) and then the do list.
MetaRuleList            ~ {MetaSetRule} .
MetaSetRule             = <name> Ident ":"
                          <cmd> IfCmd
                          "do" "(" <setlist> NameList ")" ";" .

// simple rules can either be spawnable or not (i resorted to this
// admitedly involved design) after i unsuccessfully tried to parse
// the _optional_ "p" in front of _every_ rule.
RuleList                ~ {Rule} .
Rule                    : RuleSetRule | SpawnableRuleSetRule .
//                           *common* <name> Ident ":" <cmd> Command ";" .
// while it would make sense to leave the above line in, demjava
// complains when JavaCC processes grammar.jj. so we need to manually
// flatten the class graph in this part and replicate the above line
// in the subclasses.

// as we see, the only difference between these next two is the 'p'
// tag in front of the threaded ones...
// (notice how we replicate the information that more nicely fits in the
// superclass)
RuleSetRule             =     <name> Ident ":" <cmd> Command ";" .
SpawnableRuleSetRule    = "p" <name> Ident ":" <cmd> Command ";" .

// we only have two types of commands at the moment. always and if.
// always means that the expression must evaluate to true all the
// time and if means that, should the precondition evaluate to true,
// then the postcondition must also evaluate to true. if the
// precondition evaluates to false, then the result of the rule is
// true (ie. did not trigger).
Command                 : IfCmd
                        | AlwaysCmd .
IfCmd                   = "if"      <pre_exp> CondExpression
                          "then"    <post_exp> CondExpression .
AlwaysCmd               = "always" <exp>      CondExpression .

// adding parenthesized expressions. note that parentheses are not
// needed when we have one function only, i.e.:
//   if empty(lsr.act) then required(lsc);
// but MUST be present whenever we have boolean operators, ie:
//   if ( empty(lsr.act) and notempty(lsr.cc) ) then required(lsc);
ParenExpression         = "(" Expression ")" .
Expression              = CondOrExpression .
CondOrExpression        ~ CondAndExpression { "or" CondAndExpression } .
CondAndExpression       ~ CondExpression    { "and" CondExpression    } .

// when we get to the bottom-level, an expression is either a
// simplecondition (ie: one or two fields required), a multivalued
// condition (ie: a field and a list of values), or a parenexp to
// allow for nested parenthesized expressions.
CondExpression          : SimpleCondition
                        | MultiValuedCondition
                        | ParenExpression .

// a simplecondition is either SingleValued (ie: one field), or
// a comparison condition (two fields)
SimpleCondition         : SingleValuedCondition
                        | ComparisonCondition .
// LRB                  | ExtValCondition .

// there are two types of multivaluedconditions: the externalvalidate
// and everything else.
MultiValuedCondition : FieldAndListCondition | ExtValCondition .
// LRB InCond | NotinCond LengthCond ...

// 'everything else' is in(), notin(), or length().
// LRB
FieldAndListCondition   : InCond | NotinCond | LengthCond
                           *common* "(" <field> Field ","
                                        <vlist> StaticValueList ")" .

// the ones that only require one field are empty(), notempty(), etc
SingleValuedCondition   : EmptyCond
                        | NotemptyCond
                        | RequiredCond
                        | ProhibitedCond
                        | IstimeCond
                        | IsdateCond
                        | IsdatetimeCond
                        | IstimerangeCond
                           *common* "(" <field> Field ")" .

// to parse the comparison conditions we need to use the lookahead feature of
// JavaCC-Demeter/Java, since they have infix notation. furthermore, i need to
// use the symbolic version of the lookahead (instead of a fixed offset) since
// fields can be arbitrarily long.
ComparisonCondition     :   *lookahead* (@_EQCond() @) EQCond
                        | *lookahead* (@_NEQCond()@) NEQCond
                        | *lookahead* (@_GTCond() @) GTCond
                        | *lookahead* (@_LTCond() @) LTCond
                        | *lookahead* (@_GEQCond()@) GEQCond
                        | *lookahead* (@_LEQCond()@) LEQCond .

// the definition of an externalvalidate condition is straightforward
ExtValCondition = ExtValCond "(" <id> Ident ","
<flist> FormFieldList ")" .

// all the terminals go here. notice the definition of the comparison
// functions and how they all start with the same form. that is why we
// need the lookahead there.
LengthCond              = "length" .
ExtValCond              = "externalvalidate" .
InCond                  = "in" .
NotinCond               = "notin" .
EmptyCond               = "empty" .
NotemptyCond            = "notempty" .
RequiredCond            = "required" .
ProhibitedCond          = "prohibited" .
IstimeCond              = "istime" .
IstimerangeCond         = "istimerange" .
IsdateCond              = "isdate" .
IsdatetimeCond          = "isdatetime" .
EQCond                  = <field1> Field "==" <field2> Field .
NEQCond                 = <field1> Field "!=" <field2> Field .
GTCond                  = <field1> Field ">" <field2> Field .
LTCond                  = <field1> Field "<" <field2> Field .
GEQCond                 = <field1> Field ">=" <field2> Field .
LEQCond                 = <field1> Field "<=" <field2> Field .

// defining a field was interesting, since fields are dotted lists of
// Idents, which would possible contain numbers, stars (*), and/or
// indices. we need a fixed lookahead to determine between a FormField
// (ie. lsr.blah.blah) and a StaticValue (ie, "hello", 4, 12121).
Field : *lookahead* (@3@) FormField
| *lookahead* (@3@) StaticValue .

// a formfield has a name and might have indices.
FormField = <name> FieldName [ <idx> Indices ] .

// an index must have a starting position but the end position is optional
Indices = "[" <first> Integer [".." <second> Integer ] "]" .

// a fieldname must have at least one part, and possibly more after a '.'
FieldName ~ FieldNamePart { "." FieldNamePart } .

// each part of a fieldname is either an ident, a number, or a star
FieldNamePart           : FieldNamePartIdent
                        | FieldNamePartIndex
                        | FieldNamePartStar .

// the next two are trivial...
FieldNamePartIdent = <part_name> Ident .
FieldNamePartStar = "*" .

// for numbers in the naming, we require the prescence of the '#'
// character in front of the number. for instance, the lna value of the
// second row of the service details section in the resale form is
// denoted by rsl.sd.1.lna. however, in eel we require that this be
// written as: rsl.sd.#1.lna). this is a very minor change to eel that
// makes the parser simpler (note: as of now, there are NO rules that use
// specific row numbers in them, and their use is not foreseen)
FieldNamePartIndex = "#" <part_index> Integer .

// a nonempty list of FormFields, comma-separated
FormFieldList ~ FormField { "," FormField } .

// a comma-separated, nonempty, list of Idents
NameList ~ Name { "," Name } .
Name = <name> Ident .

// staticvalues are used in both fieldandlistconditions and
// comparisonconditions. we define a small hierarchy here to
// correctly parse them.
StaticValueList         ~ StaticValue { "," StaticValue } .
StaticValue             : StaticValueInt | StaticValueString .
StaticValueInt          = <value> int .
StaticValueString       = <value> String .

// END OF CLASS STRUCTURE
// ********************************************************************
// the following section describes the behavior-dependant classes
// they are included here for simplicity.
// ********************************************************************

// the visitor superclass. all others extend eelvisitor. traversals
// are defined in terms of an eel visitor.
EELVisitor              : RuleConstructorVisitor
                        | HeaderVisitor
                        | ExpressionVisitor
                        | ValidateVisitor
                        | CommandVisitor
                        | MetaValidateVisitor
                        | ParserVisitor
                            *common* <buffer> String
                                      <varbuff> String .

// the expression visitor is the one that does most of the complicated
// work when generating code for an expression. it contains state variables
// to maintain the indices (if present) so that code generation can use
// them later in the traversal.
ExpressionVisitor = "EV" <idx1> int <idx2> int .

// commandvisitor knows how to generate code of the two commands (if and
// always). it uses the result of the expressionvisitor to generate the
// whole expression.
CommandVisitor = "CV" <ev> ExpressionVisitor .

// the validatevisitor, on the other hand, uses a commandvisitor to fill in
// the body of the member function, but knows how to generate the 'shell'
ValidateVisitor = "VV" <cmd> CommandVisitor .

// these one is the ValidateVisitor for metarules. notice we reuse the
// other two (command and expression). this _might_ be a good point in
// favor of the "use many visitors vs. one" argument.
MetaValidateVisitor = "MVV" <cmd> CommandVisitor.

// this one just knows how to do the constructors, notice that in this case,
// there's only one visitor for both simple rules and meta rules, and the
// visitor itself knows what to do for each. (this, on the other hand, would
// be an example of the "use only one visitor instead of many" approach :-)
RuleConstructorVisitor = "RCV" .

// same argument as above applies for the visitor that generates the header
// file. it is one and knows how to 'switch' between them
HeaderVisitor = "HV" .

// with the parservisitor i used yet another approach and had just one
// visitor to do everything (both header and cpp files). this visitor is,
// admitedly, simpler than the ones before (although more tedious to write)
ParserVisitor = "PV" <header> String
<body> String .

// the class that starts it all... :-)
Main = .
// ****************************************************************

Specifying the Behavior

In order to come up with the behavior, we create a .beh file in Demeter/Java language (a superset of Java). The eelc.beh file used in the EEL compiler follows. It is heavily commented and you can find different approaches to programming with traversals and visitors.

// ****************************************************************
// eelc.beh - Behavior Description for EEL compiler
// ---------------------------------------------------------------
// Copyright (c) 1997 - GTE Laboratories Incorporated
//                       40 Sylvan Road
//                       Waltham, MA 02254
// All rights reserved.
// ---------------------------------------------------------------
// this file describes the different traversals used as well as
// the behavior of the visitor objects, which cooperate to gene-
// rate the output code. the language is Demeter/Java. for more
// information about Demeter/Java please consult:
//   http://www.ccs.neu.edu/home/lieber
// ---------------------------------------------------------------
// date      who       what
// 10-30-97 lblando   created
// ****************************************************************

// once we've parsed an .eel file, we get a huge object graph. the root of this
// graph is the CompilationUnit object. i'll define all the traversals starting from
// here. note that traversals is the only behavior we give this class.
CompilationUnit {
// the allRules traversal is used to do the header file. we go to every rule object
// in the graph but no further (since we don't need to for the class declarations)
traversal allRules(EELVisitor)
{ to {MetaSetRule, SpawnableRuleSetRule, RuleSetRule}; }

// when we are doing the constructor's code, we need to differentiate between
// metarules and normal rules. constrMeta() and constrOther() are two traversals
// that do just that. constrMeta() goes all the way down to the Name class, but
// making sure we've gone through the MetaSetRule object (that is, we are in a
// meta rule) we need to get down to the Name class beacause, for meta rules, we
// need to generate the "meta_list.push_back(<<Name>>);" lines in the constructor.
// for the rest of the rules, on the other hand, we do not need to go that further
// and constrOther stops at the SpawnableRuleSetRule or RuleSetRule objects.
traversal constrMeta(EELVisitor)
{ via MetaSetRule to Name; }
traversal constrOther(EELVisitor)
{ to {SpawnableRuleSetRule, RuleSetRule}; }

// viaRules and viaMetaRules are used in the generation of the Validate() and
// MetaCheck() functions. note that we carry three visitors in our traversals
// to perform the work. the same argument for separating metarule vs. other rule
// traversals as before applies here as well.
traversal viaRules(ValidateVisitor,CommandVisitor,ExpressionVisitor)
{ bypassing MetaSetRule to *; }
traversal viaMetaRules(MetaValidateVisitor,CommandVisitor,ExpressionVisitor)
{ bypassing Rule to *; }
}

// we put some basic behavior in the abstract visitor superclass.
EELVisitor {
init   (@ buffer = ""; @)
// the following two lines are here because of a glitch in Demeter/Java
// (version 0.6.3) that requires empty bodies for superclass visitors in
// order to execute the traversal properly.
before {CompilationUnit,RuleSet,IssueDeclaration,MetaSetRule,RuleSetRule,
          SpawnableRuleSetRule,Name} (@ @)
after {CompilationUnit,RuleSet,IssueDeclaration,MetaSetRule,RuleSetRule,
          SpawnableRuleSetRule,Name} (@ @)
}

// HeaderVisitor --------------------------------------------------------------
// this visitor's responsibility is the creation of the .h file
HeaderVisitor {
// auxiliary function to share code... :-)
void common_part(String name)
(@
     buffer += "class " + name + " : public Rule {\n";
     buffer += " public:\n";
     buffer += "   " + name + "();\n";
     buffer += "   virtual bool Validate(Form*);\n";
   @)

// if we are in a meta rule, then we need to declare MetaCheck() and
// the meta_list member variable
before MetaSetRule (@
     common_part(host.get_name().toString());
     buffer += "   virtual bool MetaCheck(Form*);\n";
     buffer += " private:\n";
     buffer += "   static list<string> meta_list;\n";
     buffer += "};\n";
   @)

// on any other rule, we just need Validate()...
before {RuleSetRule, SpawnableRuleSetRule} (@
     common_part(host.get_name().toString());
     buffer += "};\n";
   @)
}

// RuleConstructorVisitor ------------------------------------------
// used to create the code for all the constructors
RuleConstructorVisitor {
// auxiliary function to print the first part of the constructor
void common_part(String name) (@
     buffer += name + "::" + name + "() : Rule() {\n";
     buffer += "   Name(\"" + name + "\");\n";
   @)

// for a simple rule, common_part(...) is all we need!
before RuleSetRule (@
     common_part(host.get_name().toString());
     buffer += "}\n";
   @)

// for spawnable rules, we need to set the spawn state variable
// to true, so we do it here.
before SpawnableRuleSetRule (@
     common_part(host.get_name().toString());
     buffer += "   Spawn(true);\n";
     buffer += "}\n";
   @)

// when we are entering a meta rule, we start just like any other rule,
// and then we set the Meta() flag. note that we do not 'close' the function
// since we still have not loaded the meta_list variable (because we haven't
// traversed it yet!)
before MetaSetRule (@
     buffer += "list<string> " + host.get_name().toString() + "::meta_list;\n";
     common_part(host.get_name().toString());
     buffer += "   Meta(true);\n";
   @)

// when we are exiting a meta rule, we know that the meta_list variable is
// loaded so we can call SetList() and close out the function.
after MetaSetRule (@
     buffer += "   SetList(&meta_list);\n";
     buffer += "}\n";
   @)

// we will only get to a name object when we are traversing metarules (note: this
// is not a good approach, i know, but is here to show how NOT to do it :-) in
// other words, we know (here) that the only time we will get to a Name object is
// in the traversal that does meta rules ONLY. we can then just output the code
// necessary for the meta case and not do any checks. but were this visitor used
// with some other traversal, the whole thing might fall apart!. maybe this is a
// point in favor of the notion of making traversals and visitors self-contained
// (inside a visitor) instead of splitting them, since they seem to be somewhat
// tightly coupled.
before Name (@
buffer += " meta_list.push_back( \"" + host.get_name().toString() + "\" );\n";
@)
}

// ValidateVisitor ---------------------------------------------------------------
// this handles the creation of the Validate() function. it is not used with the
// metarules. (once again, the visitors and traversals, tigthly coupled in behavior
// yet not in Demeter :-)
ValidateVisitor {

// once we get to a rule, we do another little traversal from here to
// come up with the .eel representation of this rule *only* and we put it
// as a comment just before we generate the c++ code
before {RuleSetRule,SpawnableRuleSetRule} (@
     java.io.StringWriter sw = new java.io.StringWriter();
     java.io.PrintWriter pw = new java.io.PrintWriter( sw );
     PrintVisitor         pv = new PrintVisitor( pw );
     host.universal_trv0( pv );
     buffer += "// " + sw + "\n";
     buffer += "bool " + host.get_name().toString() + "::Validate(Form* form) {\n";
   @)

//closing a rule is simple, just add a brace
after {RuleSetRule,SpawnableRuleSetRule} (@
buffer += "}\n\n";
@)

// the command visitor (cmd) is clever enough to always return bool so all we
// need to do at this level is return that value (note that we do so AFTER the
// traversal has gone 'inside' the command. also, note that before we output
// the command code, we need to check if there are any variable initialization
// code present and, if so, we place it before the command code.
after {AlwaysCmd,IfCmd} (@
     String vars = cmd.get_varbuff();
     if ( vars != "" ) buffer += vars;
     buffer += "   return " + cmd.get_buffer() + ";\n";
   @)
}

// MetaValidateVisitor --------------------------------------------------------
// serves the same function as ValidateVisitor but is used on meta rules only

MetaValidateVisitor {
(@ String host_name; @)

// once we get to a rule, we do another little traversal from here to
// come up with the .eel representation of this rule *only* and we put it
// as a comment just before we generate the c++ code
before MetaSetRule (@
    java.io.StringWriter sw = new java.io.StringWriter();
    java.io.PrintWriter pw = new java.io.PrintWriter( sw );
    PrintVisitor         pv = new PrintVisitor( pw );
    host.universal_trv0( pv );
    buffer += "// " + sw + "\n";
    host_name = host.get_name().toString();
    buffer += "bool " + host_name + "::MetaCheck(Form* form) {\n";
@)

// an if command is tricky, since we need to gain control after the first 'leg'
// of the if (precondition) has been traversed. at this point we close the
// MetaCheck() function and start the Validate() one (notice we clear the
// commandvisitor's buffer before moving on, so we get a clear postcondition
// later). also note the conditional inclusion of [optionally present] variable
// definition code before the command code.
after -> IfCmd,pre_exp,CondExpression (@
    String vars = cmd.get_varbuff();
    if ( vars != "" ) buffer += vars;
    buffer += "   return " + cmd.get_buffer() + ";\n";
    buffer += "}\n";
    cmd.set_buffer( "" );
    cmd.set_varbuff( "" );
    buffer += "bool " + host_name + "::Validate(Form* form) {\n";
@)

// leaving a meta rule means generating the command code for the 'then'
// part of an if command (which we have already traversed). so we check
// to see if we have any local variables to initialize and then we put
// the command code there.
after MetaSetRule (@
    String vars = cmd.get_varbuff();
    if ( vars != "" ) buffer += vars;
    buffer += "   return " + cmd.get_buffer() + ";\n";
    buffer += "}\n\n";
@)
}

// CommandVisitor ------------------------------------------------------------
// handles the traversal at the command level. since there are
// only two commands: 'if' and 'always', it's job is very simple. is knows how
// to handle metarules by not adding logic in between the pre and post conditions
CommandVisitor {
(@ boolean in_metarule = false; @)

// in order to handle metarules, we need to make sure we know when we are
// processing one of them. the flag 'in_metarule' will tell us so
before MetaSetRule (@ in_metarule = true; @)
after MetaSetRule (@ in_metarule = false; @)

// before starting any command, we make sure to clear our own buffer as
// well as the buffer of the expression visitor we contain
before {AlwaysCmd,IfCmd} (@
     buffer = ""; varbuff = "";
     ev.set_buffer(""); ev.set_varbuff("");
@)

// an always command is very simple, since the boolean result will
// already be present in the result of the expression visitor, so
// we simply set our buffer to that
after AlwaysCmd (@
     varbuff += ev.get_varbuff();
     buffer += ev.get_buffer();
@)

// the if command is a little bit more complex in that we need to add
// some logic in between the pre and post_conditions. for this reason
// we wait until the traversal has finished the precondition and add
// the glue logic there. notice we only add the logic if we are not
// inside a metasetrule.
after -> IfCmd,pre_exp,CondExpression (@
     varbuff += ev.get_varbuff();
     buffer += ev.get_buffer();
     if ( !in_metarule ) buffer += " ? ";
     ev.set_buffer("");
     ev.set_varbuff("");
@)

// after the postcondition has been generated, we add it to our buffer
// and add some trailing logic to make the whole command a valid c++
// one of the form 'A ? B : true'. note that we do not add logic if
// we are on a meta rule, since these have the logic split into two
// different functions (MetaCheck() and Validate()) instead of one
after -> IfCmd, post_exp,CondExpression (@
     varbuff += ev.get_varbuff();
     buffer += ev.get_buffer();
     if ( !in_metarule ) buffer += " : true";
@)
}

// ExpressionVisitor --------------------------------------------------------
// this is the lowest level (and hardest worker) visitor
// in the chain. its responsibility is to generate code for lower level
// expressions. each expression needs to evaluate to a boolean once finished.
// the expression visitor handles or's and's, parenthesis, and other niceties
ExpressionVisitor {
(@
      boolean flag, condorflag, condandflag, svl_flag, in_extval;
      int local_var_count;
      String current_var_name;
      String evstr;
   @)

// make sure that we reset our or/and flags for the
// second expression in an 'if' command
before -> IfCmd, post_exp,CondExpression (@ condorflag = condandflag = false; @)

// clean up the flags upon entering a command
before Command (@ local_var_count = 0; condorflag = condandflag = false; @)

// wrap any parenthesized expression with "()"
before ParenExpression (@ buffer += "( "; @)
after ParenExpression (@ buffer += " )"; @)

//when we get to an OR node, we clean the or flag
before CondOrExpression (@ condorflag = false; @)

// on an AND node, we check the or flag and only add an '||' if it is
// true (it will be the second time). we also clean the and flag
before CondAndExpression (@
      if ( condorflag ) buffer += " || ";
      condorflag = true; condandflag = false;
   @)

// when we get to the condexpression, we check the and flag and only if
// is true we put '&&' on the buffer (this to eliminate leading '&& a && b')
// we also note that we have not reached an static-value and set the and
// flag to true
before CondExpression (@
      svl_flag = false;
      if ( condandflag ) buffer += " && ";
      condandflag = true;
      // use 'current_var_name' as a flag for SVL and Field to know where to go...
      current_var_name = "";
@)

// fieldname generation is a little tricky, since Demeter/Java does not
// support primitive identifiers with dots in them (and eel identifiers have
// dots in them). a single field is, therefore, a bunch of small parts which
// we need to parse and output as a single unit (enclosed by quotes)
before FieldName (@
    flag = false;
    if (!in_extval) buffer += "\"";
    else            evstr   = "\"";
@)

// close the quotes and exit...
after FieldName (@
    if (!in_extval) buffer += "\"";
    else {
      evstr += "\"";
      varbuff += "   " + current_var_name + ".push_back( " + evstr + " );\n";
      evstr = "";
    }
@)

// only add a dot if we are on the second+ part of the name. note how
// we differentiate between externalvalidate rules and others.
before FieldNamePart (@
     if ( flag )
        if (!in_extval) buffer += ".";
        else            evstr += ".";
@)

// once we get to an 'Ident' leaf, we add it to the buffer and set the
// flag to true to signal that we can add dots after this one if we
// have to
before FieldNamePartIdent (@
      if (!flag) flag = true;
      if (!in_extval) buffer += host.get_part_name().toString();
      else            evstr += host.get_part_name().toString();
   @)

// if we've gotten to an index part (ie. lsr.ccna.#1.blah) then we need to
// convert the index (number) to string and add it to the buffer
before FieldNamePartIndex (@
      StringBuffer b = new StringBuffer();
      b.append( host.get_part_index() );
      if (!in_extval) buffer += b;
      else            evstr += b;
   @)

// a star (repeating field) is simply passed 'as is'
before FieldNamePartStar (@
if (!in_extval) buffer += "*";
else evstr += "*";
@)

// if we are in a form field, then we might have indices to denote a
// substring, in this case, we need to initialize the index values to
// zero. if we are in an externalvalidate() expression, then we need
// to add a ',' since we can have many formfields
before FormField (@
idx1 = 0; idx2 = 0;
@)

// after we have traversed an index section, we need to load the
// visitor's idx1 and idx2 fields to remember this value for
// when the traversal exits this particular field. notice we only
// load the idx2 field if it's there. otherwise it's a copy of
// idx1
after Indices (@
    idx1 = host.get_first().intValue();
    idx2 = idx1;
    if ( host.get_second() != null )
      idx2 = host.get_second().intValue();
   @)

// when we exit the field, we need to add the indices. we do this
// all the time except when we have a "static value" field (that is,
// an immediate value such as "AA") or we are generating code for an
// externalvalidate() function
after Field (@
      if (!svl_flag && !in_extval) {
        Integer i1 = new Integer( idx1 );
        Integer i2 = new Integer( idx2 );
        buffer += ", " + i1.toString() + ", " + i2.toString();
      }
@)

// if we are entering a multivaluedcondition, we need to create
// a new local variable to put all the values there. so we do the
// variable creation here (in current_var name) using the local_var_count
// counter. we also add the variable declaration to the variable buffer
before MultiValuedCondition (@
    current_var_name = new String();
    current_var_name = "var_" + new Integer( local_var_count ).toString();
    varbuff += "   static vector<string> " + current_var_name + ";\n";
@)

// notice we only increment the local_var_count upon exiting the
// multivalued condition, so that we know we have traversed the
// rule with the same value for the variable as was used to define the
// local variable.
after MultiValuedCondition (@ local_var_count++; @)

// generating code for the leaves is simple, this is just the prefix, though
before EmptyCond        (@ buffer += "rf_empty(form, ";          @)
before NotemptyCond     (@ buffer += "!rf_empty(form, ";         @)
before IstimerangeCond (@ buffer += "rf_istimerange(form, ";    @)
before IstimeCond       (@ buffer += "rf_istime(form, ";         @)
before IsdateCond       (@ buffer += "rf_isdate(form, ";         @)
before IsdatetimeCond   (@ buffer += "rf_isdatetime(form, ";     @)
before RequiredCond     (@ buffer += "rf_req(form, ";            @)
before ProhibitedCond   (@ buffer += "!rf_req(form, ";           @)
before InCond           (@ buffer += "rf_in(form, ";    @)
before NotinCond        (@ buffer += "!rf_in(form, ";   @)

// length() is a little tricky
before LengthCond       (@
    buffer += "rf_length(form, ";
    // length is an special case. even though it IS a multivalued condition for
    // convenience in parsing, we do not want to generate the vector<string>
    // so all we need to do here is flag that we are not generating it and
    // also clear the varbuff (which at this point is already loaded with the
    // value of the new variable name)
    current_var_name = ""; // flag that we have no variable
    set_varbuff( "" );     // clear varbuff
    local_var_count--;     // decrement this (note: there's an inconsistent state
                           // since this statement executes until the increment
                           // of the local_var_count happens again (upon exit from
                           // MultiValuedCondition). this violation of the invariant
                           // it harmless in this case
@)

// some of the above we just close out with a ")" ...
after {EmptyCond, NotemptyCond, IstimeCond, IsdateCond,
IstimerangeCond, IsdatetimeCond, RequiredCond, ProhibitedCond} (@
buffer += ")";
@)

// some others, though, need to be apended with the name of the
// local variable that we are using to load the multiple values
after {InCond, NotinCond, LengthCond} (@
buffer += current_var_name +")";
@)

// some housekeeping, to put commas where they're needed...
// LRB MultiValuedCondition below
after -> FieldAndListCondition,field,Field (@ buffer += ", "; @)

// control the svl_flag invariant
before StaticValueList (@ svl_flag = false; @)
after StaticValueList (@ svl_flag = false; @)

// when we get to a static value, in addition to knowing if we need
// to add a ',' or not, we need to know if we are loading a local variable
// or not. we do the later with the current_var_name != "" check. depending
// on this we either add to buffer (ie. command) or varbuff (ie. variable
// initialization)
after StaticValueString (@
      if ( svl_flag && current_var_name == "" ) buffer += ", ";
      svl_flag = true;
      if ( current_var_name == "" ) buffer += "\"" + host.get_value() + "\"";
      else varbuff += "   " + current_var_name + ".push_back( \"" + host.get_value() + "\" );\n";
@)
after StaticValueInt (@
      if ( svl_flag && current_var_name == "" ) buffer += ", ";
      svl_flag = true;
      Integer i = new Integer( host.get_value() );
      if ( current_var_name == "" ) buffer += i.toString();
      else varbuff += "   " + current_var_name + ".push_back( \"" + i.toString() + "\" );\n";
   @)

// the comparison conditions are rather simple, since they use support
// functions (to handle repeating fields). note the use of edge methods
// to introduce the proper punctuation.
before EQCond   (@ buffer += "rf_equal(form, "; @)
before NEQCond (@ buffer += "!rf_equal(form, "; @)
before GTCond   (@ buffer += "!rf_lteq(form, "; @)
before LTCond   (@ buffer += "!rf_gteq(form, "; @)
before GEQCond (@ buffer += "rf_gteq(form, ";   @)
before LEQCond (@ buffer += "rf_lteq(form, ";   @)
after -> *,field1,Field (@ buffer += ", ";   @)
after -> *,field2,Field (@ buffer += ")";   @)

// when we get to an extval condition we set the appropriate flag
// to true. we of course unset it upon exiting. on exit, the
// current_var_name contains the name of the local variable we've
// loaded with the name of the fields. notice the edge method to
// insert the id value at the right time.
before ExtValCondition (@
    in_extval = true;
    buffer += "rf_extval(form, \"";
@)
after ExtValCondition (@
    in_extval = false;
    buffer += ", " + current_var_name + ")";
@)
after -> ExtValCondition,id,Ident (@ buffer += dest.toString() + "\""; @)
}

// ParserVisitor ----------------------------------------------------------
// this one takes care of creating the Parser utility class, which does all
// the rule creation and provides 'meta' information about the rules
ParserVisitor {
(@
      Vector issue_names,
             issue_metarulenames,
             issue_rulesetnames,
             issue_rulesetrulenames;
      int issue_count, ruleset_count, rule_count, metarule_count;
   @)

// make sure we initialize everything
init (@
     issue_count = ruleset_count = rule_count = metarule_count = 0;
     issue_names            = new Vector();
     issue_metarulenames    = new Vector();
     issue_rulesetnames     = new Vector();
     issue_rulesetrulenames = new Vector();
@)

// when we get to an issue declaration, reset all the counts and
// add one element to all the vectors to 'make space' for this issue
before IssueDeclaration (@
     ruleset_count = rule_count = metarule_count = 0;
     issue_names.addElement(new String( host.get_name() ));
     issue_metarulenames.addElement(new Vector());
     issue_rulesetnames.addElement(new Vector());
     issue_rulesetrulenames.addElement(new Vector());
@)

// once we're done with this issue, get ready for the next one...
after IssueDeclaration (@
     issue_count++;
@)

// when we get to a meta rule, add its name
before MetaSetRule (@
     Vector v = (Vector)issue_metarulenames.elementAt( issue_count );
     v.addElement(new String( host.get_name().toString()));
@)

// and count it once we're done
after MetaSetRule (@
metarule_count++;
@)

// for a ruleset, we need to create a new vector and also set the name
before RuleSet (@
     rule_count = 0;
     Vector v = (Vector)issue_rulesetnames.elementAt( issue_count );
     v.addElement(new String( host.get_name().toString() ));
     v = (Vector)issue_rulesetrulenames.elementAt( issue_count );
     v.addElement(new Vector());
@)

// and count it once we're done...
after RuleSet (@
ruleset_count++;
@)

// for rules, just add them to the appropriate vector...
before {RuleSetRule,SpawnableRuleSetRule} (@
     Vector v1 = (Vector)issue_rulesetrulenames.elementAt( issue_count );
     Vector v2 = (Vector)v1.elementAt( ruleset_count );
     v2.addElement(new String( host.get_name().toString()));
@)

// and count them, as usual...
after {RuleSetRule,SpawnableRuleSetRule} (@
     rule_count++;
@)

// once we are done with the entire program, we start loading the
// 'body' and 'header' string members from the vectors we've loaded
// during the traversal
after CompilationUnit (@

     // the header is trivial, as it is static.
     header = "#include <gtestl.h>\n\n"
            + "class Rule;\n\n"
            + "class Parser {\n"
            + " public:\n"
            + "    static int    NumOfIssues();\n"
            + "    static int    NumOfCheckersInIssue(const string&);\n"
            + "    static string IssueName(int);\n"
            + "    static int    NumOfMetaRulesInIssue(const string&);\n"
            + "    static string RulesetName(const string&, int);\n"
            + "    static Rule* CreateMetaRule(const string&, int);\n"
            + "    static Rule* CreateRule(const string&, int, int);\n"
            + "    static int    NumOfRulesetsInIssue(const string&);\n"
            + "    static int    NumOfRulesInSet(const string&, int);\n"
            + " private:\n"
            + "    static int    find_issue_index(const string&);\n"
            + "};\n";

     // the body is a different story, since we need to iterate through
     // vectors, create switch statements, etc...
     int i_issue, i_ruleset, i_rule;
     Vector v;
     body = "#include <parser.h>\n"
          + "#include <gterules.h>\n\n";

body += "int Parser::NumOfIssues() { return " + issue_count + "; }\n\n";

body += "int Parser::NumOfCheckersInIssue(const string& a) { return 1; }\n\n";

     body += "string Parser::IssueName(int i) {\n"
          + "   switch (i) {\n";
     for (i_issue = 0; i_issue < issue_names.size(); i_issue++)
        body += "   case " + i_issue + ": return \"" + (String)issue_names.elementAt(i_issue) + "\";\n";
     body += "   }\n"
          + "   return \"\";\n"
          + "}\n\n";

     body += "int Parser::find_issue_index(const string& issueName) {\n"
          + "   for (int i=0; i < NumOfIssues(); i++)\n"
          + "      if ( IssueName(i) == issueName ) return i;\n"
          + "   return -1;\n"
          + "}\n\n";

     body += "int Parser::NumOfMetaRulesInIssue(const string& iname) {\n"
          + " int i = find_issue_index( iname );\n"
          + " if ( i != -1 ) {\n"
          + "   switch( i ) {\n";
     for (i_issue = 0; i_issue < issue_names.size(); i_issue++) {
       v = (Vector)issue_metarulenames.elementAt( i_issue );
       body += "   case " + i_issue + ": return " + v.size() + ";\n";
     }
     body += "   }\n"
          + " }\n"
          + " return 0;\n"
          + "}\n\n";

     body += "string Parser::RulesetName(const string& iname, int rsetidx) {\n"
          + "   int i = find_issue_index( iname );\n"
          + "   if ( i != -1 ) {\n"
          + "      switch ( i ) {\n";
     for (i_issue = 0; i_issue < issue_names.size(); i_issue++) {
        v = (Vector)issue_rulesetnames.elementAt( i_issue );
        body += "      case " + i_issue + ":\n"
             + "         switch( rsetidx ) {\n";
        for (i_ruleset = 0; i_ruleset < v.size(); i_ruleset++) {
           body += "         case " + i_ruleset + ": return \"" + (String)v.elementAt(i_ruleset) + "\";\n";
        }
        body += "         }\n";
     }
     body += "      }\n"
          + "   }\n"
          + "   return \"\";\n"
          + "}\n\n";

     body += "Rule* Parser::CreateMetaRule(const string& iname, int mruleidx) {\n"
          + "   int i = find_issue_index( iname );\n"
          + "   if ( i != -1 ) {\n"
          + "      switch ( i ) {\n";
     for (i_issue = 0; i_issue < issue_names.size(); i_issue++) {
        v = (Vector)issue_metarulenames.elementAt( i_issue );
        body += "      case " + i_issue + ":\n"
             + "         switch( mruleidx ) {\n";
        for (i_ruleset = 0; i_ruleset < v.size(); i_ruleset++) {
           body += "         case " + i_ruleset + ": return new " + (String)v.elementAt(i_ruleset) + "();\n";
        }
        body += "         }\n";
     }
     body += "      }\n"
          + "   }\n"
          + "   return 0;\n"
          + "}\n\n";

     body += "Rule* Parser::CreateRule(const string& iname, int rsetidx, int ruleidx) {\n"
          + "   int i = find_issue_index( iname );\n"
          + "   if ( i != -1 ) {\n"
          + "      switch ( i ) {\n";
     for (i_issue = 0; i_issue < issue_names.size(); i_issue++) {
        v = (Vector)issue_rulesetrulenames.elementAt( i_issue );
        body += "      case " + i_issue + ":\n"
             + "         switch( rsetidx ) {\n";
        for (i_ruleset = 0; i_ruleset < v.size(); i_ruleset++) {
           body += "         case " + i_ruleset + ":\n"
                + "            switch ( ruleidx ) {\n";
           Vector v2 = (Vector)v.elementAt(i_ruleset);
           for (i_rule = 0; i_rule < v2.size(); i_rule++) {
              body += "             case " + i_rule + ": return new " + (String)v2.elementAt(i_rule) + "();\n";
           }
           body += "            }\n";
        }
        body += "         }\n";
     }
     body += "      }\n"
          + "   }\n"
          + "   return 0;\n"
          + "}\n\n";

     body += "int Parser::NumOfRulesetsInIssue(const string& iname) {\n"
          + "   int i = find_issue_index(iname);\n"
          + "   if ( i != -1 ) {\n"
          + "      switch ( i ) {\n";
     for (i_issue = 0; i_issue < issue_count; i_issue++) {
        v = (Vector)issue_rulesetnames.elementAt(i_issue);
        body += "      case " + i_issue + ": return " + v.size() + ";\n";
     }
     body += "      }\n"
          + "   }\n"
          + "   return 0;\n"
          + "}\n\n";

     body += "int Parser::NumOfRulesInSet(const string& iname, int rsetidx) {\n"
          + "   int i = find_issue_index( iname );\n"
          + "   if ( i != -1 ) {\n"
          + "      switch ( i ) {\n";
     for (i_issue = 0; i_issue < issue_names.size(); i_issue++) {
        v = (Vector)issue_rulesetrulenames.elementAt( i_issue );
        body += "      case " + i_issue + ":\n"
             + "         switch( rsetidx ) {\n";
        for (i_ruleset = 0; i_ruleset < v.size(); i_ruleset++) {
           Vector v2 = (Vector)v.elementAt(i_ruleset);
           body += "         case " + i_ruleset + ": return " + v2.size() + ";\n";
        }
        body += "         }\n";
     }
     body += "      }\n"
          + "   }\n"
          + "   return 0;\n"
          + "}\n\n";
@)
}

// the main class. simply start up the work...
Main {

// an auxiliary function to generate copyright info and stuff
static public String header(String file_name, String desc)
(@
   String buffer = new String();
   String fname = file_name.replace( '.', '_' );
   buffer += "//=======================================================================\n";
   buffer += "// " + file_name + "\n";
   buffer += "//   - " + desc + "\n";
   buffer += "//-----------------------------------------------------------------------\n";
   buffer += Main.copyright();
   buffer += "//-----------------------------------------------------------------------\n";
   buffer += "// this is an automatically generated file. it was generated on:\n";
   Date date = new Date();
   buffer += "// " + date.toString() + "\n";
   buffer += "//=======================================================================\n";
   buffer += "#ifndef _" + fname + "_\n";
   buffer += "#define _" + fname + "_\n";
   buffer += "\n";
   return buffer;
@)

// another one for the end of the a file
static public String footer(String file_name)
(@
   String buffer = new String();
   buffer += "#endif /* " + file_name + " */\n";
   return buffer;
@)

// actually, this is the one that does the copyright! :-)
static public String copyright()
(@
   String buffer = new String();
   buffer += "// (C) 1997 - GTE Laboratories Incorporated\n";
   buffer += "//            40 Sylvan Road, MS40\n";
   buffer += "//            Waltham, MA 02254\n";
   buffer += "// * All rights reserved *\n";
   buffer += "//-----------------------------------------------------------------------\n";
   buffer += "// Generator created by Luis Blando using Demeter/Java AOOP              \n";
   buffer += "// Comments, suggestions, complaints to lblando@gte.com                  \n";
   return buffer;
@)

// and this one saves all the strings to the files.
static public void my_save(String fname, String fcontents)
(@
    System.out.println(" saving " + fname);
    File file;
    try {
       file = new File( fname );
    }
    catch (NullPointerException e) {
       System.out.println("Error: Cannot write to: " + fname);
       return;
    }
    FileOutputStream fStr;
    PrintWriter pStr;
    try {
       fStr = new FileOutputStream( file );
       pStr = new PrintWriter( fStr );
       pStr.println( fcontents );
       pStr.flush();
       fStr.close();
    }
    catch (IOException e) {
       System.out.println("Error: could not save file " + fname);
       return;
    }
@)

// main function
(@
   static public void main(String args[]) throws Exception
   {
     // parse from stdin. TODO: add reading from file
     CompilationUnit pgm = CompilationUnit.parse(System.in);

     // some banner stuff
     System.out.println("EEL2CPP Compiler v2.0 - (c) 1997 - GTE Labs");
     System.out.println("(Done using Demeter/Java by Luis Blando)");

     // create all the class declarations...
     HeaderVisitor hv = new HeaderVisitor();
     System.out.println("Forming the class declarations...");
     pgm.allRules( hv );

     // now do the constructors. notice how we use the same
     // visitor in two different traversals
     RuleConstructorVisitor rcv = new RuleConstructorVisitor();
     System.out.println("Forming all the rule constructors...");
     pgm.constrMeta ( rcv );
     pgm.constrOther( rcv );

     // create all the visitors needed for Validate()...
     ValidateVisitor   vv = new ValidateVisitor();
     CommandVisitor    cv = new CommandVisitor();
     ExpressionVisitor ev = new ExpressionVisitor();
     // link them together
     vv.set_cmd( cv );
     cv.set_ev ( ev );
     System.out.println("Doing Validate() functions...");
     // do all the validates
     pgm.viaRules(vv,cv,ev);

     // now repeat it for metarules
     MetaValidateVisitor mvv = new MetaValidateVisitor();
     // note we reuse cv and ev.
     mvv.set_cmd( cv );
     System.out.println("Doing Metarules validates...");
     pgm.viaMetaRules( mvv, cv, ev );

     // now traverse again to do the parser class. note that
     // we reuse the traversal
     ParserVisitor parser = new ParserVisitor();
     pgm.allRules( parser );

     // once we've finished traversing, collect all the info
     // into four buffers...
     String constructors = rcv.get_buffer();
     String validates    = mvv.get_buffer() + vv.get_buffer();
     String bodies       = validates;
     String gterules_h = Main.header( "gterules.h", "class declarations for rule objects" )
                       + "#include <rules.h>\n"
                       + hv.get_buffer()
                       + Main.footer( "gterules.h" );

     String gterules_cpp = Main.header( "gterules.cpp", "class definitions for rule objects" )
                       + "#include <gterules.h>\n\n"
                       + "// constructors -----------------------------------------\n"
                       + constructors
                       + "\n// Validate/MetaCheck functions -----------------------\n"
                       + bodies
                       + Main.footer( "gterules.cpp" );

     String parser_h = Main.header( "parser.h", "declaration of parser utility class" )
                       + parser.get_header()
                       + Main.footer( "parser.h" );

     String parser_cpp = Main.header( "parser.cpp", "definitions of parser utilities" )
                       + parser.get_body()
                       + Main.footer( "parser.cpp" );

     // create the output files... yeah, i know, the filenames are hard-coded! :-)
     System.out.println("Creating output files...");
     Main.my_save("parser.h",     parser_h);
     Main.my_save("parser.cpp",   parser_cpp);
     Main.my_save("gterules.h",   gterules_h);
     Main.my_save("gterules.cpp", gterules_cpp);
     System.out.println("Done!");
   }
@)
}
// ************************************************************************************

Compiling and Executing

Once the above two files have been generated, the Demeter/Java compiler (demjava) needs to be executed. This tool generates a number of Java classes which are ultimately compiled (using javacc and javac) to bytecode for execution by the JVM. The following command generates and compiles the system:

% demjava eelc

Last, the EEL compiler is run by feeding it a .eel file. The four files: gterules.h, gterules.cpp, parser.h, and parser.cpp, are generated in the current directory. The following command executes the EEL compiler on a file called gterules.eel (and produces the following to stdout).

% java Main < gterules.eel
EEL2CPP Compiler v2.0 - (c) 1997 - GTE Labs
(Done using Demeter/Java by Luis Blando)
Forming the class declarations...
Forming all the rule constructors...
Doing Validate() functions...
Doing Metarules validates...
Creating output files...
saving parser.h
saving parser.cpp
saving gterules.h
saving gterules.cpp
Done!

Some Observations

By far, the most impressive measure of the effectiveness of Demeter/Java is the development time required. I was able to implement the solution presented here in about 7 days of full-time work, which includes the time it took to learn Demeter/Java, some Java, JavaCC, etc. We cannot directly compare this figure with the 4 man-months quoted above, mainly because some of the early 'sink-in' time with regards to the domain were not necessary in the second iteration and, more importantly, because I did not develop the first iteration of the parser. Nevertheless, the impressive reduction in the required time is 'robust' to a great degree of 'error correction' :-)

Another interesting, albeit inaccurate, measure is to compare the number of lines of code necessary. In both cases I have filtered comments. In the previous version of the compiler, there are about 2000 lines of code. In the Demeter/Java version, 281. We need to take into account, however, that the previous version of the compiler used limited support functions and therefore more code needed to be generated. On the other hand, I am an extremely inexperienced Demeter/Java programmer and there are surely better ways to perform the same functions. Leveling functionality to put it on-par with the previous version of the parser would not be, imho, detrimental to the figures presented here. As a matter of fact, a re-engineering of the entire process might yield even further reduction in code size (it happened as I was going through the iterations and learning as I went along...)

Demeter/Java's 'adaptability' was somewhat put to the test when, after most of the system was completed, a change in the underlying class structure was necessary. At that time, changes at the .cd and .beh level were minimal (marked with //LRB in the code), although conceptually the changes were kind of involved, since I changed inheritance links, and added a new class.

Execution time, as was expected, increased substantially. The previous version of the parser is a compiled binary which, right from the start, executes much faster than Java's bytecodes. Furthermore, I believe (though, as usual, can't prove) that programming exclusively on a traversal/visitor style leads to somewhat inefficient code (ie: lots of calls to empty functions, lots of traversals that are not needed). Granted, with more training in Demeter/Java, the traversals could have been tuned and execution time would have improved. Since the EEL compiler is not time-constrained, these deficiencies were not important (although they are critical if Demeter/Java is going to be used for other applications).

It is very interesting to note that the 'extensions' to the standard Visitor pattern (as defined in the GoF book) that Demeter/Java proposes turn out to be extremely useful. For instance, in many cases I found myself needing to differentiate (ie. two different behaviors) between 'entering' an object versus 'leaving' an object (having done the inside traversal in between). Also, the fact that Demeter/Java allows attaching behavior to edges is critical for several of the pieces of the compiler. Let this be clear, I am not claiming that it would be impossible to solve the problem without these niceties (after all, it's all 1s and 0s, right?), I am just reporting that these features came in pretty handy when I needed them.

With regards to the 'internals' of Demeter/Java, it seems to me that the issue of "one intelligent visitor vs. multiple simpler ones" is still undecided. I've used (to some extent) both approaches and they seemed useful. On the issue of making traversals and visitors more tightly coupled, however, I have a more formed opinion. There were several points in this project (some noted in the code, some not) where the behavior of the visitors implied an 'assumption' of the type of traversal that the visitor was going to ride on. This 'dependency' is, imho, very dangerous, because it compromises the future correctness of the program (for instance, what if I later change the traversal just a little bit because some different visitor needs it?). It makes the program brittle. Therefore, based on the limited experience I've had with Demeter/Java, I think that merging traversals with visitors makes a lot of sense.

Concluding Remarks

This report has described my experience in re-implementing a simple compiler using adaptive object oriented programming ideas and Demeter/Java. The gains in development time, simplicity of the code, and extensibility features were remarkable. The project has been a sound success.