Functional Modeling with EMF, Xtext, Groovy and Scala: 2011

During the development of the awesome Xtext Protected Regions Support created by Daniel Dietrich, Daniel asked a really interesting question:

Just interested in your opinion / your 2 cents:
It's best practice to separate generated from non-generated code (when it comes to put it in a versioning system).
Therefor people use the generation gap pattern - the generated class is a base class (or s.th. similar), the manual written / generated-once classes implement the generated classes.
When using protected regions, all files (including the generated) have to be checked in. Some say, this ends up in a versioning disaster.
I prefer the protected regions approach. the base classes of the ggp nearly double the count of classes. this is a technical vehicle which makes no sense for the application. but how could be avoided, that generated classes are checked in?
my only answer is, that the generated code has to be cut of the files. empty files will disappear, protected regions are preserved. when checking out files, the generator has to be started. before checking in files, the generated code has to be stripped (into a special dir which will be checked in?).
this sounds like a technical vehicle, too. would that make sense?

To which replied: (note that these are my opinion, and I don't claim to be an
expert, so feel free to challenge my assumptions and prove me wrong!) :-)

I have nothing against GAP. It definitely has its valid uses. So for certain problems, GAP is a valid solution, and is even preferable to regions.
About versioning, I'm not sure it's a "disaster". Redundant maybe, but dangerous? I don't think so. Avoiding to checkin generated files maybe preferable where the source DSL files are "authoritative" and the target files are "final".

Let me illustrate: wsdlimport. The WSDL file is an authoritative source. The generator is stable. Generated Java Proxy Classes are "dumb" target files. You can regenerate those anytime, no need for customization. And the generator is never changed so you can be sure generated files will actually compile and work. No need to checkin the Java targets.

Xtext can be used for projects of that kind, and very easy for that.

There are also projects that fall in the middle: need some customization. So you do GAP: generate the base class and the customizable subclass. This technique is excellent for some cases, but fails when:

You're restricted by the class hierarchy in some way.
Target language doesn't support subclassing, say: XML. Or my case, this would be yet another custom DSL.

There are also cases where GAP technically works, but the generated class structure is complex enough that separating the base class from the actual class makes it "unnatural". I'm sure you've seen stuff like that. Oh, let me give a concrete example from one of my prototypes:

def StringConcatenation genMainFile(String fileName) {
    val result = new StringConcatenation() 
    result.append( augmenter.beforeMainFile(fileName) )
    result.append( augmenter.aroundMainFile(fileName, [fn1 | {
        doMainFile(fn1, [ fn2 | augmenter.innerMainFile(fn2, [fn3 | {
            genImportBlock()
            genClassBlock()
            null
        }]) ])
    }]) )
    result.append( augmenter.afterMainFile(fileName) )
    result
}

Mind you the above code actually works, but it's damn ugly! At least with protected regions there will only be (marker) comments. And with most editors, comments can be folded and not so distracting, so it's much more bearable.
Some may say "it's only the base class, nobody will touch it". On the contrary, you'll see those structural methods on your debugging stack traces, just "perfect" at the time when you're in deep need of clear & cohesive program structure, but oh... you're buried in nested method calls. :-( AspectJ/AOP/weaving also has this problem, though it's manageable to some extent. They only fall apart when pushed too far, so moderate use is OK.

The third class is projects that require extensive customization. The source DSL only comprises of ~20% of the target, providing structure or supporting form, and gives places to fill, where these are filled by the programmer or some other DSL.

In my (currently hypothetical, but hopefully realized soon) the Entity->UI generator will not generate JSF/GWT forms/pages directly, but generate to an intermediate UI DSL. (you can argue this is Model-to-Model, not M2T, but hey, textual models are much easier to inspect/hack than something buried in XML!) The UI DSL can then be processed to generate JSF or GWT.

I'm not really interested in supporting class inheritance in a UI DSL. In fact the "class" concept itself may not exist in a UI domain, it only matters to OO world. So protected regions is the only option, and thankfully Xtext supports comments by default.

With that, it's possible to customize the generated UI DSL files right in the places where they're needed.

Another use case that I'm exploring, is two or more generators (which may or may not be sourcing the same model) generating to the same file, but in different regions. And the FSA should automagically merge them.

And then, add to the mix that the generators themselves are in constant development. That means the same source DSL when processed, may yield target files that are broken, uncompilable, buggy, etc. And the target files are needed because they form the foundation of yet another project, so unless the previous "working" target files can be recovered, the development of the derived project effectively stops.

For those uses cases, would I checkin the target files? Of course.

With all of the above said, I have nothing against GAP or un-checkin generated files. GAP & un-checkin may be common, but I believe there are classes of problems where they're inappropriate.

To learn Modeling with Eclipse Modeling Framework (EMF), I highly recommend the book EMF: Eclipse Modeling Framework.

Clojure functional programming language for JVM has powerful mind-bending features.

The feature that interests me the first time is its ability to "execute data as code".

As demonstrated here, were I define a function process that basically executes the symbol processor with whatever params :

=> (defn process [& params] (eval (cons processor params)))

#'user/process

Note that processor symbol is data, not code, let's say:

=> (def processor `java.awt.Point.)

#'user/processor

So I defined processor as a quoted "java.awt.Point." which is equivalent to "new java.awt.Point()" in Java. However, the expression is not executed, it is stored in processor.

Now I can call the process function which in turn creates a new instance of Point with my provided "constructor arguments":

=> (process 12 34)

#< Point java.awt.Point[x=12,y=34] >

The reason why I quoted "constructor arguments" above, is because I can define processor to any other function, not necessarily "constructors":

=> (def processor `Math/min)

#'user/processor

=> (process 12 34)

12

Which is the same as:

=> (eval (list processor 12 34))
12

"So it's just eval," you say. How is it different to eval()-ing expressions in strings like JSR-223?

Well, in Clojure (or any Lisp dialect) it's not eval-ing an arbitrary string, but evaluating a data structure which is defined in Clojure itself. In other words, the internal representation of code corresponds exactly to the external representation. This characteristic is known as homoiconicity.

In other programming languages, the structure of the is known as its Abstract Syntax Tree (AST). Well, in Clojure, the "AST" can be defined, read, and evaluated, in Clojure data's own data structures, without needing "parsing" and transient storage (strings).

Functional Modeling with EMF, Xtext, Groovy and Scala

Wednesday, October 5, 2011

Generation Gap Pattern vs Protected Regions in Xtext MDD

Tuesday, July 12, 2011

eval-ing in Clojure: Executing Data as Code

Thursday, June 23, 2011

Brand New Functional Programming & EMF Modeling Blog!