Friday, April 6, 2012

Tackling Reverse Code Generation

Reverse Code Generation has been kindly discussed in depth by my friend Daniel Dietrich.

To review, the goal is to make it possible to generate 2 artifacts (generator templates and generated application) from 2 artifacts (model and prototype application).

Given this pseudo-Java prototype application:

public void runHello(String[] args) {
  System.out.println("Welcome to Hello World!");
}

And this model:

appName "Hello"
appTitle "Hello World"

I'd like to generate (or perhaps, "reverse generate") this generator template (in StringTemplate syntax):

public void run$appName$(String[] args) {
  System.out.println("Welcome to $appTitle$!");
}

You can think of the above as a representation of the "template" model too. (template is data therefore template is a model, right?)

Now given the actual application model :

appName "Car"
appTitle "Ultimate Car Selling"

This model plus the generated generator template would ultimately generate the final application :

public void runCar(String[] args) {
  System.out.println("Welcome to Ultimate Car Selling!");
}


Is it Worth It?

My first question even if all this is possible and practical, is it worth all the hassle?

My hunch tells me that right now, at best it's seldom used and at worst exotic.

And then about limitations. It's like projecting 3D onto 2D and then trying to get 3D back out of that 2D. It's not just challenging, but some information is lost.

These information can be recorded somewhere (like metadata for generated code), however if we used features like protected regions it'll make it even more challenging. (some heuristics would help though)

Then about loops and conditionals.

The Solution?

The way I think about it right now is that reverse code generation would be like a submodule inside a greater code generation task. Kinda like protected regions. So the reverse generator does not work globally, but only in specific parts of the project. And for those parts, we know that there are no loops or conditionals or complex constructs. UPCASE and down_case transformations etc. may still be practical.

The key lies in the prototype application. The prototype needs to serve a double role:
  1. It is directly executable/buildable as a project in the target platform environment
  2. It can be extracted to form generator templates

So we have to annotate the prototype app in some way. The annotations can be external (like Hibernate/JPA Persistence XML mappings) or internal (like JPA Java annotations, JAX-RS, etc.).

For internal annotations, there seems to be no other way than using the comment capability in the target language, just like protected regions. If the target language doesn't support comments, then the only possible approach is external annotations.

Simple syntax out of my mind:

/** GENERATOR: START
public void run$appName$(String[] args) {
  System.out.println("Welcome to $appTitle$!");
}
*/
public void runHello(String[] args) {
  System.out.println("Welcome to Hello World!");
}
//
GENERATOR: END

Basically we write the offending code twice, one as a template and one in the target language. It seems useless for that example but consider another example:

<?xml version="1.0" encoding="utf-8"?>
<Deployment xmlns="http://schemas.microsoft.com/windowsphone/2009/deployment" AppPlatformVersion="7.1">
<!-- GENERATOR START
  <App xmlns="" ProductID="$productId$" Title="$appTitle$"
       RuntimeType="Silverlight" Version="1.0.0.0" Genre="apps.normal"  Author="$appAuthor$"

       Description="$appDescription$" Publisher="$appPublisher$">
-->
  <App xmlns="" ProductID="{6b7a1ae6-8d4e-4f85-b08e-387df81d6e8e}" Title="Info Bandung"
       RuntimeType="Silverlight" Version="1.0.0.0" Genre="apps.normal"  Author="Hendy Irawan"

       Description="Woyyy, Orang Bandung kita bagi2 Info" Publisher="Hendy Irawan">
<!-- GENERATOR END -->
    <IconPath IsRelative="true" IsResource="false">ApplicationIcon.png</IconPath>
    <Capabilities>
      <Capability Name="ID_CAP_NETWORKING"/>
    </Capabilities>
    <Tasks>
      <DefaultTask  Name ="_default" NavigationPage="MainPage.xaml"/>
    </Tasks>
    <Tokens>
<!-- GENERATOR SINGLE
      <PrimaryToken TokenID="$appName$Token" TaskName="_default">
-->
      <PrimaryToken TokenID="InfoBandungToken" TaskName="_default">
        <TemplateType5>
          <BackgroundImageURI IsRelative="true" IsResource="false">Background.png</BackgroundImageURI>
          <Count>0</Count>
<!-- GENERATOR SINGLE
          <Title>$appTitle</Title>
-->
          <Title>Info Bandung</Title>
        </TemplateType5>
      </PrimaryToken>
    </Tokens>
  </App>
</Deployment>

Doesn't look too bad, does it? And I can foresee writing the reverse code generator for that is doable.

And the above code (not the reverse code generator though) isn't entirely imaginary. I have taken it out directly from my open source Windows Phone 7 project Info Bandung.

What do you think?


To learn Modeling with Eclipse Modeling Framework (EMF), I highly recommend the book EMF: Eclipse Modeling Framework.

Tuesday, January 10, 2012

Xperiencing Xtend in a Java EE 6 Web Application

Just saying that to date I'm quite happy with using Xtend in a regular Java EE 6 web application.

It allows me to write this:

    def void reindexSlugs() {
        log.debug("Reindexing slugs")
        slugProvider.clear
        val slugList = userRepo.findAllSlugs.toList
        log.debug("Reindexing slugs for {} users", slugList.size)
        val slugMap = slugList.fold(new HashMap<String, String>(),
            [map, row | map.put(String::valueOf(row.get("slug")), String::valueOf(row.get("id"))); map ]
        )
        slugProvider.batchUpdate(slugMap)
        log.debug("Reindexing slugs")
    }

Instead of: (Xtend-generated Java)

  public void reindexSlugs() {
      this.log.debug("Reindexing slugs");
      this.slugProvider.clear();
      Iterable<Map<String,Object>> _findAllSlugs = this.userRepo.findAllSlugs();
      List<Map<String,Object>> _list = IterableExtensions.<Map<String,Object>>toList(_findAllSlugs);
      final List<Map<String,Object>> slugList = _list;
      int _size = slugList.size();
      this.log.debug("Reindexing slugs for {} users", Integer.valueOf(_size));
      HashMap<String,String> _hashMap = new HashMap<String,String>();
      final Function2<HashMap<String,String>,Map<String,Object>,HashMap<String,String>> _function = new Function2<HashMap<String,String>,Map<String,Object>,HashMap<String,String>>() {
          public HashMap<String,String> apply(final HashMap<String,String> map, final Map<String,Object> row) {
            HashMap<String,String> _xblockexpression = null;
            {
              Object _get = row.get("slug");
              String _valueOf = String.valueOf(_get);
              Object _get_1 = row.get("id");
              String _valueOf_1 = String.valueOf(_get_1);
              map.put(_valueOf, _valueOf_1);
              _xblockexpression = (map);
            }
            return _xblockexpression;
          }
        };
      HashMap<String,String> _fold = IterableExtensions.<Map<String,Object>, HashMap<String,String>>fold(slugList, _hashMap, _function);
      final HashMap<String,String> slugMap = _fold;
      this.slugProvider.batchUpdate(slugMap);
      this.log.debug("Reindexing slugs");
  }

I do miss Scala's syntax though. Xtend is more "Java-compatible" and works as CDI beans.

Closure support and type inference are the two most useful features in languages like Xtend, Scala, and Groovy. :)

To learn Modeling with Eclipse Modeling Framework (EMF), I highly recommend the book EMF: Eclipse Modeling Framework.

Wednesday, October 5, 2011

Generation Gap Pattern vs Protected Regions in Xtext MDD

During the development of the awesome Xtext Protected Regions Support created by Daniel Dietrich, Daniel asked a really interesting question:

Just interested in your opinion / your 2 cents:
It's best practice to separate generated from non-generated code (when it comes to put it in a versioning system).
Therefor people use the generation gap pattern - the generated class is a base class (or s.th. similar), the manual written / generated-once classes implement the generated classes.
When using protected regions, all files (including the generated) have to be checked in. Some say, this ends up in a versioning disaster.
I prefer the protected regions approach. the base classes of the ggp nearly double the count of classes. this is a technical vehicle which makes no sense for the application. but how could be avoided, that generated classes are checked in?
my only answer is, that the generated code has to be cut of the files. empty files will disappear, protected regions are preserved. when checking out files, the generator has to be started. before checking in files, the generated code has to be stripped (into a special dir which will be checked in?).
this sounds like a technical vehicle, too. would that make sense?

To which replied: (note that these are my opinion, and I don't claim to be an
expert, so feel free to challenge my assumptions and prove me wrong!) :-)


I have nothing against GAP. It definitely has its valid uses. So for certain problems, GAP is a valid solution, and is even preferable to regions.
About versioning, I'm not sure it's a "disaster". Redundant maybe, but dangerous? I don't think so. Avoiding to checkin generated files maybe preferable where the source DSL files are "authoritative" and the target files are "final".

Let me illustrate: wsdlimport. The WSDL file is an authoritative source. The generator is stable. Generated Java Proxy Classes are "dumb" target files. You can regenerate those anytime, no need for customization. And the generator is never changed so you can be sure generated files will actually compile and work. No need to checkin the Java targets.

Xtext can be used for projects of that kind, and very easy for that.

There are also projects that fall in the middle: need some customization. So you do GAP: generate the base class and the customizable subclass. This technique is excellent for some cases, but fails when:
  1. You're restricted by the class hierarchy in some way.
  2. Target language doesn't support subclassing, say: XML. Or my case, this would be yet another custom DSL.
There are also cases where GAP technically works, but the generated class structure is complex enough that separating the base class from the actual class makes it "unnatural". I'm sure you've seen stuff like that. Oh, let me give a concrete example from one of my prototypes:

def StringConcatenation genMainFile(String fileName) {
    val result = new StringConcatenation() 
    result.append( augmenter.beforeMainFile(fileName) )
    result.append( augmenter.aroundMainFile(fileName, [fn1 | {
        doMainFile(fn1, [ fn2 | augmenter.innerMainFile(fn2, [fn3 | {
            genImportBlock()
            genClassBlock()
            null
        }]) ])
    }]) )
    result.append( augmenter.afterMainFile(fileName) )
    result
}

Mind you the above code actually works, but it's damn ugly! At least with protected regions there will only be (marker) comments. And with most editors, comments can be folded and not so distracting, so it's much more bearable.
Some may say "it's only the base class, nobody will touch it". On the contrary, you'll see those structural methods on your debugging stack traces, just "perfect" at the time when you're in deep need of clear & cohesive program structure, but oh... you're buried in nested method calls. :-( AspectJ/AOP/weaving also has this problem, though it's manageable to some extent. They only fall apart when pushed too far, so moderate use is OK.

The third class is projects that require extensive customization. The source DSL only comprises of ~20% of the target, providing structure or supporting form, and gives places to fill, where these are filled by the programmer or some other DSL.

In my (currently hypothetical, but hopefully realized soon) the Entity->UI generator will not generate JSF/GWT forms/pages directly, but generate to an intermediate UI DSL. (you can argue this is Model-to-Model, not M2T, but hey, textual models are much easier to inspect/hack than something buried in XML!) The UI DSL can then be processed to generate JSF or GWT.

I'm not really interested in supporting class inheritance in a UI DSL. In fact the "class" concept itself may not exist in a UI domain, it only matters to OO world. So protected regions is the only option, and thankfully Xtext supports comments by default.

With that, it's possible to customize the generated UI DSL files right in the places where they're needed.

Another use case that I'm exploring, is two or more generators (which may or may not be sourcing the same model) generating to the same file, but in different regions. And the FSA should automagically merge them.

And then, add to the mix that the generators themselves are in constant development. That means the same source DSL when processed, may yield target files that are broken, uncompilable, buggy, etc. And the target files are needed because they form the foundation of yet another project, so unless the previous "working" target files can be recovered, the development of the derived project effectively stops.

For those uses cases, would I checkin the target files? Of course.

With all of the above said, I have nothing against GAP or un-checkin generated files. GAP & un-checkin may be common, but I believe there are classes of problems where they're inappropriate.

To learn Modeling with Eclipse Modeling Framework (EMF), I highly recommend the book EMF: Eclipse Modeling Framework.

Tuesday, July 12, 2011

eval-ing in Clojure: Executing Data as Code

Clojure functional programming language for JVM has powerful mind-bending features.

The feature that interests me the first time is its ability to "execute data as code".

As demonstrated here, were I define a function process that basically executes the symbol processor with whatever params :

=> (defn process [& params] (eval (cons processor params)))
#'user/process

 Note that processor symbol is data, not code, let's say:

=> (def processor `java.awt.Point.)
#'user/processor

So I defined processor as a quoted "java.awt.Point." which is equivalent to "new java.awt.Point()" in Java. However, the expression is not executed, it is stored in processor.

Now I can call the process function which in turn creates a new instance of Point with my provided "constructor arguments":

=> (process 12 34)
#< Point java.awt.Point[x=12,y=34] >

The reason why I quoted "constructor arguments" above, is because I can define processor to any other function, not necessarily "constructors":

=> (def processor `Math/min)
#'user/processor

=> (process 12 34)
12

Which is the same as:

=> (eval (list processor 12 34))
12

"So it's just eval," you say. How is it different to eval()-ing expressions in strings like JSR-223?

Well, in Clojure (or any Lisp dialect) it's not eval-ing an arbitrary string, but evaluating a data structure which is defined in Clojure itself. In other words, the internal representation of code corresponds exactly to the external representation. This characteristic is known as homoiconicity.

In other programming languages, the structure of the is known as its Abstract Syntax Tree (AST). Well, in Clojure, the "AST" can be defined, read, and evaluated, in Clojure data's own data structures, without needing "parsing" and transient storage (strings).

Thursday, June 23, 2011

Brand New Functional Programming & EMF Modeling Blog!

Welcome to my new Functional Programming & Modeling with EMF Blog!

As its name says, I'll discuss topics such as:
  1. Modeling aka Model Driven Architecture (MDA) aka Model Driven Software Development (MDSD) aka Model Driven Engineering (MDE) ;-)
  2. Functional Programming, lambdas, closures and modern programming techniques
  3. Domain Specific Languages (DSLs)
  4. Code generation
  5. Working with Eclipse Modeling Framework (EMF) Toolkit and its projects such as Xtext/Xtend/Xbase, Ecore Tools, CDO, Texo, Teneo, etc.
Be prepared for extreme "dogfooding" ! ;-)

I also have a blog about programming using Eclipse RCP: Eclipse Driven Rich Application Development that may interest you.

To learn Modeling with Eclipse Modeling Framework (EMF), I highly recommend EMF: Eclipse Modeling Framework.