Domain-specific multimodeling
Domain-specific multimodeling is a software development paradigm where each view is made explicit as a separate domain-specific language.
Successful development of a modern enterprise system requires the convergence of multiple views. Business analysts, domain experts, interaction designers, database experts, and developers with different kinds of expertise all take part in the process of building such a system. Their different work products must be managed, aligned, and integrated to produce a running system. Every participant of the development process has a particular language tailored to solve problems specific to its view on the system. The challenge of integrating these different views and avoiding the potential cacophony of multiple different languages is the coordination problem.
Domain-specific multimodeling is promising when compared to more traditional development paradigms such as single-language programming and general-purpose modeling. To reap the benefits of this new paradigm, we must solve the coordination problem. This problem is also known as the fragmentation problem in the context of .
One proposal to solve this problem is the coordination method. This is a three-step method to overcome the obstacles of integrating different views and coordinating multiple languages. The method prescribes how to identify and specify the references across language boundaries, that is the overlaps between different languages. Finally, the method offers concrete proposals on how to apply this knowledge in actual development in the form of consistency, navigation, and guidance.
Motivating example
s based on multiple domain-specific languages are abundant. Languages with a metamodel defined in the Extensible Markup Language enjoy particularly widespread adoption. To illustrate development with multiple languages, we will draw an example from a case study: system. Briefly stated, is an enterprise resource planning system that includes standard components such as inventory, accounting, e-commerce etc. These components are implemented by a mixture of XML-based languages and regular Java code. As an example, let us focus on the content management component, particularly a use case in which the administrative user creates an online web survey as shown in the screenshot below. We will refer to this example as the create survey example.The figure shows a screenshot of the administrative interface of the content management application in a running instance. To create a survey, the user fills out the fields of the input form and hits the update button. This creates a new survey which can be edited and later published on a frontend website in . Behind the scenes, this use case involves several artifacts written in different languages. In this example, let us focus on only three of these languages: the Entity, the Service, and the Form DSL.
These three languages correspond roughly to the structural, the behavioural, and the user interface concern in . The Entity DSL is used to describe the underlying data model and hence the way the created survey will be saved. The Service DSL is used to describe the interface of the service that is invoked when the user hits the update button. Finally, the Form DSL is used to describe the visual appearance of the form. Although the three languages are tailored for different things, they can not be separated entirely. The user interface invokes a certain application logic and this application logic manipulates the data of the application. This is an example of non-orthogonal concerns. The languages overlap because the concerns that they represent cannot be separated entirely. Let us examine these three languages in a bottom-up manner and point out their overlaps.
Entity DSL
The Entity DSL defines the structure of data in . The listing below shows the definition of the Survey entity which is the business object that represents the concept of a survey. The code in the Listing is self-explanatory: An entity called Survey is defined with 10 fields. Each field has a name and a type. The field surveyId is used as the primary key. This definition is loaded by a central component in called the entity engine. The entity engine instantiates a corresponding business object. The purpose of the entity engine is to manage transactional properties of all business objects and interact with various persistence mechanisms such as Java Database Connectivity, Enterprise JavaBeans or even some legacy system.Service DSL
The Service DSL specifies the interface of the services in . Each service encapsulates part of the application logic of the system. The purpose of this language is to have a uniform abstraction over various implementing mechanisms. Individual services can be implemented in Java, a scripting language, or using a rule engine. The listing below shows the interface of the createSurvey service.Apart from the name, the service element specifies the location and invocation command of the implementation for this service. The default-entity-name attribute specifies that this service refers to the Survey entity which was defined in the previous listing. This is an overlap between the two languages, specifically a so-called soft reference. A model in the Service DSL refers to a model in the Entity DSL. This reference is used in the two auto-attributes elements below which specify the input and output of the service in the form of typed attributes. As input, the service accepts attributes corresponding to all non-primary key fields of the Survey entity and these attributes are optional. As output, the service returns attributes corresponding to the primary key fields of Survey, i.e., in this case the surveyId field, and these attributes are mandatory. The purpose of the reference across languages is in this case to reduce redundancy. The attributes of the createSurvey service corresponds to the fields of the Survey entity and it is therefore only necessary to specify them once.
Form DSL
The Form DSL is used to describe the layout and visual appearance of input forms in the user interface. The language consists of domain concepts such as Form and Field. The listing below shows the implementation of the EditSurvey form. This time the Form DSL overlaps with the Service DSL. The target attribute of the form and the alt-target elements specify that the input from the submission of this form should be directed to either the updateSurvey or createSurvey services. The auto-fields-service element specifies that the form should include a field corresponding to each of the attributes of the updateSurvey service. This produces a similar effect of importing definitions from another model as in the case of the auto-attributes elements in the previous listing. Further down, we can see that it is possible to customize the appearance of these imported fields such as isAnonymous. Finally, a submitButton is added with a localized title such that the user can submit his data to the referenced service.The create survey example, as described here, is implemented using models in three different languages. The complete implementation actually involves even more languages such as a Screen DSL to specify the layout of the screen where the form is placed, and a Minilang DSL which is a data-manipulation language used to implement the service. However, these three languages do illustrate the main idea of making each concern concrete. The example also shows a simple way of reducing redundancy by letting the languages overlap slightly.
Multi-level customization
s, like those described above, have limited expressiveness. It is often necessary to add code snippets in a general-purpose language like Java to implement specialized functionality that is beyond the scope of the languages. This method is called multi-level customization.Since this method is very commonly used in setups with multiple languages, we will illustrate it by a continuation of the example. Let us call this the build PDF example.
Suppose we want to build a PDF file for each survey response to the online surveys that users create. Building a PDF file is outside the scope of our languages so we need to write some Java code that can invoke a third-party PDF library to perform this specialized functionality. Two artifacts are required:
First, an additional service model, as shown below, in the Service DSL that defines the interface of the concrete service such that it can be accessed on the modeling level. The service model describes the location of the implementation and what the input and output attributes are.
Second, we need a code snippet, as shown below, that contains the actual implementation of this service. A service can have multiple inputs and outputs so input to the Java method is a map, called context, from argument names to argument values and returns output in the form of another map, called results.
This multi-level customization method uses soft references similar to the create survey example. The main difference is that the reference here is between model and code rather than between model and model. The advantage, in this case, is that a third-party Java library for building PDFs can be leveraged. Another typical application is to use Java code snippets to invoke external webservices and import results in a suitable format.
Coordination problem
The example illustrates some of the advantages of using multiple languages in development. There are, however, also difficulties associated with this kind of development. These difficulties stem from the observation that the more kinds of artifacts we introduce into our process, the more coordination between developer efforts is needed. We will refer to these difficulties as the Coordination Problem. The Coordination Problem has a conceptual and a technical aspect. Conceptually, the main problem is to understand the different languages and their interaction. To properly design and coordinate models in multiple languages, developers must have a sufficient understanding of how languages interact. Technically, the main problem is to enforce consistency. Tools must be provided to detect inconsistencies early, i.e., at modeling time, and assist developers in resolving these inconsistencies. In the following, we will examine these two aspects in greater detail.Coordination as a conceptual challenge
The first problem that developers encounter when starting on development with multiple languages is language cacophony. Learning the different languages and understanding their interaction is necessary to make sense of the complex composition of artifacts. The framework for instance has seventeen different languages and more than 200 000 lines of domain-specific language code so the complexity can be quite overwhelming! There is currently no established method of characterizing different languages such that developers quickly can reach an operational understanding. Tools are important here as an ad hoc mechanism for learning and exploration because developers typically use tools to learn by experiments. There are especially three areas where tools for domain-specific models are helpful:- Understanding a language
- Understanding language interactions
- Understanding how to use languages
Just as we may complain that XML is a bad choice, we could also object that a general-purpose language like Java is a poor choice for some tasks. Furthermore, developers may feel less intimidated by the editor in figure than by code Listings in XML or Java. If we accept that syntax matters then the use of different languages with tailored editors becomes a reasonable strategy. The simplicity of the editor makes the language easier to understand and hence easier to use. In other words, the syntax matters objection may be the very reason why we explore the field of Domain-specific languages.
Second, language interactions reveal relations between languages. Developers should be able to jump between related elements in different artifacts. Ease of navigation between different software artifacts is an important criterion for tools in traditional development environments. Although we have performed no empirical studies in this area, we hypothesize that proper navigation facilities increase productivity. This claim is supported by the observation that all major development environments today offer quite sophisticated navigation facilities such as type hierarchy browser or the ability to quickly locate and jump to references to a method definition. The development environments can provide these navigation facilities because they maintain a continuously updated model of the sourcefiles in the form of an abstract syntax tree.
In a development environment with multiple languages, navigation is much more difficult. Existing environments are not geared to parsing and representing DSL models as abstract syntax trees for arbitrary and perhaps even application-specific languages such as the languages from the previous example. Furthermore without this internal representation, existing environments cannot resolve neither intra- nor inter-language references for such languages and hence cannot provide useful navigation. This means that developers must maintain a conceptual model of how the parts of their system are related. New tools with navigation facilities geared to multiple languages would on the other hand be very helpful in understanding the relations between languages. In terms of the create survey example such tools should display the relations between the three languages by using the soft references as navigation points.
Third, to understand language use we must be able to distinguish correct editing operations from wrong ones in our development environment. Traditional development environments have long provided guidance during the writing of a program. Incremental compilation allows the environment to offer detailed suggestions to the developer such as how to complete a statement. More intrusive kinds of guidance also exist such as syntax-oriented editors where only input conforming to the grammar can be entered. Generic text-editors that can be parameterized with the grammar of a language have existed for a long time.
Existing editors do not take inter-language consistency relations into account when providing guidance. In the previous example, an ideal editor should for instance be able to suggest the createSurvey service as a valid value when the developer edits the target attribute in the Form definition. An environment which could reason about artifacts from different languages would also be able to help the developer identify program states where there was local but not global consistency. Such a situation can arise when a model is well-formed and hence locally consistent but at the same time violates an inter-language constraint. Guidance or intelligent assistance in the form of proposals on how to complete a model would be useful for setups with multiple languages and complex consistency constraints. Tool-suggested editing operations could make it easier for the developer to get started on the process of learning how to use the languages.
Coordination as a technical challenge
The technical aspect of the coordination problem is essentially a matter of enforcing consistency. How can we detect inconsistencies across models from multiple languages at modeling time? To fully understand the complexity of the consistency requirements of a system based on multiple languages, it is useful to refine our concept of consistency.Consistency can be either intra- or inter-consistency. Intra-consistency concerns the consistency of elements within a single model. The requirements here are that the model must conform to its metamodel, i.e., be syntactically well-formed. In terms of the create survey example, the entity model must for instance conform to the XSD schema of the Entity DSL. This schema is the metamodel of the Entity DSL and it specifies how elements can be composed and what are, to some extent, the valid domains of attributes.
Inter-consistency is achieved when references across language boundaries can be resolved. This kind of consistency can be further subdivided into model-to-model consistency and model-to-code consistency. Model-to-model consistency concerns the referential integrity as well as high-level constraints of the system. In the create survey example, the default-entity-name attribute from the Service listing refers to the name attribute from Entity listing. If we change one of these values without updating the other, we break the reference. More high-level consistency constraints across different models also exist as discussed later. A project can have certain patterns or conventions for naming and relating model elements. Current development environments must be tailored to specific languages with handwritten plugins or similar mechanisms in order to enforce consistency between languages such as those from the previous example.
Model-to-code consistency is an essential requirement in multi-level customization. When models are supplemented with code snippets as in the build PDF example, it is necessary to check that models and code actually fit. This partly a matter of making sure that soft references between models and code are not broken, similar to referential integrity in model-to-model consistency. But it is also a matter of making sure that the code does not violate expectations set up in the model. In the build PDF example, the model specifies that outByteWrapper will always be part of the output, i.e., the outByteWrapper key is put in the results map. An analysis of the code shows that outByteWrapper will only be part of the output if no exceptions are thrown before line 10. In other words, some possible executions of the code will violate a specification on the modeling level. More generally, we can state that multi-level customization imposes very fine-grained constraints on the involved models and code snippets.