<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:series="http://unfoldingneurons.com/" version="2.0">

<channel>
	<title>mgm technology blog</title>
	
	<link>http://blog.mgm-tp.com</link>
	<description>We discuss software innovation</description>
	<lastBuildDate>Tue, 09 Apr 2013 12:53:15 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/MgmTechBlog" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="mgmtechblog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">MgmTechBlog</feedburner:emailServiceId><feedburner:feedburnerHostname xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Designing and Implementing our Camel-based mgm Cosmo Router - Robust and Fail-Safe Message Routing with Apache Camel, Part 1</title>
		<link>http://blog.mgm-tp.com/2013/04/camel-router-part1/</link>
		<comments>http://blog.mgm-tp.com/2013/04/camel-router-part1/#comments</comments>
		<pubDate>Tue, 09 Apr 2013 12:38:56 +0000</pubDate>
		<dc:creator>Michael Frieß</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[ActiveMQ]]></category>
		<category><![CDATA[Camel]]></category>
		<category><![CDATA[EAI]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[SOAP]]></category>
		<category><![CDATA[Spring]]></category>
		<category><![CDATA[Web Service]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1513</guid>
		<description><![CDATA[We recently finished a subproject to integrate our mgm Cosmo insurance software with an external CRM system. Both systems had to exchange XML documents in a reliable and robust manner in order to keep their data in sync. We used Apache Camel as the middleware to handle all the transfers between the Java and .NET [...]]]></description>
			<content:encoded><![CDATA[<p>We recently finished a subproject to integrate our <a href="http://www.mgm-tp.com/branchen/versicherungen/loesungen/mgm-cosmo">mgm Cosmo insurance software</a> with an external CRM system. Both systems had to exchange XML documents in a reliable and robust manner in order to keep their data in sync. We used Apache Camel as the middleware to handle all the transfers between the Java and .NET based systems. This blog series discusses our solution and shares our experiences with Apache Camel.</p>
<p><span id="more-1513"></span></p>
<p><a href="http://www.mgm-tp.com/branchen/versicherungen/loesungen/mgm-cosmo">mgm Cosmo</a> is a standardized software solution that allows insurance companies to produce and manage commercial and industrial business. It accompanies the insurance sales process starting with a submission as a sales lead. A submission may produce one or more offers. If an offer is accepted by both the customer and all the other parties, it becomes an insurance policy. This policy can later be extended, changed or cancelled in the form of an endorsement.</p>
<p>Technically, all these five main business entities are managed as self-contained XML documents which you could easily store in <a href="http://en.wikipedia.org/wiki/Document-oriented_database">document-oriented NoSQL database</a>. In the following they are simply referred to as <em>XML documents</em>.</p>
<div id="attachment_1618" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-camelrouter1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-4-Cosmo-Business-Entities2.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-4-Cosmo-Business-Entities2-480x259.png" alt="" width="480" height="259" class="size-large wp-image-1618" /></a><p class="wp-caption-text">Exemplary insurance process with all involved parties and its five main business entities.</p></div>
<h2>Project Goals</h2>
<p>The business goal for the system integration was first, to make all XML documents from Cosmo available for viewing in an external CRM system and second, to receive all corresponding partner and account information from the CRM.</p>
<p>We chose <a href="http://camel.apache.org">Apache Camel</a> as the middleware to handle all the <a href="http://en.wikipedia.org/wiki/Enterprise_application_integration">Enterprise Application Integration</a> (EAI) tasks, specifically the SOAP transfers between our Java backend application and the <em>external</em> .NET services. This Camel-driven application was simply named <em>&#8220;Cosmo Router&#8221;</em>.</p>
<p>Our main goal for the Cosmo Router was to be as reliable, automated, robust and fail-safe as possible. For example, we strived for a system which can handle connection failures and timeout errors with adaptable retry strategies like increasing waiting periods. As you can imagine it didn’t take us long to find even more sophisticated failover scenarios. However, retry strategies respectively failover behavior was only one small topic to be encountered during this project.</p>
<p>This blog article is accompanied by several source code examples but assumes a basic understanding of <a href="http://camel.apache.org">Apache Camel</a>. In case you never heard of Apache Camel, you might want to take a look at these additional resources and examples:</p>
<ul>
<li><a href="http://camel.apache.org">the official Camel website</a></li>
<li><a href="http://architects.dzone.com/articles/apache-camel-integration">Camel introduction with examples</a></li>
<li><a href="http://www.manning.com/ibsen/Camel_ch01_update.pdf">Free Chapter 1 of the Book &quot;Camel in Action&quot;</a></li>
<li><a href="http://it-republik.de/jaxenter/artikel/Apache-Camel-Agile-Enterprise-Integration-4097.html">Java Magazin article (in German)</a></li>
</ul>
<h2>Architecture Overview</h2>
<p>Pretty early, our customer defined SOAP as the mandatory messaging technology. The task at hand was both to transfer data to newly created web services as well as to receive data from them. In this article we will only focus on the sending part. For better distinction, the target web services will be called <em>the &#8220;external&#8221; web services</em>. The external web services were developed by a third party in .NET respectively <a href="http://msdn.microsoft.com/en-us/library/ee354381.aspx">Windows Communication Foundation 4</a> (WCF4).</p>
<p>One of our main objectives for our architecture was to decouple the transfer process from the Cosmo Web Application&#8217;s workflow in order to process all transfers asynchronously (from the web application&#8217;s point of view). This decision was possible because the application workflow did not depend on the transfer results. It was also driven by previous negative experiences with another SOAP web service and its limited availability. In practical words, we wanted a Cosmo user to be able to continue working with the application instead of being blocked by a &#8220;Transfer in progress, please wait&#8221; popup.</p>
<p>As it turned out, this definition became very handy: during the deployment of the final release, an initial migration of all existing XML documents was planned. In numbers, tens of thousands of transfers needed to be processed. But with the help of asynchronous transferring, the Cosmo Web Application only needed to be unavailable for a few hours to safely prepare all transfers. This was a great improvement compared to two full days needed for the actual transferring. When the initial transferring even became a whole week due to bugs, we could relax with a &#8220;no problem&#8221; smile and happy users.</p>
<p>Driven by the idea of asynchronism we finally ended up with the following architecture:</p>
<div id="attachment_1516" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-camelrouter1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-1-Cosmo-Router-Final-Architecture-e1363879021957.png"><img class="size-large wp-image-1516" src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-1-Cosmo-Router-Final-Architecture-479x211.png" alt="" width="479" height="211" /></a><p class="wp-caption-text">This diagram shows the sending process from the Cosmo Web Application to the external CRM web service.</p></div>
<p>The whole process is triggered by specific user interactions in the web application as indicated by the big grey arrow. We won&#8217;t go into details there but instead focus on the &#8220;Transfer Table&#8221; and particularly the Cosmo Router in the middle lane.</p>
<h2>Transfer Processing</h2>
<p>Let&#8217;s take a closer look at each step in the above shown diagram. Please note that for the sake of simplicity all of the following examples don&#8217;t contain log statements or error and exception handling.</p>
<h3>Step 1: Getting new Transfers from the Database</h3>
<p>The Cosmo Router periodically checks the database&#8217;s &#8220;Transfer Table&#8221; for new transfers. At first, we wanted to do the database polling with Camel&#8217;s <a href="http://camel.apache.org/sql-component.html">SQL component</a> which seemed to be a perfect fit. However, it doesn&#8217;t support being used as a &#8220;Camel Consumer&#8221; (i.e. as a starting point of a Camel route or in other words as a &#8220;from&#8221; endpoint). So we decided to simply start the routing process with the <a href="http://camel.apache.org/timer.html">Camel Timer</a> component as described next.</p>
<h4>Starting Routes with Timers</h4>
<p>A timer as a &#8220;from&#8221; endpoint triggers the processing und subsequently the polling of the database for new transfers. In Camel&#8217;s <a href="http://camel.apache.org/spring.html">Spring XML DSL</a> this looks as follows:</p>
<pre class="brush: xml; wrap-lines: false;">
&lt;!-- Note: the timer must be explicitly defined as an endpoint
           due to the timer's period being configurable --&gt;
&lt;endpoint id=&quot;newOffersTimer&quot; uri=&quot;timer://newOffersTimer?period=${router.timer.offer.period}&quot;/&gt;

&lt;route id=&quot;Send.Start.Timer.NewOfferTransfers&quot; autoStartup=&quot;true&quot; startupOrder=&quot;901&quot;&gt;
  &lt;from uri=&quot;ref:newOffersTimer&quot;/&gt;
  ...
  &lt;setProperty propertyName=&quot;CurTransferType&quot;&gt;
    &lt;simple&gt;${properties:TRANSFER_TYPE_OFFER}&lt;/simple&gt;
  &lt;/setProperty&gt;
  ...
  &lt;to uri=&quot;direct:Send.GetAndProcessNewTransfers&quot;/&gt;
&lt;/route&gt;
</pre>
<p>The route references the timer as a &#8220;from&#8221; endpoint. The timer fires as soon as the route is started and does nothing more than creating an (almost) empty <a href="http://camel.apache.org/exchange.html">Camel Exchange</a> object which is all we want at that moment. Afterward, the route just sets a Camel property and delegates the processing to another route called &#8220;Send.GetAndProcessNewTransfers&#8221;. As you might have guessed already, this is for re-using the route several times (more precisely once for each transferable XML document).</p>
<p>Using multiple, separated timer routes brings some benefits:</p>
<ul>
<li>Each timer route can be disabled without affecting all the other timer routes. This is very useful when there is a bug in the processing of one specific XML document and you want to &#8220;pause&#8221; exactly that type. This can keep your log clean of many retry or failure attempts &#8211; not to mention the savings on bandwidth, memory or CPU usage (if these count anyways).</li>
<li>All the timer routes can run concurrently which greatly speeds things up but keeps the routing logic simple, too. Or at least simpler than using one asynchronously firing Timer for all the types together.</li>
</ul>
<h4>Fetching new XML Documents from the Transfer Table</h4>
<p>As mentioned before, all the timer routes delegate the processing to another Camel route named &#8220;Send.GetAndProcessNewTransfers&#8221;. Here it is:</p>
<pre class="brush: xml; wrap-lines: false;">
&lt;route id=&quot;Send.GetAndProcessNewTransfers&quot;&gt;
  &lt;from uri=&quot;direct:Send.GetAndProcessNewTransfers&quot;/&gt;
  ...

  &lt;!-- check for new transfers --&gt;
  &lt;to uri=&quot;bean:sqlExecutor?method=selectNewTransfers&quot;/&gt;

  &lt;!-- delegate the processing of the body with all the selected transfers --&gt;
  &lt;to uri=&quot;direct:Send.ProcessSelectedTransfers&quot;/&gt;
&lt;/route&gt;
</pre>
<p>The obvious thing to recognize is the usage of a custom bean named &#8220;sqlExecutor&#8221;. We found this approach to be much easier than working with Camel&#8217;s SQL component  &#8211;  particularly when using dynamic SQL placeholders. We will come back to this topic later on (in the chapter &#8220;Struggling with the SQL Component&#8221;) but for now here is the custom bean:</p>
<pre class="brush: java; wrap-lines: false;">
public class SqlExecutor {
  ...

  public List&lt;Map&lt;String, Object&gt;&gt; selectNewTransfers(String body, Exchange exchange) {
    final String transferType = exchange.getProperty(Constants.TRANSFER_TYPE, String.class);
    List&lt;Map&lt;String, Object&gt;&gt; newTransfers =
        jdbcTemplate.queryForList(
            SELECT * FROM TRANSFER WHERE TRANSFER_TYPE = ? AND STATUS = ?&quot;,
            transferType,
            TransferStatus.NEW.name() );

    if (newTransfers.size() == 0) {
      log.debug(&quot;No '{}' {} transfers - STOP&quot;, TransferStatus.NEW, transferType);
      // STOP the routing by appropriately marking the Exchange (equals &quot;&lt;stop/&gt;&quot; in XML DSL)
      exchange.setProperty(Exchange.ROUTE_STOP, Boolean.TRUE);
      return null;
    }
    log.info(&quot;Found {} '{}' {} transfers&quot;, new Object[]{ newTransfers.size(), TransferStatus.NEW, transferType });
    return newTransfers;
  }
}
</pre>
<p>The code should be pretty much straight forward. The bean uses an underlying <a href="http://static.springsource.org/spring/docs/3.2.x/spring-framework-reference/html/jdbc.html">Spring JdbcTemplate</a> and all new transfers are returned as a List of Maps (with the SQL column names as keys). Until now this is the same behavior as the Camel SQL component. But then the handling of one of our processing definitions follows: don&#8217;t continue with the routing if there are no new transfers (see &#8220;if&#8221;-condition). In contrast, the SQL component would continue with an empty array which would have to be checked and reacted upon similarly in the Camel XML DSL.</p>
<h3>Step 2: Preprocessing</h3>
<p>After having all new transfers as a List in the Camel body, the route &#8220;Send.ProcessSelectedTransfers&#8221; is called. Within this route several great Camel features are used: the <a href="http://camel.apache.org/splitter.html">Splitter</a> (see line 8 ) and the <a href="http://camel.apache.org/validate.html">Validate DSL</a> (see line 5 and 13-14).</p>
<pre class="brush: xml; highlight: [5,8,13,14]; wrap-lines: false;">
&lt;route id=&quot;Send.ProcessSelectedTransfers&quot;&gt;
  &lt;from uri=&quot;direct:Send.ProcessSelectedTransfers&quot;/&gt;

  &lt;!-- some validations --&gt;
  &lt;validate&gt;&lt;simple&gt;${body} is 'java.util.List'&lt;/simple&gt;&lt;/validate&gt;

  &lt;!-- Split the List with all the Transfers and process each Transfer --&gt;
  &lt;split stopOnException=&quot;true&quot; parallelProcessing=&quot;false&quot;&gt;
    &lt;!-- choose what to split (here: the body which is a List) --&gt;
    &lt;simple&gt;${body}&lt;/simple&gt;

    &lt;!-- again some validations --&gt;
    &lt;validate&gt;&lt;simple&gt;${body} is 'java.util.Map'&lt;/simple&gt;&lt;/validate&gt;
    &lt;validate&gt;&lt;simple&gt;${body[&quot;REQUEST_XML&quot;]} != null&lt;/simple&gt;&lt;/validate&gt;

    &lt;!-- remember some properties for later use --&gt;
    &lt;setProperty propertyName=&quot;TransferId&quot;&gt;
      &lt;simple&gt;${body[&quot;ID&quot;]}&lt;/simple&gt;
    &lt;/setProperty&gt;

    &lt;!-- set body to XML payload --&gt;
    &lt;setBody&gt;
      &lt;simple&gt;${body[&quot;REQUEST_XML&quot;]}&lt;/simple&gt;
    &lt;/setBody&gt;

    &lt;!-- route continues with Step 3 --&gt;
</pre>
<p>Aside from the validations, the first thing to happen is the splitting of the List which the previously described &#8220;sqlExecutor&#8221; returned. Next, the ID is saved to a Camel property allowing the later correlation of the SOAP reply. Finally, the body is set to the payload XML as required by the (upcoming) CXF component.</p>
<p>Note that the Splitter was configured purposely to sequential processing and immediate stopping on Exceptions (see line 8). We decided against the use of Camel&#8217;s asynchronous processing to keep the routing as simple as possible. For example consider you have three transfers: A1, B1 and A2 with A2 being an update of A1. Now A1 fails due to a temporary network failure but B1 and A2 succeed. Furthermore let us define that all failures should be retried. Therefore A1 must either be discarded or A2 must be resend in order to prevent the older A1 to overwrite the newer A2. Discarding of A1 might become a problem if the receiving system requires every state of A being transferred. Resending all dependent states might make the routing logic more complicated. Howsoever, if A2 would not be send unless A1 was successful, additional routing logic would not have to be considered at all. Another example would be that A1 and A2 are sent shortly after each other but A1 gets e.g. delayed by network traffic and arrives a millisecond <i>after</i> A2. The receiving system would have to recognize the out-dated A1 and handle it appropriately. Eventually we felt that the price of a more complex failover behavior was too high for the small performance benefit of parallel processing.</p>
<h3>Step 3: Sending via SOAP</h3>
<p>Before the sending via SOAP happens, a final XML schema validation of the body containing the request XML happens:</p>
<pre class="brush: xml; wrap-lines: false;">
    &lt;!-- validate before sending --&gt;
    &lt;to uri=&quot;validator:classpath:com/mgmtp/OfferTransfer.xsd?useDom=false&quot;/&gt;
</pre>
<p>This validation has the benefit of directly getting meaningful validation errors instead of generic SOAP faults from an external web service which you might have no or very little control over.</p>
<p>After the validation, the request XML is sent to the external web service with the help of <a href="http://cxf.apache.org">Apache CXF</a> which is a really simple one-liner:</p>
<pre class="brush: xml; wrap-lines: false;">
    &lt;!-- here's the enveloping in SOAP and sending happening --&gt;
    &lt;to uri=&quot;cxf:bean:offerWebService&quot;/&gt;

    &lt;!-- route continues with Step 4 --&gt;
</pre>
<p>Note that we <a href="http://camel.apache.org/cxf.html">configured CXF</a> to build the <a href="http://www.w3schools.com/soap/soap_envelope.asp">SOAP envelope</a> automatically, so we only have to bother with the important things, namely the payload XML. Here is the complete configuration of the CXF bean (belongs outside of the &#8220;&lt;camelContext/&gt;&#8221; element):</p>
<pre class="brush: xml; wrap-lines: false;">
&lt;cxf:cxfEndpoint id=&quot;offerWebService&quot;
                 address=&quot;${offer.webservice.url}&quot;
                 wsdlURL=&quot;com/mgmtp/transfer/Offer.wsdl&quot;
                 serviceName=&quot;wsdlns:Offer&quot; endpointName=&quot;wsdlns:OfferSoap12&quot;
                 xmlns:wsdlns=&quot;http://www.external-ws.com/Offer/v1&quot;&gt;
    &lt;cxf:properties&gt;
      &lt;entry key=&quot;dataFormat&quot; value=&quot;PAYLOAD&quot;/&gt;

      &lt;!-- prevents DOM parsing of huge messages --&gt;
      &lt;entry key=&quot;allowStreaming&quot; value=&quot;true&quot;/&gt;

      &lt;!-- enables CXF Logging Feature which writes inbound and outbound SOAP messages to log --&gt;
      &lt;entry key=&quot;loggingFeatureEnabled&quot; value=&quot;true&quot;/&gt;
    &lt;/cxf:properties&gt;
&lt;/cxf:cxfEndpoint&gt;
</pre>
<h3>Step 4: Result Processing</h3>
<p>The previous CXF call from Step 3 returns the response XML payload from the external web service as Camel body. In our scenario, the response XML document contains a result code from the external web service like &#8220;<code>INTERNAL_ERROR</code>&#8221; or &#8220;<code>SUCCEEDED</code>&#8220;. In the final routing, this response XML is evaluated and saved to the database:</p>
<pre class="brush: xml; wrap-lines: false;">
    &lt;!-- evaluate the SOAP result --&gt;
    &lt;to uri=&quot;bean:responseEvaluator?method=evaluateResponse&quot;/&gt;

    &lt;!-- save the response XML --&gt;
    &lt;to uri=&quot;bean:sqlExecutor?method=insertResponseXml( ${body} )&quot;/&gt;

    &lt;!-- other things like notify a service of a new response --&gt;

    &lt;!-- save the final transfer result --&gt;
    &lt;to uri=&quot;bean:sqlExecutor?method=insertTransferResult( ${property.TransferResult} )&quot;/&gt;
  &lt;/split&gt;
&lt;/route&gt;
</pre>
<p>The custom bean &#8220;<code>responseEvaluator</code>&#8221; inspects the response XML and then decides what the ultimate transfer result should be. For example, we could ignore an &#8220;<code>INTERNAL_ERROR</code>&#8221; result and just send the transfer again and again, because it is most likely a bug in the external system which will be fixed sooner or later. Such a strategy would not need manual interaction to resume or resend a failed transfer and thus save effort. But regardless of the evaluation&#8217;s outcome, the <code>responseEvaluator</code> always sets a Camel property named <code>TransferResult</code> which will be used by the <code>sqlExecutor</code> bean.</p>
<p>The &#8220;<code>sqlExecutor</code>&#8221; saves both the response XML and the <code>TransferResult</code> in the database. The saving is once more executed with a <a href="http://static.springsource.org/spring/docs/3.2.x/spring-framework-reference/html/jdbc.html">Spring JdbcTemplate</a>, similar to Step 2:</p>
<pre class="brush: java; wrap-lines: false;">
public class SqlExecutor {
  ... 

  public void insertTransferResult(final String transferResult, final Exchange exchange) {
    final String transferId = exchange.getProperty(Constants.TRANSFER_ID, String.class);
    Validate.notEmpty(transferId, &quot;Camel Property '&quot; + Constants.TRANSFER_ID + &quot;' must not be empty!&quot;);
    Validate.notEmpty(transferId, &quot;Parameter transferResult must not be empty!&quot;);

    int affectedRows = jdbcTemplate.update(
        &quot;UPDATE TRANSFER SET STATUS = ? WHERE ID = ?&quot;, transferResult, transferId
    );

    Validate.isTrue(affectedRows == 1, &quot;Not more or less than exactly one row should have been updated.&quot;);
  }
}
</pre>
<p>In case you wonder why we decided to use a bean as a response evaluator instead of a <a href="http://camel.apache.org/content-based-router.html">Camel &#8220;choice&#8221; statement</a>: one of the main reasons was the beneficial syntax-checking of the Java compiler. The WSDL contains an XML schema and that schema is integrated with JAXB into the Cosmo Router by using JAXB&#8217;s <a href="http://www.xyzws.com/scdjws/studyguide/jaxb_samples2.0.html">XJC compiler</a> to generate Java classes from the XML schema. As a result, XML schema changes like return-code renames (being an enumeration in the XSD) will immediately bring up compiler errors in the Java code. Another reason was readability: many &#8220;choice/when/otherwise&#8221; evaluations in the Camel XML DSL are more difficult to read than in Java.</p>
<p>This concludes the most important steps in the transfer processing. Of course we were only able to define this behavior because we had the luxury that neither the prevention of duplicate transfers nor high-throughput were very important. Nevertheless we achieved reliability, robustness, simplicity and last but not least project delivery on time.</p>
<h2>Struggling with the SQL Component</h2>
<p>As mentioned before in &#8220;Step 1&#8243;, Camel&#8217;s <a href="http://camel.apache.org/sql-component.html">SQL component</a> requires the data for dynamic placeholders to be delivered in the Camel body. Although this may make sense for integrating the SQL component into Camel&#8217;s way of processing, we felt the mandatory changing of the body to be cumbersome. This is fine if you don&#8217;t have a Camel body yet, e.g. when checking the database for new transfers. But it can become a headache when trying to save the response.</p>
<p>Imagine your Camel body contains a SOAP response and you want to save it together with a final result code like &#8220;SUCCESS&#8221; or &#8220;ERROR&#8221; in the database. You could use a Camel SQL statement like this:</p>
<pre class="brush: plain; light: true;">
UPDATE TRANSFER
  SET RESPONSE_XML = #, RESULT_CODE = #
  WHERE ID = #
</pre>
<p>But for <em>multiple</em> placeholders, the Camel SQL component requires the body to be an iterable array. Thus you need to convert the body with the SOAP response into an iterable array, supplemented with the result code and the ID to update.</p>
<p>Surprisingly we found no way to achieve this directly with the standard Camel XML DSL. So one idea was to write a custom <a href="http://camel.apache.org/type-converter.html">Camel Type Converter</a> which would parse a comma-separated body and split it into an iterable array:</p>
<pre class="brush: xml; wrap-lines: false;">
&lt;route&gt;
  ...

  &lt;setBody&gt;
    &lt;simple&gt;&quot;${body}&quot; ; &quot;${property.TransferResult}&quot; ; &quot;${property.TransferId}&quot;&lt;/simple&gt;
  &lt;/setBody&gt;
  &lt;convertBodyTo type=&quot;com.mgmtp.IterableList&quot;/&gt;
  &lt;to uri=&quot;sql:UPDATE TRANSFER SET RESPONSE_XML = #, RESULT_CODE = # WHERE ID = #&quot;/&gt;

  ...
&lt;/route&gt;
</pre>
<p>Although this idea would work, it is of course more difficult to read and introduces new logic to test and new potential failure points like empty strings, null values, exceeded string lengths or unescaped quotation marks. Additionally, accessing the SOAP response once more after having it saved is difficult, too. Not only did you change the body to an iterable array but also the SQL component&#8217;s behavior is to &#8220;overwrite&#8221; the body with the number of updated rows! You could of course save the body to a Camel property before the SQL statement is executed. Or you could use <a href="http://camel.apache.org/multicast.html">Camel&#8217;s Multicast DSL</a> to clone the original body with the SOAP response. But why do all this bending in the first place if you just want to execute a simple dynamic SQL statement with multiple placeholders? Well, our answer was simple: avoid the SQL component and just use a custom bean like the previously described &#8220;<code>sqlExecutor</code>&#8221; bean.</p>
<h2>Fighting Doubts with Prototypes</h2>
<p>The previously described architecture did not appear out of thin air but did undergo several iterations and prototypes. We especially wanted to collect early experiences in some crucial areas like using <a href="http://cxf.apache.org">Apache CXF</a> as <a href="http://en.wikipedia.org/wiki/Java_API_for_XML_Web_Services">JAX-WS</a> implementation (we previously used <a href="http://metro.java.net">Metro</a>) and evaluating how good it integrates into <a href="http://camel.apache.org">Apache Camel</a>.</p>
<p>Through early prototypes we quickly gained enough confidence in the chosen architecture. Additionally, we were able to start at once with the definition of most failure scenarios and consequently the optimal failover behavior we wanted the Cosmo Router to have. In fact, every time an unexpected failure occurred it was instantly checked against all previously defined failure scenarios and added, if necessary. The corresponding failover logic was implemented at once, too. Following this agile and pragmatic approach we soon achieved a robust messaging router which became better and better the more the implementation and testing went on.</p>
<p>If you are curious how we defined our failure scenarios and the required failover behavior of the Cosmo Router, then watch out for the next part of this blog series.</p>
<h2>Experiences with ActiveMQ</h2>
<p>Our first architectural draft was based on the idea to use <a href="http://activemq.apache.org">ActiveMQ</a> for exchanging messages between the Cosmo Web Application and the Cosmo Router. The idea was that the &#8220;TransferCreator&#8221; would insert transferable XML documents as JMS messages into the &#8220;Process Queue&#8221;. Thus we drafted the architecture shown in the following diagram:</p>
<div id="attachment_1517" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-camelrouter1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-2-Cosmo-Router-Experimental-Architecture-with-ActiveMQ-Variant-1-e1363879050102.png"><img class="size-large wp-image-1517" src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-2-Cosmo-Router-Experimental-Architecture-with-ActiveMQ-Variant-1-480x177.png" alt="" width="480" height="177" /></a><p class="wp-caption-text">Initial experimental architecture with ActiveMQ.</p></div>
<p>We built some prototypes around this idea but soon stumbled upon several issues. For example, we wanted both the Cosmo Router and the Cosmo Web Application to support non-availability. However, that would have been problematic in the above described architecture because the TransferCreator is attached to a user request. This user request should not have to wait if the Cosmo Router&#8217;s &#8220;Process Queue&#8221; is not available for several minutes or even hours.</p>
<p>So we came up with the solution to have two queues: one in the Cosmo Web Application and one in the Cosmo Router and both queues are communicating with each other. In ActiveMQ terminology this is called a &#8220;<a href="http://activemq.apache.org/how-do-distributed-queues-work.html#Howdodistributedqueueswork-Storeandforwardnetworksofbrokers">store and forward network of brokers</a>&#8220;:</p>
<div id="attachment_1518" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-camelrouter1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-3-Cosmo-Router-Experimental-Architecture-with-ActiveMQ-Variant-2.png"><img class="size-large wp-image-1518" src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-3-Cosmo-Router-Experimental-Architecture-with-ActiveMQ-Variant-2-480x247.png" alt="" width="480" height="247" /></a><p class="wp-caption-text">Improved experimental architecture with ActiveMQ using a &quot;store and forward network of brokers&quot; approach.</p></div>
<p>But again we had concerns with this idea as explained in the next chapter.</p>
<h2>Why we didn&#8217;t choose ActiveMQ</h2>
<p>Please consider once more our final architecture which favored a simple database over ActiveMQ:</p>
<div id="attachment_1516" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-camelrouter1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-1-Cosmo-Router-Final-Architecture-e1363879021957.png"><img class="size-large wp-image-1516" src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Fig-1-Cosmo-Router-Final-Architecture-479x211.png" alt="" width="479" height="211" /></a><p class="wp-caption-text">Our final architecture for comparison as previously described at the beginning of the article.</p></div>
<p>We saw several benefits in using the transfer table approach as depicted above instead of ActiveMQ:</p>
<ul>
<li>We didn&#8217;t require additional time to gain sufficient ActiveMQ knowledge. Although we were very interested in ActiveMQ, the project timeline was very tight and time was of the essence.</li>
<li>We didn&#8217;t have to introduce yet another framework. This was particularly important because the used development process was Scrum and by process definition everyone should be able to implement everything everywhere. Consequently the more frameworks are used, the higher the learning curve becomes.</li>
<li>We didn&#8217;t have to worry about transaction support. Unfortunately we didn&#8217;t achieve quick und solid results with transaction support. For example it stayed unclear how difficult it would be to integrate both ActiveMQ brokers into transactions while also keeping the Cosmo Router a lightweight stand-alone application.</li>
</ul>
<p>Thus we finally decided that using a relational database would be absolutely sufficient for our requirements and much easier to setup, maintain and understand, although the use of ActiveMQ might be quite reasonable in more complex scenarios.</p>
<h2>Conclusion</h2>
<p>In this article we described how Camel can be used to transfer XML documents between SOAP web services. We have shown that the use of a transfer database and the use of synchronous transfer processing allowed us to keep the routing relatively simple and easy to understand. Yet different types can be transferred concurrently which increases the overall throughput and sufficiently reduces bottlenecks. Furthermore the Cosmo Web Application can create and dispatch transfers asynchronously and therefore will not block its users from working with it.</p>
<p>All this is achieved in a transactional way, meaning that as long as a transfer is not completely processed, it stays in a &#8220;<code>NEW</code>&#8221; state and is therefore tried to be send again and again. This adaptable retry behavior does not only greatly reduce the need for manual investigation and interaction. It has also the benefit that the Cosmo Router can resume from any state even after a sudden cold restart.</p>
<p>Last but not least, our pragmatic approach enabled us to deliver the application integration of <a href="http://www.mgm-tp.com/branchen/versicherungen/loesungen/mgm-cosmo">mgm Cosmo</a> with the external CRM system on time. Moreover, the asynchronous transfer creation and the highly automated retry strategies saved us effort when going live. We only required a short downtime and we needed no further interaction when several XML document transfers were failing a whole week.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2013/04/camel-router-part1/">Designing and Implementing our Camel-based mgm Cosmo Router - Robust and Fail-Safe Message Routing with Apache Camel, Part 1</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=30" title="View articles by Michael Frieß">Michael Frieß</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2013/04/camel-router-part1/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<series:name><![CDATA[Robust and Fail-Safe Message Routing with Apache Camel]]></series:name>
	</item>
		<item>
		<title>Tuning Garbage Collection for Mission-Critical Java Applications</title>
		<link>http://blog.mgm-tp.com/2013/03/garbage-collection-tuning/</link>
		<comments>http://blog.mgm-tp.com/2013/03/garbage-collection-tuning/#comments</comments>
		<pubDate>Wed, 27 Mar 2013 11:18:00 +0000</pubDate>
		<dc:creator>Dr. Andreas Müller</dc:creator>
				<category><![CDATA[Tips]]></category>
		<category><![CDATA[ECommerce]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[JVM]]></category>
		<category><![CDATA[Performance]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1525</guid>
		<description><![CDATA[I recently had the opportunity to test and tune the performance of several shop and portal applications built with Java and running on the Sun/Oracle JVM, among them some of the most visited in Germany. In many cases garbage collection is a key aspect of Java server performance. In the following article we take a [...]]]></description>
			<content:encoded><![CDATA[<p>I recently had the opportunity to test and tune the performance of several shop and portal applications built with Java and running on the Sun/Oracle JVM, among them some of the most visited in Germany. In many cases garbage collection is a key aspect of Java server performance. In the following article we take a look at the state-of-the-art advanced GC algorithms and important tuning options and compare them for diverse real-world scenarios.</p>
<p><span id="more-1525"></span></p>
<p>Seen from the point of Garbage Collection, Java server applications have wide varying requirements:</p>
<ol>
<li>Some are high-traffic applications serving a huge amount of requests and creating a huge amount of objects. Sometimes, moderate-traffic applications using wasteful software frameworks do the same thing. Anyway, cleaning up these objects in an efficient way is a challenge for the garbage collector.</li>
<li>Others have extremely long uptimes and require a constant quality of service during that uptime without slow degradation or the risk of sudden deterioration.</li>
<li>Some place tight limits on their user response times (as in the online gaming or betting area) which do not leave much room for extended GC pauses.</li>
</ol>
<p>In many cases you will find a combination of several of these requirements with different priorities. Several of my sample shops and portals were very demanding with respect to point 1, one put extreme priority on point 2 but most applications are not extremely demanding in all of the three aspects at the same time. This leaves you the necessary room to choose the right tradeoffs. </p>
<h2>Out-of-the-Box GC Performance</h2>
<p>JVMs have improved a lot but still cannot do your job of optimizing the runtime for your application. Default JVM settings have a fourth priority in mind in addition to the 3 mentioned above: minimizing the memory footprint. They need to support millions of users who do not run Java on a server with plenty of memory. This is even true for many e-business products which are most of the time preconfigured to run on developer notebooks instead of production servers. As a consequence, if you run your server with a minimal set of heap and GC parameters like the following</p>
<pre class="brush: plain; light: true;">
java -Xmx1024m -XX:MaxPermSize=256m -cp Portal.jar my.portal.Portal
</pre>
<p>you will almost certainly obtain results which are not good enough for efficient server operation. In the first step, it is good practice to configure not only memory limits but also initial sizes to avoid costly step-by-step increases during server startup. Whenever you know how much memory is enough for your server (which you should try to find out in time) it is best to make initial sizes and limits equal by adding </p>
<pre class="brush: plain; light: true;">
-Xms1024m -XX:PermSize=256m
</pre>
<p>The last basic option frequently found in JVM configurations is a similar setting for the size of the so-called <em>New generation</em> heap:</p>
<pre class="brush: plain; light: true;">
-XX:NewSize=200m -XX:MaxNewSize=200m
</pre>
<p>These and other more sophisticated settings are explained in the next sections but let&#8217;s first look how the garbage collector works with them in a load test for one of our portal samples on a rather slow test server:</p>
<div id="attachment_1537" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-1.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-1-480x221.png" alt="" title="Figure 1" width="480" height="221" class="size-large wp-image-1537" /></a><p class="wp-caption-text">1. GC behavior of a JVM with little heap tuning (-Xms1024m -Xmx1024m -XX:NewSize=200m -XX:MaxNewSize=200m) over a period of about 25 hours (Click to enlarge).</p></div>
<p>The blue curve shows the occupied total heap as a function of time, vertical grey lines show the duration of GC pauses. </p>
<p>In addition to these graphs, key indicators of GC operation and performance are shown on the right-hand side. First we have a look at the average amount of garbage created (and collected) in this test run. The value of 30.5 MB/s is marked in yellow because it is a considerable but still moderate garbage creation rate, just about right for an introductory GC tuning example. Other values indicate how well the JVM copes with cleaning up that amount of garbage: 99.55% of that garbage is cleaned up in the <em>New generation</em> and only 0.45% in the <em>Old generation</em> which is rather good and therefore marked green. </p>
<p>Why this is good can be seen from the pauses the GC activity imposes on the JVM (and all the worker threads executing user requests): There are numerous and rather short <em>New generation</em> GC pauses. They occurred on average every 6 seconds and lasted less than 50 milliseconds. Such pauses stopped the JVM during 0.77% of wall time but any single pause is unnoticeable to the users waiting for the server&#8217;s response.</p>
<p>On the other hand, <em>Old generation</em> GC pauses stop the JVM during only 0.19% of time. But given the fact that during that time they only clean up 0.45% of the garbage while 99.55% is cleaned up during the 0.77% New generation pause time this shows how extremely inefficient Old generation garbage collection is compared to New generation GC. In addition, Old generation pauses on average occurred less than once per hour but lasted as much as almost 8 seconds on average with a single outlier even reaching 19 seconds. As these are true pauses for all the JVM&#8217;s threads processing user requests, they should be as infrequent and short as possible.</p>
<p>From these observations follows the basic tuning goal for generational garbage collection:</p>
<p><em>Collect as much garbage as possible already in New generation and make Old generation pauses as infrequent and short as possible</em>.</p>
<h2>Basic Ideas of Generational Garbage Collection and Heap Sizing</h2>
<p>Start from what you see in a JDK tool like <a href="http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstat.html">jstat</a> or <a href="http://docs.oracle.com/javase/7/docs/technotes/tools/share/jvisualvm.html">jvisualvm</a> and its <a href="http://www.oracle.com/technetwork/java/visualgc-136680.html">visualgc plugin</a>:</p>
<div id="attachment_1527" class="wp-caption alignnone" style="width: 419px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-2.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-2.png" alt="" title="Figure 2" width="409" height="374" class="size-full wp-image-1527" /></a><p class="wp-caption-text">2. The JVM heap structure including the sub-segments of the New generation (right column).</p></div>
<p>The Java heap is made up of the Perm, Old and New (sometimes called Young) generations. The New generation is further made up of <em>Eden space</em> where objects are created and <em>Survivor spaces</em> S0 and S1 where they are kept later for a limited number of New generation garbage collection cycles. If you want more details, you might want to read Sun/Oracle&#8217;s whitepaper <a href="http://www.oracle.com/technetwork/java/javase/memorymanagement-whitepaper-150215.pdf">&#8220;Memory Management in the Java HotSpot Virtual Machine&#8221;</a>.</p>
<p>By default the New generation as a whole, and the survivor spaces in particular, are too small to hold objects long enough until most of them are no longer needed and can be collected. Therefore, they are moved to the Old generation prematurely which will then fill up too fast and need to be cleaned up frequently which causes relatively many of the Full GC stops visible in figure 1 above.</p>
<h2>Tuning the Generation Sizes</h2>
<p>Tuning generational GC means making the New generation as a whole and in particular the survivor spaces larger than they are out-of-the-box. But to do this you also have to consider the GC algorithm used:</p>
<p>The default GC algorithm of a Sun/Oracle JVM running on today&#8217;s hardware is called ParallelGC and if it were not the default it could be configured explicitly using the JVM parameter </p>
<pre class="brush: plain; light: true;">
-XX:+UseParallelGC
</pre>
<p>This algorithm by default does not work with fixed sizes for Eden and the survivor spaces but uses a policy called &#8220;AdaptiveSizePolicy&#8221;, which is an adjustment-controlled automatic sizing strategy. As described above, it delivers reasonable behavior for many scenarios including non-server usage but it is not optimal for server operation. To switch it off and start setting your survivor sizes explicitly to fixed values use the following JVM configuration switch:</p>
<pre class="brush: plain; light: true;">
-XX:-UseAdaptiveSizePolicy
</pre>
<p>Once this has been done, we can not only further increase the New generation but also effectively set the survivor sizes to a suitable value:</p>
<pre class="brush: plain; light: true;">
-XX:NewSize=400m -XX:MaxNewSize=400m -XX:SurvivorRatio=6
</pre>
<p>&#8220;<code>SurvivorRatio=6</code>&#8221; means that each survivor space is 1/6 of Eden size or 1/8 of total New generation size, which in this case means 50 MB while adaptive sizing usually works with much smaller sizes in the range of only a few MB. By repeating the same load test as above with these settings we got the following result:</p>
<div id="attachment_1538" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-3.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-3-480x221.png" alt="" title="Figure 3" width="480" height="221" class="size-large wp-image-1538" /></a><p class="wp-caption-text">3. GC behavior of a JVM with tuned heap sizes (-Xms1024m -Xmx1024m -XX:NewSize=400m -XX:MaxNewSize=400m -XX:-UseAdapativeSizePolicy -XX:SurvivorRatio=6) over a period of 50 hours.</p></div>
<p>Note that during this test run of doubled duration there was on average almost the same garbage creation rate as before (30.2 compared to 30.5 MB/s). Nevertheless, there were only two Old generation (Full) GC pauses, no more than one in 25 hours. This was achieved by decreasing the rate of garbage ending up in the Old generation (the so-called promotion rate) from 137 kB/s to 6 kB/s or only 0.02% of all garbage. At the same time New generation GC pause duration increased only slightly from an average of 48 to 57 milliseconds and the average interval between pauses rose from 6 to 10 seconds. Altogether, switching off adaptive sizing and fine tuning the heap sizes decreased GC pause time from 0.95% to 0.59% of elapsed time which is an excellent result.</p>
<p>Similar results after tuning can be obtained with the <em>ParNew</em> algorithm as an alternative to the default <em>ParallelGC</em>. It was developed for compatibility with the <em>CMS</em> algorithm mentioned below and can be configured by <code>-XX:+UseParNewGC</code>. It does not use adaptive sizing but works with fixed values for the survivor sizes. Therefore and with the default of <code>SurvivorRatio=8</code> it usually delivers much better out-of-the-box results for server usage than the <em>ParallelGC</em>. </p>
<h2>Getting rid of long Old Generation GC Pauses</h2>
<p>The only remaining problem with the latest result above are the long Old generation (Full) GC pauses of about 8 seconds on average. These pauses have been made rare by proper generation tuning but when they occur they still are a nuisance to users because during their duration the JVM is not executing worker threads (stop-the-world GC). In our case, these 8 seconds are caused by an old and slow test server and could be up to a factor of 3 faster on modern hardware. On the other hand, today&#8217;s applications typically also use larger heaps than 1 GB and have larger amounts of live objects in the heap than in this example. Web applications nowadays work with heaps up to 64 GB and (at least temporarily) need half of that for their live objects. In such cases, 8 seconds is short for Old generation pauses. They can easily come close to one minute which is totally unacceptable for an interactive web application.</p>
<p>One option to alleviate the problem is the use of parallel processing for Old generation GC. By default, the <em>ParallelGC</em> and <em>ParNew</em> GC algorithms in Java 6 used multiple GC threads only for young generation collections while Old generation collections were single-threaded. In the case of the <em>ParallelGC</em> collector this can be changed by adding </p>
<pre class="brush: plain; light: true;">
-XX:+UseParallelOldGC
</pre>
<p>Since Java 7 this option is activated by default together with the <code>-XX:+UseParallelGC</code>. However, even with 4 or 8 cpu cores in your system you should not expect much more than an improvement by a factor of 2, often less. In some cases, as in our 8 seconds example above, this can be a welcome improvement but in other more extreme cases it is not enough. The solution is to use low-latency GC algorithms.</p>
<h2>The Concurrent Mark and Sweep (CMS) Collector</h2>
<p>The <em>CMS</em> garbage collector is the first and most-widely used low-latency collector. It has been available since Java 1.4.2 but suffered from instability issues in the beginning. Solving them required quite a few Java 5 releases. </p>
<p>As indicated by its name the CMS collector uses a concurrent approach where most of the work is done by a GC thread that runs concurrently with the worker threads processing user requests. A single normal Old generation stop-the-world GC run is split up into two much shorter stop-the-world pauses plus 5 concurrent phases where worker threads are allowed to go on with their work. Find a more detailed description of the CMS in the article <a href="http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#cms">&#8220;Java SE 6 HotSpot Virtual Machine Garbage Collection Tuning&#8221;</a>. </p>
<p>The CMS collector is activated by </p>
<pre class="brush: plain; light: true;">
-XX:+UseConcMarkSweepGC
</pre>
<p>Applying this to our sample application from above (under higher load than before) led to the following result:</p>
<div id="attachment_1539" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-4.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-4-480x219.png" alt="" title="Figure 4" width="480" height="219" class="size-large wp-image-1539" /></a><p class="wp-caption-text">4. GC behavior of a JVM with tuned heap sizes and CMS (-Xms1200m -Xmx1200m -XX:NewSize=400m -XX:MaxNewSize=400m -XX:SurvivorRatio=6 -XX:+UseConcMarkSweepGC) over a period of 50 hours.</p></div>
<p>It is visible that the Old generation pauses in the 8 seconds range are now gone. For each Old generation collection (in our case 5 of them in 50 hours) there are now two pauses and all of them are below 1 second. </p>
<p>By default, the CMS collector uses the ParNew collector to execute the New generation collections. If the ParNew collector runs together with the CMS its pauses tend to be a bit longer than when it runs without it because their cooperation requires some extra effort. In addition to the slightly higher average New generation pause times compared to the previous results, this can be seen from the frequent outliers in New generation pause times which reach up to 0.5 seconds in the test run shown. But they are all short enough to make the CMS/ParNew collector pair a good low-latency option for many applications.</p>
<p>A more important disadvantage of the CMS collector is related to the fact that it cannot be started when the Old generation heap is full. Once the Old generation is full, it is too late for the CMS and it must then fall back to the usual stop-the-world strategy (announced by a &#8220;concurrent mode failure&#8221; in the GC log). To reach its low-latency goal the CMS is started whenever Old generation occupation reaches a threshold set by</p>
<pre class="brush: plain; light: true;">
-XX:CMSInitiatingOccupancyFraction=80
</pre>
<p>The CMS is started once 80% of the Old generation is occupied. For our application this reasonable value (which at the same time is also the default) worked well, but if the threshold is set too high a concurrent mode failure can any time bring back the long Old generation GC pauses. If on the other hand it is set too low (below the size of the live part of the heap) the CMS might run concurrently all the time and thus consume the processing power of one CPU entirely. If an application experiences brisk changes in its object creation and heap usage behavior, e.g. by the start of specialized tasks either interactively or by a timed trigger, it can be hard to set this threshold right to avoid both risks at all times.</p>
<h2>The Specter of Fragmentation</h2>
<p>The biggest disadvantage of the CMS, however, is related to the fact that it does not compact the Old generation heap. It therefore carries the risk of heap fragmentation and severe operations degradation over time. Two factors increase this risk: a tight Old generation heap and frequent CMS runs. The first factor can be improved by making the Old generation heap larger than what would be needed with the ParallelGC collector (which I did from 1024 to 1200 MB as can be seen in the previous figures). The second factor can be improved by proper generation sizing as described above. We actually saw how infrequent Old generation GC can be made by it. To demonstrate how essential it is to fine tune the generation sizes before switching to the CMS let&#8217;s have a look at what might happen if we do not follow this rule and apply the CMS directly to the little tuned heap of figure 1:</p>
<div id="attachment_1540" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-5.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-5-480x225.png" alt="" title="Figure 5" width="480" height="225" class="size-large wp-image-1540" /></a><p class="wp-caption-text">5. GC behavior and sudden degradation by fragmentation when the CMS is applied to the poorly tuned heap of figure 1 (GC indicators on the right from the first 14 hours only).</p></div>
<p>It is obvious that with these settings the JVM worked well for almost 14 hours under loadtest conditions (in production and with lower load this treacherously benign period may last much longer). Then suddenly there were very long GC pauses which actually stopped the JVM for about half of the remaining time. There were not only attempts to clean up the mess in the Old generation which lasted more than 10 seconds but even New generation GC pauses were in the seconds range because the collector spent a lot of time searching for space in the Old generation when it tried to promote objects from new to Old generation.</p>
<p>The fragmentation risk is the price to pay for the low-latency advantage of the CMS. This risk can be minimized but it is always there and it is hard to predict when it will strike. With proper GC tuning and monitoring, however, the risk can be managed. </p>
<h2>The Promise of the Garbage First (G1) Collector</h2>
<p>The G1 collector was designed to achieve low-latency behavior without the risk of heap fragmentation. As such, it is <a href="http://www.oracle.com/technetwork/java/javase/tech/g1-intro-jsp-135488.html">announced as a long-term replacement</a> for the CMS collector by Oracle. G1 avoids the fragmentation risks because it is a compacting collector. As far as GC pauses are concerned, it does not aim at the shortest possible pauses but at controlling pauses by placing an upper limit on their duration which is maintained in a best-effort approach. Readers can find more details about the G1 collector in the great tutorial <a href="http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html">&#8220;Getting Started with the G1 Garbage Collector&#8221;</a>, German readers also in Angelika Langer&#8217;s article <a href="http://www.angelikalanger.com/Articles/EffectiveJava/55.GC.G1.Overview/55.GC.G1.Overview.html">&#8220;Der Garbage-First Garbage Collector (G1) &#8211; Übersicht über die Funktionalität&#8221;</a>.</p>
<p>Before we examine the current state of the G1 collector by comparing its performance on our sample application with the performance of the classic collectors described above, let me summarize two important pieces of information about the G1 collector:</p>
<ul>
<li>G1 is officially supported by Oracle since Java 7u4, but for G1 you should go for the most recent Java 7 update available. The Oracle GC team is working hard on G1 and improvements in recent Java updates (7u7 to 7u9) have been noticeable. On the other hand, G1 has been in no way production-ready in any Java 6 release and the by far superior Java 7 implementation will probably never be backported.</li>
<li>The generation sizing approach I described above is obsolete with G1. Setting generation sizes is in conflict with setting pause time targets and will prevent the G1 collector from doing what it was designed for. With G1 you set the overall memory size using &#8220;<code>-Xms</code>&#8221; and &#8220;<code>-Xmx</code>&#8221; and (optionally) a GC pause time target and usually leave all the rest to the G1 collector. It follows a similar approach as the ParallelGC collector&#8217;s AdapativeSizingPolicy and adjustment-controls the generation sizes in such a way as to fulfill the pause time target.</li>
</ul>
<p>Once these guidelines were followed, the G1 collector delivered the following result out-of-the-box:</p>
<div id="attachment_1541" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-6.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Figure-6-480x221.png" alt="" title="Figure 6" width="480" height="221" class="size-large wp-image-1541" /></a><p class="wp-caption-text">6. GC behavior of a JVM with G1 and minimal configuration (-Xms1024m -Xmx1024 -XX:+UseG1GC) over a period of 26 hours.</p></div>
<p>In this case, we used the default GC pause time target of 200 milliseconds. As can be seen from the indicators this target was almost met on average and the longest GC pauses were as good as with the CMS (figure 4). G1 apparently had very good control of GC pauses because outliers compared to the average duration were rather rare and limited.</p>
<p>On the other hand, average GC pause times were much longer than with the CMS collector (270 vs. 100ms) and because they were even more frequent this also means that accumulated GC pause time, i.e. the overhead for GC itself, was more than 4 times higher than with CMS (6,96 vs. 1.66% of elapsed time). </p>
<p>Just like the CMS the G1 works with GC pauses and with concurrent GC phases. In similar ways as the CMS, it starts concurrent phases based on an occupation threshold. It is visible in figure 6 that the available heap of 1GB is by far not fully used. This is because the G1&#8217;s default occupation threshold is much lower than the CMS&#8217; threshold. It is also reported that the G1 in general tends to be satisfied with less heap than the other collectors.</p>
<h2>Quantitative Comparison of Garbage Collectors</h2>
<p>The following table summarizes some key performance indicators achieved with the 4 most important garbage collectors of Oracle Java 7 running the same load test on the same application but with different levels of load (indicated by the garbage creation rate shown in column 2):</p>
<div id="attachment_1532" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-gcg1" href="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Table.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2013/03/Table-480x146.png" alt="" title="Table" width="480" height="146" class="size-large wp-image-1532" /></a><p class="wp-caption-text">Table with Comparison of several Garbage Collectors (Click to enlarge).</p></div>
<p>All the collectors were run with about 1GB of total heap size; the traditional collectors (ParallelGC, ParNewGC and CMS) in addition used the following heap settings:</p>
<pre class="brush: plain; light: true;">
-XX:NewSize=400m -XX:MaxNewSize=400m -XX:SurvivorRatio=6
</pre>
<p>while the G1 collector ran without additional heap size settings and used the default pause time target of 200 milliseconds which can also be set explicitly by</p>
<pre class="brush: plain; light: true;">
-XX:MaxGCPauseMillis=200
</pre>
<p>As can be seen from this table the traditional collectors execute New generation collections (column 3) in similar time. This is true for the ParallelGC and the ParNewGC collectors but also for the CMS which in fact uses the ParNewGC to execute New generation collections. Promotion from new to Old generation, however, requires some coordination between ParNewGC and CMS during New generation GC pauses. This coordination creates an extra cost which translates into slightly longer New generation pauses for the CMS. </p>
<p>Column 7 summarizes the time lost in GC pauses as percentage of elapsed time. This number is a good measure of GC overhead because concurrent GC time (last column) and the CPU usage overhead it implies may be neglected. With heap sizes tuned as described above and thus with rare Old generation collections, column 7 is largely dominated by New generation pause time. New generation pause time is the product of New generation pause duration (column 3) and New generation pause frequency. New generation pause frequency is a function of the New generation size which was the same (400 MB) for all of the traditional collectors. Therefore and for these collectors column 7 more or less mirrors column 3 (for similar load). </p>
<p>The benefit of the CMS collector in this picture is evident from column 6: it trades much (one order of magnitude) shorter Old generation GC pauses against a slightly higher overhead. For many real world applications this is a very good deal.</p>
<p>How well does the G1 collector compete for our application? Column 6 (and 5) tells us that it successfully competes with the CMS in reducing Old generation GC pauses. But column 7 indicates that it pays a rather high price to achieve this: GC overhead was 7% compared to 1.6% for the CMS under the same load. </p>
<p>I will examine the conditions under which this higher overhead occurs as well as the strengths and weaknesses of the G1 compared to other collectors (in particular to the CMS collector) in a follow-up to this article as it is a vast and newsworthy subject in its own right.</p>
<h2>Summary and Outlook</h2>
<p>For all the classic Java GC algorithms (SerialGC, ParallelGC, ParNewGC and CMS) generation sizing is an essential tuning and fine tuning procedure which in many real-world applications is not practiced sufficiently. The consequences are suboptimal application performance and the risk of operations degradation (loss of performance and even application standstill for extended periods of time if it is not well monitored).</p>
<p>Generation sizing can improve application performance noticeably and reduce the occurrence of long GC pauses to a minimum. Elimination of long GC pauses, however, requires the usage of a low-latency collector. The preferred and most proven low-latency collector has been (and still is as of today) the CMS collector which in many cases does what is needed and, with proper tuning, also provides long-term stability in spite of its inherent heap fragmentation risk. The intended replacement, the G1 collector, is now (as of Java 7u9) a supported and usable alternative but there is still room for improvement. For many applications, it will deliver acceptable but not yet better results than the CMS collector. The details of its strengths and weaknesses deserve closer examination.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2013/03/garbage-collection-tuning/">Tuning Garbage Collection for Mission-Critical Java Applications</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=31" title="View articles by Dr. Andreas Müller">Dr. Andreas Müller</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2013/03/garbage-collection-tuning/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Securing your Password Database with bcrypt</title>
		<link>http://blog.mgm-tp.com/2013/02/securing-your-password-database-using-bcrypt/</link>
		<comments>http://blog.mgm-tp.com/2013/02/securing-your-password-database-using-bcrypt/#comments</comments>
		<pubDate>Fri, 08 Feb 2013 15:19:18 +0000</pubDate>
		<dc:creator>Dr. Christian Winkler</dc:creator>
				<category><![CDATA[Tips]]></category>
		<category><![CDATA[Database]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1459</guid>
		<description><![CDATA[Do you also spend sleepless nights because you have saved the passwords of your users in clear text or near-clear text (MD5)? We will show you a simple method how you can smoothly migrate your password database to a much more secure format. The transition is transparent to the users and instant, i.e. as soon [...]]]></description>
			<content:encoded><![CDATA[<p>Do you also spend sleepless nights because you have saved the passwords of your users in clear text or near-clear text (MD5)? We will show you a simple method how you can smoothly migrate your password database to a much more secure format. The transition is transparent to the users and instant, i.e. as soon as you have implemented the process, your passwords are safe. If you still store your passwords in an insecure format, you should convert them to a secure format as soon as possible. Do it now!</p>
<p><span id="more-1459"></span></p>
<p>Many online shops, forums or communities save their user passwords in either plaintext or MD5 hashed form for historical reasons. Passwords saved in plaintext are obviously a bad idea, but MD5 is not much better these days as modern graphics cards can create several hundred millions of MD5 hashes a second and passwords can easily be broken this way, e.g. by using the <a href="http://www.elcomsoft.com/lhc.html">Lightning Hash Cracker</a>. Suitable hardware can be rented at a very low price in the cloud, e.g. <a href="http://www.nvidia.com/object/gpu-cloud-rendering.html">NVidia&#8217;s GPU Cloud Rendering</a> or <a href="http://aws.amazon.com/ec2/">Amazon EC2</a>, so the danger of exposing user passwords is increasing. Even institutions like the <a href="http://crypt0nymous.tumblr.com/post/33886239538/fbi-gov-hacked-294-plain-text-passwords-leaked-by-the">FBI still save their password in clear text</a>, so there is no shame involved, but it&#8217;s something you should definitely change as soon as possible.</p>
<p>The usual method to change the hashing algorithm without asking users to modify their passwords is to convert the passwords to the new hash as soon as users log in. Our algorithm, however, instantly converts the database to a safe password format.</p>
<h2>Who is not at risk?</h2>
<p>You do not need to worry if you are already using a secure hashing algorithm with salts that are hard to break. SHA1 or SHA256 are more secure than MD5, but still not really safe. There is a multitude of hashing algorithms that are more computationally intensive and &#8211; due to their intrinsic salt &#8211; hard to beat using <a href="http://en.wikipedia.org/wiki/Rainbow_table">rainbow tables</a>, the most prominent of which is <a href="http://en.wikipedia.org/wiki/Bcrypt">bcrypt</a>. More generic <a href="http://en.wikipedia.org/wiki/Key_derivation_function">key derivation functions</a> can also be used for one-way encrypting passwords, a good candidate is e.g. <a href="http://en.wikipedia.org/wiki/PBKDF2">PBKDF2</a>. There are intensive ongoing discussions which algorithm is theoretically superior, but these are, in our opinion, not directly relevant as both seem to be secure enough for a long time being.</p>
<div id="attachment_1460" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-bcrypt" href="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig1-Password-Saving-Methods.png"><img class="size-large wp-image-1460" src="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig1-Password-Saving-Methods-480x137.png" alt="" width="480" height="137" /></a><p class="wp-caption-text">A Web App uses a database to persist the users&#39; passwords. The red methods for saving passwords can easily be &quot;inverted&quot; if the database is compromised whereas the blue algorithms are safe today and probably for a few more years.</p></div>
<h2>Some background</h2>
<p>Let&#8217;s first discuss how an attacker actually gets hold of a database. Passwords (or their hashed versions) are often stored in relational databases. The by far most common attack to get a copy of the database (or individual rows) is a technique called <a href="http://en.wikipedia.org/wiki/SQL_injection">SQL injection</a>. URL parameters or POST parameters of a web page are manipulated to contain malicious characters which are interpreted by the SQL database. Of course not all web sites are susceptible to SQL injection but it is very hard to prove that a web site is invulnerable.</p>
<p><a href="http://en.wikipedia.org/wiki/Vulnerability_(computing)">Security vulnerabilities</a> are also a permanent danger. Security leaks might exist in some server software, operating system kernels or even in the database system itself. Such leaks are traded on the Internet in order to abuse them.</p>
<p>You should never underestimate internal attacks. Often databases are accessible from the inside and passwords can readily be found in the source code or server configuration. This makes it easy for employees to get hold of all data and analyze it after it has been saved to a local disk or even transferred to an external medium.</p>
<p>Backups are often performed in data centers and it is not always safe to assume that these backups are treated as confidentially as the servers themselves. Often backup tapes are reused or even disposed; this sometimes leads to interesting finds including old databases. Remember that users tend to change their passwords very infrequently if not forced; so even very old passwords have a high chance to still be working (maybe also on other sites).</p>
<p>The same is basically true for old hard disks. However, hard disks are often changed routinely and the old disks are then sold on used hardware wholesale sites. If you cannot absolutely trust your hosting provider (and who can?) you should take into account that your password database might sometime also go this way.</p>
<div id="attachment_1461" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-bcrypt" href="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig2-Attack-Vectors.png"><img class="size-large wp-image-1461" src="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig2-Attack-Vectors-480x327.png" alt="" width="480" height="327" /></a><p class="wp-caption-text">Different possibilities how an attacker can get hold of the (encrypted) passwords.</p></div>
<h3>How Hashing works</h3>
<p>One way to protect the password itself is to not save it at all, but to just save a &#8220;fingerprint&#8221; of the password instead. If a password is entered, it is enough to calculate the fingerprint of the entered password and compare it to the saved  fingerprint. This fingerprinting technique is called &#8220;<a href="http://en.wikipedia.org/wiki/Hash_function">hashing</a>&#8220;. There are of course more formal explanations of hashing, but for our discussion it is not necessary to understand it on the deepest level. The most important aspect for us is the use of a <a href="http://en.wikipedia.org/wiki/Cryptographic_hash_function">cryptographic hash function</a> that ensures that small changes in the original text lead to completely different hashes.</p>
<p>To make fingerprints different for users even if their passwords are the same, an additional (random) value is combined with the password before it is hashed. This value is called &#8220;<a href="https://www.owasp.org/index.php/Password_Storage_Cheat_Sheet#Recommendation:_Make_it_hard_to_steal_the_salt">salt</a>&#8221; and is also added to the hash to calculate the fingerprint from the entered password correctly using this hash. It is obviously much safer to store the salt in a different place but for our discussion this just complicates matters and therefore we ignore this. Sometimes the algorithm is modified further by adding a &#8220;secret&#8221;, which is only known to the application, to the password as well before it is hashed. This slightly enhances security if the secret is well-chosen.</p>
<p>There is a variety of hashing algorithms; the most famous ones are <a href="http://en.wikipedia.org/wiki/MD5">MD5</a> and <a href="http://en.wikipedia.org/wiki/SHA-1">SHA1</a>. Both were however created in former times and optimized for the calculation of hashes with high performance. This makes them not very well suited for storing passwords as we will see below.</p>
<div id="attachment_1462" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-bcrypt" href="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig3-Hash.png"><img class="size-large wp-image-1462" src="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig3-Hash-480x144.png" alt="" width="480" height="144" /></a><p class="wp-caption-text">A hash is a one-way function which can calculate fingerprints from passwords. However, it cannot be inverted. So as in real life you cannot find a person by his/her fingerprints; to check the whether the fingerprint belongs to a specific person you have to ask the person for the fingerprint. </p></div>
<h3>How hashed Passwords can be broken</h3>
<p>Unfortunately, not all is safe even when passwords are hashed. If an attacker gets hold of the (hashed) password database, there are several ways of how this database can be broken. We concentrate on the most famous ones:</p>
<ul>
<li><strong>Dictionary attacks</strong>: Dictionary attacks make use of the fact that most passwords are words which are also found in a dictionary. The attacker then takes a dictionary (usually one which is specifically designed for cracking passwords) and tries to hash each word in the dictionary and compares the result to the hashed version of the passwords. The tools used for dictionary attacks, like &#8220;<a href="http://en.wikipedia.org/wiki/John_the_Ripper">John the Ripper</a>&#8221; have become quite sophisticated and perform all kind of permutations and obvious replacements (like &#8220;I&#8221; ⇔ &#8220;1&#8243; and &#8220;E&#8221; ⇔ &#8220;3&#8243; etc.)Of course this only works if the hashing algorithm is known and the position of the salt can be extracted. This can usually be accomplished quite easily if the attacker creates a login for him/herself before stealing the password database.</li>
<li><strong>Brute force attacks</strong>: These attacks are a simpler version of the dictionary attacks. Instead of using a dictionary, all conceivable combinations of characters are created and hashed. Of course the short-term success rate is lower as most users will have passwords which are in a dictionary. However, this method will find all passwords if given enough time and not only passwords which are found in a dictionary.</li>
<li><strong>Rainbow tables</strong>: Rainbow tables are precomputed tables that contain hash values for many, many passwords. As specific weaknesses of the hashing algorithm can be taken into account, the size of the table can shrink considerably. Obviously rainbow tables work best when no salt is included in the passwords. Then this method is very efficient and can break a few million passwords a day on decent hardware.</li>
</ul>
<p>All these methods are dangerous, as computing power has exploded in the last few years. Both generating and checking passwords can be done with a rate of several hundred million passwords a second if common graphics cards and outdated, unsuited algorithms like MD5 or SHA1 are used. It&#8217;s naïve to assume this kind of hashing already makes the passwords secure.</p>
<p>Not convinced that this works? We have created the hash <code>2257151269b83ef0e139c3eec8bbcbcb</code> from a password, so head over to <a href="http://www.md5decrypt.org">www.md5decrypt.org</a> and try to find the original password. Even search engines carry some of the most popular hashes today.</p>
<p>To create a secure storage of passwords, a more complicated algorithm has to be used. <a href="http://codahale.com/how-to-safely-store-a-password">Bcrypt</a> is an algorithm which is specifically designed for this purpose and will be complicated enough to last for several years. bcrypt was specifically designed for adjustable complexity by increasing the number of cycles. Both bcrypt and PBKDF2 take several orders of magnitude longer to calculate than MD5 / SHA1, which is a bit cumbersome if you expect mass logins, but on the other side makes it impossible to crack the password database by using brute force techniques. By adjusting this &#8220;work factor&#8221; the algorithms will be safe for a long time to come. For a more detailed discussion regarding complexity see article &#8220;<a href="http://codahale.com/how-to-safely-store-a-password">How To Safely Store A Password</a>&#8220;.</p>
<h2>Our Conversion Process</h2>
<p>Let’s assume that we already have a password database with a significant amount of users. The passwords are hashed with a hashing function hash(cleartext), where hash can be e.g. MD5, SHA-1 or in the most simple case a function just returning the cleartext (in which case the passwords are stored unencrypted).</p>
<h3>The challenge</h3>
<p>As described above and in the <a href="http://en.wikipedia.org/wiki/Bcrypt">bcrypt Wikipedia page</a>, one of the advantages of bcrypt is its complexity. So it takes time to compute the hash value from the original. This makes the algorithm hard to attack but also poses some problems when converting many values at once as we have to make sure that no concurrent access is modifying the password at the same time (i.e.. a user trying to change his/her password at the same time).</p>
<p>We have to ensure that no password change is lost and no user will be denied login during the whole conversion process.<br />
There are some additional complexities related to the different ways a password was saved before the change; these will be explained individually below.</p>
<h3>Prerequisites</h3>
<p>As a first step the software has to be made aware of bcrypted passwords. This of course makes it necessary to save bcrypted passwords and check against these saved versions. As the transition cannot be performed for all users at once, it makes sense to introduce a new column in the database first to host these bcrypted passwords. Alternatively, we could also have a Boolean column which tells us whether the password is still saved in the old representation or is already bcrypted. The choice is up to you.</p>
<p>In the next step, the software itself has to be adapted. It has to be aware of the two password formats, must be able to check entered passwords against both formats and be able to write changed passwords in the new format. This software will still be fully functional even if not a single password has been converted to the new format.</p>
<h3>Algorithm for Changing Passwords</h3>
<p>In the first step the procedure for changing passwords should check whether the row is already locked. This is a good idea as it is possible (though very unlikely) that a user opened multiple windows and is trying to change the password in all of them.</p>
<p>In the next step, the idea is to exclusively lock the row of the current user first, to then read the original (maybe already hashed) password, bcrypt it, write it back, delete the original password, finish the transaction and release the lock. It is essential that all this is performed within a single transaction. This ensures that no password change gets lost and users will be able to log in to the system using their old (and new) passwords without interruption (or only a very small delay if a user whose password is about to be converted tries to log in simultaneously).</p>
<div id="attachment_1463" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-bcrypt" href="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig4-Changing-Password-Process.png"><img class="size-large wp-image-1463" src="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig4-Changing-Password-Process-479x286.png" alt="" width="479" height="286" /></a><p class="wp-caption-text">Activity diagram for changing the password to a bcrypted version of the password.</p></div>
<h3>Algorithm for Verifying Passwords</h3>
<p>To avoid concurrency problems due to passwords just being converted, the conversion algorithm will perform the necessary change in one single transaction, so the verification does not have to concern itself with it. If the password is in bcrypt format, only this value should be used to check the password. The bcrypt value already contains some meta information like salt and number of iterations, so verification is as simple as</p>
<pre class="brush: plain;">
Bcrypt.check(hash(enteredPassword), storedBcryptHash)
</pre>
<p>If the password has not yet been converted the previously used algorithm should take over.<br />
In the other cases below we concentrate only on the verification of the bcrypted password, while the old password is gracefully handled by the old password checking algorithm.</p>
<h3>Handling clear-text Passwords</h3>
<p>Converting clear text passwords is straightforward. In each step, the row is locked, the unencrypted password from each row taken, bcrypted and saved to the new column.</p>
<p>Verifying the password is similar; depending on whether it has already been bcrypted or not, the clear text or the bcrypted password is used to check against the entered password.</p>
<h3>Handling unsalted MD5 passwords</h3>
<p>As MD5 cannot be easily inverted (unless you attack the database yourself) it is much simpler to use the MD5 hashed password as a starting point and during conversion just bcrypt that value into a new column.</p>
<p>Verification is a bit more complicated, because the entered password must be MD5 hashed and that hash then be checked against the bcrypted column in the database.</p>
<h3>Handling salted MD5 passwords</h3>
<p>The conversion procedure is more complicated if MD5 passwords with salt are used. To re-create the salted MD5 hash, the salt has to be separated in a first step and should be kept in an extra column in the database. After doing that, the conversion procedure is identical to the one described above, i.e. the salted MD5 hash will be bcrypted.</p>
<p>Checking an entered password is also a bit more complicated. First the correct salt must be selected from the database, the entered password must be MD5 hashed with this salt and finally this value will be checked with the bcrypt algorithm. Checking in this case works with:</p>
<pre class="brush: plain;">
Bcrypt.check(MD5(salt + enteredPassword), storedBcryptPassword)
</pre>
<h3>Converting to a &#8220;pure&#8221; bcrypt database</h3>
<p>Storing the bcrypt&#8217;ed version of an already (via e.g. MD5) hashed password is not as secure as storing the password directly with bcrypt, as some of the entropy is eaten up by the original hashing algorithm. However it is still tremendously more secure than storing the password trivially hashed or in clear text.</p>
<p>If you started with an MD5 password database and are now unhappy about the &#8220;mixed&#8221; hashes in your database, there is also a solution for this. You have to use a flag column which indicates that the passwords are in the double-hashed intermediate format. As soon as a user successfully logs in you know the correct password and can save a pure bcrypted version in the database. Of course you still have to change the flag so that your future password verifications only use bcrypt.</p>
<p>This procedure is also non-intrusive but will never catch all users. If you want to get rid of those &#8220;mixed&#8221; users completely, you either have to force them to log in or reset their passwords. If they have not logged in for a long time, chances are high that they have forgotten their passwords anyway.</p>
<div id="attachment_1464" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-bcrypt" href="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig5-Incremental-conversion-process.png"><img class="size-large wp-image-1464" src="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Fig5-Incremental-conversion-process-480x331.png" alt="" width="480" height="331" /></a><p class="wp-caption-text">Passwords can be converted incrementally to a &quot;pure&quot; bcrypt format.</p></div>
<h2>Conclusion</h2>
<p>After performing the steps described above your users&#8217; passwords will be instantly safe. Now even if the password database is compromised, hackers will have a hard time figuring out the passwords. The process is easy and straightforward as we reuse the already existing passwords and make the process transparent for the users.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2013/02/securing-your-password-database-using-bcrypt/">Securing your Password Database with bcrypt</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=20" title="View articles by Dr. Christian Winkler">Dr. Christian Winkler</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2013/02/securing-your-password-database-using-bcrypt/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>On the Quality Benefits of Formal Domain Specific Languages - Software Quality driven by Formal DSLs, Part 2</title>
		<link>http://blog.mgm-tp.com/2013/01/formal-dsl-part2/</link>
		<comments>http://blog.mgm-tp.com/2013/01/formal-dsl-part2/#comments</comments>
		<pubDate>Sat, 12 Jan 2013 13:40:47 +0000</pubDate>
		<dc:creator>Dr. Jürgen Knopp</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[DSL]]></category>
		<category><![CDATA[QA]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[Validation]]></category>
		<category><![CDATA[Web Forms]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1450</guid>
		<description><![CDATA[One of the assets of mgm is dedicated quality for software, including especially portal technology for applications with high-safety and reliance demands. In the first blog within this series, &#8220;Using Domain Specific Languages to Implement Interactive Frontends&#8220;, we described an approach using a specification language (DSL) family on customer level to specify valid inputs and [...]]]></description>
			<content:encoded><![CDATA[<p>One of the assets of mgm is dedicated quality for software, including especially portal technology for applications with high-safety and reliance demands. In the first blog within this series, &#8220;<a href="http://blog.mgm-tp.com/2012/02/formal-dsl-part1/">Using Domain Specific Languages to Implement Interactive Frontends</a>&#8220;, we described an approach using a specification language (DSL) family on customer level to specify valid inputs and frontend compu­tations for forms-based interactive or batch systems. Let us continue and focus on the quality benefits of this approach.</p>
<p><span id="more-1450"></span></p>
<p>The presented specification language family supports modeling of frontends, such as form-based user interfaces, or field-related batch programs that e.g. process XML input.</p>
<p>In a nutshell, the approach is characterized by formal specifications for well-defined and valid field inputs, computation of values for calculated fields and of relationships between these fields. For an example see the summary at the end of part &#8220;<a href="http://blog.mgm-tp.com/2012/02/formal-dsl-part1/">Using Domain Specific Languages to Implement Interactive Frontends</a>&#8220;.</p>
<p>The specification means are as follows:</p>
<ul>
<li><strong>All fields are typed</strong> as numbers, currencies, percentage types, plain text, or enumerations, etc. For some of these types, additional constraints can be expressed, such as minimum and maximum values, field lengths or regular expression restrictions.</li>
<li><strong>Calculated fields</strong> have functional dependencies describing value dependencies on values of other fields by means of functional rules.</li>
<li><strong>Constraint rules</strong> describe cross-field relationships between fields. Error messages describing the non-compliance of inputs to these rules are part of the specification. </li>
</ul>
<p>Both, constraint rules and functional rules, aside from referring to values for fields, allow the expression of different behavior dependent on existent or missing values for input fields.</p>
<p>Such specifications automatically lead to control for the specified system: </p>
<ul>
<li>Field value are computed according to functional rules.</li>
<li>Constraints on and between fields lead to automatic validation:</li>
</ul>
<ul>
<li>Type errors are flagged with corresponding messages. </li>
<li>For constraint rules, the specified error messages are issued, if constraints do not hold.</li>
</ul>
<p>The specification languages are tailored to the application domain (DSL). Therefore specifications can be (and indeed are) written by customers rather than by programmers. </p>
<h2>Practical Quality Benefits for Specifications</h2>
<p>Let us discuss how the formal language approach supports development, quality and test.</p>
<h3>Avoiding implementation errors by code generation</h3>
<p>Similarly to programming languages, specification languages substantially simplify the software development process. Code generated from specifications eliminates a great deal of complexity and leads to less error-prone systems. Here&#8217;s how code can be generated:</p>
<ul>
<li>Functional rules are simply compiled to fully operational code. </li>
<li>Field type definitions and cross-field constraints are translated to validation code performing checks and delivering adequate error messages. </li>
</ul>
<p>At mgm tp, code generators have been implemented for a variety of languages including Java, C++, Javascript and different runtime environments. The generator approach yields a high-quality gain. Once the code generators are well tested, there is not much need to test validation software for each case again. The generated code conforms to the specification. </p>
<h3>Generating Test Data from Specifications</h3>
<p>Formal specifications substantially facilitate and improve the quality-assurance process, especially testing. Since constraints and dependencies define the set of correct inputs, they are an ideal prerequisite for the generation of test suites. This includes both, consistent (valid inputs) and inconsistent data (deliberately invalid inputs). Generation of test data is discussed in more detail later in this blog series.</p>
<h3>Finding Specification Errors</h3>
<p>Translating specifications to code and especially to test data improves quality of the specifications, because it shows flaws allowing for an early feedback to the writers of specifications. Inconsistent and sometimes even incomplete specifications can easily be discovered:</p>
<ul>
<li><em>Inconsistent specifications</em> lead to contradictions and thus inhibit valid test data generation. This can be detected already at test data generation time.</li>
<li><em>Incomplete specifications</em> give too much freedom to test data generation. Since &#8220;too much freedom&#8221; cannot be decided upon statically and automatically, this will show up only later in the process. In most cases such specification errors are reported in the testing phase.</li>
</ul>
<h3>Improving the Processes</h3>
<p>Additionally to the above technical benefits, this formalized approach improves the requirement, development, and quality processes as follows:</p>
<ul>
<li>Formalization of requirements <em>in the user&#8217;s domain</em> enforces the analysis of well-known and less well-known informal requirements. In other words, the likelihood that the system does what the customer explicitly or implicitly expected increases.</li>
<li>The differentiation between input validity and business logic <em>modularizes</em> the requirement process and thus substantially improves the latter.</li>
<li>The early availability of validity checks allows early feedback to specification writers. </li>
<li>Early availability of code and test data generation improves testing for front- and backends.</li>
</ul>
<h2>Quality Assurance Methods for Formal Specifications</h2>
<p>The aspects mentioned above already substantially improve software quality without any dedicated quality measure. Still, quality assurance, especially testing is needed to deliver software with the behavior the customer expects. Fortunately, even testing becomes simpler, more reliable, adaptable, and measure­able, due to the usage of formal specifications. How this is being accomplished is described in the rest of this blog.</p>
<p>Formal specifications facilitate well-defined structuring of tests and a focus on more interesting testing aspects. To explain these aspects we refer to typical system aspects (for both, single and multi-tier architectures) in the following table:</p>
<table>
<thead>
<tr>
<th></th>
<th>Aspect</th>
<th>How is it done in our framework?</th>
</tr>
</thead>
<tbody>
<tr>
<td>1.1</td>
<td>Validation of inputs,<br />Valid cases</td>
<td>By generated software that has to be called by the application. After the check input data is known to be consistent.</td>
</tr>
<tr>
<td>1.2</td>
<td>Validation of inputs,<br />Invalid cases</td>
<td>By generated software that has to be called by the application. The input data has to be flagged by the system with differentiated error messages.</td>
</tr>
<tr>
<td>2</td>
<td>Frontend computations</td>
<td>By generated software that has to be called by the application.</td>
</tr>
<tr>
<td>3</td>
<td>Generation of artifacts for backend processing</td>
<td>By generated software that has to be called by a web application. This is not further discussed this blog.</td>
</tr>
<tr>
<td>4</td>
<td>Backend behavior</td>
<td>Implemented by other means. Subject to extensive testing triggered by the frontend using valid input data controlled by 1) and 2).</td>
</tr>
</tbody>
</table>
<p>The aspects 1.1, 1.2, 2 and 3 can be tested using automatically generated data: valid (and deliberately invalid) test data for functional tests can be generated using methods described below. </p>
<p>Extensive testing is needed for back end behavior (see 4). This needs to be done for valid data only. And the backend data needed can be automatically generated from the formal frontend specifications.</p>
<h3>Automated testing</h3>
<p>Automated functional testing for both, newly developed software and for regression purposes is a must for the kind of systems considered here. This not only reduces the amount of manual testing for each delivery, but also is more reliable regarding testing errors and permits to increase the test coverage.</p>
<p>It is beyond the scope of this blog to describe the overall test process. We merely mention two aspects for automated testing: test data, and test paths &#8211; the sequence of input processing. These aspects are dealt with separately.</p>
<ul>
<li><strong>Test data:</strong> Test data generation allows to pre-compute test cases by defining various values for input fields. Formal specifications allow the definition of test coverage measures for test cases related to these field values.</li>
<li><strong>Test paths:</strong> The same test data sets are being fed into interactive system in different order including multiple inputs, i.e. editing of fields. The term used here is <em>path</em>. A path describes the order of processing, including multiple visits of forms and editing of fields. Within a path, pre-computed test data (as described above) can be used to as input to the system and thus test its behavior.</li>
</ul>
<p>Test paths and test data are orthogonal aspects which complement each other as described below.</p>
<h3>Field value based test data generation</h3>
<p>Is there a way to generate all valid test cases for field values? For formal systems described above, the number of input fields, their types, and constraints are well known. Moreover, the number of values per field is finite and so is the number of fields. Thus, at least in principle, the set of all valid test cases is enumerable by enumerating field values and combinations of all values for all fields. Invalid field value combinations &ndash; according to constraints &ndash; can be automatically excluded from this enumeration.</p>
<p>In theory, this means that all possible input-driven test cases can be generated for the purpose of testing. In practice, however, this finite number is far too large: Without constraints between fields, the overall number of test cases is really huge: it is equal to a product of n factors where n is the number of fields. For each field the factor is defined by the number of valid field values.</p>
<p>Constraints reduce the number of legal inputs but the data variety is still too high. Thus, we must substantially reduce the set of test cases while trying to keep &#8220;good&#8221; test coverage.</p>
<p>There are two independent and compatible ingredients to reduce the number of test cases:</p>
<ol>
<li>We reduce the number of different values considered for fields, and focus on fewer but preferably more ”interesting” values.</li>
<li>We reduce the number of combinations of different values for different fields: Rather than considering all combinations, the set of all values for all fields is considered. The number of test cases is reduced to the number of different values for the field with highest number of dif­ferent values. Sometimes, constraints lead to a reduction or growth of this number. </li>
</ol>
<p>Both techniques, considered together, substantially reduce the number of test cases. </p>
<p>Let us now analyze what &#8220;interesting&#8221; means. Certainly the criterion is the likeliness to find errors and to prove correctness using the resulting test data. We describe that in two steps.</p>
<h4>Special and interesting values for types</h4>
<p>The first step is to define special (&#8220;interesting&#8221;) values for specific field types. Test cases using these shall primarily be considered:</p>
<ul>
<li>Special values such as min or max for numbers, currencies and several related types.</li>
<li>Special values such as zero, non-zero, hazardous characters, empty or very large strings, etc.  </li>
<li>For enumeration fields, potentially all values are important, since enumeration fields often control business logic ramifications (e.g. in interactive systems, enumerations are often represented by drop down boxes for selecting important choices). </li>
</ul>
<p>Obviously, this method has limitations. No knowledge about rules is used here. Hence incomplete business logic is applied. Moreover, some types &mdash; even enumerations &mdash; are sometimes too &#8220;large&#8221;, i.e. they have many values, and not all of them might really be &#8220;interesting&#8221; in the application domain. Nevertheless, considering special values depending on field types is a good basis for the techniques described below.</p>
<h4>Special and interesting values for specific fields</h4>
<p>Considering individual fields rather than types, interesting values can be selected more specifically. The most important information stems from rules (both functional and constraints): </p>
<ul>
<li>Each comparison with respect to equality of a field value with a specific constant generates an interesting value equal to the constant. These values come on top of the interesting values de­fined by the field types. The fact that they have impact on business logic is reflected here.</li>
<li>Each comparison with respect to equality of a field with another field implies that the interesting value set of these fields are shared.</li>
</ul>
<h4>Field value coverage of test data generation</h4>
<p>The main idea with respect to test coverage for test data generation algorithms is this:</p>
<ul>
<li>For each interesting value of each field generate at least one occurrence within the test data.</li>
<li>Extend data sets applying all constraints to obtain fields with mutually consistent values.</li>
<li>Extend data sets by generating random or interesting values for unconstrained fields.</li>
</ul>
<p>Note that in most cases the constraints do not allow the full variety of 1), since constraint rules usually will exclude some values for specific fields. As an example, consider the following constraint rule, which is taken from the prevous blog article &#8220;<a href="http://blog.mgm-tp.com/2012/02/formal-dsl-part1/">Using Domain Specific Languages to Implement Interactive Frontends</a>&#8220;:</p>
<pre class="brush: plain; wrap-lines: false;">
   AlternativeVat == 0
   or AlternativeVat == NormalVat
   or AlternativeVat == NormalVat/2
   =&gt; failed: &quot;VAT can only be normal, half normal or zero&quot;
</pre>
<p>The interesting values for <code>AlternativeVat</code> are <code>NormalVat</code>, zero and <code>NormalVat/2</code>, and &#8220;undefined&#8221; (the latter stands for: no value has been specified). These interesting values are propagated to all dependent fields, thus producing more interesting values for test data. This can be demonstrated for the following specification snippet.</p>
<pre class="brush: plain; wrap-lines: false;">
AllVat = If FieldValueSpecified(AlternativeVat)
            then  AlternativeVat/100*NetAmount
            else  NormalVat/100*NetAmount
</pre>
<p>For the field <code>AllVat</code> the interesting values are (see if clause)  <code>NormalVat*100*NetAmount</code>, zero, <code>NormalVat/2*100*NetAmount</code> and (see else clause) <code>NormalVat/100*NetAmount</code>. </p>
<p>This description of test data generation is by no means complete; here we merely intended to show the relation­ship to interesting values. For more information, especially on algorithms (in an earlier language setting), see the blog articles &#8220;<a href="http://blog.mgm-tp.com/2010/12/test-data-generation-part2/">Producing High-Quality Test Data</a>&#8221; and &#8220;<a href="http://blog.mgm-tp.com/2010/10/test-data-generation-part1/">Form Validation with Rule Bases</a>&#8220;.  </p>
<h4>Multiplicity of Fields</h4>
<p>A specific testing aim refers to multiplicity, i.e. to multiple occurrences of fields in forms (such as per­sonal data for several people). In the first blog this concept has been explained and exemplified.  The following snippet shows rules for multiplicity fields <code>PosFullPrice</code>, <code>UnitPrice</code>, and <code>Quantity</code>.</p>
<pre class="brush: plain; wrap-lines: false;">
NetAmount         = Sum(PosFullPrice.all)
PosFullPrice.each = UnitPrice.each * Quantity.each
</pre>
<p>The number of instances of fields with multiplicity can be defined in advance in order to cover imple­mentation errors occurring in this context. In our experience in most cases generating test data for a multiplicity of 3 suffices to find hidden bugs. In the above example above, three instances of <code>PosFullPrice</code>, <code>UnitPrice</code>, and  <code>Quantity</code> are generated. They are fed into the algorithm for enumeration of test data. If a specific multiplicity index is referred to in con­straint rules, this multiplicity automatically delivers interesting values (similarly to fields with no multiplicity as shown above). For scaling tests and performance tests, specific fields are selectively be set to a very high multiplicity.</p>
<p>​<br />
<h3>Path Aspects: Considering Order of Field Editing in Tests</h3>
<p>At mgm tp, the second automation aspect &mdash; the order of editing fields &mdash; is pursued with two approaches. Both are independent of test data. Data is simply provided from pre-computed test data sets as described in the previous chapters.</p>
<h4>Controlled random testing</h4>
<p>One method covering path dependencies consists in using randomly generated contexts. More explicitly <em>controlled random testing</em> (see, e.g. the presentation &#8220;<a href="http://www.infoq.com/presentations/Testing-for-the-Unexpected">Testing for the Unexpected</a>&#8220;) increases the likelihood that &#8220;interesting&#8221; paths/value combinations are found by chance, since both, values and paths, are addressed in parallel. Once inte­resting contexts are found, by test failure, one can focus further testing around the contexts found.  </p>
<h4>Explicit testing paths</h4>
<p>In contrast to controlled random testing, explicit testing paths specify evaluation orders (interactive forms visits and field assignments). Several strategies such as multiple visits with value reassignments or with cancelations can be configured.  The visiting stra­tegy is specified in advance, depending on quality aims of the customers. Random testing is just a special case which can be specified in our test driver infrastructure.</p>
<p>​<br />
<h3>Semantic Test Coverage using adaptable Testing Aims</h3>
<p>From the testing perspective, the greatest advantage of using formal specifications is the fact that the set of valid fields and field combinations is very well defined, and it is defined at the customer level. This enables the techniques discussed above. Furthermore, one can profit from the fact that custo­mers possess knowledge about important and less important domain related aspects.</p>
<p>This permits a deviation from the pre-computed aims in terms related to the domain rather than to technology It bridges from the customer domain to technology (semantics beats automatics).</p>
<ol>
<li>Sometimes interesting values, such as enumerations, have no impact on program ramifications, and in addition have many values which have similar semantics, with the exception of some specific ones &#8211; which can be derived from the domain knowledge.</li>
<li>Important cases may neither be modeled by enumerations nor explicitly be visible in constraint rules. In these cases non-formalized domain knowledge might be fed into the set of interesting values for specific fields. </li>
<li>By analogy, the need for multiplicity tests for specific fields might better be decided upon by persons with domain knowledge rather than taking defaults or algorithmic artifacts.</li>
<li>In some cases invalid data are of interest in order to test reactions on invalid inputs. If this requested, it can be achieved simply by generating test data based on rule sets which contain deliberate negations of specific constraints.</li>
<li>Explicit testing paths are set to useful defaults, sometime even including controlled random testing.</li>
</ol>
<p>The overall method starts from formal specifications and combines domain expert knowledge with automated testing (see the illustration for an overview):</p>
<ol>
<li>Automatically derive testing aims from specifications and store them in a default configuration.</li>
<li>Let these aims be analyzed by experts with domain knowledge, and adapt the testing aim con­fi­guration. This leads to an increase of a reduction of the size of the data set as described in a) to e)</li>
<li>Generate test data starting from an adapted test aim configuration for field values. Within this step specification flaws (e.g. contradicting rules) are being reported.</li>
<li>Perform tests with the generated test data while considering path configurations proven useful in former tests. Within this step any insufficient test coverage (in most cases due to incomplete specifications) is being reported.</li>
</ol>
<div id="attachment_1453" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-forms-specification-part2" href="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Adaptation-of-Testing-Aims.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/12/Adaptation-of-Testing-Aims-480x270.png" alt="" title="Adaptation-of-Testing-Aims" width="480" height="270" class="size-large wp-image-1453" /></a><p class="wp-caption-text">Illustration of the method for adaptation of testing aims.</p></div>
<p>In conclusion, the adaptation of testing aims is a tool-supported manual process which fits well to test data generation and automated testing.</p>
<h2>Conclusions</h2>
<p>In this blog series we have shown that, in several respects, formal specification languages can be used to deliver high quality software systems. Since specifications are being provided at the comprehension level of the cu­sto­mer, we obtain a high likelihood that the system does, what the customer intended. In particular, we gain high quality due to&#8230;</p>
<ul>
<li>automatic code generation, </li>
<li>well defined automatic test data generation,</li>
<li>and measurable test coverage adaptable testing aims,</li>
<li>fast turnarounds in the presence of requirement changes,</li>
<li>backends guarded by benign frontends, and</li>
<li>backends with well-defined test coverage based on frontend testing aims.</li>
</ul>
<p>In addition to these implementation and quality assurance benefits, the interaction with the customer is improved as well. Early feedback with respect to errors, inconstancies, and incompleteness in specifi­cations are possible, thus improving the requirement analysis process as well.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2013/01/formal-dsl-part2/">On the Quality Benefits of Formal Domain Specific Languages - Software Quality driven by Formal DSLs, Part 2</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=23" title="View articles by Dr. Jürgen Knopp">Dr. Jürgen Knopp</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2013/01/formal-dsl-part2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[Software Quality driven by Formal DSLs]]></series:name>
	</item>
		<item>
		<title>Mobile Dashboard Reporting powered by JAX-RS and Highcharts</title>
		<link>http://blog.mgm-tp.com/2013/01/mobile-dashboard-reporting-powered-by-jax-rs-and-highcharts/</link>
		<comments>http://blog.mgm-tp.com/2013/01/mobile-dashboard-reporting-powered-by-jax-rs-and-highcharts/#comments</comments>
		<pubDate>Fri, 04 Jan 2013 15:20:04 +0000</pubDate>
		<dc:creator>Jan Esser</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Chart]]></category>
		<category><![CDATA[HTML5]]></category>
		<category><![CDATA[Java EE]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Mobile]]></category>
		<category><![CDATA[REST]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1478</guid>
		<description><![CDATA[When we developed this sales reporting solution for the insurance sector, we went for a mobile, browser-based dashboard that renders the reports on the client-side and thus enable a high degree of interactivity. That means that once the reporting data is delivered, the client should be able to e.g. drill down into the data or [...]]]></description>
			<content:encoded><![CDATA[<p>When we developed this sales reporting solution for the insurance sector, we went for a mobile, browser-based dashboard that renders the reports on the client-side and thus enable a high degree of interactivity. That means that once the reporting data is delivered, the client should be able to e.g. drill down into the data or slide along the time axis. This article focuses on the technical aspects of the data delivery in JSON format and interactive charting in the browser.</p>
<p><span id="more-1478"></span></p>
<p>The basic idea was to develop a data bridge which receives the reporting data from the server side and provide it conveniently for consumption on the client side. The data is delivered on-demand to the HTML <a href="http://en.wikipedia.org/wiki/Single-page_application">single-page application</a> using <a href="http://www.adaptivepath.com/ideas/ajax-new-approach-web-applications">AJAX</a> to load <a href="http://www.json.org">JSON</a> documents. This gives us very high flexibility (as we will see) and is in contrast to the conventional mixing of data into HTML pages that happens on the server side (think of <a href="http://www.oracle.com/technetwork/java/javaee/jsp/index.html">JSP</a>/<a href="http://www.asp.net">ASP.NET</a>).</p>
<p>In our approach, the server is solely responsible for providing a data API (e.g. with support for time-sliced queries) and for delivering the data in the universal JSON format, i.e. it encodes data as code that can be interpreted by a broad range of different client platforms. This reduces the complexity of the server application by leaving the interpretation and visualization to the client application of the solution. The actual display of the data may even differ between the clients like desktop, browser or mobile applications.</p>
<p>The present browser application is based on <a href="http://www.highcharts.com">Highcharts</a> and <a href="http://jquery.com">jQuery</a>. We chose Highcharts since it&#8217;s a mature JavaScript charting framework.</p>
<div id="attachment_1479" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-dashboard" href="http://blog.mgm-tp.com/wp-content/uploads/2013/01/Dashboard.png"><img class="size-large wp-image-1479" src="http://blog.mgm-tp.com/wp-content/uploads/2013/01/Dashboard-480x322.png" alt="" width="480" height="322" /></a><p class="wp-caption-text">Screenshot a of sample Sales Mobile Dashboard.</p></div>
<h2>Providing Reporting Data to the Browser via JAX-RS</h2>
<p>Reporting data is known to have a multidimensional nature, while practically reports (in the <a href="http://en.wikipedia.org/wiki/ROLAP">ROLAP</a> flavor) tend to have several views or perspectives cutting the data cube. Additionally reporting data is time-sliced. For instance, the report data of 2001 is not going to change anymore. This is a perfect situation for caching, for instance in HTML5&#8217;s <a href="http://dev.w3.org/html5/webstorage/">Web Storage</a> or a mobile browser&#8217;s database. Furthermore the time-slicing allows querying for distinct portions of the reporting data. Dividing the load leads to achieving an acceptable response time (in opposite to mobile <a href="http://en.wikipedia.org/wiki/Data_warehouse">data warehouses</a>).</p>
<p>The choice for the protocol to provide and consume the report data was easy, since HTTP is the most familiar protocol to browsers. The strong points of that decision came up later, when all pieces of that puzzle fit together.</p>
<p>Applying a <a href="http://en.wikipedia.org/wiki/Resource-oriented_architecture">resource-oriented approach</a> first requires some simple addressability, which will ease the clients to formulate follow-up-requests. Next a collection of resource representations needs to be defined. For the given setup it was evident to focus on JSON. The listing below shows the key part of the <a href="http://docs.oracle.com/javaee/6/tutorial/doc/giepu.html">JAX-RS</a> bean serving the report data in JSON format using the <a href="http://jackson.codehaus.org">Jackson JSON processor</a>.</p>
<pre class="brush: java; wrap-lines: false;">
// JAX-RS Resource bean
public abstract class ReportingResource {
	...
	@GET
	@Produces(MediaType.APPLICATION_JSON)
	public final Response getReportData(
        @HeaderParam(DATE_FROM_HEADER) final Date dateFrom,
		@HeaderParam(DATE_UNTIL_HEADER) final Date dateUntil,
        @Context final HttpHeaders httpHeaders,
		@Context final SecurityContext securityContext)
	{
		// require authentication and a certain user role
		if (securityContext.getUserPrincipal() == null
            || !securityContext.isUserInRole(REQUIRED_USER_ROLE))
        {
			return Response.status(Status.UNAUTHORIZED).build();
		}

		return Response.ok(
		    getReportDataInDateRange(dateFrom, dateUntil, httpHeaders))
		    .build();
	}
	...
}
</pre>
<h2>The Data-as-Code Approach using JSON</h2>
<p>The principle data-as-code applies here, as seen in the next listing: <a href="http://www.json.org">JSON</a> is a data format and at the same time part of the JavaScript language specification. Furthermore it demonstrates the memory model of JavaScript-runtime-environments.</p>
<pre class="brush: jscript; wrap-lines: false;">
{
  &quot;data&quot;: [
    {
      &quot;lobNames&quot;: [
        &quot;LIABILITY_POLICY_PREMIUM&quot;,
        &quot;LIABILITY_OFFER_PREMIUM&quot;
      ],
      &quot;currency&quot;: &quot;EUR&quot;,
      &quot;periodBegin&quot;: &quot;2012-08-01&quot;,
      &quot;name&quot;: &quot;PremiumLobTimelineTimeslice&quot;,
      &quot;LIABILITY_POLICY_PREMIUM&quot;: 1461812,
      &quot;LIABILITY_OFFER_PREMIUM&quot;: 0
    },
    {
      &quot;lobNames&quot;: [
        &quot;LIABILITY_POLICY_PREMIUM&quot;,
        &quot;LIABILITY_OFFER_PREMIUM&quot;,
        &quot;PROPERTY_OFFER_PREMIUM&quot;,
        &quot;PROPERTY_POLICY_PREMIUM&quot;
      ],
      &quot;currency&quot;: &quot;EUR&quot;,
      &quot;periodBegin&quot;: &quot;2012-09-01&quot;,
      &quot;name&quot;: &quot;PremiumLobTimelineTimeslice&quot;,
      &quot;LIABILITY_POLICY_PREMIUM&quot;: 1081483,
      &quot;LIABILITY_OFFER_PREMIUM&quot;: 129314,
      &quot;PROPERTY_OFFER_PREMIUM&quot;: 1240473,
      &quot;PROPERTY_POLICY_PREMIUM&quot;: 61074854
    }
  ]
}
</pre>
<p>We use <a href="http://api.jquery.com/jQuery.ajax/">jQuery Ajax</a> requests to asynchronously fetch specific data from server REST/JAX-RS resources that follow the mentioned addressability rules. The data is basically fetched in time-slices, as seen in the listing below (<code>dateFrom</code> and <code>dateUntil</code>). We can restrict the data exchange to just those portions of the report data, which are of current interest to the user.</p>
<pre class="brush: jscript; wrap-lines: false;">
// Excerpt from Reporting.js
reporting.Reporting = {
	/* ... */
loadData : function(reportType, noCache) {
		/* ... */
		var headerParams = {
		'dateFrom' : reporting.utils.DateNavigator.getStartDate(),
		'dateUntil': reporting.utils.DateNavigator.getEndDate()
		};
		jQuery.ajax({
			url : url,
			dataType : 'json',
			cache : !noCache,
			async : false,
			headers : headerParams,
			success : onSuccess,
			error : onError
		});
		return resultData;
	}
	/* ... */
}
</pre>
<h2>Rendering Data Charts in the Browser</h2>
<p>Regarding the JSON structure Highcharts has so-called formatters, like <a href="http://api.highcharts.com/highcharts#yAxis.labels.formatter"><code>yAxis.labels.formatter</code></a>, that allow easy access to row-wise reporting data. The minimum requirement is to have an ordered JSON array below data (see listing above). In some edge cases (as in the subsequent listing) it is also possible to re-group the reporting data, so that it fits e.g. for a stacked bar chart like in this <a href="http://www.highcharts.com/demo/column-stacked-and-grouped">example</a>.</p>
<pre class="brush: jscript; wrap-lines: false;">
// Highcharts configuration
renderChartConfig : function(data) {
	// values for xAxis
	var categories = [];
	// stack of lob (line of business) totals
	var series = [];

	//...

	for ( var i = 0; i &lt; data.length; i++) {
		categories[i] = Highcharts.dateFormat(
				'%b %y', reporting.utils.DateUtil.parseUtcMillis(data[i].periodBegin));
		// series[j].data[i] = data[i][lobName];
	}

	var config = {
		// ...
		xAxis : {
			categories : categories,
			// ...
		},
		yAxis : {
			labels : {
				formatter : function() {
					return Highcharts.numberFormat(
							this.value, 0, '.', ',') + ' &amp;euro;';
				}
			},
		},
		legend : {
			labelFormatter : function() {
				return this.name;
			},
		},
		tooltip : {
			formatter : function() {
				return '&lt;strong&gt;' + this.x + '&lt;/strong&gt;'
					+'&lt;br/&gt;' + this.series.name + ': '
					+ Highcharts.numberFormat(this.y, 0, '.', ',') + '&lt;br/&gt;'
					+ 'Total: ' + Highcharts.numberFormat(
							this.point.stackTotal, 0, '.', ',');
			}
		},
		series : series
	};
	return config;
}
</pre>
<h2>Conclusion</h2>
<p>Realizing the server side with JAX-RS and JSON marshalling resulted in a modular and extensible solution. Since reporting needs to have a vital dynamic over time, it should be possible to create a new perspective or view (on the data cube) in no time.</p>
<p>This is achieved by decoupling the client application&#8217;s use of the data from the server and its internals of how to provide the data. In other terms, providing data does not mean to know how it will be interpreted. The complexity of browsers and client diversity mentioned above can be moved to the JavaScript frameworks and so away from the server-side.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2013/01/mobile-dashboard-reporting-powered-by-jax-rs-and-highcharts/">Mobile Dashboard Reporting powered by JAX-RS and Highcharts</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=29" title="View articles by Jan Esser">Jan Esser</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2013/01/mobile-dashboard-reporting-powered-by-jax-rs-and-highcharts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How KICKZ uses Facebook for a better Customer Experience</title>
		<link>http://blog.mgm-tp.com/2012/10/kickz-facebook-integration/</link>
		<comments>http://blog.mgm-tp.com/2012/10/kickz-facebook-integration/#comments</comments>
		<pubDate>Wed, 10 Oct 2012 14:01:17 +0000</pubDate>
		<dc:creator>Jiri Honc</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[ECommerce]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[REST]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Social]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1398</guid>
		<description><![CDATA[The KICKZ online store is our latest e-commerce project that has gained a deep integration with Facebook. This blog article presents the four ways to utilize Facebook for a better and smoother customer experience. First we show how customers can use their Facebook accounts for registration and login into the online store using the OAuth [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.kickz.com">KICKZ</a> online store is our latest e-commerce project that has gained a deep integration with Facebook. This blog article presents the four ways to utilize Facebook for a better and smoother customer experience. First we show how customers can use their Facebook accounts for registration and login into the online store using the OAuth 2.0 protocol. Next we discuss accessing the customer data from Facebook&#8217;s Open Graph (if user&#8217;s permission is given) in order to prefill registration and order forms. And finally, we deal with product reviews and forwarding customer Facebook posts to the Facebook wall of the online store.</p>
<p><span id="more-1398"></span></p>
<p><a href="https://developers.facebook.com/docs/authentication/">Facebook uses</a> the authentication and authorization protocol <a href="http://oauth.net/2/">OAuth 2.0</a>. Due to the broad, world-wide user base of Facebook, OAuth became a valuable authentication resource for web applications. But Facebook does provide more for web applications &ndash; it gives access to the whole social graph of a user, the permission by the user provided. We show now step by step, how an integration can look like.</p>
<h2>Facebook OAuth</h2>
<p>The benefits of the integration of an authentication provider are manifold. First, the user can use his Facebook account to log-in at several web applications. He hasn&#8217;t to remember a bunch of credentials. He doesn&#8217;t need to fill-in regristration forms again and again. Thus, an authentication service saves time and helps to keep the amount of accounts to manage low.</p>
<div id="attachment_1412" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Login-Button.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Login-Button-480x377.png" alt="" title="Facebook Login Button" width="480" height="377" class="size-large wp-image-1412" /></a><p class="wp-caption-text">Login via Facebook: Users can login into the KICKZ web shop with their Facebook account. For new users, their Facebook profile data is used to prefill the registration form (see also screenshot below).</p></div>
<p>Second, any Web 2.0 site provides such share buttons to Facebook or Google+, that&#8217;s already a must. The integration of Facebook&#8217;s OAuth service for user registration in your own web application will lower the bar for new users, because it&#8217;s just one click away. The huge popularity of Facebook with nearly 1 billion users around the world makes Facebook a good partner for authentication purposes. And what&#8217;s important for users: the OAuth 2.0 protocol ensures that your web application won&#8217;t leak any passwords.</p>
<h2>Facebook&#8217;s Open Graph API</h2>
<p>Facebook provides a public API for its social graph (now <a href="http://developers.facebook.com/docs/opengraph/">Open Graph</a>), so any web application, identified to Facebook by its Facebook application, is able to ask the user for permission to access a user&#8217;s personal data and his personal wall. Depending on the user&#8217;s choice, the web application is able to authenticate the user and do a login. If the user is not yet registered in your web application, it can read the personal data from Facebook and prefill the registration form. Additional actions include posting on the user&#8217;s wall and re-post some of his posts to the Facebook wall of the company of the web application. </p>
<div id="attachment_1404" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Prefilled-Registration-Form.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Prefilled-Registration-Form-480x415.png" alt="" title="Prefilled Registration Form" width="480" height="415" class="size-large wp-image-1404" /></a><p class="wp-caption-text">Example of a prefilled registration form with data obtained from the user's Facebook profile.</p></div>
<p>In order to get access to the <a href="http://developers.facebook.com/docs/reference/api">Facebook Open Graph API</a>, a web application developer needs to register a so called &#8220;Facebook Application&#8221;. The Facebook App describes the connection to our application, in our case the online shop. It&#8217;s the integration point of our web application and Facebook.</p>
<p>Once the Facebook App is prepared via the <a href="https://developers.facebook.com/apps">App Dashboard</a>, we need the values &#8220;App ID/API Key&#8221; and &#8220;App Secret&#8221;, which we need to login to Facebook ourselves to access the Facebook Open Graph API. In the following screenshot, the App ID/API Key and App Secret values are blurred.</p>
<div id="attachment_1405" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Application.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Application-480x219.png" alt="" title="Facebook Application" width="480" height="219" class="size-large wp-image-1405" /></a><p class="wp-caption-text">Preparing an Facebook Application on the App Dashboard for your web application. You are provided with an App ID/API Key, App Secret.</p></div>
<p>The Facebook Applications are bound to a dedicated site URL and domain. This a security mechanism, so only requests from the registered site/domain will be allowed. Find more about the Facebook Applications and the Open Graph API in the <a href="http://developers.facebook.com/docs/beta/opengraph/tutorial">Open Graph Tutorial</a>.</p>
<p>Hint: If you&#8217;re developing a web application it&#8217;s helpful to register a Facebook Application for &#8220;localhost&#8221;. This way, each developper can share this single Facebook Application. For the production environment, just use another Facebook Application registered to the real web site and domain.</p>
<p>Now with the Facebook Application in place, we&#8217;re able to implement the features.</p>
<h2>Feature 1: User-Registration with Facebook</h2>
<p>Faceboook provides a <a href="http://developers.facebook.com/docs/reference/javascript/">JavaScript SDK</a> which enables you to connect to your Facebook Application and then to access the Facebook Open Graph API to interact with the users&#8217; Facebook accounts. The JavaScript SDK can be rendered in three ways: XFBML, XFBML in HTML5-compliant markup, and via Iframe. We choose the first two, XFBML and HTML5, because they support better tracking capabilities. For more details, please refer to the <a href="http://developers.facebook.com/docs/plugins/">Social Plugins documentation</a>.</p>
<div id="attachment_1406" class="wp-caption alignnone" style="width: 490px"><a href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Client-side-Flow.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Client-side-Flow-480x364.png" alt="" title="Client-side Flow" width="480" height="364" class="size-large wp-image-1406" /></a><p class="wp-caption-text">Client-side authentification flow via the Facebook Javascript SDK (steps 1 and 2 in diagram).</p></div>
<h3>1. Connect with Facebook Application (Javascript)</h3>
<p>It is possible to initialize the JavaScript SDK asynchronously, and we prefer this implementation as it provides us with the fastest response if we use the following code right after the opening <code>&lt;body&gt;</code> tag:</p>
<pre class="brush: jscript; wrap-lines: false;">
&lt;div id=&quot;fb-root&quot;&gt;&lt;/div&gt;
&lt;script&gt;
  window.fbAsyncInit = function() {
    FB.init({appId: 'YOUR_APP_ID', status: true, cookie: true, oauth: true, xfbml: true});
    // Additional initialization code here
  };

  // Load the SDK asynchronously
  (function(d){
     var js, id = 'facebook-jssdk';
     if (d.getElementById(id)) {return;}
     js = d.createElement('script'); js.id = id; js.async = true;
     js.src = &quot;//connect.facebook.net/en_US/all.js&quot;;
     d.getElementsByTagName('head')[0].appendChild(js);
   }(document));
&lt;/script&gt;
</pre>
<h3>2. Login a User (Javascript)</h3>
<p>The authentication component of our web application needs to support three use-cases:</p>
<ol>
<li>Registered user and already logged in to Facebook</li>
<li>Registered user, but not logged in to Facebook</li>
<li>New User: The user has a Facebook account, but is not registered in our web application</li>
</ol>
<p>The listing below shows the necessary code on the client-side. It just uses the Facebook API to either show the OAuth login dialog, or ask for permission to access the users Facebook data in case he&#8217;s not a registered user. Facebook then maintains the granted permissions in our Facebook application, so we don&#8217;t need to check, whether this user is already registered or not.</p>
<p>First we define the link:</p>
<pre class="brush: xml; wrap-lines: false;">
&lt;a onclick=&quot;function() {fb_login_register();};&quot;&gt;Login via Facebook&lt;/a&gt;
</pre>
<p>And then we define the JavaScript function <code>fb_login_register</code>:</p>
<pre class="brush: jscript; wrap-lines: false;">
fb_login_register : function () {
    FB.getLoginStatus(function(response) {
        if (response.status == &quot;connected&quot;) {
            // Case 1: Registered user and already logged in to Facebook
            LOGIN_TO_WEBAPP(response.authResponse.accessToken);
        } else {
		    // Cases 2/3: show FB login dialog and ask for permissions to access private data
            FB.login(function(response) {
                if (response.authResponse) {
                    console.log(&quot;User is connected to the application.&quot;);
                    LOGIN_TO_WEBAPP(response.authResponse.accessToken);
                } else {
                    console.log('User cancelled login or did not fully authorize.');
                }
            }, {scope:'email,read_stream,user_location,publish_stream,offline_access'});
        }
    });
},
</pre>
<p>For further details see the <a href="http://developers.facebook.com/docs/reference/javascript/FB.getLoginStatus/">FB.getLoginStatus</a> and <a href="http://developers.facebook.com/docs/reference/javascript/FB.login/">FB.login</a> documentation.</p>
<p>And this is how the login procedure looks like to the user:</p>
<div id="attachment_1403" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook"  href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Login-Popup.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Login-Popup-480x292.png" alt="" title="Facebook Login Popup" width="480" height="292" class="size-large wp-image-1403" /></a><p class="wp-caption-text">Login: Using the client-side authentication flow will open a Facebook Login popup.</p></div>
<div id="attachment_1407" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Request-for-Permissions.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Request-for-Permissions-479x435.png" alt="" title="Request for Permissions" width="479" height="435" class="size-large wp-image-1407" /></a><p class="wp-caption-text">Request for Permissions.</p></div>
<p>After the successful login, we just send the User access token for this user to our server, which then gets the user&#8217;s data and provides a session for him.</p>
<p>Hint: Asking for the permission &#8220;offline_access&#8221; comes very handy on the server-side. This allows to access the users data, without the user being currently logged in. However, the offline_access has been removed! If you are building a new application you shouldn&#8217;t use this permission. Instead check the <a href="https://developers.facebook.com/docs/offline-access-deprecation/">Deprecation of Offline Access Permission</a>.</p>
<p>With our approach, the complete authentication is made on the client side (i.e. browser) as shown above. The client sends Facebook&#8217;s user access token, so on the server-side we can access the <a href="http://developers.facebook.com/docs/reference/api/">Open Graph API</a> to get the personal information. This will be done in step 3 before.</p>
<h3>Alternative: Server-Side Authentication</h3>
<p>Analogous to the steps 1 and 2 above that use the JavaScript API from within the browser, you can also use the OAuth authentication on the server-side using the so-called <a href="http://developers.facebook.com/docs/authentication/server-side/">server-side OAuth flow</a>, an HTTP-based approach. Here again, you would first connect to our Facebook Application, and then access with the Facebook Open Graph API to read the users data is possible (step 3 below).</p>
<p>Doing the authentication with the server-side flow leads to some HTTP redirects, so the user has to leave the origin website (our <a href="http://www.kickz.com">KICKZ</a> shop) during the login, as shown in the screenshot below. The benefit of the server-side authentication is better security, because the access token has not to be transferred over the network from the user&#8217;s browser to the web application on the server.</p>
<div id="attachment_1408" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook"  href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Login-Redirect.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Login-Redirect-480x275.png" alt="" title="Facebook Login Redirect" width="480" height="275" class="size-large wp-image-1408" /></a><p class="wp-caption-text">Login: Using the server-side authentication flow causes a HTTP redirect to the Facebook login page.</p></div>
<p>Here&#8217;s how you connect to our Facebook Application:</p>
<p><strong>Step 1: Sending a Request for code and ask for permissions.</strong></p>
<p>In this first step we generate an URL as follows:</p>
<pre class="brush: plain;">

https://www.facebook.com/dialog/oauth

    ?client_id=YOUR_APP_ID
    &amp;redirect_uri=YOUR_REDIRECT_URI
</pre>
<p>Here, you use your Facebook App&#8217;s ID as the <code>YOUR_APP_ID</code> parameter. For a full list of parameters, see the <a href="http://developers.facebook.com/docs/reference/dialogs/oauth/">OAuth Dialog documentation</a>.</p>
<p>Hint: If you define <code>YOUR_REDIRECT_URI</code> as &#8220;http://localhost/&#8221; you can test it directly in your browser. First you will be prompted to authorize the Facebook App to your Facebook account, then you will be redirected to your localhost with the code URL parameter as:</p>
<pre class="brush: plain;">

http://localhost/?code=OAUTH_CODE_GENERATED_BY_FACEBOOK
</pre>
<p>In this step you also ask for permissions of your application. The <a href="http://developers.facebook.com/docs/reference/api/permissions/">permissions</a> are defined by the scope parameters, e.g.:</p>
<pre class="brush: plain; wrap-lines: false;">
scope=email,read_stream,user_location,publish_stream,offline_access
</pre>
<div id="attachment_1409" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Server-side-Flow.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Server-side-Flow-480x365.png" alt="" title="Server-side Flow" width="480" height="365" class="size-large wp-image-1409" /></a><p class="wp-caption-text">Server-side authentification flow using HTTP requests (steps 1 and 2 in diagram).</p></div>
<p><strong>Step 2: Request for User Access Token</strong></p>
<p>Now that you&#8217;ve got the code, you can ask for the access token as follows:</p>
<pre class="brush: plain; wrap-lines: false;">

https://graph.facebook.com/oauth/access_token?

	client_id=YOUR_APP_ID&amp;redirect_uri=YOUR_REDIRECT_URI&amp;
	client_secret=YOUR_APP_SECRET&amp;code=OAUTH_CODE_GENERATED_BY_FACEBOOK
</pre>
<p>The response for this request contains the access token in the message body. Example:</p>
<pre class="brush: plain;">
access_token=USER_ACCESS_TOKEN&amp;expires=YYYY
</pre>
<p>We just use the <a href="http://hc.apache.org/httpclient-3.x/apidocs/org/apache/commons/httpclient/methods/GetMethod.html">GetMethod</a> class of the <a href="http://hc.apache.org">Apache HttpComponents</a> library for sending the HTTP requests and obtaining the responses.</p>
<h3>3. Load the users data and create a session (Server-Side)</h3>
<p>Once we have our user access token, either the Javascript way or via the server-side flow, we can now use the any Java library to load the personal information of a user from Facebook.</p>
<p>There are several Java libraries for accessing the Facebook API. In this example we just use the <a href="http://restfb.com">restfb</a> library, because it offers a very intuitive way for working with the Facebook Open Graph.</p>
<p>This is the code to retrieve a user from Facebook, using the access token for the user delivered by the client-side: </p>
<pre class="brush: java; wrap-lines: false;">
FacebookClient facebookClient = new DefaultFacebookClient(accessToken);
FcbUser fbUser = facebookClient.fetchObject(
            USER_ACCESS_TOKEN,
            FcbUser.class,
	        Parameter.with(&quot;fields&quot;, &quot;id, first_name, last_name, picture, email, location, gender, birthday&quot;));
</pre>
<p>Now that we&#8217;ve successfully got the Facebook user, we can copy his basic data into our user base, create a new session for him and log him in. In our web application we just inform the user, that he&#8217;s now logged in, which looks likes this:</p>
<div id="attachment_1410" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Successful-Login.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Successful-Login-480x184.png" alt="" title="Successful Login" width="480" height="184" class="size-large wp-image-1410" /></a><p class="wp-caption-text">The web application greets and informs the user about the successful login via Facebook Connect.</p></div>
<h3>Conclusion</h3>
<p>With these fairly easy steps the &#8220;Login via Facebook&#8221; feature gets implemented. The web application can provide a one-click registration process for new users. In respect to our native registration-process, in which we ask for all relevant information, we now support a so called &#8220;partial&#8221; registration. All information, like the postal address information, which is not available via Facebook will be retrieved from the user during the first check-out.</p>
<h2>Feature 2: Generate posts on the a user&#8217;s Facebook wall</h2>
<p>Everybody knows the &#8220;like buttons&#8221; on web-pages, but with the Facebook API there are much more possibilities to interact with the users. In our web shop each user is able to post reviews on products. These reviews are shared with other customers in our web-shop directly.</p>
<p>With the Facebook integration, we can share the reviews of a user and post them directly on his <a href="http://en.wikipedia.org/wiki/Facebook_features#Wall">Facebook wall</a>. This leads to a win-win situation, because the user sees his reviews liked by other users. And the web shop gets more incoming links and more attention which helps to boost marketing and SEO.</p>
<p>Let&#8217;s look at how to post a message to a user&#8217;s Facebook wall. Since we already have the permissions and know how to obtain an access token, we can just use the restfb library like this:</p>
<pre class="brush: java; wrap-lines: false;">
FacebookClient facebookClient = new DefaultFacebookClient(ACCESS_TOKEN);
FacebookType publishMessageResponse = facebookClient.publish(
    &quot;me/feed&quot;, FacebookType.class,
    Parameter.with(&quot;message&quot;, MESSAGE_TO_POST),
    Parameter.with(&quot;link&quot;, LINK_TO_POST),
    Parameter.with(&quot;picture&quot;, PICTURE));

String postId = publishMessageResponse.getId();
log.debug(&quot;Published message ID: &quot; + postId);
</pre>
<p>And since we have the id of the post, we can also put a comment to that message as follows:</p>
<pre class="brush: java; wrap-lines: false;">
FacebookType response = facebookClient.publish(
    postId + &quot;/comments&quot;, FacebookType.class,
	Parameter.with(&quot;message&quot;, COMMENT_TO_ORIGINAL_MESSAGE)); 

log.debug(&quot;Comment ID: &quot; + response.getId());
</pre>
<p>In our web application we use this feature to automatically post reviews on products of the user on his personal Facebook wall. The comments are reviewed by a shop personnel, thus not all reviews get distributed automatically. So the code snippets above are integrated in the customers review approval workflow.</p>
<h2>Feature 3: Listen on a user&#8217;s Facebook wall and copy posts to the shop wall</h2>
<p>Facebook provides also a listener interface. So a Facebook application can register listeners for posts of a user. For each new post the listener gets invoked and is then able to decide what to do.</p>
<p>The callback functionality breaks the basic Facebook privacy. With this mechanism each post can break out the walled Facebook garden, in which the user originally wanted to share his post. So this feature also needs to get handled with great respect and empathy to the users privacy feeling. Thus, in order to get these posts a special permission from the user is required.</p>
<p>As KICKZ is a strong shop-brand and the customers have a close relationship to the brand, they usually accept this post forwarding and feel proud to be present on the Facebook wall of their favorite shop. This may not be the case for every company or brand, and customers may get offended when they see their &#8220;private messages&#8221; on the companies Facebook wall.</p>
<p>The following steps are necessary to allow us to listen and read a user&#8217;s posts:</p>
<ul>
<li>Ask the user for the permissions &#8220;read_stream&#8221; and &#8220;publish_stream&#8221;.</li>
<li>Register a Facebook listener (callback server).</li>
<li>Filter user activities and publish them to the company&#8217;s Facebook wall or to web shop site.</li>
</ul>
<div id="attachment_1414" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-kickz-facebook" href="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Callback.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/10/Facebook-Callback-480x190.png" alt="" title="Facebook Callback" width="480" height="190" class="size-large wp-image-1414" /></a><p class="wp-caption-text">Main steps of this listening and filtering process (read left-to-right).</p></div>
<p>We implemented a simple &#8220;fb_callback_servlet&#8221; which accepts the incoming calls from Facebook. If the user&#8217;s post matches certain criteria, then his post is directly posted to the Facebook wall of the shop and its web site.</p>
<p>Here some background information about the steps:</p>
<ul>
<li><strong>Registration of the callback server:</strong> We define a &#8220;callback_url&#8221; on which we listen in the &#8220;fb_callback_servlet&#8221;. During the registration process we pass the access token of our Facebook Application and the parameters describing what and on which object of the graph API we want to listen on.</li>
<li><strong>Users activity listener:</strong> Our &#8220;fb_callback_servlet&#8221; provide a <code>doGet()</code> method that verifies the subscription for the user-stream to Facebook. This is also required for any modification or deletion of the subscription. The <code>doPost()</code> method of the servlet is used as a &#8220;callback&#8221; and receives the data of the user stream.</li>
</ul>
<p>An official description of this function is available on &#8220;<a href="http://developers.facebook.com/docs/reference/api/realtime/">Realtime Updates</a>&#8221; in the Facebook developer documentation. And a simple implementation example is available on <a href="https://github.com/facebook/real-time/tree/master/samples">facebook / real-time (Github)</a>.</p>
<h2>Feature 4: Tracking Facebook communication via Google Analytics</h2>
<p><a href="https://developers.google.com/analytics/devguides/collection/gajs/">Tracking</a> Facebook communication with the <a href="http://www.google.com/intl/en/analytics/">Google Analytics tool</a> provides useful information for the management of e-commerce activities and shows traffic related to Facebook.  </p>
<p>The integration of Google Analytics is also straight forward and there are many good examples like <a href="http://analytics-api-samples.googlecode.com/svn/trunk/src/tracking/javascript/v5/social/facebook_js_async.html">Google Analytics Social Tracking Demo</a> and tutorials available, e.g. <a href="http://analytics.blogspot.com/2011/07/social-plugin-tracking-in-google.html">&#8220;Social Plugin Analytics in Google Analytics&#8221;</a>.</p>
<p>In this web application we use the tracker in the traditional way, but we just override the global social track abilities, as we are currently only interested in Facebook events on our pages. So we add tracking logic for Facebook directly to the Google Analytics <a href="https://developers.google.com/analytics/devguides/collection/gajs/methods/">pageTracker object</a>:</p>
<pre class="brush: jscript; wrap-lines: false;">
&lt;script type=&quot;text/javascript&quot;&gt;
var gaJsHost = ((&quot;https:&quot; == document.location.protocol) ? &quot;https://ssl.&quot; : &quot;http://www.&quot;);
document.write(unescape(&quot;%3Cscript src='&quot; + gaJsHost + &quot;google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E&quot;));

var pageTracker = _gat._getTracker(GOOGLE_ANALYTICS_ID);
pageTracker.trackFacebook = function(opt_pageUrl, opt_trackerName) {
    try {
     register for all Facebook events
      if (FB &amp;&amp; FB.Event &amp;&amp; FB.Event.subscribe) {
        FB.Event.subscribe('edge.create', function(targetUrl) {
          _tracker._trackSocial('facebook', 'like', targetUrl, opt_pageUrl);
        });
        FB.Event.subscribe('edge.remove', function(targetUrl) {
           _tracker._trackSocial('facebook', 'unlike', targetUrl, opt_pageUrl);
        });
        FB.Event.subscribe('message.send', function(targetUrl) {
            _tracker._trackSocial('facebook', 'send', targetUrl, opt_pageUrl);
        });
      }
    } catch (e) {}
};
&lt;/script&gt;
</pre>
<p>The <code>pageTracker.trackFacebook</code> method defined above is called after <code>FB.init()</code> in the web-pages of our web application.</p>
<h2>Conclusion</h2>
<p>We showed that the integration of Facebook into a web application is quite straightforward. We demonstrated a social login/registration and in the context of a shopping use-case. These features allow a user to quickly login and provide the postal information later on-demand during the check-out.</p>
<p>The integration of customer reviews and forwarded user posts with the shop&#8217;s Facebook wall helps to gain visibility of the web site, since a lot of fresh incoming links are generated. In order to respect the users&#8217; privacy, there&#8217;s no automatic forwarding applied, the reviews are moderated by the shop personnel. Finally, the integration of Google Analytics helps to monitor the efficiency of our efforts of the Facebook related traffic.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/10/kickz-facebook-integration/">How KICKZ uses Facebook for a better Customer Experience</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=28" title="View articles by Jiri Honc">Jiri Honc</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/10/kickz-facebook-integration/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Integrating FredHopper into a hybris Marketplace</title>
		<link>http://blog.mgm-tp.com/2012/09/fredhopper-hybris-integration/</link>
		<comments>http://blog.mgm-tp.com/2012/09/fredhopper-hybris-integration/#comments</comments>
		<pubDate>Wed, 26 Sep 2012 15:02:40 +0000</pubDate>
		<dc:creator>Christian Belka</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[ECommerce]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Hybris]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1375</guid>
		<description><![CDATA[In this blog article we discuss how Fredhopper, an advanced site search and merchandising product, can be integrated into the hybris eCommerce suite not only to search for products, but to create cross selling and campaigns as well. In the used scenario hybris is the foundation of a marketplace with a few million products from [...]]]></description>
			<content:encoded><![CDATA[<p>In this blog article we discuss how Fredhopper, an advanced site search and merchandising product, can be integrated into the hybris eCommerce suite not only to search for products, but to create <a href="http://en.wikipedia.org/wiki/Cross-selling">cross selling</a> and campaigns as well. In the used scenario hybris is the foundation of a marketplace with a few million products from thousands of vendors.</p>
<p><span id="more-1375"></span></p>
<p>The standard edition of <a href="http://www.hybris.com">hybris</a> uses the open source Lucene-based <a href="http://lucene.apache.org/solr/">Solr</a> enterprise search server. But for this project Fredhopperwill be be used instead. The article explains how Fredhopper can be provided with all the required data from the hybris system.</p>
<p><a href="http://www.fredhopper.com/products/">Fredhopper</a> is a commercial Online Marketing Suite with a focus on</p>
<ul>
<li>On-site Search,</li>
<li>On-site Targeting, and</li>
<li>Predictive Targeting.</li>
</ul>
<p>Fredhopper also delivers the navigational structure for our project and all other navigation-related items like breadcrumbs, etc.</p>
<p>A marketplace consists not only of one huge catalog, which lists all products from all marketplace vendors. It also contains individual vendor shops, where the vendors can have their own catalog structure, their own navigation and their own search. All ths is also handled by Fredhopper.</p>
<div id="attachment_1385" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-fredhopper" href="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Marketplace.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Marketplace-480x348.png" alt="" title="Marketplace" width="480" height="348" class="size-large wp-image-1385" /></a><p class="wp-caption-text">A marketplace with two vendors (dealers).</p></div>
<p>For example, consider the main catalog in the figure above. It can contain an audio player, which is offered by different vendors. Therefore, the audio player must occur in the main catalog and also in the vendor catalogs. In the context of a vendor shop an audio player should be displayed in that context only, while the main catalog of the marketplace will display the product offered by multiple vendors at different prices where the offer with the lowest price will be displayed in the standard result list (either via search or navigation).</p>
<h2>Why Fredhopper?</h2>
<p>First, Fredhopper can do much more than searching. It can also be used to create campaigns for <a href="http://en.wikipedia.org/wiki/Upselling">upselling</a>, has a recommendation feature and offers sophisticated auto-correction for spelling, etc. in a lot of languages, including those used in the marketplace.</p>
<div id="attachment_1386" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-fredhopper" href="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Fredhopper-Recommendations.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Fredhopper-Recommendations-480x509.png" alt="" title="Fredhopper-Recommendations" width="480" height="509" class="size-large wp-image-1386" /></a><p class="wp-caption-text">Illustrations of Fredhopper-based offers and recommendations.</p></div>
<p>Secondly, changes in its behavior can be easily configured by marketing users without a need for technical support or code changes. This flexibility was the reason why we switched from the provided Solr product to Fredhopper.</p>
<h2>Integrating Fredhopper into hybris</h2>
<p>Before I discuss the details of the integration, let&#8217;s take a look at the hardware and infrastructure we had to integrate. As the following diagram shows, Fredhopper is deployed on different servers where each server has a specific function (Navigation/search and quick search).</p>
<div id="attachment_1387" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-fredhopper" href="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Deployment.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Deployment-479x218.png" alt="" title="Deployment" width="479" height="218" class="size-large wp-image-1387" /></a><p class="wp-caption-text">Distributed Fredhopper deployment for Navigation/search and quick search.</p></div>
<p>To use the Fredhopper features like navigation services, search services, campaigns, etc. we developed a suite of components for the <a href="http://www.hybris.com/en/modules/wcms">hybris WCMS</a> (Web Content Management System):</p>
<ul>
<li>Components used to display categories</li>
<li>Search components</li>
<li>Navigation components</li>
<li>Recommendation components</li>
</ul>
<p>All those components are &#8220;regular&#8221; hybris components that query the Fredhopper index.</p>
<p>The Fredhopper servers receive requests from and deliver responses to the hybris servers and not directly to the user. The hybris servers use web services to communicate with the Fredhopper servers. Fredhopper provides a Java-based web service library to send queries. We wrote our own wrapper to encapsulate all the technical details and created an API where all project-based queries are encapsulated.</p>
<p>The following snippet shows, how to use this wrapper (<code>NavigationHelper</code>) to communicate with Fredhopper. It&#8217;s an excerpt of our WCMS search component:</p>
<pre class="brush: java; wrap-lines: false;">
@RequestMapping(&quot;/ProductsSearchComponentController.html&quot;)
public String showSearchNavigation(final ModelMap model, final HttpServletRequest request) throws WebException {

    // ProductsListComponentModel is a hybris component
	ProductsListComponentModel component = (ProductsListComponentModel) request
			.getAttribute(FrontendConstants.CMSConstants.COMPONENT_KEY);
	long startTime = System.currentTimeMillis();
	NavigationState navigationState = this.navigationHelper.getOrCreateNavigationState(request);

	// --- Calls a Fredhopper web service through our navigationHelper wrapper
	// NavigationResult is a DTO, result from Fredhopper
	NavigationResult navigationResult = this.navigationHelper
			.getOrCreateNavigationResult(request, navigationState);

	try {
		this.navigationHelper.getSelectedCategory(request, navigationState);
	} catch (final UnknownIdentifierException e) {
		Logger.getLogger(this.getClass()).warn(&quot;No navigation category with code '&quot; +
				navigationState.getCategoryPath() + &quot;' found&quot;);
		throw new WebException(&quot;Category not found&quot;, HttpServletResponse.SC_NOT_FOUND, e);
	}

	// --- Build up Spring MVC view model
	model.put(&quot;orderByList&quot;, SearchOrderEnumeration.getSortOptions(navigationState.getVendorShop() != null));
	model.put(&quot;pageSizeList&quot;, pageSizeList);
	model.put(&quot;isSearch&quot;, Boolean.valueOf(navigationState.getSearchTerm() != null));
	model.put(&quot;productsPagination&quot;, this.paginationHelper.pagination((int) navigationResult.getTotalCount(),
			navigationResult.getPageSize(), navigationResult.getPage()));
	if (navigationState.getSearchTerm() != null) {
		model.put(&quot;suggestionList&quot;, this.navigationHelper.getSuggestions(navigationState.getSearchTerm()));
		model.put(CampaignComponentConstants.CONTEXT_CAMPAIGN_ELEMENT,
				searchCampaign(navigationResult.getCampaigns(), component.getCampaignName()));
	}
	if (Config.getBoolean(NavigationHelper.FREDHOPPER_LOG_TIME, false)) {
		Logger.getLogger(this.getClass()).warn
				(&quot;FREDHOPPER_TIME: showSearchNavigation took &quot; +
						(System.currentTimeMillis() - startTime) + &quot;ms&quot;);
	}

	if (navigationState.getVendorShop() == null) {
		return FrontendConstants.CMS2_COMPONENT_VIEW_PATH + COMPONENT_JSP_PATH;
	} else {
		model.put(&quot;contextVendorShop&quot;, navigationState.getVendorShop());
		return FrontendConstants.CMS2_COMPONENT_VIEW_PATH + COMPONENT_FOR_VENDOR_SHOP_JSP_PATH;
	}
}
</pre>
<p>The class <code>NavigationHelper</code> calls the Fredhopper Navigation Service. This service is used to create the &#8220;real&#8221; communication with the Fredhopper Java Webservice Client like this:</p>
<pre class="brush: java;">
public NavigationResult navigate(final NavigationState state) throws NavigationException {
	...
	// Mappingcode to create the query String going to Fredhopper from the NavigationState Object
	final Page page = this.runQuery(query);
	// Mapping Code from FredHopper Page Object to &quot;NavigationResult&quot; on our side
	...
}
</pre>
<p>In the <code>runQuery()</code> method we really execute the webservice call, as shown below:</p>
<pre class="brush: java;">
private Page runQuery(final Query query) {
	final String queryString = query.toQueryString();
	final FASWebService fasService = getFasWebService();
	return fasService.getAll(queryString);
}
</pre>
<p>The <code>FASWebService</code> class is the given Fredhopper class, which returns a FredHopper <code>com.fredhopper.webservice.client.Page</code> object which is used to fill our NavigationResult object.</p>
<p>Starting from this base, we can easily provide a toolbox consisting of Fredhopper WCMS components.</p>
<h2>Updating the Fredhopper search index</h2>
<p>To export the data from hybris to Fredhopper, we use the existing Solr features from hybris (part of the Solr extension), but extended them for usage with Fredhopper. From here on we start an export which exports all the products, row by row.</p>
<p>If a product has multiple offers from different vendors, the cheapest product will be selected for the export. This is because, according to the current business rules, the cheapest offer be displayed as the default. The idea is that the frontend will retrieve the other offers from the database and not from the Fredhopper index, because the index generation will be to huge and therefore too slow otherwise.</p>
<div id="attachment_1388" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-fredhopper" href="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Kettle-ETL-Jobs.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/09/Kettle-ETL-Jobs-479x323.png" alt="" title="Kettle-ETL-Jobs" width="479" height="323" class="size-large wp-image-1388" /></a><p class="wp-caption-text">Using Kettle ETL Jobs for exporting and updating the Fredhopper search index.</p></div>
<p>We export the current product list into a CSV file which is then fed into a Kettle ETL Job which imports the data into the Fredhopper index, which is completely regenerated. <a href="http://kettle.pentaho.com">Pentaho Kettle</a> is a data integration framework for Extraction, Transformation and Loading (ETL) Jobs. This job is triggerd once a week.</p>
<p>We also run two other jobs on a more regular basis which handle delta updates for new, modified and deleted products. This is our solution to keep Fredhopper up-to-date even for huge data stores.</p>
<p>The first incremental job searches for all modifications in the data model (like updated name, change of category etc) to be exported. We get a big search result here and the job takes a few hours to run. It is triggered once a day.</p>
<p>For information about price and availability we use a second incremental job that runs multiple times a day. Fredhopper doesn&#8217;t differentiate between price rows and products so this second job only monitors price rows and only the attributes of price rows (like price and availability) are considered.</p>
<h2>Scaling the Fredhopper Integration to huge Marketplaces</h2>
<p>As mentioned above and shown in the first diagram, there are two contexts, the general marketplace context, where the customer is browsing the marketplace, and the shop context for the specific vendor stores. Therefore we use a mixed mode: we use Fredhopper to create the product listing pages but load vendor-specific product attributes like vendor prices and stock availability directly from the database.</p>
<p>As a result we have a fast, reliable and up-to-date search and navigation engine which fulfils the specific requirements of a broad marketplace, namely  to display a large number of products in varying context, extremely well.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/09/fredhopper-hybris-integration/">Integrating FredHopper into a hybris Marketplace</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=17" title="View articles by Christian Belka">Christian Belka</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/09/fredhopper-hybris-integration/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>PL/SQL Unit Testing with DBUnit</title>
		<link>http://blog.mgm-tp.com/2012/07/plsql-unit-testing/</link>
		<comments>http://blog.mgm-tp.com/2012/07/plsql-unit-testing/#comments</comments>
		<pubDate>Tue, 24 Jul 2012 06:08:04 +0000</pubDate>
		<dc:creator>Nick Giles</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[QA]]></category>
		<category><![CDATA[SQL]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1347</guid>
		<description><![CDATA[My colleague Slavom&#237;r Jele&#328; and I are currently working on a logistics management application for an international food retailer. It&#8217;s a data-oriented application that performs pre-calculation steps on billions of rows with PL/SQL stored procedures. In order to ensure the correctness of these calculations, we devised a solution for unit testing the stored procedures in [...]]]></description>
			<content:encoded><![CDATA[<p>My colleague Slavom&iacute;r Jele&#328; and I are currently working on a logistics management application for an international food retailer. It&#8217;s a data-oriented application that performs pre-calculation steps on billions of rows with PL/SQL stored procedures. In order to ensure the correctness of these calculations, we devised a solution for unit testing the stored procedures in Oracle based on DBUnit.</p>
<p><span id="more-1347"></span></p>
<p>The logistics management application enables the international food retailer to administer the flow of goods between stores and warehouses. The number of users is relatively low at around 200, but the volume of data is massive with many tables containing hundreds of millions of rows and some containing over one billion. An operating day begins with an import of the latest data from the customer&#8217;s central data warehouse. From 7am on, users analyze tables of results and trends derived from this imported historical data, make decisions based on them and the application then recalculates future trends based on these decisions.</p>
<p>In order to reduce the volume of data to be analyzed in real-time when recalculating future trends, the historical data is pre-calculated into a smaller set of aggregated values. For example, sales data describing how many pieces of various items are sold in which store on which day is imported every morning, but one part of the system only needs data per item and per region for a month and so it makes sense to perform this aggregation at the beginning of the day and cache the results in a separate database table. These pre-calculations take the form of PL/SQL stored procedures which are run directly after the import completes every morning.</p>
<div id="attachment_1352" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-plsql" href="http://blog.mgm-tp.com/wp-content/uploads/2012/07/Aggregation-Overview1.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/07/Aggregation-Overview1-480x480.png" alt="" title="Aggregation Overview" width="480" height="480" class="size-large wp-image-1352" /></a><p class="wp-caption-text">Overview of how the historical data is pre-calculated into a smaller set of aggregated values.</p></div>
<p>We had some issue with quality assurance, which generally fell into three groups:</p>
<ul>
<li>Unusual circumstances would create data that would not be properly aggregated,</li>
<li>The ability to ensure that changes to aggregation rules have been implemented properly, and</li>
<li>Ensuring that changes made do not adversely affect other aggregation rules.</li>
</ul>
<p>The pre-calculation is split into several stored procedures but together comprise over 2000 lines of PL/SQL. Ensuring that all this logic performs as it should demands special attention. What was required was a system which would be able to test the stored procedures to make sure they function as specified &ndash; unit testing for PL/SQL stored procedures.</p>
<h2>Proposed solution</h2>
<p>The goal we had in mind was a system which could insert the bare minimum amount of data required for one specific test, run the pre-calculation and then compare the actual result with what the expected result. The overall pre-calculation is very complicated and so we split the tests up so that there would be one test for each rule of the aggregation. Using the example mentioned earlier, one test might be to specify sales data on the first day of a month and the last day of the previous month for one given item in a given store and to ensure that only the data in the current month is calculated.</p>
<p>Another test for this aggregation might be to have sales data in a month for two items and to ensure that the data is aggregated separately for each item. Importantly though in each case we specify only the absolute minimum amount of data required for the scenario, for example there might be just one country, one store, one item and perhaps just two rows in the sales table, not forgetting auxiliary data for foreign key integrity.</p>
<h2>Implementation</h2>
<p>After a few prototypes, we settled on using DBUnit which is able to do exactly what we want. DBUnit is a testing framework for databases which can run stored procedures, fetch, manipulate and compare data. It is a simple and usable interface to the database on top of JDBC including the ability to compare datasets derived from the database against datasets specified in file based structures such as XML and CSV. This means that queries, views and full tables can be compared.</p>
<p>Initial data and the expected output state of the tables containing the aggregated values are specified separately in two simple XML files. These are then referenced in a test method in a Java class along with the command to run the pre-calculation stored procedure.</p>
<div id="attachment_1354" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-plsql" href="http://blog.mgm-tp.com/wp-content/uploads/2012/07/Unit-Testing-Solution1.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/07/Unit-Testing-Solution1-480x525.png" alt="" title="Unit Testing Solution" width="480" height="525" class="size-large wp-image-1354" /></a><p class="wp-caption-text">Relevant parts of our Testing solution and how they play together.</p></div>
<h2>An Example</h2>
<p>Using the example mentioned earlier, sales data would be prepared like this:</p>
<h5>sales_data</h5>
<table>
<thead>
<tr>
<td>item_id</td>
<td>date</td>
<td>store_id</td>
<td>quantity</td>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>31/05/12</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>01/06/12</td>
<td>1</td>
<td>2</td>
</tr>
<tr>
<td>1</td>
<td>30/06/12</td>
<td>1</td>
<td>4</td>
</tr>
<tr>
<td>1</td>
<td>01/07/12</td>
<td>1</td>
<td>8</td>
</tr>
</tbody>
</table>
<p>In addition to this, auxiliary data would have to be prepared for entities such as the store, the store&#8217;s country, the country&#8217;s language, etc. Most of this auxiliary data is common and so can be easily duplicated when making new tests, making small changes where necessary.</p>
<p>In this example we are interested in making sure that sales for June are correctly aggregated and that values for May and July are excluded from the June aggregation and so expected values in the table containing the monthly aggregates would look like this:</p>
<h5>monthly_data</h5>
<table>
<thead>
<tr>
<td>item_id</td>
<td>month</td>
<td>store_id</td>
<td>quantity</td>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>01/05/12</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>01/06/12</td>
<td>1</td>
<td>6</td>
</tr>
<tr>
<td>1</td>
<td>01/07/12</td>
<td>1</td>
<td>8</td>
</tr>
</tbody>
</table>
<p>Once the two XML files have been created, the next step is to create the Java test method:</p>
<pre class="brush: java; wrap-lines: false;">
public class PrecalcMonthlyDbTestCase extends ProjectDBTestCase {
  @Override
  protected IDataSet getDataSet() throws Exception {
    return loadXmlDataSet
      (&quot;com/mgmtp/project/db/nonfood/precalc/sales/monthly1.xml&quot;);
  }

  @Test
  public void test() throws Exception {
    getConnection().getConnection().createStatement().execute(
      &quot;{call p_recalc_monthly_sales(1, DATE '2012-07-02')}&quot;);
    compareResults(
      &quot;com/mgmtp/project/db/nonfood/precalc/sales/monthly1_result.xml&quot;);
  }

  //...more tests
}
</pre>
<p>The class PrecalcMonthlyDbTestCase covers just one test. It extends from ProjectDbTestCase which is our implementation of DBUnit&#8217;s base class for test cases, setting common configuration options and providing additional functionality such as loading the source data from a specified location on the classpath before the test and cleaning it afterwards. The test itself is a simple affair; it runs the pre-calculation for a given date and then compares the expected data defined in an XML file with actual data in the corresponding database tables holding the new data aggregated by the pre-calculation.</p>
<h2>Test-driven Development</h2>
<p>We soon discovered that we were able to use this approach to ensure changes to the aggregation are implemented correctly and that all conceivable corner-cases are correctly handled. A Test-Driven-Development approach was accommodated whereby the first step in implementation was to create a series of tests based on the requirement. Initially these tests must fail as they specify new behavior. Then the implementation begins and afterwards is verified when the tests are run successfully. Similarly, if a bug is found in the pre-calculation, the first step is to reproduce the situation with a failing test case.</p>
<h2>Conclusion</h2>
<p>Whilst the testing of PL/SQL stored procedures was the initial focus of the work, we are now planning to broaden the scope and use DBUnit to test the large SQL statements in the project. Many of the SQL statements make extensive use of common table expressions and many exceed 200 lines, making them very susceptible to the same vulnerabilities as the pre-calculation stored procedures.</p>
<p>The end result is several suites of tests covering all of the various pre-calculation procedures which are run every night as part of the nightly build and also before every release. We don&#8217;t have any numbers to quantify the increase in accuracy and stability of the results, but are now 100% sure that once a problem surfaces, it will not resurface again.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/07/plsql-unit-testing/">PL/SQL Unit Testing with DBUnit</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=27" title="View articles by Nick Giles">Nick Giles</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/07/plsql-unit-testing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Possibly the most malicious Regular Expression - Hacking Java Puzzlers for Fun and Profit, Part 3</title>
		<link>http://blog.mgm-tp.com/2012/06/regexp-java-puzzler-2/</link>
		<comments>http://blog.mgm-tp.com/2012/06/regexp-java-puzzler-2/#comments</comments>
		<pubDate>Tue, 12 Jun 2012 09:57:01 +0000</pubDate>
		<dc:creator>Tobias Budde</dc:creator>
				<category><![CDATA[Puzzler]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1280</guid>
		<description><![CDATA[This is another episode on regular expression. This time we look at the worst regex you can possibly come up with and it is nasty and evil. And we’ll see that it is quite common.]]></description>
			<content:encoded><![CDATA[<p>This is the next of my episode on regular expressions. Today, we look at the worst regex you can possibly come up with, although it looks innocent and simple. You will learn about this backtracking trap that let&#8217;s you easily wait for 10^30 steps, as an example of an errant email regex will illustrate. One possible solution we investigate is the use of possessive quantifiers.</p>
<p><span id="more-1280"></span></p>
<p>Well, some of you might think: <em>All regular expressions are malicious.</em></p>
<p>If you are one of them: You have no idea how bad they can become!</p>
<h2>Greedily Looking for ‘aa‘</h2>
<p>Have a look at this short example:</p>
<pre class="brush: java; wrap-lines: false;">
public static void main (String[] args) {

    final String pattern = &quot;(aa|aab?)*&quot;;

    // two 'a'
    final String a002 = &quot;aa&quot;;
    // three 'a'
    final String a003 = &quot;aaa&quot;;
    // 102 'a'
    final String a102 = &quot;aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&quot;;
    // 103 'a'
    final String a103 = &quot;aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa&quot;;

    System.out.println(a002.matches(pattern)) ;
    System.out.println(a003.matches(pattern)) ;
    System.out.println(a102.matches(pattern)) ;
    System.out.println(a103.matches(pattern)) ;
}
</pre>
<p>What does it print?</p>
<ol>
<li><code>true false true false</code></li>
<li>Throws an exception</li>
<li>Does not compile</li>
<li>None of the above</li>
</ol>
<p>The regular expression &#8220;<code>(aa|aab?)*</code>&#8221; matches all strings that are made up of &#8220;<code>aa</code>&#8221; or &#8220;<code>aab</code>&#8220;. I.e., it matches for even numbers of &#8216;a&#8217; and does not match for odd number of &#8216;a&#8217;. (I know. The same can be done with a simple &#8220;<code>(aab?)*</code>&#8221; but just follow me.)</p>
<p>So in theory it should print &#8220;<code>true false true false</code>&#8221; (answer 1 above). But actually, it prints &#8220;<code>true false true </code>&#8221; (answer 4). The final &#8220;<code>false</code>&#8221; will come some billion years later if you are patient and your computer does not want you to reboot in the meantime.</p>
<h2>It&#8217;s not a bug, It&#8217;s a feature</h2>
<p>We got caught in a backtracking trap. Let&#8217;s have a look at a shorter example: &#8220;<code>aaaaa</code>&#8220;. The following table shows what the java regular engine does to see if it matches to our pattern.</p>
<table border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="41" valign="top"><strong>Step</strong></td>
<td width="128" valign="top"><strong>Group 1</strong></td>
<td width="128" valign="top"><strong>Group 2</strong></td>
<td width="128" valign="top"><strong>Group 3</strong></td>
<td width="58" valign="top"><strong>Result</strong></td>
</tr>
<tr>
<td width="41" valign="bottom">1</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">2</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">3</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">4</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">5</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">6</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">7</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aa)</td>
<td width="58" valign="bottom">fail</td>
</tr>
<tr>
<td width="41" valign="bottom">8</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="128" valign="bottom">(aab?)</td>
<td width="58" valign="bottom">fail</td>
</tr>
</tbody>
</table>
<p>Explanation:</p>
<ul>
<li><strong>Step 1</strong>: The engine finds two &#8220;<code>aa</code>&#8220;, tries to find a third &#8220;<code>aa</code>&#8221; and fails.</li>
<li><strong>Step 2</strong>: The engine finds two &#8220;<code>aa</code>&#8220;, tries to find a &#8220;<code>aab?</code>&#8221; next and fails.</li>
<li><strong>Step 3</strong>: The engine starts backtracking and replaces the second &#8220;<code>aa</code>&#8221; with &#8220;<code>aab</code>&#8221; and fails again for both &#8220;<code>aa</code>&#8221; &#8230;</li>
<li><strong>Step 4</strong>: and &#8220;<code>aab?</code>&#8220;.</li>
<li>And so on.</li>
</ul>
<p>The result is: &#8220;<code>aaaaa</code>&#8221; does not match &#8220;<code>(aa|aab?)+</code>&#8220;.</p>
<p>In our example with 103 ‘a&#8217; the engine will execute 2^100 = 10^30 steps, that is 1000 billion billion billion. The regex &#8220;<code>(aa|aab?)*</code>&#8221; looks so innocent and simple but is probably <em>the worst regex I ever saw</em>.</p>
<p>But backtracking is useful. If you match &#8220;&lt;a&gt;&lt;b&gt;&lt;/b&gt;&lt;/a&gt;&#8221; to the regex &#8220;<code>&lt;(.*)&gt;(.*)&lt;\1&gt;</code>&#8221; you will find a match of opening and closing tags. I think there is no regular expression for a regex engine without backtracking that can do this match. Please comment on this!</p>
<h2>Switching to Possessive Quantifiers</h2>
<p>What will change if we use a slightly different regex? We change the quantifier from a greedy &#8220;<code>*</code>&#8221; to a possessive &#8220;<code>*+</code>&#8220;.</p>
<pre class="brush: java; light: true;">
final String pattern = &quot;(aa|aab?)*+&quot;;
</pre>
<p>What does it print?</p>
<ol>
<li><code>true false true false</code></li>
<li>Throws an exception</li>
<li>Does not compile</li>
<li>None of the above</li>
</ol>
<p>Since we use a possessive quantifier the regular expression engine will not do backtracking. So the process stops after step 2 and the &#8220;<code>aa</code>&#8221; in &#8216;Group 2&#8242; won&#8217;t be replaced for &#8220;<code>aab?</code>&#8220;. The first result is printed at once.</p>
<h2>A Case Study with Email Addresses</h2>
<p>Let&#8217;s take the example of email address. Let&#8217;s look at two similar regular expressions for it:</p>
<pre class="brush: plain; wrap-lines: true;">
[a-z0-9](([\-.]|[_]+)?([a-z0-9]+))*@[a-z0-9]+[.](([a-z]{2,3})|([a-z]{2,3}[.]{1}[a-z]{2,3}))
</pre>
<pre class="brush: plain; wrap-lines: true;">
[a-z0-9_\+-]+(?:\.[a-z0-9_\+-]+)*@[a-z0-9-]+(?:\.[a-z0-9-]+)*\.(?:[a-z]{2,4})
</pre>
<p>Do you spot the difference? The first expression is as evil as &#8220;<code>(aa|aab?)*</code>&#8221; and gets stuck if you try to match e.g. <code>"aaaaaaaaaaaaaaaaaaaaa!"</code> while the second is fine. I.e. if you use the first regex to validate email addresses in your Java code your thread gets stuck for a few million years and your system is vulnerable to a &#8220;<em>regular expression denial of service</em>&#8221; attack. In principle all regular expression engines that allow backtracking will have the same problem. Imagine what will happen to your database system if there are a few thousand invalid email addresses to be stored and you use an malicious regex in the DB!</p>
<p>There is no simple way to recognize regular expressions which fall into the evil category. But here are some hints:</p>
<ul>
<li>Be careful with options that imply each other: &#8220;<code>aa</code>&#8221; vs. &#8220;<code>aab?</code>&#8220;.</li>
<li>Try to avoid cascading quantifiers like in &#8220;<code>(a+)+</code>&#8220;.</li>
<li>Try to be possessive like in &#8220;<code>(aa|aab?)*+</code>&#8220;.</li>
</ul>
<p>Have a look at the <a href="http://en.wikipedia.org/wiki/Redos">Wikipedia article on Redos</a> for more information and additional examples for  malicious regular expressions.</p>
<p>I&#8217;m curious to hear your &#8220;war&#8221; stories: Did you actually have this problem in your systems on production? What did you do to find it and fix it?</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/06/regexp-java-puzzler-2/">Possibly the most malicious Regular Expression - Hacking Java Puzzlers for Fun and Profit, Part 3</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=26" title="View articles by Tobias Budde">Tobias Budde</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/06/regexp-java-puzzler-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<series:name><![CDATA[Hacking Java Puzzlers for Fun and Profit]]></series:name>
	</item>
		<item>
		<title>Regular Expressions: Splitting Pipes - Hacking Java Puzzlers for Fun and Profit, Part 2</title>
		<link>http://blog.mgm-tp.com/2012/05/regexp-java-puzzler/</link>
		<comments>http://blog.mgm-tp.com/2012/05/regexp-java-puzzler/#comments</comments>
		<pubDate>Wed, 30 May 2012 16:24:38 +0000</pubDate>
		<dc:creator>Tobias Budde</dc:creator>
				<category><![CDATA[Puzzler]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1251</guid>
		<description><![CDATA[It&#8217;s a common saying in IT: &#8220;I had a problem and used regular expressions to solve it. Now I&#8217;ve two problems&#8221;. We want to offer help in a series of mgm &#8220;Hacking Java Puzzler&#8221; blog entries and demonstrate how regular expressions can be useful anyway. In this first episode we will focus on splitting CSV [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s a common saying in IT: &#8220;I had a problem and used regular expressions to solve it. Now I&#8217;ve two problems&#8221;. We want to offer help in a series of mgm &#8220;Hacking Java Puzzler&#8221; blog entries and demonstrate how regular expressions can be useful anyway. In this first episode we will focus on splitting CSV lines.</p>
<p><span id="more-1251"></span></p>
<p>CSV is a very old file format and parsing it is very simple. There are special libraries for this but Java has built-in support as well. The class <code>String</code> offers the <code>split()</code> method that takes a regular expression to parse the content of a string. The implementation of this function directly uses the <code>split</code> method in the <code>java.util.regex.Pattern</code> class.</p>
<p>Have a look at this short example:</p>
<pre class="brush: java;">
public static void main (String[] args){
    final String text = &quot;a|b|c&quot;;
    final String delimiterPattern = &quot;\\|&quot;;

    final String[] columns = text.split(delimiterPattern);

    System.out.println(Arrays.toString(columns));
}
</pre>
<p>Our values in the CSV are delimited by the pipe character &#8216;|&#8217;. This real-world example is taken from a legacy application that represents lists as pipe-concatenated strings, like <code>"a|b|c"</code> in the code above. This example would work in the same way for any other delimiter character.</p>
<p>So, what does it print? Yes, it&#8217;s &#8220;<code>[a, b, c]</code>&#8220;. Well done.</p>
<h2>Puzzler 1: Warming up</h2>
<p>Now for the first simple Java puzzler. We want to do the same, but our first two columns are empty:</p>
<pre>
    final String text = "||c";
</pre>
<p>What does it print?</p>
<ol>
<li><code>[, , c]</code></li>
<li><code>[null, null, c]</code></li>
<li>An exception is thrown</li>
<li>None of the above</li>
</ol>
<p>As expected it&#8217;s &#8220;<code>[, , c]</code>&#8221; and the first two columns are empty strings.</p>
<h2>Puzzler 2: You&#8217;ll be surprised</h2>
<p>Now it gets a little harder. We change the text so that all columns are empty:</p>
<pre>
    final String text = "||";
</pre>
<p>What does it print?</p>
<ol>
<li><code>[, , ]</code></li>
<li><code>[null, null, null]</code></li>
<li>An exception is thrown</li>
<li>None of the above</li>
</ol>
<p>Well, you&#8217;ll be surprised&mdash;it&#8217;s number 4, and what will be printed is this: &#8220;<code>[]</code>&#8220;.<br />
One might want to shout out loud: <em>&#8220;That&#8217;s *%$#&amp; stupid. I&#8217;ll never understand regular expressions!&#8221;</em></p>
<h3>What just happened?</h3>
<p>If you blame regular expressions for this unexpected result you are actually barking up the wrong tree. Yes, it is implemented in the <code>regex</code> package, but let&#8217;s read the JavaDoc:</p>
<ul>
<li><code>String.split()</code>: <em>[...] trailing empty strings <strong>will be discarded</strong>.</em></li>
<li><code>Pattern.split()</code>: <em>Trailing empty strings are [...] <strong>not included</strong> in the resulting array.</em></li>
</ul>
<p>If you have a look at the code of the <code>Pattern.split()</code> method, you will find something like this:</p>
<pre class="brush: java; wrap-lines: false;">
// Taken from JDK's Pattern class

int resultSize = matchList.size();
if (limit == 0)
    while (resultSize &gt; 0 &amp;&amp; matchList.get(resultSize-1).equals(&quot;&quot;))
        resultSize--;
String[] result = new String[resultSize];
return matchList.subList(0, resultSize).toArray(result);
</pre>
<p>These lines actively delete empty trailing strings, like documented. I.e. your expected result is deliberately destroyed.</p>
<p>What&#8217;s the reason for this API design? Does anybody have a clue? I don&#8217;t.</p>
<h2>How to Get it Right</h2>
<p>The workaround is quite simple: Just ensure that <code>limit != 0</code> in <code>Pattern.split</code>. How? Luckily, there&#8217;s a variant of the <code>split()</code> method that takes the limit as a parameter. The following small change does the job (note the <code>-1</code> as a second parameter):</p>
<pre class="brush: java; wrap-lines: false;">
final String[] columns = text.split(delimiterPattern, -1);
</pre>
<p>In my opinion this should have been the default behavior.</p>
<p>Another solution is to directly use the regex API:</p>
<pre class="brush: java;">
final String text = &quot;||&quot;;
Pattern pattern = Pattern.compile(&quot;[^|]*&quot;);
Matcher matcher = pattern.matcher(text);
List&lt;String&gt; columns = new ArrayList&lt;String&gt;();

while (matcher.find()){
    columns.add(matcher.group());
}
</pre>
<p>The regular expression &#8220;<code>[^|]*</code>&#8221; matches everything that is not a pipe symbol. This includes the empty words in our sample text.</p>
<p>Using the regex API is a little more work but is the only way for a related problem: <em>Extract only not empty words from a CSV</em>. Using <code>split()</code> will always return leading empty words (as you can see in the first simple puzzler). With regex it&#8217;s just a minor change to &#8220;<code>[^|]+</code>&#8221; because the asterisk means &#8216;<em>none or more</em>&#8216; while the plus quantifier means &#8216;<em>one or more</em>&#8216;.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/05/regexp-java-puzzler/">Regular Expressions: Splitting Pipes - Hacking Java Puzzlers for Fun and Profit, Part 2</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=26" title="View articles by Tobias Budde">Tobias Budde</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/05/regexp-java-puzzler/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<series:name><![CDATA[Hacking Java Puzzlers for Fun and Profit]]></series:name>
	</item>
		<item>
		<title>Building a scalable Web-based Call Center CTI Solution</title>
		<link>http://blog.mgm-tp.com/2012/05/scalable-web-cti-solution/</link>
		<comments>http://blog.mgm-tp.com/2012/05/scalable-web-cti-solution/#comments</comments>
		<pubDate>Tue, 08 May 2012 13:26:51 +0000</pubDate>
		<dc:creator>Lars Immisch</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[COMET]]></category>
		<category><![CDATA[HTML5]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Telephony]]></category>
		<category><![CDATA[Wicket]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1206</guid>
		<description><![CDATA[My project was part of our customer&#8217;s effort to replace all of the enterprise applications with web applications based on a standardized technology stack. In this strategic move, the call center integration was a crucial step. As it turned out, the technical design of the new call center telephony solution was quite challenging. We did [...]]]></description>
			<content:encoded><![CDATA[<p>My project was part of our customer&#8217;s effort to replace all of the enterprise applications with web applications based on a standardized technology stack. In this strategic move, the call center integration was a crucial step. As it turned out, the technical design of the new call center telephony solution was quite challenging. We did not only learn a lot about <a href="http://en.wikipedia.org/wiki/Computer_telephony_integration">CTI</a>; we also had to implement the system to be scalable and ensure that it handles more than 1000 call center agents.</p>
<p><span id="more-1206"></span></p>
<p>The call center agents should use mostly the standard web applications, but with an additional telephony control that allowed them to accept incoming calls, to disconnect calls, or to make consultation calls to other agents or supervisors.</p>
<h2>An incoming call</h2>
<p>Let&#8217;s have a look at the most important usecase first: an incoming call.</p>
<p>The following diagram gives an overview of the flow of events, before the agent&#8217;s telephone rings:</p>
<div id="attachment_1222" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-web-cti" href="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Incoming-Flow.png"><img class="size-large wp-image-1222" src="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Incoming-Flow-480x366.png" alt="" width="480" height="366" /></a><p class="wp-caption-text">The simplified flow of events before an agent receives a call.</p></div>
<p>The incoming call of the customer is handled by the <a href="http://en.wikipedia.org/wiki/Private_branch_exchange#Private_branch_exchange">PBX (Private Branch Exchange)</a>. When the agent finally takes the call, a lot of information about the customer has already been collected. In most cases, the caller will already have gone through an interactive voice response system that has collected his account number and verified his PIN (omitted from the picture above).</p>
<p>This is how the agent screen might look like after the agent has taken the call:</p>
<div id="attachment_1209" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-web-cti" href="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Agentscreen.png"><img class="size-large wp-image-1209" src="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Agentscreen-480x185.png" alt="" width="480" height="185" /></a><p class="wp-caption-text">Schematic screen of the call center agent UI.</p></div>
<p>The box on the left are the telephony controls. They are embedded in an iframe and allow the agent to disconnect the call or place consultation calls to other agents. The telephony controls send commands to the gateway (and in extension to the PBX) and receive asynchronous events.</p>
<h2>CSTA as a Model for our Protocol</h2>
<p>Before it was even determined whether the telephone system should be integrated directly through the PBX or via an integration layer (Genesys), we decided to use the <a href="http://www.ecma-international.org/publications/standards/Ecma-269.htm">CSTA Phase III</a> communication protocol as an orientation for the protocol between gateway and browser. CSTA (Computer Supported Telephony Applications) is an ECMA standard (like JavaScript) and describes third-party call control using services and events. Third-party call control roughly means that the standard looks at an entire switch and all connected devices (telephones), and not just a single telephone. This point of view is reflected in the naming of the services and events. For example, when a call arrives at a terminal the event is called <em>Delivered</em>. A sample event flow is given in the following diagram.</p>
<div id="attachment_1223" class="wp-caption alignnone" style="width: 489px"><a rel="lightbox-web-cti" href="http://blog.mgm-tp.com/wp-content/uploads/2012/05/CSTA-Message-exchange.png"><img class="size-full wp-image-1223" src="http://blog.mgm-tp.com/wp-content/uploads/2012/05/CSTA-Message-exchange.png" alt="" width="479" height="438" /></a><p class="wp-caption-text">Exemplary flow of CSTA services and events in our system.</p></div>
<p><em>Services</em> are commands to the telephone system. An outgoing call (from any device within the domain of the switch) is initiated by a <em>Make Call</em> Service. But there are also services like <em>Set Agent State</em>.</p>
<p>CSTA is extremely comprehensive; we used only a small selection of its services and events. It is also easy to extend &mdash; the transfer of non-standardized key/value pairs within the data part of services and events is explicitly provided for.</p>
<p>CSTA provides an <a href="http://en.wikipedia.org/wiki/Abstract_Syntax_Notation_One">ASN.1</a> and an XML encoding. Writing an ASN.1 parser in JavaScript was obviously not a good idea and even the XML mapping is quite heavy-weight and we decided to design our own transport encoding on top of <a href="http://www.json.org/">JSON</a> and built a <a href="http://en.wikipedia.org/wiki/Representational_state_transfer">REST</a>-inspired web service as a gateway to the PBX.</p>
<h2>Requirements</h2>
<p>The customers&#8217; technology framework requirements were:</p>
<ul>
<li>Internet Explorer 8 as the browser for the call center agents</li>
<li><a href="http://wicket.apache.org/">Wicket</a> as web application framework for the call center application</li>
<li>Tomcat 7 as the web application server for both the call center web app and the gateways.</li>
</ul>
<p>The technical requirements were: minimal latency, high throughput, and high availability. An average delay below 150 ms was required for latency, i.e. a value slightly below the attention threshold. For the call center callers, very low latency is not crucial &mdash; most callers will have waited in the queue for minutes rather than seconds to reach a free agent anyway. But the new web application should &mdash; if at all possible &mdash; not worsen the ergonomics for the call center agents. In the end, this wasn&#8217;t a problem: during tests using moderate load latencies below 80 ms could be achieved.</p>
<p>High availability is an obvious requirement: if a call center with about 1000 agents fails, there will be many unhappy customers. On an unlucky day the failure will even be reported in the news. We solved the problem by designing for redundant server components and a low latency failover protocol. The actual web application uses Tomcat&#8217;s built-in clustering mechanism. We couldn&#8217;t reuse this for the telephony gateway, because the relevant state is distributed across the switches anyway.</p>
<p>The gateway has two essential reliability requirements:</p>
<ul>
<li>Commands to the telephone system have to be retried quickly if a gateway fails.</li>
<li>Telephony events must not be lost.</li>
</ul>
<p>The functional requirements were straightforward:</p>
<ul>
<li>Incoming and outgoing calls (simple call control)</li>
<li>Call forwarding (single-step/two-step transfer)</li>
<li>Forwarding to the IVR (Interactive Voice Response) — including customer dependent data — as well as routing back to the same agent that originally took the call</li>
<li>Setting and displaying the agent status</li>
</ul>
<h2>Architecture</h2>
<p>The architecture consists of several interconnected systems as shown in the diagram below:</p>
<ul>
<li>The call center agents&#8217; browser with the JavaScript/HTML,</li>
<li>Telephony-related systems (left): the <em>gateway</em> (a server-side web application running in Tomcat) and the <em>PBX</em>,</li>
<li>Call center web application (right): <em>Wicket-based web application</em> and its <em>database(s)</em>.</li>
</ul>
<p>The integration of telephony and web application happens in the browser. The web application includes our JavaScript library and a telephony control panel in an <em>iframe</em>.</p>
<div id="attachment_1225" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-web-cti" href="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Layers.png"><img class="size-large wp-image-1225" src="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Layers-480x316.png" alt="" width="480" height="316" /></a><p class="wp-caption-text">The architecture of our CTI solution based on web technology.</p></div>
<h2>Sending Server Events to the Browsers</h2>
<p>For redundancy, every client connects to both gateways, and keeps the TCP connection open. This means that every application server (Tomcat) of the gateways has to hold nearly 1000 open connections. We use the <a href="http://tomcat.apache.org/tomcat-7.0-doc/aio.html">AIO-Interface of Tomcat 7</a>, so all these connections can be processed by a single thread. This greatly minimizes memory requirements and scheduling overhead.</p>
<div id="attachment_1226" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-web-cti" href="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Processes.png"><img class="size-large wp-image-1226" src="http://blog.mgm-tp.com/wp-content/uploads/2012/05/Processes-480x304.png" alt="" width="480" height="304" /></a><p class="wp-caption-text">Overview of our system architecture with redundant gateways and PBX systems.</p></div>
<p><a href="http://html5doctor.com/server-sent-events/">Server-sent events</a> (a.k.a. server push) was recently standardized as part of HTML5 in the <a href="http://dev.w3.org/html5/eventsource/#the-eventsource-interface">EventSource interface</a>. Another convenient method to implement bidirectional communication is <a href="http://dev.w3.org/html5/websockets/">WebSockets</a>. But we couldn&#8217;t use any of these due to the use of legacy browsers — we were glad we didn&#8217;t have to support IE6 and could rely on at least IE8. So we implemented a <a href="http://en.wikipedia.org/wiki/Comet_(programming)">COMET</a> variant, which essentially consists of long running <em>XMLHttpRequest</em> through which events are sent as chunked responses.</p>
<p>The asynchronous events from the gateways are decoded by our JavaScript library, which updates the telephony control and forwards the events to the interface part of the browser, which in turn may trigger a server interaction.</p>
<h3>Cross-Domain COMET with IE8</h3>
<p>The customer wanted to be able to run the web application and the gateways on different application servers. This means that the Javascript <em>XMLHTTPRequests</em> are <em>cross domain</em>, which turned out to be a small challenge on IE8.</p>
<p>Mozilla Firefox, Safari and Google Chrome all support the <a href="http://www.w3.org/TR/cors/">CORS (Cross-Origin Resource Sharing)</a> specification of the W3C. IE8 supports it as well; however, with IE8 one must use <em>XDomainRequests</em> instead of <em>XMLHTTPRequests</em>, and the API is slightly different. There is a also a <a href="http://blogs.msdn.com/b/ieinternals/archive/2010/04/06/comet-streaming-in-internet-explorer-with-xmlhttprequest-and-xdomainrequest.aspx">subtle buffering bug</a> within IE8 that makes it necessary to set 2 KB of fill characters on every new COMET connection to ensure that the next event is received by the application immediately.</p>
<h2>Redundant Gateways and PBX</h2>
<p>Each browser keeps two connections to two different gateways. One is active, and the other is a hot standby. When the connection to the active gateway is broken, the hot standby gateway is immediately activated. If necessary,  the last failed command will be retried. As the hot standby gateway has been sending events the whole time as well, it is guaranteed that no event is lost. After this failover, the connection to the failed gateway is retried. When it is active, the previously failed gateway has become the host standby gateway.</p>
<p>Loss of a gateway does not lead to the loss essential state — the gateways hold as little state as possible. All relevant state was either pushed up into the JavaScript library or down into the PBX integration layer. The gateways are also independent of each other and interchangeable. This makes the solution inherently scalable. More gateways can be added at any time.</p>
<p>The PBX (a Genesys installation) itself is also redundant. The fallback on this level is hidden by the Genesys API and the gateway doesn’t have to handle it.</p>
<h2>Testing</h2>
<p>Our solution was tested with three different methods:</p>
<ul>
<li>Javascript unit tests with <a href="http://docs.jquery.com/QUnit">QUnit</a>,</li>
<li>A simulator that implements the gateway&#8217;s HTTP services and simulates a single agent telephone (with a Swing GUI), and</li>
<li>Load tests.</li>
</ul>
<p>Writing the simulator was a substantial effort, but it helped in two ways:</p>
<ul>
<li>It made development without the telephony hardware possible.</li>
<li>It made it easy to test scenarios that were not reliable testable with real hardware (like deliberate race conditions).</li>
</ul>
<p>In an ideal world, the load tests would have been performed with an external load test tool. We didn&#8217;t have one available, so we wrote our own load test generator using the CSTA API to generate and receive calls.</p>
<h2>Conclusion</h2>
<p>Our solution is light-weight, conceptually simple and scalable. The simplicity is the result of two development iterations and rather long design phases.</p>
<p>The decision to use CSTA as the blueprint for the communication protocol worked well, too. It was helpful that we did not have to re-invent two-step transfer for the umpteenth time. Also, the CSTA vocabulary (which goes down to the text in the log messages) can be understood by personnel that are familiar with <a href="http://en.wikipedia.org/wiki/Computer_telephony_integration">CTI</a>.</p>
<h3>In the Footprints of Arnold Schwarzenegger</h3>
<p>Call center applications always remind me of a <a href="http://www.imdb.com/title/tt0111503/">slightly silly movie</a> starring Arnold Schwarzenegger as an undercover agent and Jamie Lee Curtis as his unsuspecting wife. His cover story for her is that he is doing something with IT and in one scene she inquires about his day at work. He starts telling her enthusiastically and quite elaborately about a call center integration &mdash; and she nearly falls asleep.</p>
<p>I, however, think the combination of a call center and a web application is technically quite fascinating.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/05/scalable-web-cti-solution/">Building a scalable Web-based Call Center CTI Solution</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=25" title="View articles by Lars Immisch">Lars Immisch</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/05/scalable-web-cti-solution/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Consequences when using Mutable Fields in hashCode() - Hacking Java Puzzlers for Fun and Profit, Part 1</title>
		<link>http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/</link>
		<comments>http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/#comments</comments>
		<pubDate>Thu, 08 Mar 2012 13:16:08 +0000</pubDate>
		<dc:creator>Ulrich Schrempp</dc:creator>
				<category><![CDATA[Puzzler]]></category>
		<category><![CDATA[Code Quality]]></category>
		<category><![CDATA[IDE]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1123</guid>
		<description><![CDATA[We start our new series with an informative HashSet puzzler. It&#8217;s about a bug that gave us quite a headache since its root cause was hard to identify. This subtle bug has without doubt crept into many code bases, so a detailed discussion should interest every Java coder. I will also discuss code inspection tools [...]]]></description>
			<content:encoded><![CDATA[<p>We start our new series with an informative HashSet puzzler. It&#8217;s about a bug that gave us quite a headache since its root cause was hard to identify. This subtle bug has without doubt crept into many code bases, so a detailed discussion should interest every Java coder. I will also discuss code inspection tools that detect this violation (sadly, only few). And by the way, what we learned about HashSet also makes a good topic in our job interviews.</p>
<p><span id="more-1123"></span></p>
<p>The starting point was that we had implemented a class with some fields and needed quick access to its instances. This use case was pretty performance-critical, so we took a HashSet to maintain the class instances. As the instance count was high, we needed a broad distribution of possible hash-values. We used the IDE to generate code for the <code>hashCode()</code> and <code>equals()</code> method. Everything should have been fine with this generated code — at least we thought so. However, our application showed some unexpected results and it took us quite a long time to nail down the root cause.</p>
<p>Here&#8217;s a <em>simplified</em> example of code using a HashSet of PhoneNumber instances:</p>
<pre class="brush: java;">
// Class PhoneNumber implements hashCode() and equals()
PhoneNumber obj = new PhoneNumber(&quot;mgm&quot;, &quot;089/358680&quot;);
System.out.println(&quot;Hashcode: &quot; +
	obj.hashCode());  //prints &quot;1476725853&quot;

// Add PhoneNumber object to HashSet
Set&lt;PhoneNumber&gt; set = new HashSet();
set.add(obj);

// Modify object after it has been inserted
obj.setNumber(&quot;089/358680-0&quot;);

// Modification causes a different hash value
System.out.println(&quot;New hashcode: &quot; +
	obj.hashCode()); //prints &quot;7130851&quot;

// ... Later or in another class, code such as the following
// is operating on the Set:

// Unexpected Result!
// Output: obj is set member: FALSE
System.out.println(&quot;obj is set member: &quot; +
	set.contains(obj));

// Even stranger unexpected Result!
// Output: obj is set member: FALSE
for (PhoneNumber p : set) {
	if (p.equals(obj)) {
		System.out.println(&quot;obj is set member: &quot; +
			set.contains(p));
	}
}
</pre>
<p>Executing the code above surprisingly produces the following output:</p>
<pre>
obj is set member: FALSE
obj is set member: FALSE
</pre>
<p>Obviously, what we would expect is a &#8220;TRUE&#8221;, since obj has been inserted into the HashSet.</p>
<h2>What just happened?</h2>
<p>The unexpected result from the code above is caused by a trap in the JDK Collections framework into which many developers have fallen: <em>If an implementation of <code>hashCode()</code> uses mutable fields to calculate the value, <code>HashSet.contains()</code> produces unexpected results, i.e. your object seems to be not a member of the set.</em></p>
<p>For an illustration, let&#8217;s look at the class <code>PhoneNumber</code> and its mutable field &#8220;number&#8221;:</p>
<pre class="brush: java;">
public class PhoneNumber {

    private final String name;
    private String number;

    public PhoneNumber(String number, String name) {
        this.number = number;
        this.name = name;
    }

	// Setter makes &quot;number&quot; mutable!
    public void setNumber(String number) {
        this.number = number;
    }

    @Override
    public int hashCode() {
        int result = name != null ? name.hashCode() : 0;
        result = 31 * result +
			(number != null ? number.hashCode() : 0);
        return result;
    }

	// equals() left out here ...
}
</pre>
<p>What&#8217;s wrong with this class? Well, it&#8217;s a bad idea to use <a href="http://www.javaranch.com/journal/2003/04/immutable.htm">mutable</a> fields in <code>hashCode()</code> when its instances are put into a <code>HashSet</code> (or as keys into a <code>HashMap</code>). In general, any hash-based collection is problematic. See also <a href="http://stackoverflow.com/questions/5110376/hashset-contains-problem-with-custom-objects">&#8220;HashSet contains problem with custom objects&#8221; (Stackoverflow)</a> and <a href="http://javaadventure.blogspot.com/2007/02/hashcode-pitfalls-with-hashset-and.html">&#8220;hashCode() pitfalls with HashSet and HashMap&#8221;</a>.</p>
<h2><code>HashSet.contains()</code> surprisingly uses <code>hashCode()</code></h2>
<p>Part of our problem to spot our bug was that the <code>HashSet.contains()</code> method relies on hash values to stay immutable. Unfortunately, this is not stated explicitly in the <a href="http://docs.oracle.com/javase/6/docs/api/java/util/HashSet.html#contains(java.lang.Object)">HashSet JavaDoc</a>, which only mentions <em>&#8220;&#8230;returns true if and only if this set contains an element e such that <code>(o==null ? e==null : o.equals(e))</code>&#8220;</em>. Actually, this is the same description as the <a href="http://docs.oracle.com/javase/6/docs/api/java/util/Collection.html#contains(java.lang.Object)">JavaDoc of <code>Set.contains()</code></a>.</p>
<p>A conscientious reader may also find the following note in the <a href="http://docs.oracle.com/javase/6/docs/api/java/util/Set.html">JavaDoc of the <code>Set</code></a> interface, which only mentions <code>equals()</code>:</p>
<blockquote><p>&#8220;Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set.&#8221;</p></blockquote>
<p>By the way, the following Sun JDK bug was reported quite some time ago: <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6579200">&#8220;(coll) HashSet.contains method violates Set.contains() contract&#8221;</a>. The bug is approved (but not fixed) and the last comment was made in late 2007.</p>
<h2>Properly coding the <code>hashCode()</code> method</h2>
<p>The <em>contract</em> of <code>hashCode()</code> is explained in the <a href="http://docs.oracle.com/javase/6/docs/api/java/lang/Object.html#hashCode()">JavaDoc of <code>Object</code></a>. You will also find hints on proper implementations in the very interesting book <a href="http://www.amazon.com/Effective-Java-2nd-Joshua-Bloch/dp/0321356683">&#8220;Effective Java&#8221;</a> from Joshua Bloch. The book covers many interesting topics and especially the items 7 and 8 shed light on good <code>equals()</code> and <code>hashCode()</code> implementation practices. Another book by the same author, <a href="http://www.javapuzzlers.com">&#8220;Java Puzzlers&#8221;</a>, also contains two puzzlers on this problematic area: specifically, puzzlers 57 and 58 show how these two methods depend on each other.</p>
<p>There are many more discussions about the problems that can occur with <code>hashCode()</code>. For example, in his Google TechTalks presentation <a href="http://www.youtube.com/watch?v=aAb7hSCtvGw">&#8220;How To Design A Good API and Why it Matters&#8221;</a>, Joshua Bloch says that <code>hashCode()</code> is an implementation detail that should not have leaked into the Java API at all (at about 27:30 min).</p>
<p>And don&#8217;t forget the lesson learned here: <em>using mutable fields in hashCode() is a recipe for disaster. And disaster strikes when instances of this class are put in a hash-based collection like <code>HashSet</code> or <code>HashMap</code> (as map keys)</em>.</p>
<p>Please note that since code usually uses only the respective collection interfaces, e.g. Set and Map, you might not even know about it (as in our case). Or you use a module or library that stores your objects in a collection internally as an implementation detail that&#8217;s hidden from you.</p>
<h2>Don&#8217;t rely on automatic hashCode() Generation</h2>
<p>Even with coding rules in mind, a <code>hashCode()</code> implementation that uses mutable fields creeps into our code base faster than you can spell &#8220;bug&#8221;. This is because developers are reluctant to write the long-winded calculations in the <code>hashCode()</code> methods manually and often generate them with the help of the IDE, as shown in the screenshot below. But it&#8217;s just too easy for a developer to press &#8220;Generate&#8221; without first checking the specific fields that can be included and leaving the mutable fields out. Of all IDEs I tested only <a href="http://netbeans.org/">NetBeans</a> at least has all fields unchecked, which forces the developer to select them on purpose.</p>
<div id="attachment_1124" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-hashset-java-puzzler" href="http://blog.mgm-tp.com/wp-content/uploads/2012/03/Generating-hashCode-method.png"><img class="size-large wp-image-1124" title="Generating hashCode method" src="http://blog.mgm-tp.com/wp-content/uploads/2012/03/Generating-hashCode-method-480x468.png" alt="" width="480" height="468" /></a><p class="wp-caption-text">Modern IDEs provide automatic generation of hashCode(). Eclipse and Intellij IDEA by default include all mutable fields.</p></div>
<h2>Code Inspection Tools to the Rescue?</h2>
<p>You might wonder if classes of your code base contain an <code>hashCode()</code> implementation that uses mutable fields. One option (besides a manual code review) is using a code inspection tool. Unfortunately, the prominent open source tools like <a href="http://findbugs.sourceforge.net/bugDescriptions.html/">FindBugs</a>, <a href="http://pmd.sourceforge.net/snapshot/rules/basic.html">PMD</a>, <a href="http://checkstyle.sourceforge.net/availablechecks.html">CheckStyle</a> do not offer such a built-in inspection.</p>
<div id="attachment_1125" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-hashset-java-puzzler" href="http://blog.mgm-tp.com/wp-content/uploads/2012/03/IntelliJ-IDEA-Code-Inspection.png"><img class="size-large wp-image-1125" title="IntelliJ IDEA Code Inspection" src="http://blog.mgm-tp.com/wp-content/uploads/2012/03/IntelliJ-IDEA-Code-Inspection-480x113.png" alt="" width="480" height="113" /></a><p class="wp-caption-text">Only IntelliJ IDEA has a built-in code inspection that detects the use of mutable (actually non-final) fields in hashCode().</p></div>
<p>The only tool support I found was <a href="http://www.jetbrains.com/idea/">Intellij IDEA</a>. This IDE provides a <a href="http://www.jetbrains.com/idea/documentation/inspections.jsp">code inspection</a> named <em>&#8220;Non-final field referenced in &#8216;hashCode()&#8217;&#8221;</em>. Any violation is highlighted as shown in the screenshot above.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/">Consequences when using Mutable Fields in hashCode() - Hacking Java Puzzlers for Fun and Profit, Part 1</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=24" title="View articles by Ulrich Schrempp">Ulrich Schrempp</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/03/hashset-java-puzzler/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<series:name><![CDATA[Hacking Java Puzzlers for Fun and Profit]]></series:name>
	</item>
		<item>
		<title>Using Domain Specific Languages to Implement Interactive Frontends - Software Quality driven by Formal DSLs, Part 1</title>
		<link>http://blog.mgm-tp.com/2012/02/formal-dsl-part1/</link>
		<comments>http://blog.mgm-tp.com/2012/02/formal-dsl-part1/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 12:06:17 +0000</pubDate>
		<dc:creator>Dr. Jürgen Knopp</dc:creator>
				<category><![CDATA[Research]]></category>
		<category><![CDATA[DSL]]></category>
		<category><![CDATA[QA]]></category>
		<category><![CDATA[Testing]]></category>
		<category><![CDATA[Validation]]></category>
		<category><![CDATA[Web Forms]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1082</guid>
		<description><![CDATA[For many years we have dealt with the challenges that frontends with interactive forms pose w.r.t. validation, test data and quality. Describing the requirements in formal Domain Specific Languages (DSL) became the way of choice to create a specification that gives a twofold benefit: first, the customer understands it better, and secondly, the software engineers [...]]]></description>
			<content:encoded><![CDATA[<p>For many years we have dealt with the challenges that frontends with interactive forms pose w.r.t. validation, test data and quality. Describing the requirements in formal <a href="http://en.wikipedia.org/wiki/Domain-specific_language">Domain Specific Languages</a> (DSL) became the way of choice to create a specification that gives a twofold benefit: first, the customer understands it better, and secondly, the software engineers use the specification not only to implement more resilient software, but also to improve quality assurance. This new series will explain how we do it and why we think it&#8217;s the best approach.</p>
<p><span id="more-1082"></span></p>
<p>We develop many applications for e-government, finance and insurance. These usually rely heavily on the presentation and evaluation of data processed within interactive frontends. Legal and business reasons demand that the resulting applications meet very high standards of reliance and safety. For us, this meant that we had to find a way to work close with the customer to understand the domain while simultaneously improving coverage and efficiency of development and QA. </p>
<p>We met this challenge by using Domain Specific Languages to involve the customer in the creation a specification basis for the domain which could be used by our engineers not only as requirement description but also for the implementation of tools for automatic validation or even code generation. The work spent on the definition, development and maintenance of specification language tools proved to be worth it as quality improved and effort decreased. E.g. while we were analyzing and reporting test coverage in terms of project specific formalized test data, we were glad that we could rely on a resilient specification basis. </p>
<div id="attachment_1111" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-forms-specification-part1" href="http://blog.mgm-tp.com/wp-content/uploads/2012/02/DSL-Specification-as-a-Bridge.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2012/02/DSL-Specification-as-a-Bridge.png" alt="" title="DSL Specification as a Bridge" width="480" height="201" class="size-full wp-image-1111" /></a><p class="wp-caption-text">Specifications based on a Domain Specific Formal Language bridge the gap between Customers and Software Developers.</p></div>
<p>The proven benefits of formal specifications at the comprehension level of customers encouraged us to bridge the gap in more mgm projects: we motivate more customers to introduce a formal requirement process and more engineers to a domain specific, though formalistic view of requirements so they can benefit from the improvements.</p>
<p>This first blog article introduces our domain specific formal language approach. The upcoming second part will focus on quality benefits derived from formal specifications.</p>
<h2>Setting the Scene</h2>
<p>The quality of complex interactive applications always includes a high level of conformity to the respective customer&#8217;s requirements. It is not enough to have a well-functioning system; the system also has to do what the customer expects! This holds for both, explicit and implicit (sometimes never formulated) requirements. We believe that the only way to accomplish this is to learn from the customer and to struggle for exact and comprehensive requirements consistent with the customer&#8217;s domain.</p>
<h4>Requirements modeled by Domain-Specific Formal Languages</h4>
<p>We believe that the need for exact requirements can be met by using formal languages to describe requirements. To model and implement what the customer needs (and not what we believe that he or she needs), the specification must be formulated using the terms of the respective domain and be comprehensible for the customer. We achieve this through the adaptation and extension of our existing language family to support the customer&#8217;s particular domain.</p>
<h4>Requirements modeled for Frontends &ndash; at least</h4>
<p>It is common sense to separate front-end aspects (such as input and output handling for forms) from the full business logic modeled somewhere in the back-end. The main purpose of front-ends is to deliver to the back-end information which is guaranteed to be sound and consistent. The back-end implementers should not be bothered by interactive aspects and user level data consistency.</p>
<h3>Our Approach</h3>
<p>mgm technology partners&#8217; approach to form-centric applications was to implement a framework supporting specification languages along the full software development chain: editors, code generators, documentation and test data generators for the supported language family.</p>
<p>Whereever possible:</p>
<ul>
<li>We use and promote formal specification languages for front-ends and part of the business logics.</li>
<li>We involve customers in the requirement process, especially through including formal specifications for their domains.</li>
<li>We tailor the specification framework to each of the customer&#8217;s specific application domain.</li>
</ul>
<p>This yields a high return (for customers and mgm) in terms of delivery time and quality and, in addition, reduces the implementation and quality assurance effort.</p>
<h3>Technical Benefits</h3>
<p>The benefits of using formal front-end specification techniques are manifold:</p>
<ol>
<li><strong>Consistency of the customer requirements</strong>: It becomes very likely, that the system accepts and denies exactly what the customer expected (including implicit requirements which would have never been formulated in absence of formal specifications).</li>
<li><strong>Reduction of programming effort and risk</strong>: Since formal specifications allow for automatic code generation, a great deal of programming effort and risk simply vanishes.</li>
<li><strong>Increasing functional quality standards for the front-end</strong>: Obviously, generated code is consistent to specifications (once the code generator is mature and tested). Moreover, due to the formal description of legal inputs, test suites describing valid inputs are automatically generated.</li>
<li><strong>Driving the tests of the back-end</strong>: Since the front-end specifications define the set of correct inputs, they serve as basis for tests for the back-end. Automatically generated test suites can be run which prove the quality of the back-end in a well defined setting. Test coverage goals for the back-end can be formulated in terms of the existing formal front-end specifications for the front-end. Test coverage becomes measurable in terms of front-end data and front-end use cases.</li>
</ol>
<p>In this blog post we focus on the topics 1) and 2) described above. We do so by presenting a flavor of the used specification languages. Aspects 3) and 4) will be discussed in a second blog post.</p>
<h2>Specifications supporting customer-related formal Requirements</h2>
<p>In the following we will demonstrate how to derive a formal specification using mgm&#8217;s specification language.</p>
<h3>An User Interface Example</h3>
<p>Imagine a simple bill writing system based on a simple (e.g. web graphical) user interface where we will also need to consider both, calculations and consistency constraints.</p>
<p><em>Let us assume that a number of bill positions with a net unit price can be typed in, each of them with its own multiplicity (quantity) of at least 1. The system calculates the net price for each position as well as the baseline including the default or specific VAT (value added tax).</em></p>
<p>The input form might look as follows (we ignore GUI details in this example):</p>
<div id="attachment_1083" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-forms-specification-part1" href="http://blog.mgm-tp.com/wp-content/uploads/2012/02/Empty-Form.png"><img class="size-full wp-image-1083" src="http://blog.mgm-tp.com/wp-content/uploads/2012/02/Empty-Form.png" alt="" width="480" height="199" /></a><p class="wp-caption-text">Sample form with input fields (yellow) and calculated fields (blue).</p></div>
<p>Input fields are marked yellow and fields which are to be calculated by the system (and thus are &#8220;read-only&#8221; for the user) are marked blue.</p>
<p>A completed form for two positions, with reduced VAT would look like follows:</p>
<div id="attachment_1084" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-forms-specification-part1" href="http://blog.mgm-tp.com/wp-content/uploads/2012/02/Filled-Form.png"><img class="size-full wp-image-1084" src="http://blog.mgm-tp.com/wp-content/uploads/2012/02/Filled-Form.png" alt="" width="480" height="236" /></a><p class="wp-caption-text">Filled form with automatically calculated net and gross amounts.</p></div>
<h2>Formal Specification for the Example</h2>
<p>The specification language is now used to describe the properties of the fields, i.e. their types, the relations between their values and other constraints regarding consistency and completeness in a formal way. This approach is the core of the formalization of requirements.</p>
<p>Let us look at some formal specifications for the example above.</p>
<h3>Types</h3>
<p>Depending on the domain, fields have domain-specific types at user level. These types can be specified as numbers, currencies, date data, strings, enumerations and truth values. Most of them also include a specification of the allowed field length. For instance <code>AlternativeVat</code> is a positive numeric value with 2 digits which leads to the specification:</p>
<pre class="brush: plain; wrap-lines: false;">
AlternativeVat: PositiveNumberDigits(2)
</pre>
<p>Some fields (e.g. all position related fields in the example) might have multiple instances. This is also specified along with the type definition. Multiplicity is specified as follows by using <code>multi</code>:</p>
<pre class="brush: plain; wrap-lines: false;">
multi Position:     String(25)
multi UnitPrice:    EurosAndCentsDigits(8)
multi Quantity:     PositiveInteger(3)
multi PosFullPrice: EurosAndCentsDigits(10)
</pre>
<p>Fields values which are either subject to calculation by the system or which are given or by the specification are marked as <code>calc</code> or <code>constant</code>, respectively. Both kinds are not editable by the user (they are &#8220;readonly&#8221;).</p>
<pre class="brush: plain; wrap-lines: false;">
calc       NetAmount:    EurosAndCentsDigits(10)
constant   NormalVat:    PositiveNumberDigits(2) = 19
calc       AllVat:       EurosAndCentsDigits(10)
calc multi PosFullPrice: EurosAndCentsDigits(10)
</pre>
<h3>Functional Rules</h3>
<p>Functional rules specify values in a functional setting.</p>
<p>For the fields <code>AllVat</code> and <code>GrossAmount</code> one gets straight-forward specification formulas:</p>
<pre class="brush: plain; wrap-lines: false;">
AllVat = If FieldValueSpecified(AlternativeVat)
            then  AlternativeVat/100*NetAmount
            else  NormalVat/100*NetAmount
GrossAmount = AllVat + NetAmount
</pre>
<p>The semantics of these rules is intended to be functional rather than imperative, i.e. a field value is not influenced by any state aside from the occurring field values (Similar to formulas in spreadsheets or functional languages). Note that functional rules can also be used to specify mappings within the user interface, e.g. from drop-downs to textual representations.</p>
<h3>Constraint Rules</h3>
<p>Constraint rules describe constraints which have to be satisfied to ensure consistent inputs. They consist of</p>
<ul>
<li>a conditional clause specifying the constraint, which the field values have to satisfy, and of</li>
<li>a fail-clause including a message which shall be issued if the constraints are not satisfied.</li>
</ul>
<p><em>In this example, only specific VATs can be considered: normal VAT, half normal VAT or no VAT at all.</em></p>
<pre class="brush: plain; wrap-lines: false;">
constraint
AlternativeVat == 0
   or AlternativeVat == NormalVat
   or AlternativeVat == NormalVat/2
   =&gt; failed: &quot;VAT can only be normal, half normal or zero&quot;
</pre>
<p>Note that the semantics is similar to assert statements: the failed-clause expresses feedback in case the conditional clause is not fulfilled.</p>
<p>Note further the differences between functional and constraint rules:</p>
<ul>
<li>Unlike functional rules, constraint rules specify constraints (operationally expressed: checks) rather than computations. For syntactic differentiation, we use &#8220;<code>==</code>&#8221; here rather than &#8220;<code>=</code>&#8220;.</li>
<li>Functional rules do not have fail-clauses since they enforce values for fields rather than checking their relationship.</li>
</ul>
<h3>Rules for Multiple Instances</h3>
<p>The language supports multiple instances of fields (and rules therefore) along two multiplicity dimensions &ndash; &#8220;All&#8221; and &#8220;Each&#8221;.</p>
<h4>All-multiplicity for functional rules</h4>
<p>For computations such as for the field <code>NetAmount</code> multiple instances have to be considered</p>
<pre class="brush: plain; wrap-lines: false;">
NetAmount = Sum(PosFullPrice.all)
</pre>
<p>The &#8220;<code>.all</code>&#8221; next to <code>PosFullPrice</code> stands for an &#8220;all instances&#8221; semantics. We call this <em>&#8220;all&#8221; multiplicity</em>.</p>
<h4>Each-multiplicity functional rules</h4>
<p>Each instance of <code>PosFullPrice</code> is calculated in a homogenous way by multiplying the price with the quantity.</p>
<pre class="brush: plain; wrap-lines: false;">
PosFullPrice.each = UnitPrice.each * Quantity.each
</pre>
<p>This functional rule yields for each row (i.e. instance), respectively and is denoted with the postfix &#8220;<code>.each</code>&#8220;. We call this each <em>multiplicity</em>.</p>
<p>This defines that the rule holds for each multiple instance of involved fields. Intuitively, one can view this as multiple copies of the rule for each instance.</p>
<h4>Each-multiplicity for constraint rules</h4>
<p>Consistency constraints for multiple instances of fields are expressed here as well.</p>
<p><em>In the example, it must be guaranteed, that each position (row) is fully specified.</em> This is expressed with the specific predicate <code>FieldsCommonlyDefined</code> (using &#8220;each&#8221; multiplicity).</p>
<pre class="brush: plain; wrap-lines: false;">
constraint
FieldsCommonlyDefined(Position.each, UnitPrice.each, Quantity.each)
    ==&gt; failed: &quot;All fields (Position , UnitPrice, Quantity) must be specified if one is specified &quot;
</pre>
<h4>All-multiplicity for constraint rules</h4>
<p>Obviously, &#8220;all&#8221; multiplicity is usable in constraint rule as well. In the example, since the system should not print empty bills, one has to specify multiplicity greater than zero at least for one of the fields referring to positions.</p>
<pre class="brush: plain; wrap-lines: false;">
constraint
AtLeastOneInstanceExists(Position.all)
    ==&gt; failed: &quot;Please specifiy at least one position&quot;
</pre>
<p>This completes the example showing some of the specifications means. The full specification is given in the appendix. Since it is complete it can be used for automatic code generations for computations and constraint checking.</p>
<p>The aim of this short example is neither syntactic accuracy nor completeness. It is merely an illustration of important aspects. Please remember that each language of the Specification Language family is tailored specifically anyhow.</p>
<h3>Summary: Characterization of the Specification Language Family</h3>
<p>In a nutshell, the specification language family described above defines valid inputs and front-end computations for web applications or other form based systems. The formalism can be characterized as a subset of typed predicate calculus defined in terms of the customer domain. For the sake of simplicity and comprehensibility, we do not extend to full predicate calculus or to higher order logic. The pragmatic expressiveness of the language family is more important than the theoretical power of predicate calculus.</p>
<p>Here is a summary of the aspects which can be described.</p>
<ul>
<li>Typed field: Valid and invalid field value using field description.</li>
<li>Field-value-constraints: valid and invalid values for related fields.</li>
<li>Existence-constraints: validity and invalidity of existent and non-existent input value depending on values or existence of input value for other fields.</li>
<li>Both kinds of constraints can be interwoven.</li>
<li>Computation of field values based on other field values. This is expressed by rules including:
<ul>
<li>mapping of external presentations to internal ones (such as mapping drop down selections to values),</li>
<li>constant values which show up in the user interface,</li>
<li>conditional values.</li>
</ul>
</li>
<li>Multiplicity: fields may be defined to occur in several instances, controlled by the specification. In terms of constraint control this allows to express iterative aspects on a higher level than for single fields. There are multiplicity aspects expressible by the specification language family (not needed in this example). This, however, is beyond the scope of this paper.</li>
</ul>
<p>And here&#8217;s the full specification example:</p>
<pre class="brush: plain; wrap-lines: false;">
# Fields and Types:
multi      Position: String(25)
multi      UnitPrice: EurosAndCentsDigits(8)
multi      Quantity: PositiveInteger(3)
multi      PosFullPrice: EurosAndCentsDigits(10)
           AlternativeVat: PositiveNumberDigits(2)
constant   NormalVat: PositiveNumberDigits(2) = 19
calc       NetAmount: EurosAndCentsDigits(10)
calc       AllVat:  EurosAndCentsDigits(10)
calc multi PosFullPrice: EurosAndCentsDigits(10)

# Functional Rules:
AllVat = If FieldValueSpecified(AlternativeVat)
            then AlternativeVat/100*NetAmount
            else NormalVat/100*NetAmount
GrossAmount       = AllVat + NetAmount
NetAmount         = Sum(PosFullPrice.all)
PosFullPrice.each = UnitPrice.each * Quantity.each

# Constraints:
AlternativeVat == 0  or  AlternativeVat == NormalVat
      or  AlternativeVat == NormalVat/2
      =&gt; failed: &quot;VAT can only be normal, half normal or zero&quot;
FieldsCommonlyDefined(Position.each UnitPrice.each, Quantity.each)
      =&gt; failed: &quot;All fields (Position , UnitPrice, Quantity) must be specified if one is specified &quot;
AtLeastOneInstanceExists(Position.all)
     =&gt; failed: &quot;Please specifiy at least one position&quot;
</pre>
<h2>Development and Testing issues</h2>
<p>Due to formal specifications, both, software development and quality assurance become less cumbersome, more comprehensible, easier to track, more scalable and safer. We conclude by shortly sketching these aspects.</p>
<h3>Avoiding Implementation Efforts and Risks through Code Generation</h3>
<p>Similar to programming languages, specification languages substantially simplify software development. Code generated from specifications eliminates a great deal of complexity and leads to less error-prone systems. Both functional rules and constraint rules can be used as input for generators to generate fully operational code. This is a highly efficient way to automatically produce components for the validation of data in a complex solution environment.</p>
<p>Once the code generators are well tested there is not much need to test for each front-end application again. mgm technology partners developed code generators that generate code for Java, C++ and Javascript, each of them running in a variety of environments.</p>
<h3>Test Data Generation and Test Coverage</h3>
<p>Since the used specification languages describe well defined inputs they are an ideal prerequisite for test data generation: consistent test cases are generated directly from the specification due to</p>
<ul>
<li>the existence of type definitions for fields,</li>
<li>the existence of constraint rules describing the relationship between fields,</li>
<li> and the existence of calculation formulas for calculated fields values.</li>
</ul>
<p>Test data generation occurs with few exceptions fully automatically. For specific test cases it is also possible to automatically generate inconsistent data by deliberately enforcing wrong data. This is done by negating constraints and re-running the test data generator.</p>
<p>mgm technology partners has developed a test data generator suite based on the used formal languages. The tool is used for the generation of big and complex test cases allowing for extensive and well defined test coverage. Some insights can be found in our blog articles &#8220;<a href="http://blog.mgm-tp.com/2010/10/test-data-generation-part1/">Form Validation with Rule Bases</a>&#8220;, and in &#8220;<a href="http://blog.mgm-tp.com/2010/12/test-data-generation-part2/">Producing High-Quality Test Data</a>&#8220;. Test coverage and other quality assurance topics issues will be highlighted in the upcoming second blog article.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/02/formal-dsl-part1/">Using Domain Specific Languages to Implement Interactive Frontends - Software Quality driven by Formal DSLs, Part 1</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=23" title="View articles by Dr. Jürgen Knopp">Dr. Jürgen Knopp</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/02/formal-dsl-part1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<series:name><![CDATA[Software Quality driven by Formal DSLs]]></series:name>
	</item>
		<item>
		<title>Ultra-Performant Dynamic Websites with Varnish</title>
		<link>http://blog.mgm-tp.com/2012/01/varnish-web-cache/</link>
		<comments>http://blog.mgm-tp.com/2012/01/varnish-web-cache/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 11:41:25 +0000</pubDate>
		<dc:creator>Dr. Christian Winkler</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Tips]]></category>
		<category><![CDATA[Cache]]></category>
		<category><![CDATA[ECommerce]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[Scalability]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1059</guid>
		<description><![CDATA[This article describes how we configured and used the Varnish web cache for the popular German online shop www.lidl.de. Varnish gave us a tremendous performance boost. With this new caching setup, we easily achieve request rates of several thousand pages per second, which are quite common during marketing campaigns like special offers.

In a typical non-caching [...]]]></description>
			<content:encoded><![CDATA[<p>This article describes how we configured and used the <a href="https://www.varnish-cache.org/">Varnish</a> web cache for the popular German online shop <a href="http://www.lidl.de/">www.lidl.de</a>. Varnish gave us a tremendous performance boost. With this new caching setup, we easily achieve request rates of several thousand pages per second, which are quite common during marketing campaigns like special offers.</p>
<p><span id="more-1059"></span></p>
<p>In a typical <em>non-caching</em> setup of a web application as illustrated in the figure below, Apache handles static requests for images, scripts, etc. and forwards requests for the HTML pages to an application server like Tomcat or Glassfish. There the dynamic content is generated and then sent back to Apache and finally to the user. In this scenario, the database access is the most critical bottleneck. Even worse, each page request can cause multiple database requests, i.e. SQL statements.</p>
<div id="attachment_1068" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-varnish" href="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Setup-without-Web-Cache.png"><img class="size-full wp-image-1068" title="Setup without Web Cache" src="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Setup-without-Web-Cache.png" alt="" width="480" height="255" /></a><p class="wp-caption-text">Initial setup without our caching solution: the slow components are shown in red. (Load balancing, which could be done by Apache, is not considered here.)</p></div>
<p>Let&#8217;s assume that without caching, an application server can serve up to <span id="statefullperformance">100 dynamic pages per second</span>. Through a bit of vertical scaling, i.e. using two server instances (nodes) and load balancing, this can be increased to about 200 dynamic requests. However, this scalability is not perfect and once it grows to three and more nodes, it already starts to get worse as the sessions have to be distributed among the nodes in the cluster.</p>
<p>The system can of course handle many more simultaneous users than the number 200 suggests, as users do not permanently access links. So the number of users did not really pose a problem during normal operation. However, the situation immediately got critical when newsletters with special offers were sent, as the application server instances were now under &#8220;siege&#8221;. An overload of the instances led to slower and slower responses and decreasing customer satisfaction. Another reason a shop would want to be responsive, is that search engines consider measured response times during crawling for ranking search results.</p>
<p>So the question we had to solve was: <em>How can we keep the system responsive (ideally with a response time of 1-2 seconds) during high load and peak situations?</em> Please be aware that in the case of online shops, the highest turnover occurs in these situations.</p>
<p>When we analyzed the server log of the <a href="http://www.lidl.de/">www.lidl.de</a> online shop, we noticed an interesting fact, which we used to our advantage later on: the behavior of users is different in these situations. Most users are just browsing and reading. Consider e.g. a newsletter sent to a few million users: most of the readers will just click a few links (which can still easily amount to several million page impressions). Taking a deeper look we found out that most users are viewing absolutely and exactly identical content which has nevertheless been produced exclusively for them. Only a small percentage used the interactive services of the website like shopping carts, ordering etc.</p>
<h2>Introducing Varnish</h2>
<p>The peak situation described above implies that most content (even though dynamically generated by the web application) is identical for all users. So the obvious idea for a cache is to store the most frequently requested pages. The <a href="http://www.mediawiki.org/wiki/Manual:Varnish_caching">Varnish manual</a> describes Varnish as a lightweight, efficient <a href="https://www.varnish-cache.org/docs/trunk/tutorial/advanced_backend_servers.html">reverse proxy</a> server, meaning it is working in front of the web servers (Apache). It acts as a so-called <em>HTTP accelerator</em> which stores (caches) copies of the pages served by the web server (thus the synonym &#8220;web cache&#8221;). The next time the same page is requested by a user, Varnish will serve the copy instead of requesting the page from the Apache server. Varnish is blazingly fast, since it stores its cached data in memory.</p>
<p>The new architecture with Varnish as a web cache now looks like this:</p>
<div id="attachment_1069" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-varnish" href="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Setup-with-Varnish-as-Web-Cache.png"><img class="size-full wp-image-1069" title="Setup with Varnish as Web Cache" src="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Setup-with-Varnish-as-Web-Cache.png" alt="" width="480" height="367" /></a><p class="wp-caption-text">Varnish in front of Apache acting as a Web cache. It is configured to cache only stateless page requests. Stateful page requests (session) and static resources are forwarded to Apache.</p></div>
<h2>Performance Improvements</h2>
<p>Caching with Varnish removes the need for the web application to regenerate the same page over and over again, resulting in a tremendous performance boost. Varnish can easily handle 10,000 requests/s on a single node. Especially in high load situations the hit rate is easily above 90% (and almost 100% for the mostly clicked homepage) so that the setup described above can now handle 50 times the original volume. However, this high performance will only hold for <em>stateless</em> users. Any user with a session will fall back to the 100 requests/s class.</p>
<p>As most of the load is now taken by the Varnish cache servers, the load on the application servers has dropped considerably. Even in high load situations where the Varnish servers handle several thousand requests per second, most of the content comes from the cache and the application servers can concentrate on re-creating expired content (which is then kept in the cache for s-maxage seconds) and handling users with a session (who are hopefully going to order).</p>
<p>Our setup leads to a significantly improved end-to-end performance of the system &ndash; even during normal operation. This is interesting as it creates an advantage for users during normal operation and saves money for the website owner at the same time.</p>
<p>Using less hardware means investing less money initially. However, an even more important fact is, that the operating costs will also be much lower. These operating costs are caused by permanent maintenance of the system, like powering servers around the clock, updating, applying patches etc. Since these costs are the main drivers for the total cost of ownership (TCO), the potential savings are also largest in this regime.</p>
<p>Using fewer servers also means consuming less power. By reducing the energy bill this &#8220;green IT&#8221; approach therefore leads to lower operating costs. Compared to extending the existing system without a cache, an enormous amount of money was saved both in hardware and operating costs, while introducing a &#8220;performance buffer&#8221; for situations with even higher loads at the same time.</p>
<p>Another effect is that the shop&#8217;s marketing division can now act freely without having to keep technical constraints in mind: new campaigns can be planned to increase the turnover significantly, like sending more frequent newsletters, using special offers etc.</p>
<h2>Challenges</h2>
<p>Before we dive into the details of our Varnish configuration, let&#8217;s first discuss three problems we had to solve, specifically handling stateful users, keeping users stateless w.r.t. caching as long as possible, and caching pages with changing content.</p>
<h3>Problem: Websites are Stateful</h3>
<p>Most websites nowadays are stateful, e.g. a server-side session is created when a user logs in. In case of an online shop, the session might contain the shopping cart, login information etc.</p>
<p>The problem is that as soon as the session contains personalized information, caching must immediately stop. But, as long as state information does not have an effect on the content of generated pages, it can be ignored. This is what we call a <em>stateless or browsing user</em>, and our first objective should be to cache pages suitable for this user class.</p>
<p>Thus, our <strong>solution is to classify users</strong>, i.e. to carefully distinguish between stateless and stateful users. As the web application did not originally take care of that, it had to be changed in two fundamental ways:</p>
<ol>
<li>The application must only generate and send cookies if it has created some internal state for a user.</li>
<li>This state transition can happen at any time. So a user who has not even touched the application server and is completely unknown to the application must be able to become a stateful user at any time.</li>
</ol>
<div id="attachment_1070" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-varnish" href="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Stateless-vs-Stateful-Users.png"><img class="size-full wp-image-1070" title="Stateless vs Stateful Users" src="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Stateless-vs-Stateful-Users.png" alt="" width="480" height="337" /></a><p class="wp-caption-text">Two classes of users are distinguished by certain attributes. A user should stay stateless as long as possible. Stateful (red) users will need contact to the application server and experience slower performance.</p></div>
<p>Fortunately, the web application was already obeying the <a href="http://en.wikipedia.org/wiki/Representational_state_transfer">REST</a> paradigm. HTTP GET requests were used for all content that was just shown to the users. In contrast to this, all user actions which were actually creating some state on the server side were already modeled in HTTP POST requests. This proved to be extremely helpful when we started to configure the cache software.</p>
<h3>Keeping Users Stateless</h3>
<p>The general goal must be to keep users stateless, at least as long as possible. In a first naive approach, only this facilitates caching.</p>
<p>Keeping users stateless means that the server should never send a session cookie unless really necessary. On the other hand, a lot of web applications require some basic personalization. This dilemma can be solved by using cookies which will be evaluated on the client side only. For example, let&#8217;s assume that users can change the background color of the website as a very simple form of personalization. This can be performed by Javascript and, for the sake of caching (and achieving a high hit rate), this should be the preferred way of doing simple personalization. Of course, a server-side cookie for personalized background color could be used to get the same result. But the cache hit rate would then suffer considerably (to be exact, by a factor  identical to the number of background colors, since exactly the same amount of cached copies has to be saved).</p>
<p>So one <strong>recipe for staying stateless is to keep simple state on the client-side</strong> and never send it to the server. This state does not necessarily have to reside in a cookie &ndash; you can also use the browser local storage for that, as described in Smashing Magazin&#8217;s <a href="http://coding.smashingmagazine.com/2010/10/11/local-storage-and-how-to-use-it/">&#8220;Using Local Storage In HTML5-Capable Browsers&#8221;</a> article.</p>
<h3>Dealing with Content that is Changing</h3>
<p>Even if now all stateless users can see the same cached content, this content is changing over time. In an online shop, for example, some products might run out of stock and become unavailable or need to be replaced by other products. Unfortunately, this does not only affect the product pages themselves but sometimes also pages that reference them; e.g. links and thumbnail images will have to be changed or removed. Similar situations often occur in online publishing and in nearly all websites which change over time.</p>
<p>Thus, another requirement for the cache is its ability to <strong>partially</strong> <strong>expire content</strong>. And of course, the bookkeeping must be performed externally so that the affected pages can be removed individually.</p>
<p>For the cache to work properly and perform <strong>automatic expiration</strong> of content, it needs to know how long the currently cached content should be kept (i.e. its maximal age). The web application therefore has to generate this so-called time-to-live (TTL) information.</p>
<p>The HTTP specification has defined HTTP response header fields such as <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9"><code>Cache-Control</code></a> for exactly this purpose a long time ago. These are set by the web application itself, since it knows best how long the content will be considered &#8220;current&#8221;/&#8221;valid&#8221;. This setting could even be dynamic , e.g. giving a shorter time-to-live to a product page if stock is low. The Cache-Control directive most suitable for this purpose is <code>s-maxage</code> as it specifies the maximum age of the object in seconds that the response is allowed to be kept in the web cache.</p>
<h3>Determining Cacheable Candidates</h3>
<p>Not all content can be or even should be cached. Caching on completely static websites is easier by far, however, these tend to be very unattractive, could be pre-generated and then moved to the web server. As the cache will sit in front of the web server, all requests will go to the cache first. It does not make much sense to store pages in the cache which are kept statically in the web server&#8217;s file system anyway.</p>
<p>On the other hand, only GET URLs can be candidates for caching. As a POST request transmits information from the browser to the server, it cannot be cached and must always be handled by the application server. This might sound like a big constraint at first but is actually a feature that can be nicely utilized: all URLs which are candidates for performing the state transition of a user from stateless to stateful will be POST requests. And consequently, the application itself can decide whether the POST requests actually qualify for making a user stateful or whether s/he can remain stateless, for example when a wrong login/password combination is entered.</p>
<h2>Anatomy of Varnish&#8217;s Request Processing</h2>
<p>Varnish distinguishes three stages when processing a request:</p>
<ul>
<li>The request is received from the browser (<code>vcl_recv</code>).<br />
At this stage, Varnish calls the subroutine <code>vcl_recv</code> in the configuration file (VCL). Here, the request header can be manipulated e.g. by removing cookies. It can be decided whether the content should be looked up in the cache or be propagated to the backend server.</li>
<li>The response is received from the backend (<code>vcl_fetch</code>).<br />
This function is only executed when the content is not delivered from the cache. In this phase, response headers from the backend can be modified (either for delivery or for saving in the cache). The request attributes are also still available and can be used for manipulating several settings.</li>
<li>The response is sent to the browser (<code>vcl_deliver</code>).<br />
This stage is passed by all requests and can be used to add headers (like TTL), change cookies etc. The request parameters are available for reading.</li>
</ul>
<div id="attachment_1063" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-varnish" href="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Varnish-Request-Processing.png"><img class="size-large wp-image-1063" title="Varnish Request Processing" src="http://blog.mgm-tp.com/wp-content/uploads/2012/01/Varnish-Request-Processing-480x310.png" alt="" width="480" height="310" /></a><p class="wp-caption-text">Different stages of Varnish&#39;s request processing. Everything related to the cache is in red, i.e. all cacheable content is looked up in the cache and possibly delivered; if it&#39;s not in the cache, the web server will be asked via vcl_fetch.</p></div>
<p>Varnish defines additional subroutines which also hook into the Varnish workflow, but they are not as important. See also the <a href="https://www.varnish-cache.org/docs/trunk/tutorial/vcl.html">VCL tutorial</a> and the <a href="https://www.varnish-cache.org/docs/trunk/reference/vcl.html">VCL reference</a>.</p>
<h3 id="vcl">A Sample Varnish Configuration (VCL)</h3>
<p>This section contains a simple Varnish configuration that provides caching as required. The challenge is to keep the user stateless as long as possible. In order to achieve this, a simple trick is used: if a request does not contain a <code>JSESSIONID</code> cookie, it is a stateless request and even if the (uneducated) backend wants to set a cookie, it will be removed. Only POST requests will set necessary cookies. Manipulating the TTL compliments the configuration. A lot of logging is used in the example; this is not just for illustrative purposes but also practical for debugging and optimizing the configuration.</p>
<pre class="brush: plain; wrap-lines: false;">
import std;

backend default {
    .host = &quot;localhost&quot;;  # Varnish is running on same server as Apache
    .port = &quot;80&quot;;
}

sub vcl_recv {
  # remove unnecessary cookies
  if (req.http.cookie ~ &quot;JSESSIONID&quot;) {
    std.log(&quot;found jsessionid in request, passing to backend server&quot;);
    return (pass);
  } else {
    unset req.http.cookie;
  }
}

sub vcl_fetch {
  if (req.http.cookie ~ &quot;JSESSIONID&quot; || req.request == &quot;POST&quot;) {
    std.log(&quot;not removing cookie/passing POST, url &quot; + req.url);
    return (pass);
  } else {
    # remove all other cookies and prevent backend from setting any
    std.log(&quot;removing cookie in url &quot; + req.url);
    unset beresp.http.set-cookie;
    set beresp.ttl = 600s;
  }
}

sub vcl_deliver {
  # send some handy statistics back, useful for checking cache
  if (obj.hits &gt; 0) {
    set resp.http.X-Cache-Action = &quot;HIT&quot;;
    set resp.http.X-Cache-Hits = obj.hits;
  } else {
    set resp.http.X-Cache-Action = &quot;MISS&quot;;
  }
}
</pre>
<p>Notice the C-like syntax in the Varnish configuration. This is no accident; in fact, the whole configuration code is compiled to a binary shared object at startup and when reloading the script to optimize for performance. As the subroutines in this configuration are called for each request, this helps immensely in creating a fast cache server. Moreover, it is possible to add C code directly to the configuration.</p>
<p>It might seem strange at first to define the configuration in a procedural language, but it proved to be extremely valuable as it enables us to be flexible and to formulate how exactly to handle the requests. Overall, this leads to a much more readable configuration than a declarative approach.</p>
<p>Notice the different &#8220;top level&#8221; objects in the configuration file:</p>
<ul>
<li><code>req</code> is the request (i.e. the URL including all headers) coming from the browser,</li>
<li><code>resp</code> is the response before it is sent to the client, i.e. when it can still be manipulated.</li>
<li><code>beresp</code>: The response which Varnish gets from the backend (if the object is not cacheable or not cached) is also available as <code>beresp</code> and can be evaluated.</li>
</ul>
<p>On a side note, Varnish can use ACLs to restrict the access to certain resources. The same ACLs can also be used to (declaratively) tell Varnish what to cache and what not. This technique is the sometimes used <a href="https://www.varnish-cache.org/docs/trunk/tutorial/purging.html#bans">&#8220;banning&#8221;</a>. Varnish can also (atomically) delete certain elements from the cache. This is accomplished via a <a href="https://www.varnish-cache.org/docs/trunk/tutorial/purging.html#http-purges">&#8220;purge&#8221; command</a> through the HTTP interface and should be restricted to IP addresses (which is the standard configuration together with a secret).</p>
<h2>Configuration Details and Tips</h2>
<p>Now that we have seen the basic VCL file and understood how a request is usually processed, let&#8217;s dive in even further and discuss the details and lessons learned.</p>
<h3>Improving the Hit rate with Header Normalization</h3>
<p>Varnish has to be told which HTTP request header fields it should use as a cache index. The index is organized as a hash, thus these selected header fields are often referred to as the <em>hash key</em>.</p>
<p>On a side note, you can select the header fields to be used as a hash by implementing the subroutine <code>vcl_hash</code>. If you don&#8217;t implement it, Varnish uses the full URL plus the <code>Host</code> request header field by default. In addition to the hash key computed in <code>vcl_hash</code>, the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html"><code>Vary</code></a> header field is always automatically added to the hash key . For further information on the hash key, see <a href="https://www.varnish-cache.org/docs/trunk/phk/varnish_does_not_hash.html">&#8220;What Varnish Does&#8221;</a> and <a href="http://stackoverflow.com/questions/6098914/varnish-and-http-header">&#8220;Varnish and http header&#8221; on Stackoverflow</a>.</p>
<p>To improve the cache hit-rate, it is crucial that you clean up the request header fields used for the hash key. Cleaning up means to change them to a common denominator (so-called <em>header normalization</em>). Another very good candidate is of course the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23"><code>Host</code> header field</a>, where a normalized version (like &#8220;www.sitename.com&#8221;) should be used even if &#8220;sitename.com&#8221; is sent in the request header. In addition to that, removing unnecessary headers is always a good idea.</p>
<p>Be careful that the application server does not send a <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44"><code>Vary</code> header field</a> for the user agent as this effectively means that there has to be a distinct copy for each user agent. There are so many different browsers (http://panopticlick.eff.org/) that this will basically make caching useless. See also <a href="http://mark.koli.ch/2010/09/understanding-the-http-vary-header-and-caching-proxies-squid-etc.html">&#8220;Understanding the HTTP Vary Header and Caching Proxies (Squid, etc.)&#8221;</a> and the<br />
<a href="https://www.varnish-cache.org/docs/trunk/tutorial/vary.html#tutorial-vary">Varnish Documentation on Vary</a>.</p>
<h3>Compression</h3>
<p>The <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.3"><code>Accept-Encoding</code></a> request header field plays an important role: it can have different values like &#8220;plain&#8221;, &#8220;gzip&#8221; or &#8220;deflate&#8221;. Unfortunately, Internet Explorer prefers the deflate encoding while all other browsers favor gzip. Without intervention, this leads to different copies of the same content in the cache, one in deflate format, the other in gzip format.</p>
<p>Since the request header can be modified on the fly in the <code>vcl_recv<br />
</code> subroutine, we can effectively control that only one variant of the content is cached. In your VCL you can modify the request header field and use gzip exclusively if it is available (which is true for both Internet Explorer and others). This technique is presented in detail in the article <a href="https://www.varnish-cache.org/trac/wiki/VCLExampleNormalizeAcceptEncoding">&#8220;Normalize Accept-Encoding header&#8221;</a>. Since both browser families have a market share of roughly 50%, this simple change effectively doubles the hit rate.</p>
<p>Please note that beginning with Varnish 3.0, Varnish <a href="https://www.varnish-cache.org/docs/trunk/tutorial/compression.html">supports gzip natively</a> and can modify the <code>Accept-Encoding</code> field by itself, so the measures discussed in the previous paragraph can be skipped.</p>
<h3>Handling Cookies</h3>
<p>Cookies basically fall into different categories:</p>
<ul>
<li>Cookies <em>relevant</em> for caching: These should be kept and their values can be used as part of the hash key for cache index.</li>
<li>Cookies <em>irrelevant</em> for caching: These should be discarded and not considered by the cache.</li>
<li>Cookies <em>partially relevant</em> for caching: These should be modified and the irrelevant parts should be removed. The remaining cookie should then be used as part of the hash key for the cache index.</li>
<li>Session cookies: These cookies must be treated differently as they basically make caching impossible. If such cookies are detected, Varnish should not cache anything but work as a proxy only sending data from the backend server directly to the client.</li>
</ul>
<h3>Consistent Values for TTL and the Expires Field</h3>
<p>Varnish has to decide whether and how long to keep elements in the cache. As we have already learned, the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9"><code>Cache-Control</code></a> header field is utilized here. More specifically, the <code>s-maxage</code> directive part (or <code>maxage</code> as a fallback if <code>s-maxage</code> is not present) is examined to determine the specified maximum lifetime of a cacheable object. Of course, this only works as long as the cache is not full; in the latter case the <a href="http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used">LRU algorithm</a> is used.</p>
<p>If the web application was not designed with a web cache in mind, it might have conflicting values in <code>s-maxage</code> and the <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.21"><code>Expires</code></a> response header field. (See the HTTP specification for a <a href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3">discussion of the Expires versus the maxage field</a>.) This might lead to the bizarre situation that the cached content is sent by Varnish with an <code>Expires</code> header field value that lies in the past if <code>s-maxage</code> has a larger value than <code>Expires</code>.</p>
<p>This weird behavior can be fixed in several ways, e.g. by statically setting the Expires header in Varnish for each request to <code>s-maxage</code> seconds into the future during &#8220;vcl_fetch&#8221;. This will increase the cache efficiency <em>on the browser side</em> and lead to a more responsive website.</p>
<h3>File Descriptors</h3>
<p>In our first tests the solution performed well, but not excellently. But even more critical were the many dropped connections, i.e. requests from browsers that did not even reach Varnish.</p>
<p>The reason and the fix were easy &ndash; the number of file descriptors had to be increased. This is even more important in real-life situation where connections tend to be slow, as each TCP connection consumes one file descriptor. It does not hurt to allow 32768 descriptors for Varnish.</p>
<h3>Monitoring</h3>
<p>If you have setup a web cache solution with Varnish, it is important to measure its performance and especially monitor the hit rate of the cache. This turns out to be a bit complicated since the Varnish log files are not written to disk for performance reasons; instead of this, Varnish logs to a circular buffer residing in a shared memory segment. The circular buffer can be read at any time but past values will vanish forever. Since we wanted a monitoring solution that would also allow us to perform a post-mortem analysis in case of a problem, we configured the logging to write the <a href="https://www.varnish-cache.org/docs/trunk/reference/varnishd.html#storage-types">circular buffer to a persistent file</a>.</p>
<p>The most relevant tools for monitoring Varnish are:</p>
<ul>
<li><a href="https://www.varnish-cache.org/docs/trunk/reference/varnishlog.html">varnishlog</a>: This shows current requests from the logging ring buffer. Usually request phases will be shown in chronological order which mixes up the requests themselves. This can be fixed by using appropriate options though.</li>
<li><a href="https://www.varnish-cache.org/docs/trunk/reference/varnishtop.html">varnishtop</a>: This shows the CPU distribution inside the varnish process and can be used to optimize the configuration if too much time is spent in only a few functions.</li>
<li><a href="https://www.varnish-cache.org/docs/trunk/reference/varnishhist.html">varnishhist</a>: This is easily the most intuitive and graphical tool for analyzing Varnish. It shows a (text) histogram of the response time distribution and thus gives a good overview how the whole system is performing.</li>
<li><a href="https://www.varnish-cache.org/docs/trunk/reference/varnishstat.html">varnishstat</a>: This shows important statistical information about hit rates, total cache hits, accepted connections from clients etc.</li>
</ul>
<h2>Why Varnish is the best Caching Solution (for us)</h2>
<p>When we began to investigate ways to speed up the <a href="http://www.lidl.de">www.lidl.de</a> site, our first choice was to add the <a href="http://httpd.apache.org/docs/2.1/mod/mod_cache.html">Apache mod_cache</a> caching module to the Apache web server already in use. The first hurdle was the declarative configuration; it is well-suited for a web server but not perfect for modeling a caching behavior. After some fiddling around, it was working smoothly. But more serious problems arose from the fact that certain cookies had to be considered and others had to be neglected. It was impossible to find a viable solution, so the cookie was filtered out by the load balancer. Cache invalidation is performed lazily in Apache, i.e. an outdated resource is removed from the cache only after it is requested. Consequently, outdated resources which are not requested will stay in the cache forever and can only be expired externally. As all cached components are distributed in single files, this expiry is slow and the whole process complicated. For our situation, Apache was not a good solution (although it was in use for quite some time) and hit rates were also rather disappointing.</p>
<p>So our search continued. Via dedicated proxy servers, which are more suitable for large client-side installations like <a href="http://www.squid-cache.org/">Squid</a>, we finally encountered Varnish, an HTTP accelerator specially built for caching purposes on the server-side. Varnish is already used by many big websites like <a href="http://www.facebook.com/">Facebook</a>, <a href="http://search.twitter.com">Twitter (Search)</a>, <a href="http://www.hulu.com/">Hulu</a>.</p>
<p>Varnish is very flexible as it offers procedural configuration of all request stages in a C-like language (which is actually translated to C and compiled at start time to be as efficient as possible). This enables creative cookie handling and all kinds of other tricks which are usually needed in such a scenario. Varnish was specially designed to run on servers with a VM subsystem, so all cached objects live in a single memory-mapped file and can be accessed extremely fast. Varnish handles expiry automatically and correctly and is even much faster than Apache. So the decision was made to go with Varnish.</p>
<p>Other <em>HTTP accelerators</em> were also considered, but proved to be not feasible, like <a href="http://www.oracle.com/technetwork/middleware/ias/index-089317.html">Oracle Web Cache</a>, a commercial software package from Oracle Inc.; the problem here is that the cache cannot grow easily, and that the manipulation of requests and responses is limited. A hardware-based solution is e.g. <a href="http://www.f5.com/solutions/acceleration/web-acceleration/">F5&#8217;s BIG-IP WebAccelerator</a>.</p>
<h2>Further Optimizations</h2>
<p>Below is a discussion of measures that build on a Varnish setup and would speed-up the page delivery even further.</p>
<h3>Using a CDN to increase Scale, Reach &amp; Performance</h3>
<p>CDNs take care of delivering the static content while the dynamic content is served via the usual stack. They work in an inherently distributed way and have clever algorithms to select the topologically nearest server for each user. Static and dynamic content can be separated by using virtual webservers with different hostnames. The PDF article <a href="http://www.akamai.com/dl/technical_publications/GloballyDistributedContentDelivery.pdf">&#8220;Globally distributed content delivery&#8221;</a> from Akamai provides an excellent introduction.</p>
<p>Most CDNs offer an API for invalidating all or partial content and respect the expires header field sent from the originating servers. So the Varnish server can work as a central content repository and will be the upstream server for refreshing the CDN.</p>
<p>Almost all traffic would then be served by the CDN. This saves a lot of bandwidth on the Varnish server and the Gigabit interface will not so easily be overloaded. Moreover, as traffic costs in the CDN are negligible, money can be saved as the hosting company does not have to increase its own upstream link. For more information on how to build a CDN see <a href="http://blog.unixy.net/2010/07/how-to-build-your-own-cdn-using-bind-geoip-nginx-and-varnish/">&#8220;How to build your own CDN using BIND, GeoIP, Nginx, and Varnish&#8221;</a>.</p>
<h3>ESI: Caching Page Fragments with diverse TTL</h3>
<p>From a technical point of view, only pages which are requested by the GET method can be cached at all. This is due to the fact that &ndash; by definition &ndash; POST requests change state on the server which then necessarily needs to reach the application server.</p>
<p>However, the solution described above performs less &#8220;aggressive&#8221; caching since it just stops caching as soon as a session cookie is present. The effect is that stateful users never get cached pages and therefore might have to wait longer for the page to render completely. On the other hand, it does not make sense to cache pages for individual users since it is quite unlikely that the same user will come back to the exactly same page. Even if the user would come back, it would not be safe to assume that the page is still up-to-date (e.g. since the shopping cart might have changed in the meantime).</p>
<p>To speed things up again, a compromise needs to be found between caching invariant fragments of a page and producing personalized content on the fly for stateful users. Fortunately, Varnish offers the correct arsenal to perform exactly this decomposition by leveraging <a href="http://www.w3.org/TR/esi-lang">Edge Side Include (ESI)</a>.</p>
<p>When <a href="https://www.varnish-cache.org/docs/trunk/tutorial/esi.html">Varnish processes ESI tags</a>, the page assembly (out of fragments) is done by Varnish. As these fragments are separate web resources (requested through GET or POST) they can be assigned their own cache settings and handling information. For example, a cache time-to-live (TTL) of several days could be appropriate for the template, but a fragment containing a frequently-changing story or ad may require a much lower TTL. Some fragments may have to be marked uncacheable.</p>
<p>It must be carefully analyzed how the decomposition of the page might look like as getting it right is essential to achieve a high hit-rate and a low overhead. In case of an online shop, the page could e.g. consist of different (graphical) fragments:</p>
<div id="attachment_1071" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-varnish" href="http://blog.mgm-tp.com/wp-content/uploads/2012/01/ESI-Page-Template-and-Fragements.png"><img class="size-full wp-image-1071" title="ESI Page Template and Fragements" src="http://blog.mgm-tp.com/wp-content/uploads/2012/01/ESI-Page-Template-and-Fragements.png" alt="" width="480" height="365" /></a><p class="wp-caption-text">Decomposition of a typical page into user-specific, dynamic (red) and static (blue) fragments.</p></div>
<p>The shopping cart and the login details would then be transferred directly from the application server via an appropriate ESI fragment, whereas the rest of the page is identical for all users and can be stored in the cache. To minimize the number of requests from Varnish to the application server, both fragments can be transferred in one part and integrated in different locations on the page on the client side or via CSS.</p>
<p>Compared to the performance numbers <a href="#statefullperformance">above</a>, the stateful performance is much higher when using ESI. Rates of about 500 stateful requests per second are now easily possible.</p>
<h3>Memcached: Caching Session-specific Page-Fragments</h3>
<p>If you examine the page diagram above, you might notice that even though the shopping cart and login details are user-specific elements on the page, they are not very dynamic, i.e. they change infrequently.</p>
<p>This leads to an opportunity for further optimization: the user-specific fragments can also be stored, but must of course be associated with the session of the corresponding user. As the information is not persistent (as it becomes invalid with an invalidated session) it can be stored in memory. <a href="http://memcached.org/">Memcached</a> is just made for this scenario and therefore a perfect fit, see e.g. the article <a href="http://blog.preinheimer.com/index.php?/archives/334-Storing-Sessions-in-Memcache-how-everything-behaves.html">&#8220;Storing Sessions in Memcache&#8221;</a>.</p>
<p>Any change in the shopping cart or login details will trigger a regeneration of the HTML fragments which will then be stored in memcached. (This can be done in the same POST request by the application server.) Varnish will include the fragment from Memcached (either via direct integration, via Apache or via Nginx). A SessionListener within Tomcat can take care of removing stale sessions from Memcached.</p>
<p>Memcached is extremely fast. Even for stateful users this leads to a performance of well above 5,000 GET requests/s. POST requests are a different story as they still have to be handled by the application server. As they perform only internal tasks and write both to the database and Memcached, a rate of 500 requests/s is nonetheless realistic.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2012/01/varnish-web-cache/">Ultra-Performant Dynamic Websites with Varnish</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=20" title="View articles by Dr. Christian Winkler">Dr. Christian Winkler</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2012/01/varnish-web-cache/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Practical Customer Participation in JIRA Workflows - JIRA beyond Bug Tracking, Part 2</title>
		<link>http://blog.mgm-tp.com/2011/12/jira-beyond-bug-tracking-part2/</link>
		<comments>http://blog.mgm-tp.com/2011/12/jira-beyond-bug-tracking-part2/#comments</comments>
		<pubDate>Thu, 08 Dec 2011 15:43:19 +0000</pubDate>
		<dc:creator>Alexander Weiss</dc:creator>
				<category><![CDATA[Projects]]></category>
		<category><![CDATA[Tips]]></category>
		<category><![CDATA[Agile]]></category>
		<category><![CDATA[JIRA]]></category>
		<category><![CDATA[Project Management]]></category>
		<category><![CDATA[QA]]></category>

		<guid isPermaLink="false">http://blog.mgm-tp.com/?p=1046</guid>
		<description><![CDATA[This second part of our blog series continues with the topic of direct involvement of customers and gives some practical examples of when, where and how to introduce and include your customer into JIRA. We will then discuss two of mgm&#8217;s proven real-world workflows and use them as case studies about appropriate modes for successful [...]]]></description>
			<content:encoded><![CDATA[<p>This second part of our blog series continues with the topic of direct involvement of customers and gives some practical examples of when, where and how to introduce and include your customer into <a href="http://www.atlassian.com/software/jira/overview">JIRA</a>. We will then discuss two of mgm&#8217;s proven real-world workflows and use them as case studies about appropriate modes for successful customer participation. You will also learn about our recommended ways of keeping the complexity of huge JIRA projects from the customer.</p>
<p><span id="more-1046"></span></p>
<p>Let&#8217;s begin with how JIRA can be utilized during the initial project phases. The main steps here are to prepare a more detailed business modeling and to complete the technical and business concepts. These steps are tightly connected with the compilation of the requirements and the requirement management phase. The requirement analysts of the project and the responsible project manager will interview all the necessary stakeholders to get a complete picture of the required solution that the business modelers and architects need for their work.</p>
<h2>Capturing Requirements as JIRA Tickets</h2>
<p>During this requirement management phase all the collected functional and non-functional requirements will already be stored as JIRA tickets to control their content and impact and to prioritize them with respect to the solution and its implementation order (planning process). And exactly this phase can be used to create a first contact point for customers with JIRA: Involvement in the compilation of new requirements and detailing of already filed items.</p>
<p>But as the customer is not yet very familiar with JIRA in this very early stage, we typically choose to create all the new requirements tickets ourselves instead of the customer. This is not just to unburden the customer: we also want to avoid the additional work of correcting imprecisely formulated requirements. </p>
<p>The descriptions of requirement tickets should always be <em>unambiguous and complete</em>. Thus, the responsibility to verbalize requirements usually remains with us. But the customer can be involved at any time to contribute details and he can (and should) be an active part during the elaboration phase and deliver his input and expertise through comments to the respective tickets. </p>
<p>Another very important point in requirement management is the used <em>terminology</em>. It is very important to always talk (and to write) the customers&#8217; domain specific language. Use only terms that can be understood by the customer! We find it very helpful to maintain a glossary of all domain specific words and terms together with the customer. </p>
<h2>Customer Involvement in the JIRA Requirement Process</h2>
<p>In addition to detailing the content of requirements and ensuring their correctness, the customer can take over two other important tasks in the requirement process:</p>
<ul>
<li><strong>Assignment:</strong> Once a requirement is elaborated, the effort estimated and it is ready for realization, it has to be assigned to the supplier (us) for release planning and implementation.</li>
<li><strong>Approval:</strong> When the requirement is implemented and approved by the development team and our internal quality assurance, the realized requirement is ready for approval by the customer.</li>
</ul>
<p>Both of these steps (assignment and approval) can be realized as workflow steps for the issue type &#8220;Requirement&#8221;. Dependent on the character of the project and customer, mgm runs projects with different levels of integration. </p>
<h2>Proven Workflow Implementations</h2>
<p>Let&#8217;s take a look at two requirement workflows that we designed for our projects, each with a different integration level of the mentioned steps.</p>
<div id="attachment_1047" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-jira-beyond-part2" href="http://blog.mgm-tp.com/wp-content/uploads/2011/12/blog2-workflow-1.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2011/12/blog2-workflow-1-480x313.png" alt="" title="blog2-workflow-1" width="480" height="313" class="size-large wp-image-1047" /></a><p class="wp-caption-text">Workflow 'Requirement' with dedicated customer steps for assignment and approval (Variant 1).</p></div>
<p>This first workflow (shown above) contains a dedicated step to alert the customer that the requirement is &#8220;READY TO ASSIGN&#8221;. On his dashboard the customer has a portlet listing all these &#8216;marked&#8217; requirements as a working queue for assignments!</p>
<p>It is not strictly necessary that the customer himself executes the transition &#8220;assign&#8221; in JIRA. We have projects where the requirement assignment is officially sent via mail or e-mail by the customer. In these cases, our requirement manager performs the transition on behalf of the customer. But we also have projects where the customer himself pushes the &#8220;assign&#8221; button in JIRA.</p>
<p>Following the implementation part with the steps &#8220;IN PROGRESS&#8221;, &#8220;RESOLVED&#8221; and &#8220;CODE-REVIEWED&#8221;, the requirement workflow contains the steps &#8220;READY FOR TESTING&#8221; and &#8220;VERIFIED&#8221;. During the software approval stage the customer can use these dedicated steps to manage his testing and approval tasks. Once again, the needed filters are integrated into the customer dashboard. The development team will explicitly hand over all implementations that passed internal quality assurance to the customer. The approval transition &#8220;VERIFY ISSUE&#8221; will then be executed by the customer himself. Usually we convince the customer to do this directly in JIRA. </p>
<p>In this first example, the step &#8220;REVISION&#8221; of the requirement (the elaboration phase) is separated from the step &#8220;ESTIMATE&#8221;, thus the customer will keep the control of ordering the effort estimations. </p>
<p>Now let us consider a second workflow example as depicted below: </p>
<div id="attachment_1048" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-jira-beyond-part2" href="http://blog.mgm-tp.com/wp-content/uploads/2011/12/blog2-workflow-2.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2011/12/blog2-workflow-2-480x311.png" alt="" title="blog2-workflow-2" width="480" height="311" class="size-large wp-image-1048" /></a><p class="wp-caption-text">Workflow 'Requirement' with dedicated customer steps for assignment and approval (Variant 2).</p></div>
<p>The interesting parts here are the initial step &#8220;DRAFT&#8221;, the step &#8220;ANALYSIS&#8221; (before &#8220;REVISION&#8221;) and the approval step &#8220;SIGNED OFF&#8221;. The &#8220;DRAFT&#8221; step is especially useful if customers create requirement tickets by themselves. The distinct step &#8220;ANALYSIS&#8221; is an independent elaboration and phrasing phase for the customer (typically when they have a dedicated operations department) whereas the step &#8220;REVISION&#8221; is an elaboration phase for the project team (development). At the end of the whole implementation process the customer can use the step &#8220;SIGNED OFF&#8221; for the approval process.</p>
<h2>Overview: Where and How to Involve the Customer</h2>
<p>Requirement management is an obvious area for direct participation of customers, but more traditional areas like &#8220;bug tracking&#8221; and &#8220;change management&#8221; are also potential candidates for customer involvement.</p>
<p>Below is a collection of areas where we constantly try to convince our customers to participate directly within our established JIRA processes:</p>
<ul>
<li><strong>Requirement management</strong>
<ul>
<li>Input of new requirement tickets </li>
<li>Direct participation in the elaboration phase (optionally with additional workflow steps)</li>
<li>Assignment for realization (workflow steps)</li>
<li>Testing and approval of implemented requirements (workflow steps)</li>
</ul>
</li>
<li><strong>Change management</strong>
<ul>
<li>Input of new change request tickets </li>
<li>Direct participation in the elaboration phase (optionally with own workflow steps)</li>
<li>Assignment for realization (workflow steps)</li>
<li>Testing and approval of implemented change requests (workflow steps)</li>
</ul>
</li>
<li><strong>Bug tracking</strong>
<ul>
<li>Input of new bug tickets </li>
<li>Testing and approval of fixed bugs (workflow steps)</li>
</ul>
</li>
<li><strong>Software approval process</strong>
<ul>
<li>Execution of dedicated testing tickets </li>
<li>Approval of all individual development tickets (requirements, change requests and bugs) (workflow steps)</li>
<li>Issue the final software (or release) acceptance (workflow steps)</li>
</ul>
</li>
</ul>
<p>The &#8220;Change Management&#8221; process has to be aligned with the customers&#8217; organization structure and change process. We experienced that especially change management is in most cases an already well defined process at the customer side. However, for &#8220;Change Management&#8221; we can typically apply the same workflow as for &#8220;Requirements&#8221;.</p>
<p>&#8220;Bug tracking&#8221; nowadays follows standard workflows. But bug tickets are also development tasks, i.e. changes to the product/source code. Thus, we extended the bug workflows with steps representing the approval and quality assurance parts (&#8220;READY FOR TESTING&#8221;, &#8220;VERIFIED&#8221; and &#8220;SIGNED OFF&#8221;) as well. This applies to all issue types leading to development activities where requirements and change requests just represent the controlling/management part and not the realization part, e.g. Requirements, Change Requests, Bugs and dedicated implementation tasks.</p>
<h2>Dedicated JIRA Projects for the Customer and Development</h2>
<p>Sometimes the periodic amount of JIRA tickets (e.g. needed for a software release) exceeds the &#8220;pain&#8221; threshold (typically &gt; 300 per release) and the customer is beginning to loose the project overview and feels lost in the overwhelming amount of requirements, change request, bugs, QA tasks and tasks in general. Our recommendation for these cases is to split it up and create 2 dedicated JIRA projects:</p>
<ul>
<li><strong>Customer facing project:</strong> Used for all operational bugs and incidents (source: customer and end-user), comprises the complete requirement and change management and the software approval process. </li>
<li><strong>Development facing project:</strong> Used for all bugs during the development phase, all implementation tasks derived from requirements and change requests, development internal quality assurance tasks, all tasks that are related to the project in general, etc.</li>
</ul>
<div id="attachment_1049" class="wp-caption alignnone" style="width: 490px"><a rel="lightbox-jira-beyond-part2" href="http://blog.mgm-tp.com/wp-content/uploads/2011/12/blog2-project-splitting.png"><img src="http://blog.mgm-tp.com/wp-content/uploads/2011/12/blog2-project-splitting-480x226.png" alt="" title="blog2-project-splitting" width="480" height="226" class="size-large wp-image-1049" /></a><p class="wp-caption-text">Splitting up a JIRA project into a dedicated JIRA project for the customer and another one for the development.</p></div>
<p>The ticket handling is very easy: Assigned requirements within the customer project are just cloned and moved to the development project. The original and the cloned requirements are then automatically linked by JIRA. In the development project you can then create the appropriate division into implementation tasks needed for your team and component diversity. The status update of the customer&#8217;s source tickets (linked tickets) has to be done manually.</p>
<p>The development project is typically only visible for the development team and not for the customer. When we did this in the past, the customer&#8217;s initial doubts that we just want to hide information from him could always be resolved by simply opening the development project to him and showing him the hundreds of (open) tickets. Normally he would loose interest in this project very quickly because he is not getting any additional benefits out of it. On the contrary, he will be getting rather confused by the amount of information.</p>
<p>We&#8217;ve made really good experiences with the concept of 2 dedicated JIRA projects for the customer and the development, respectively. But there is a <em>second way</em> to remove redundant information from an overstrained customer. You can use JIRA&#8217;s security level concept. This way you can keep all tickets in one JIRA project. But you will then have to cope with the maintenance of security settings at ticket level due to the fact that security levels have to be set manually for each required ticket. To set a default security level is counterproductive because then every customer created ticket will be automatically hidden from the customer directly after creation.</p>
<h2>Conclusion</h2>
<p>In our experience the advantages obtained through direct participation of customers in JIRA exceed the disadvantages of for example the increased efforts necessary for JIRA configuration. A well informed customer who is directly involved in his project feels much more comfortable even if something goes wrong or the progress of the project gets stuck. <em>Transparency is the magical keyword.</em></p>
<p>But you have to accept that every project has its own characteristics. It will be mainly influenced by the customer&#8217;s character, organization and stakeholders. You have to find the most appropriate and fitting level for a customer&#8217;s direct participation. Try to get the most accurate picture of the stakeholders you have to work with and then decide how they could fit into the process. And keep in mind that the process can always be adapted afterwards in order to achieve the greatest efficiency in project progress and customer satisfaction.</p>
<p>In keeping with agile practice apply <em>continuous improvement</em> to your project management processes: Change something &ndash; find out how it went &ndash; learn from it &ndash; change something again!</p>
<h2>Summary of the Key Success Factors</h2>
<ul>
<li>Let customers create tickets directly in JIRA: requirements, change requests, support inquiries, bugs.</li>
<li>Incorporate customers&#8217; duties and responsibilities (e.g. assignments, approvals) directly into the issues workflow as dedicated steps.</li>
<li>Prepare specific filters and dashboards for the customer
<ul>
<li>for his duties (detailing, assignments, approval)</li>
<li>for overviews</li>
<li>for status</li>
</ul>
</li>
<li>Split projects with an overwhelming amount of implementation tasks into two separate instances &#8211; one for the customer and one for development.</li>
<li>Give customers a short JIRA training covering all standard actions as well as how to use and adapt dashboards and how to interpret the data and analysis reports.</li>
<li>Tailor every project set-up individually and don&#8217;t try to compress it into the same template.</li>
</ul>
<p>If you keep all this in mind, you have a good chance that JIRA will become customer&#8217;s &#8217;sweetheart&#8217;!</p>
<p>There are tons of other interesting topics around JIRA. I will continue to provide you with further ideas, suggestions and mgm experiences. And of course if you have additional questions, ideas, and suggestions around JIRA I would really appreciate any comments and input from you.</p>
	<p><em><small>(c) 2013 <a href="http://www.mgm-tp.com">mgm technology partners</a>. This posting "<a href="http://blog.mgm-tp.com/2011/12/jira-beyond-bug-tracking-part2/">Practical Customer Participation in JIRA Workflows - JIRA beyond Bug Tracking, Part 2</a>" is part of the <a href="http://blog.mgm-tp.com">mgm technology blog</a>. The author of the posting is
	<a href="http://blog.mgm-tp.com/?author=21" title="View articles by Alexander Weiss">Alexander Weiss</a>.
	</small></em></p>

	<p><em><small>
	We are hiring! mgm technology partners is looking for good software engineers for all our offices. Check out <a rel="external" href="http://www.mgm-tp.com/karriere">www.mgm-tp.com/karriere</a>.
	</small></em></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.mgm-tp.com/2011/12/jira-beyond-bug-tracking-part2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<series:name><![CDATA[JIRA beyond Bug Tracking]]></series:name>
	</item>
	</channel>
</rss><!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced
Database Caching using disk: basic

Served from: blog.mgm-tp.com @ 2013-05-21 19:05:24 -->
