poniedziałek, 25 sierpnia 2014

Chasing email displaying problem (or How The World Was Saved)

A few days ago I was involved in investigation why an email that is sent by our application is displayed incorrectly by an application on recipient side. The problem seems trivial, and the root cause was indeed hilarious, but it took a couple of days to troubleshoot...

Our application (Java-based) sends emails using Postfix SMTP server running locally, which relays the email to a corporate mail server, which in turn delivers messages to intended recipients.


This setup has worked very well for us, but some time ago one of the third party recipients started to complain that emails from our production system are displayed incorrectly in their application. Their custom application consumes received emails from their mailbox and displays the content in a text window. The problem was that the email body was displayed garbled, like all line breaks were missing. They were surprised because during tests in UAT environment everything looked fine.
Troubleshooting was not made any easier by the fact that this third party is a huge financial corporation, with significant red tape and inertia. The people that observed the problem had little chance of getting proper IT support on their side, so the burden of investigating the issue was mostly on us - with mild pressure from the client as this issue was a blocker for business plans the client had with that third party.

Initially there was a suspicion that our production and UAT environments were configured differently and one sent emails with Linux-style line breaks (\r) and the other one with Windows-style endings (\r\n). Wrong format could confuse the application on the recipient side...We spent a couple of days comparing UAT and production configuration - no differences found. We went as far as to capture SMTP traffic in both environments and compare it in Wireshark to make sure that what leaves our estate is identical for production and preproduction environments. Nothing - both environments sent identical emails, barring recipient list (to/cc/bcc fields were populated differently for both environments).

Then the attention moved to the corporate mail server, based on Mimecast. The suspicion now was that there were some policies set up that would process preproduction and production emails differently. So we sent tens of emails from both environments trying to identify the reason, using different application configurations and different Mimecast policy settings. Again no conclusive outcome was seen - but interestingly the third party reported that production emails started to look fine at some point. The only difference between these emails and the emails that caused problems a few days back was that one 'cc' address was not present. That address was something like img@company.com.

As it turned out, the reason why perfectly good emails were displayed incorrectly was that the string "img" on recipient address list was interpreted as a start of HTML img tag by the third party application. Upon seeing this string the application switched to some other parsing mode, which caused email body to be interpreted and displayed incorrectly.

I must admit I was shocked to hear that the root cause was such a silly bug. Fortunately that img@company.com address belonged to our client and not to another third party involved in the business process. Therefore a workaround could be deployed quickly - the client set up a mailbox on their side with a different name, and we could get rid of that img address. World saved...

poniedziałek, 7 lipca 2014

Confitura 2014

Last Saturday I went to Confitura - the biggest free Java conference in Europe. With its well over 1000 participants it is an enormous effort for the community behind it. Let's see how it went.

Venue

The venue was the campus of the Warsaw University at Krakowskie Przedmieście, Warsaw - the same place as last year. Generally fine - large rooms, good equipment, a lot of space outside to walk, relax and talk with colleagues. If air conditioning worked better and the corridors were a bit more spacious I would say it is perfect. Food and drinks - present, enough to survive through the day full of talks. For those with bigger appetites there are many bars and restaurants nearby (hey, it's the centre of Warsaw after all).
Two interesting features were iBeacons that were supposed to make navigation between conference rooms easier and electronic devices located near exits to allow voting presentations up or down. I did not rely on iBeacons for finding rooms and I always forgot to vote when leaving the rooms - but maybe it is just me. Anyway, I would rather see an option to vote up or down in the mobile application prepared for the conference.

Before lunch

There were five parallel tracks, so I had to make some difficult choices. I picked "Disposable dev environments" by Marcin Brański and Przemek Hejman for the first slot. They presented tools that can be used for DevOps deployments or simply improving developer's productivity when provisioning of test environments might take ages. It was an interesting topic for me as I am considering use of these technologies on my current project and my knowledge in this area so far has been only theoretical.
They started with showing Packer - a tool to build machine images for multiple platforms. Other tools covered during the presentation were Puppet, Vagrant, Ansible, Chef, Salt and to a slightly different tune - OS level virtualization solution: Docker, and its useful companion: Fig.
The guys even managed to run 3 or 4 scripts (Packer, Vagrant, Docker), but in my opinion it was an area that actually could be improved. Instead of simply showing a display full of logs after script execution it would be more informative to show how the environments built, set up and started using Packer/Puppet/Docker can be put into actual use by running some simple applications inside them.
It would provide a better learning experience even if it meant that due to extra time required some tools would need to be dropped from the agenda. In other words - a bit more depth at the expense of breadth.
The guys were quite enthusiastic talking about disposable environments, so I'm confident their next presentation will be a bit slower paced and less chaotic.

The next presentation I attended was about working with a difficult client ("Tajniki współpracy z (trudnym) klientem") by Kasia Mrowca. This presentation unraveled slowly, but ultimately provided quite a few interesting insights. Some that I noted are:
  • Keeping systems small is important because it increases the odds of success. If the system is inherently large and complex - try focusing on no more than 3 key processes at a time. Showing statistics how much more often small projects succeed compared to large ones can go a long way when talking to business owners.
  • Be careful when evaluating requirements originating from end users - they sometimes tend to copy the existing manual processes into new IT systems. You might end up reimplementing Excel functionality in a SAP system...
  • Screen mock-ups are very useful. Balsamiq Mockups was mentioned as a good tool (I tend to agree as I'm happy to use it in my work as well). Prepare more than one mock-up version to avoid getting your mind fixed to just one layout.
  • If preparing a mock-up takes more than a few hours - consider building a prototype instead.
  • If building a prototype would take months then the system is probably too complex - see point one.
  • Options presented to clients could be visualised using three dimensions: Effort estimate, Risk, Business Value.
  • Use cases or user stories can be presented using UML-style notation as well as comic strips. Notation does not matter - choose one that will be most effective for the targeted audience.
  • Do not present too many options - it will make decision making more difficult.
  • And the most important: do not teach business people how they should run their business - they hate it (from my experience - I strongly agree).

Then I moved to another building to see Wojciech Seliga's talk about innovations. He started with pointing out some innovative practices employed by Atlassian (and Spartez) that are not related to products, like a simple sales model:
  • publicly available, low prices
  • no discounts
  • self-service
Atlassian is now worth $3.3bln, so it seems these practices work quite well for them...
Later Wojciech claimed that innovation cannot be effectively planned or produced - you simply cannot order someone to be innovative. Innovation cannot be brought in by processes and policies. Traditional means used by managers in other areas, like money incentives, trainings, processes are not effective when it comes to innovation.
What helps is to create an innovation-friendly environment, where creative individuals can develop and work without being afraid to make mistakes. A zone of "safe failure", where the cost of failure is minimal, helps incubating innovative ideas.
Some ideas shared by Wojciech that work well for Atlassian are:
  • brown bags - knowledge sharing sessions at lunchtime.
  • FedEx Days (renamed to Ship It Days) - a 24 hours long hackaton.
  • 20% of working time to be spent on side projects.
  • Ensuring that newjoiners are heard - they challenge status quo and bring a fresh view.
  • HackHouse - a week long camp for graduate newjoiners to have fun and code.
  • Company-wide transparency - all company information is available for everyone, except private and law-protected documents.
  • Easy access to information - regardless in which system it resides.
  • Collecting feedback, especially from users.
And an advice: do not get stuck in what he calls "plantations of IT workers" - companies that work on projects that are so old-fashioned that nobody in the West wants to deal with them anymore.

After the first three sessions there was a lunch break. I missed the warm meal provided by the organizers, but it was not that bad - I grabbed a few sandwiches and enjoyed some interesting conversations.


After lunch

The first session I chose after lunch was by Jacek Laskowski. The topic was how StackOverflow, GitHub and Twitter can be used by programmers for professional development.
The main point that I remember is being active on StackOverflow and GitHub helps learning new things (as you get exposed to different problems than the ones you face in your own projects) and improving one's reputation among developer community.
There was also quite a bit of ego pumping during the presentation - not a big surprise given how expressive Jacek is as a person.
One more thing that was mentioned is Scalania - a learning project for people starting their Scala adventure: https://github.com/jaceklaskowski/scalania.

Then I moved to a session about NoSQL hosted by Marcin Karkocha. Most of the information was a fairly standard stuff about types of NoSQL databases, what are their relative advantages and disadvantages. The most valuable pieces for me were:
  • Difficulties in defining SQL and NoSQL databases. Some SQL databases do not support ACID transactions (e.g. MyISAM engine for MySQL). Some NoSQL databases support SQL language for queries although they do not use relational data model. Not Only SQL seems to be the best expansion of NoSQL.
  • Basing on Marcin practical experience, the following architecture seems to work fine in practice: MongoDB + PostgreSQL (user data, transactional data like payments) + Redis (cache).
References to real projects that Marcin participated in added credibility to the presentation, on the other hand there were mistakes in theoretical parts which left me with some mixed feelings.
By the way - I wonder why MarkLogic is so rarely mentioned during presentations about NoSQL. It did not make it into Marcin's presentation, either.

I skipped the next session - I preferred to have spirited discussions with colleagues when sitting in a deck-chair with face towards the sun (did I mention the weather was just wonderful on Saturday?)

The last presentation for me was about organic architecture by Jarosław Pałka. I found it to be too focused on development practices and too little on actual architectural stuff. I cannot understand e.g. how removing unused code rather than commenting it out is related to IT architecture.
An interesting point was about defining architecture as a process of moving the system from one state to another. It highlights the fact that architecture definition takes time.
Another good point was about right tools - the right tool is whatever will help you get to the target state, regardless of what the current technology hype is (microservices, DDD, NoSQL or other BPMN).


Other interesting events

I had an interesting discussion with chaps from Azul Systems at their stand. They sell Zing - a JVM implementation that makes GC pauses go away. It can be used wherever SLAs are strict and stop-the-world activities might lead do SLA breach. An example could be a trading platform, a messaging hub or even a human facing system if the underlying heap is massive and GC pause might take a few seconds, leading to terrible user experience. Azul offers a free jHiccup tool to monitor application responsiveness and facilitate diagnosing problems that could be resolved with Zing.
Another interesting product of Azul is ReadyNow. It precompiles classes that are normally used during program execution to avoid delays due to JIT compilation at runtime. Since precompilation takes place at application startup, you get slightly longer startup time but faster and more stable performance later. A popular solution to this JVM warm-up problem is to run several dummy requests through the system to give JIT compiler a chance to kick-in before live traffic is allowed in. It seems that ReadyNow can avoid this nuisance.

Finally sentence of the conference: "It's better to ask forgiveness than permission" - heard twice on Saturday, once from Jacek Laskowski and once from Jarosław Pałka. In both cases it was in the context of improving codebase.

wtorek, 10 czerwca 2014

Writing and testing software for data integrity

Last week I hosted a presentation under auspices of JUG Łódź (which I happen to be a founding member of): "Writing and testing software for data integrity". Data integrity is a broad topic, so I touched only a few chosen aspects. Things that made it into the presentation include:

  • Physical vs logical integrity
  • ACID (strong) consistency
  • BASE (weak) consistency
  • Synchronous vs asynchronous replication
  • Distributed systems limitations - CAP theorem
  • Examples of data consistency violation (like Photo Privacy Violation or Double Money Withdrawal described in "Don't settle for eventual consistency" article)
  • Strong consistency comes with performance penalty. Choosing performance and availability over consistency might be justified and lead to improved revenues (as is the case with Amazon) or lead to spectacular failures like in case of Flexcoin

Local vs distributed transactions

The second part of the presentation was slightly different, though. It included a live demonstration of a situation where local transactions are not sufficient to guarantee data consistency across multiple resources and how distributed transactions come to the rescue. The demonstration was done basing on the scenario below:

The application consumes messages from a JMS queue (TEST.QUEUE), stores message content in a database, does some processing inside VeryUsefulBean and finally sends a message to another JMS queue (OUT.QUEUE).
The application was a web application deployed on JBoss EAP 6.2. JMS broker functionality was provided by ActiveMQ and MySQL acted as a database. Web application logic was built with Spring and Apache Camel.
So let's assume that the processing inside VeryUsefulBean fails:
The exception was injected using JBoss Byteman rule:

As expected with local transactions system state was inconsistent after processing failure. The expected state would be:

  • Incoming message not lost
  • No data saved in database
  • Nothing sent to outbound queue (OUT.QUEUE).

Basically one would expect that system state would not change due to processing failure. However the actual behaviour was:
  • Incoming message not lost
  • Data saved to DB (as many times as message was (re)delivered).
  • Nothing sent to outbound queue (OUT.QUEUE).
The reason for that behaviour was that local transaction that saved data into database was committed independently from JMS message consumption transaction, leading to inconsistent state.

Then the experiment was repeated with JTA transaction manager and XA resources set up. The outcome was correct this time - no data was saved to database. JMS message consumption and all processing including database inserts was handled as part of the same distributed transaction and all changes were rolled back upon failure as expected.

Automated integration test

The test proved that application worked correctly with JTA transaction manager and XA resources (XA connection factory for JMS, XA data source for JDBC), however the test was manual and time consuming. Ideally this behaviour would be tested automatically, and this was the topic of the final part of the presentation. We did a walk through an integration test that verified transactional behaviour automatically.

First test cases were defined as JBehave scenarios:
JUnit was used to execute the scenarios:

Spring context for the web application was splitted into two parts:
  • spring-jmsconsumer-core.xml - contained beans with application logic and definition of Camel context.
  • spring-jmsconsumer-infrastructure.xml - contained beans used to access external resources, like JMS connection factory or JDBC data source.
In order to execute the application logic in fully controlled environment the test had to be completely autonomous. It means that all external interfaces and infrastructure had to be recreated by the test harness:
  • ActiveMQ - replaced by embedded ActiveMQ.
  • Arjuna Transaction Manager provided by JBoss EAP - replaced by standalone Atomikos.
  • MySQL - replaced by embedded HSQLDB.
Both embedded ActiveMQ and HSQLDB support XA protocol, so they could be used to verify transactional behaviour.
While the core context could and had to be reused during test execution, the infrastructure context made sense only when the application was deployed on a real JEE server, as it retrieved necessary resources from JNDI - see below.

Therefore the infrastructure part of the context had to be rewritten for automated test execution:


Note beans with id amq-broker and hsqldbServer - they are responsible for starting embedded JMS broker and DB server needed during test execution.

Having the infrastructure in place, it is quite simple to write test steps defined in JBehave scenarios, e.g.:

There are obviously a few other details that need to be worked out, but since this post has already grown too long - have a look at the complete code on GitHub: https://github.com/mstrejczek/dataintegrity.

poniedziałek, 9 czerwca 2014

Review of Scalar - Scala Conference

This is a long long overdue review of the Scalar conference that took place on 5th of April 2014 in Warsaw. It's been over 2 months since the event so I've forgotten a lot - this post is based mostly on the notes I took during the event. Unfortunately I lost most of the notes when transferring them between devices...

About the event

Scalar was a one day conference dedicated to Scala and organized by SoftwareMill - company whose people are well known among Java/Scala community. It was a free event and the venue was National Library - a lovely location next to Pole Mokotowskie park. There was no breakfast served at all - not great, but acceptable given it was a free event. A warning on the website to have a proper breakfast in the morning would be good, though. It was difficult to stay focused during the first presentation when in starvation mode. The situation was saved to some extent by cookies that were available in large quantities, and I appreciated 100% juice being available instead of sugar-enriched nectars or cola. There was one more issue with logistics - the lobby with cookies and drinks was too small and therefore incredibly crowded during the first breaks. Later the situation improved as more and more people started to go outside to enjoy lovely weather.

Content

There was only one track, so I didn't have to make difficult choices. I missed a few sessions, though - due to personal reasons I could not attend all sessions or stay focused at all times. The first session was about "Database Access with Slick". Slick is a functional-relational mapping offering direct mapping from SQL to Scala code with little impedance mismatch. In SQL world extensions like PL/SQL are proprietary and use ugly imperative form. With Slick one can enrich queries using Java code (or was it Scala code?) There is free support for open source DBs. Commercial extensions are needed to use Slick with proprietary DBs like Oracle. I remember a complex type system and operators presented during the session - I didn't find that aspect of Slick too appealing. Also support for upsert operation is not available yet. Summing up - the presentation was really solid, but I was not impressed by the Slick library itself. One more note - one can try Slick out on Typesafe Activator.

The second session covered "ReactJS and Scala.js". ReactJS is a JavaScript library for building UIs that was developed by Facebook and is used by Instagram. Scala.js allowed to develop code in Scala and compile it into JavaScript. During the presentation there was an extensive walk through code of a chat application (Play on backend, ReactJS + Scala.js on frontend). What I noted was that Scala collections work nice in a web browser. The problem is that JavaScript implementation of Scala standard library is heavy (20 MB), but can be stripped down to include only necessary bits e.g. 200kB in size. Another issue was that conversions from/to JavaScript types were a pain - especially intense if rapid prototyping is what you're after. Another conclusion: ReactJS works great with immutable data. Regarding Scala-to-JavaScript compiler: normal compilation is fast, but the produced files are huge. Closure compilation is slow but produces small output files. So from now on I'm writing from memory only as my surviving notes covered only first two sessions:

The next session was "Simple, Fast & Agile REST with Spray" by Adam Warski from SoftwareMill. The presentation was one of the better ones: well-paced, informative and with really smooth live coding. Live coding, if done well, makes presentations more interesting in my opinion. Not only the presentation itself was good, but the subject too - you can really expose RESTful interface (including starting up HTTP server) with a few lines of code using Spray. Definitely recommended if you are looking for Scala-based REST framework.

After that came "Doing Crazy Algebra With Scala Types". There were some interesting allegories shown between Scala types and mathematic formulae. That was probably the first time I saw Taylor series since I left university...Mildly amusing - I found it a curiosity but could not identify practical uses.

The last session before lunch was "Scaling with Akka", which involved a demo of an Akka cluster running on a few Raspberry Pi nodes. I must admit I don't remember much from this session apart from the fact that Akka cluster worked indeed and Raspberry Pis were painfully slow.

The first session after lunch was devoted to Scala weaknesses and pecularities: "The Dark Side of Scala" by Tomasz Nurkiewicz. It was a good and fast paced presentation, and there were some interesting kinks shown that did not repeat those found in other well known presentations (e.g. the famous "We're Doing It All Wrong" by Paul Phillips).

The next session was a solid introduction to event sourcing: "Event Sourcing with Akka-Persistence" by Konrad Malawski.

I couldn't focus fully during the next three presentations, so I won't write about them. The last one that I remember was "Lambda implementation in Scala 2.11 and Java 8". It included comparing bytecode generated for Java 8 lambda expression with bytecode generated by Scala, plus an excellent explanation how invokedynamic works in Java 8. Java 8 uses invokedynamic to call the lambda expression code, while Scala generates additional anonymous class and invokes a virtual override method. The bytecode for Java 8 looks much more concise, although at the performance is not necessarily better as invokedynamic in Java 8 leads to generation of an anonymous class at runtime. So effectively an anonymous class is used anyway - with Scala it is generated at compile time, with Java 8 - at runtime. So currently the main benefit is smaller size of Java 8 jar files compared to Scala-generated ones. However if in Java 9 or Java 10 the anonymous method generation gets optimized away entirely then invokedynamic will clearly get significant runtime performance boost - without need to touch the source code! Scala is going to migrate to invokedynamic in the future.

There was one more session at the end but I missed it. Summing up - the conference was all right. The sessions were short, which made space for many different topics. The level of presentations was satisfactory on average - not all presentations were perfect and interesting, but I think for the first edition of a free conference - well done.