Overview
Talend recently announced the Beta of Talend Open Studio 6.0. The new version has been in development for quite a while now and we have seen 5 milestone releases so far. The release of Beta means that the final release is getting closer. This is an important update for Talend and you should get excited about it. In this blog we will take a look at some of the changes that version 6.0 brings. We will also perform a benchmark of Talend v6 running on Java 8 versus Talend 5.6.2 with Java 7 and see if there are any performance changes.
New Features
Let’s begin by looking at some of the new features Talend has introduced in this release.
Java 8 Support
Java 8 full support is here and while some might think this is not that significant you will immediately get the following benefits:
- Tiered compilation, which was introduced in JDK 7, has been enabled by default and brings server VM start up speeds close to or on par with the client VM;
- Further improvements to Multi core CPU performance (parallelisation and multi-threading speed improvements);
- Code execution performance improvements (speed improvements will depend on the type of task).
Those are just some of the benefits of Java 8. For the full list of new features introduced in Java 8, please refer to the following link: ‘What’s New in JDK8‘.
There is one more important thing I would like to say about java 8 support and that is: client security! As you know, Java 7 is on its way out due to high adoption rates of Java 8 and thus the support for version 7 by Oracle is dropped. Certain companies that require up-to-date software will not allow the use of Talend with anything but Java 8, which was impossible up to now. However with Talend 6.0 you can run Talend on any environment. This means you can use Talend in places where it was previously rejected due to the use of a no longer supported Java version.
New Components
With version 6.0, Talend has focused on bug smashing and updating the existing vast component set with new drivers and functionality. Still, a couple of new components have been introduced including:
- Microsoft HD Insights Platform is, in essence, just a different name for a Microsoft Hadoop implementation based on MapR;
- MariaDB support is a fork of MySQL focused on performance and being open-sourced;
- MemSQL support is in-memory high performance database by former Facebook employees;
Existing Component Updates
As I’ve mentioned, the existing components received a lot of attention in this version and here are just some of the new features added to them:
Component Name | Details |
---|---|
Vertica connectors | Added SQL Templates support |
Sqoop connectors | New teradata options |
Hive and Pig connectors | S3 connection: this is a big one as it allows the use of Amazon Big Data option with amazon S3 file system instead of the default HDFS directly from components. This could cut down on costs for particular business cases when choosing a file store system |
Postgresql connectors | Latest Postgresql 9.4 support |
Big Data connectors | Latest Cloudera CDH 5.4 support, Updated Hortonworks and MapR |
Eclipse
Since Java version 8 has been introduced in the new Talend version, the Eclipse on which the tool was build had to be updated. The new version comes with Eclipse version 4.4. To find out more about the new features of Eclipse 4.4 (Luna), please refer to the following link: What’s New in Luna?
The most noticeable new features of the new Eclipse version are:
- Java 8 Support;
- More responsive UI;
- Ubuntu menu integration (this will bring much better support on Ubuntu systems and hopefully solve the horrendous font rendering issues);
- SWT Browser now supports XULRunner 24.x (finally you no longer need to scour the web to find the appropriate (several year old) XULRunner versions for Talend);
Note: XULRunner is what makes internet browsing possible in Eclipse. Set the GTK+ version to be used by Eclipse via the launcher. If you are a CentOS user or any other Linux Distro that runs on GTK+ you should also get some better GUI integration.
UI
Talend has made a big effort to update the interface of the tool. Which is not necessarily a big selling feature to some, but is nevertheless much more pleasant to work with. The interface goes in line with the new Talend Cloud offering, so now both tools have a similar look and feel to them.
The most outstanding changes are the complete redesign of:
- Menu icons;
- Components icons;
- A brand new solution to link components together;
- New palette (finally something that can be used).
UI across multiple platforms
Federa 22
Ubuntu 15.04
Windows 8.1
As you can see, by default the difference is minimal. Mostly it’s font types, font sizes and some colouring differences which are all dependent on the system and not the tool itself. However I did notice an issue with the UI on Fedora 22, which by default runs on top of a Gnome Shell; the minimise and maximise tabs within Talend are using incorrect icons that do not work correctly.
Also, switching to GTK theme is not an option on Gnome as the applied colour palette makes most of the tool unusable due to white text on white background. This is clearly still a work in progress and Talend might fix the UI rendering issues by final release.
Benchmarks
The following section of the article will demonstrate the performance differences in Talend Load Times and Job Executions. All the tests will be compared between Talend Open Studio Data Integration v5.6.2 (which is the latest stable release) running Java 7 and Talend Open Studio Data Integration v.6.0 Beta running Java 8. The test will be run on the same machine. For System Specifications please refer to Appendix A.
Load Times
The following test showcases Talend Open Studio load times. Both versions have the same mandatory components installed. The test has been performed 5 times for each version to get a thorough result set.
Based on Task Manager memory usage information, Talend 6.0 uses more memory from the start, as can be seen in the table below. However, during execution, less memory on average was was used by Talend 6.0 for the same task.
Talend Version | Memory Usage |
---|---|
Talend 6.0 Beta | 625 Mb |
Talend 5.6.2 | 460 Mb |
Job-Run Times
The following tests are going to use the Talend Demo jobs that come pre-packaged with Talend. All the jobs will run several times to get a thorough sample of data.
Conclusion
Purely from a performance evaluation of the new tool, there is not much gain right now. Improvements to job executions were not part of the release and the observed positive changes could most likely be accredited to Java version 8 and Eclipse 4.4. Then again, any improvements to speed – intentional or simply due to new software support – is a positive in my book.
More importantly, Talend has made a complete revision of the existing software UI and updated the existing components to support the latest libraries and distribution versions. This may be considered critical for some developers if you are working with cutting edge software or products that have a rapid release cycle that you wish to be a part of.
That said, there are still functions Talend need to improve on. Schema handling within the tool, settings propagation, custom component builder are all on my wish list. And by the look of this Beta release, they are already being worked on – something to look forward to.
It should be noted that there are some of new features and components of Talend 6.0 that we have not yet tested / reviewed… Keep tight for Part 2!
0 Comments
Trackbacks/Pingbacks