Here’s a quick tip for those working with Talend to create a job that performs a number of calculations on a set of network data. Say that your calculations are reliant on jitter, but you are not provided with any values; you need to create these yourself and these have to be realistic and follow a known pattern. Although, Talend comes pre-populated with a lot of useful routines,there is nothing that can create values with a normal distribution. So in this blog, I will show you how to quickly create a custom routine that does just that.
To create this custom routine follow the steps below:
- Open Talend
- Expand the Code tab under the Repository window
- Right-click “Routines” and select create folder
- Name your folder anything you like, for example “custom”
- Right-click the new folder you created and select “Create routine”
- Name your routine “customRoutine”
- An auto-generated file will open, that will contain the class customRoutine you just created
- Remove example routines that might have been created
- Add the code below:
/** * @return a normal distribution value with a defined scale * @param double(10.0): standard deviation * @param double (30.0): average value * @param int (1): scale * @example normDistValue(10.0, 30.0, 1) # A double value with one decimal point and values falling between 30.0 +/- 10.0 */ public static double normDistValue(double sd, double avg, int scl) { Random r = new Random(); BigDecimal bd = new BigDecimal(r.nextGaussian() * sd + avg); return bd.setScale(scl, BigDecimal.ROUND_HALF_UP).doubleValue(); }
- Save and close
That’s it. The new routine is created. The routine includes an explanation of what it does and how to use it but as an overview you can now use the routine and specify what is your average value, how much do you wish to deviate from it and what the decimal precession you wish to use is, or if any at all.
Note: There are two ways you can go about this. Either use a standard public double nextGaussian() Method, or use Apache Commons Mathematics Library, which contains a org.apache.commons.math3.distribution.NormalDistribution Class. The example shown above uses the former.
To conclude, using this routine in a tRowGenerator component or in a tMap you ought to be able to now generate or add random values that will be within a specific distribution, thus following a desired or known pattern.
I hope you enjoyed this article. If you’re curious for more, I recommend subscribing to the Datalytyx blog in order to receive a monthly update on recently released articles.
0 Comments