Format date in Batch yyyyMMdd

The usual approach to build a timestamp for a log files in a batch is to use standard Windows utilities date and time. However the format of the output depends on locale and it is almost not possible to make the script which runs on any machine. A solution might be to create a small java […]

Removing time part of a date fails in Kettle

Might be an interesting info. I encountered an error running a Pentaho Data Integration transformation. After simplification I got this: Generate Row creates one row with the date field DATETIME=1981-01-01. Calculator calculates field DATE using operation "Remove time from a date". And this trivial transformation failed with a weird error: Unexpected error : java.lang.IllegalArgumentException: MINUTE […]

How to Obfuscate Password for Command Line Kettle

Pentaho Kettle does not provide possibility to obfuscate password in batch files (though the connection passwords are obfuscated in XML files). You can write: kitchen.bat /rep repos /job test /user admin /pass admin but this will not work: kitchen.bat /rep repos /job test /user admin /pass "Encrypted 2be98afc86aa7f2e4cb79ce71da9fa6d4" Here "Encrypted 2be98afc86aa7f2e4cb79ce71da9fa6d4" is encrypted version of "admin". You can get […]

Creating Aggregate Tables

This post demonstrates an example how to create aggregation tables in Kettle. Let's assume the original transformation loading delta is the following: It reads data from source database, makes some transformations and loads data into data warehouse. The source table JobEntry has two grouping fields project_id and employee_id and one additive measure registered_hrs. The task […]

Use of Index and Characterset

This post demonstrates influence of characterset on use of indexes. See also Language and Use of Indexes in Oracle /* DROP TABLE t; ALTER SESSION SET NLS_LANGUAGE='AMERICAN'; */ SELECT PARAMETER, VALUE FROM V$NLS_PARAMETERS; CREATE TABLE t (x VARCHAR2(255)); INSERT INTO t VALUES ('a'); INSERT INTO t VALUES ('b'); INSERT INTO t VALUES ('c'); INSERT INTO […]

Language and Use of Indexes in Oracle

Symptoms It takes a very long time to load a data warehouse on Oracle database with Kettle when the language in the Windows Region and Language settings is different from English (United States). Cause Oracle JDBC driver used by Kettle issues command ALTER SESSION SET NLS_LANGUAGE and changes default database language. Changing NLS_LANGUAGE also changes […]

Io exception: Socket read timed out

If you get the following message, check the firewall settings. ERROR 30-09 09:13:36,485 – YOUR_LOG_CONNECTION – Error disconnecting from database: Error comitting connection Io exception: Socket read timed out When the transformation/job is started, Kettle opens a connection for logging. The transformation could run for hours, the connection for logging will stay idle all this […]

Influence of Nr of rows in rowset on Merge Join

Let's consider a transformation that merges two flows: Here are some experiments with different Nr of rows in rowset: So the speed of the Merge join depends on the parameter Nr of rows in rowset. It should be reasonably high (3000K). Note that if the parameter is too high, the transformation might fail with exception: java.lang.OutOfMemoryError: […]