Skip to content

This Talend component builds hash keys from various configurable columns. It is designed to support hash key generation for Data Vault scenarios.

License

Notifications You must be signed in to change notification settings

cimt-ag/talendcomp_tHashRow

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Talend component tHashRow

This component creates a hash value. All configured input columns will be cleansed according to the given rules and concateneded in the order of the schema with the given delimiter. The resulting string is used to calculate the hash value.

Basic Settings

Basic configuration

Hash type

Algorithm that will be used to generate the hash

Hash output

Column of the output schema in which the hash value will be written

Hash input manipulation

Relevant fields

Column Use Trim Case Sensitive
List of input columns Check if column should added to the hash Check if column should be trimmed Select if column should be upper case, lower case, case sensitive or not in use (eg. in case of numeric values)

Delimiter

Delimter to seperate the input values

Null replacement

Value that will be used to calculate the hash, if input value is null

  • Example if replacement value is set to "#NULL#""
    • COLUMN_1 = "Test"
    • COLUMN_2 = null
    • COLUMN_3 = 123
    • Hash Input results in "Test";#NULL#;123

Fraction size (float)

Maximum precision of float values

Fraction size (double)

Maximum precision of double values

Number format

List of available number formats. Grouping is gernerally disabled.

Date format

Format date as miliseconds

  • if checked
    • all date fields will be represented as miliseconds since unix epoch
  • if unchecked
    • all date fields will be represented in the given date format

Enable string quoting

String based fields will be surrounded with the given quotation mark. Existing quotation chars in text will be replaced by double quotes.

  • Example without quoting
    • Hash Input = CUSTOMER "A";1234;STREET 1;;;
    • results in "CUSTOMER ""A""";1234;STREET 1

Cut of empty trialing hash input values

If checked all empty trailing values will be truncated before hash will be calculated

  • Example without quoting
    • Hash Input = CUSTOMER A;1234;STREET 1;;;
    • results in CUSTOMER A;1234;STREET 1
  • Example with quoting
    • Hash Input = "CUSTOMER A";1234;"STREET 1";"";;""
    • results also in CUSTOMER A;1234;STREET 1

Advanced Settings

Hash output manipulation

Modify hash output

If all input values are null, the hash value will be replaced with the given value

  • Example 1
    • checked and value is set to "22222222222222222222222222222222"
    • Hash input = ;;;;;
    • Hash value = "22222222222222222222222222222222"
  • Example 2
    • unchecked
    • Hash input = ;;;;;
    • Hash value = 8f0158355357e8302939ea687dba9363

Additional settings

Show hash input

If checked the hash input (concatenation of all input values) will be exposed to the selected column

Talend component tMultiHashRow

Basic Settings

Basic configuration

Hash type

Algorithm that will be used to generate the hash

Hash input configuration

To enable a multiple configuration it was necessary to replace the column configuration known from tHashRow with a text input. For simple realization a Key-Value Pair was chosen, containing the target column as key and the used column as value. If several columns must be used for a hash, the input is made using a comma-separated list. The well-known functions for trimming a column or for differentiating between upper and lower case are performed using a subsequent definition in square brackets. Several hash calculation configurations are connected separately by ; .

Shortcut Description
T Check if column should be trimmed
U Select if column should be converted to upper case
L Select if column should be converted to lower case
C Use case sensitive values (default)

Example;

TARGETCOL_HASH_1=SOURCECOL_1, SOURCECOL_2;
TARGETCOL_HASH_2=SOURCECOL_2, SOURCECOL_1, SOURCECOL_3;
...
TARGETCOL_HASH_n=SOURCECOL_x, SOURCECOL_2, SOURCECOL_9, SOURCECOL_1;

Delimiter

Delimter to seperate the input values

Null replacement

Value that will be used to calculate the hash, if input value is null

  • Example if replacement value is set to "#NULL#""
    • COLUMN_1 = "Test"
    • COLUMN_2 = null
    • COLUMN_3 = 123
    • Hash Input results in "Test";#NULL#;123

Fraction size (float)

Maximum precision of float values

Fraction size (double)

Maximum precision of double values

Number format

List of available number formats. Grouping is gernerally disabled.

Date format

Format date as miliseconds

  • if checked
    • all date fields will be represented as miliseconds since unix epoch
  • if unchecked
    • all date fields will be represented in the given date format

Enable string quoting

String based fields will be surrounded with the given quotation mark

Cut of empty trialing hash input values

If checked all empty trailing values will be truncated before hash will be calculated

  • Example without quoting
    • Hash Input = CUSTOMER A;1234;STREET 1;;;
    • results in CUSTOMER A;1234;STREET 1
  • Example with quoting
    • Hash Input = "CUSTOMER A";1234;"STREET 1";"";;""
    • results also in CUSTOMER A;1234;STREET 1

Advanced Settings

Hash output manipulation

Modify hash output

If all input values are null, the hash value will be replaced with the given value

  • Example 1
    • checked and value is set to "22222222222222222222222222222222"
    • Hash input = ;;;;;
    • Hash value = "22222222222222222222222222222222"
  • Example 2
    • unchecked
    • Hash input = ;;;;;
    • Hash value = 8f0158355357e8302939ea687dba9363

Support of Enterprise features

This version supports the dynamic datatype. To use it's necessary to register this component in the talend plugin jar or to cast type to Object type. From https://help.talend.com/reader/MNcEDgjyM49yQ58GboyG4Q/doXCP4sJgwe85tm8ny5Zqw

For a list of components that support this feature, go to <install_dir>/plugins/, where <install_dir> is the Studio installation directory, extract the jar file org.talend.core.tis_.jar to get the text file supportDynamic.txt in the resources folder.

About

This Talend component builds hash keys from various configurable columns. It is designed to support hash key generation for Data Vault scenarios.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 100.0%