Sunday, March 29, 2009

Active and Passive Transformations

Active Transformations
Any transformation which changes the number of rows passing through it is called active transformations. If the number of input rows is not equal to output rows, then it is called active.

Source Qualifier
Joiner
Normalizer
Union
Filters
Routers
Sort
Rank
Aggregator
Transaction Control
Update Strategy

Passive Transformation
Any transformation which does not change the number of records passing through it is called passive transformation. If the number of input records = number of output records then it is passive.

Expression
Lookups
Stored Procedure
External Stored Procedure

Special Cases : -
Mapplet can be Active or Passive based on the transformations inside the mapplet. If it has any Active transformation, then it is active otherwise it is passive.

Interview Questions :-
1) Is a Mapplet Active or Passive - based on the transformations inside the mapplet
2) Why is a sorter active transformation - beacuse it has "Sort Distinct" property which will return distinct values.

Sequence Generator Transformation

What is NEXTVAL
NEXTVAL - generates sequence numbers when you connect this port to a transformation.

If you want to load 2 targets with the same sequence number
- Connect the NEXTVAL port to a transformation and then connect that transformation to the 2 targets.

If you want to load 2 targets with the unique sequence number
- Connect the NEXTVAL port to two different targets or two different pipelines which has those 2 targets

What is CURRVAL
CURRVAL = NEXTVAL + Increment by Value property

If you connect CURRVAL port without connecting the NEXTVAL, then the CURRVAL will give you constant value.
If you connect CURRVAL, Informatica process one row at a time - can be a performance issue.

Properties :
Start Value : Start Value of the sequence. It will be used as the start value, once it reaches End Value and if the Cycle Option is enabled

Increment By :

End Value : End Value for the Sequence Generator

Current Value : Current Value of the Sequence. It will be updated either at the end of the session(Non-reusable and Number of Cached Values =0 ) or during the session (Number of Cached Value > 0 ) . If Reset Property is enabled, then Informatica resets the value to its original value of the session.

Cycle : Enable if you want to cycle through the values after it reaches the End Value.

Reset : If enabled, Informatica updates current value to the original value every time it runs the session. If disabled Informatica updates the current value to last generated value + 1.

Number of Cached Values : Non-Reusable - If the Value is 0 - It doesnt Cache Values. Informatica reads the start value form the repository and then keeps generating the sequence values. At the ene of the session it updates the current value to Start Value or Lastgeneratedsequence+1 (based on the Reset Option)

If the Value is > 0 then Informatica reads the start value and caches the number of values based on the Number of Cached Values and then updates the Current value in the repository. It will again go to the Repository if all the values are used. At the end of the session it throws away any unremaining Sequence Numbers which were not used.

Interview Questions :-
1) If the End Value is reached and if the Cycle option is not enabled then the session fails
2) Reset Option is disabled for Reusable Sequence Generator
3) Number of Cached Values has to be >= 1 for Reusable Sequence Generator
4) To avoid big gaps between sequence values, set the Number of cached values to 0 if it is non-reusable, set the Number of Cached values to less number if it is Reusable.

Performance Issues :-
1) Do not connect CURRVAL
2) Check Number of Cached Values
Set it to 0 if it is Non-Reusable, Set it to appropriate(default 1000) if it is Reusuable

ENDVALUE & Datatypes in Versions :-
In version 8.6 the datatype of NEXTVAL & CURRVAL is bigint(19) and the ENDVALUE is 9223372036854775807.

In version 7, 8.1. the datatypes are integer(10) and the ENDVALUE is 2147483647