Update Transformation
Active Tranformation
Used to flag records for insert / update / delete or reject based on condition.
Forward Rejected rows -
By default the rejected rows are forwarded to the next transformation.
All the rejected rows are written to the session reject file. If you disable this option then the records are written to session log file. If you enable row error handling then the rows are written to row error logs and not to the session log file.
Aggreagator after Update Transformation
If a record is flagged for delete, then the aggregator will subtract the value, if it is flagged for insert then it is added.
Showing posts with label Transformations. Show all posts
Showing posts with label Transformations. Show all posts
Wednesday, April 22, 2009
Router Transformation
Router Transformation:-
Active Transformation.
1 Input Group and Multiple Output group transformation.
Two types of output groups -Userdefined and default.
You specify the test condition for each user defined group you create.
All the groups are processed by informatica if the default group is connected.
If the default group is not connected then informatica will not process the groups which are not connected.
If a record satisfies more than one group, then informatica will pass this row multiple times.
Active Transformation.
1 Input Group and Multiple Output group transformation.
Two types of output groups -Userdefined and default.
You specify the test condition for each user defined group you create.
All the groups are processed by informatica if the default group is connected.
If the default group is not connected then informatica will not process the groups which are not connected.
If a record satisfies more than one group, then informatica will pass this row multiple times.
Thursday, April 16, 2009
Sorter Transformation
Sorter Transformation
- Active/Passive
- Sorts data according to the sort key specified.
You can specify multiple columns as sort keys and the order (Ascending or Descending for each column)
Options :-
Distinct Output Rows -
Can output distinct records if you check this option
Null Treated Low -
By default it treats as High values. If you check this option it will treat it as Low
Case Sensitive - Checked by default. If you enable uppercase are higher
Work Directory - By default it is $PMTempDir
Sorter Cache Size -
It is the max amount of memory used for sorter.
In version 8 it is set to Auto. In version v7 it is 8MB
If it cannot allocate then the session will fail
If it needs more memory that it will pages the data in the Work Directory, and writes a warning in the log file.
- Active/Passive
- Sorts data according to the sort key specified.
You can specify multiple columns as sort keys and the order (Ascending or Descending for each column)
Options :-
Distinct Output Rows -
Can output distinct records if you check this option
Null Treated Low -
By default it treats as High values. If you check this option it will treat it as Low
Case Sensitive - Checked by default. If you enable uppercase are higher
Work Directory - By default it is $PMTempDir
Sorter Cache Size -
It is the max amount of memory used for sorter.
In version 8 it is set to Auto. In version v7 it is 8MB
If it cannot allocate then the session will fail
If it needs more memory that it will pages the data in the Work Directory, and writes a warning in the log file.
Tuesday, April 7, 2009
SQL transformation
SQL transformation
- To run ANSI SQL queries in the pipelines
Type : Active/Passive based on the Mode
Two modes :-
1) Script Mode
- run scripts that are located externally. you specify script name to the transformation with
- It outputs one row for each input row
Ports
ScriptName Input
ScriptResult Output - Returns PASSED or FAILED
ScriptError Output - Returns error
2) Query Mode
Allows to run queries that you define in the editor. You can pass strings or parameters to the query (dynamic query). You can output multiple rows when the query has a select statement.
Static
Dynamic :
Full Query - put ~Query_Port~ to pass the full query to the transformation
you can pass more than 1 query statment. If the query has SELECT, then you have to configure the output ports
Partial Query -
you can create a port and then write the sql and then use ~Table_Port~ to substitue the value
Connection Type : Specify a connection object or pass the database connection to the transformation at run time.
Number of ports should match with the number of columns in the SELECT
Default port :SQLErrorPort, Numaffectedports
- To run ANSI SQL queries in the pipelines
Type : Active/Passive based on the Mode
Two modes :-
1) Script Mode
- run scripts that are located externally. you specify script name to the transformation with
- It outputs one row for each input row
Ports
ScriptName Input
ScriptResult Output - Returns PASSED or FAILED
ScriptError Output - Returns error
2) Query Mode
Allows to run queries that you define in the editor. You can pass strings or parameters to the query (dynamic query). You can output multiple rows when the query has a select statement.
Static
Dynamic :
Full Query - put ~Query_Port~ to pass the full query to the transformation
you can pass more than 1 query statment. If the query has SELECT, then you have to configure the output ports
Partial Query -
you can create a port and then write the sql and then use ~Table_Port~ to substitue the value
Connection Type : Specify a connection object or pass the database connection to the transformation at run time.
Number of ports should match with the number of columns in the SELECT
Default port :SQLErrorPort, Numaffectedports
Sunday, March 29, 2009
Active and Passive Transformations
Active Transformations
Any transformation which changes the number of rows passing through it is called active transformations. If the number of input rows is not equal to output rows, then it is called active.
Source Qualifier
Joiner
Normalizer
Union
Filters
Routers
Sort
Rank
Aggregator
Transaction Control
Update Strategy
Passive Transformation
Any transformation which does not change the number of records passing through it is called passive transformation. If the number of input records = number of output records then it is passive.
Expression
Lookups
Stored Procedure
External Stored Procedure
Special Cases : -
Mapplet can be Active or Passive based on the transformations inside the mapplet. If it has any Active transformation, then it is active otherwise it is passive.
Interview Questions :-
1) Is a Mapplet Active or Passive - based on the transformations inside the mapplet
2) Why is a sorter active transformation - beacuse it has "Sort Distinct" property which will return distinct values.
Any transformation which changes the number of rows passing through it is called active transformations. If the number of input rows is not equal to output rows, then it is called active.
Source Qualifier
Joiner
Normalizer
Union
Filters
Routers
Sort
Rank
Aggregator
Transaction Control
Update Strategy
Passive Transformation
Any transformation which does not change the number of records passing through it is called passive transformation. If the number of input records = number of output records then it is passive.
Expression
Lookups
Stored Procedure
External Stored Procedure
Special Cases : -
Mapplet can be Active or Passive based on the transformations inside the mapplet. If it has any Active transformation, then it is active otherwise it is passive.
Interview Questions :-
1) Is a Mapplet Active or Passive - based on the transformations inside the mapplet
2) Why is a sorter active transformation - beacuse it has "Sort Distinct" property which will return distinct values.
Sequence Generator Transformation
What is NEXTVAL
NEXTVAL - generates sequence numbers when you connect this port to a transformation.
If you want to load 2 targets with the same sequence number
- Connect the NEXTVAL port to a transformation and then connect that transformation to the 2 targets.
If you want to load 2 targets with the unique sequence number
- Connect the NEXTVAL port to two different targets or two different pipelines which has those 2 targets
What is CURRVAL
CURRVAL = NEXTVAL + Increment by Value property
If you connect CURRVAL port without connecting the NEXTVAL, then the CURRVAL will give you constant value.
If you connect CURRVAL, Informatica process one row at a time - can be a performance issue.
Properties :
Start Value : Start Value of the sequence. It will be used as the start value, once it reaches End Value and if the Cycle Option is enabled
Increment By :
End Value : End Value for the Sequence Generator
Current Value : Current Value of the Sequence. It will be updated either at the end of the session(Non-reusable and Number of Cached Values =0 ) or during the session (Number of Cached Value > 0 ) . If Reset Property is enabled, then Informatica resets the value to its original value of the session.
Cycle : Enable if you want to cycle through the values after it reaches the End Value.
Reset : If enabled, Informatica updates current value to the original value every time it runs the session. If disabled Informatica updates the current value to last generated value + 1.
Number of Cached Values : Non-Reusable - If the Value is 0 - It doesnt Cache Values. Informatica reads the start value form the repository and then keeps generating the sequence values. At the ene of the session it updates the current value to Start Value or Lastgeneratedsequence+1 (based on the Reset Option)
If the Value is > 0 then Informatica reads the start value and caches the number of values based on the Number of Cached Values and then updates the Current value in the repository. It will again go to the Repository if all the values are used. At the end of the session it throws away any unremaining Sequence Numbers which were not used.
Interview Questions :-
1) If the End Value is reached and if the Cycle option is not enabled then the session fails
2) Reset Option is disabled for Reusable Sequence Generator
3) Number of Cached Values has to be >= 1 for Reusable Sequence Generator
4) To avoid big gaps between sequence values, set the Number of cached values to 0 if it is non-reusable, set the Number of Cached values to less number if it is Reusable.
Performance Issues :-
1) Do not connect CURRVAL
2) Check Number of Cached Values
Set it to 0 if it is Non-Reusable, Set it to appropriate(default 1000) if it is Reusuable
ENDVALUE & Datatypes in Versions :-
In version 8.6 the datatype of NEXTVAL & CURRVAL is bigint(19) and the ENDVALUE is 9223372036854775807.
In version 7, 8.1. the datatypes are integer(10) and the ENDVALUE is 2147483647
NEXTVAL - generates sequence numbers when you connect this port to a transformation.
If you want to load 2 targets with the same sequence number
- Connect the NEXTVAL port to a transformation and then connect that transformation to the 2 targets.
If you want to load 2 targets with the unique sequence number
- Connect the NEXTVAL port to two different targets or two different pipelines which has those 2 targets
What is CURRVAL
CURRVAL = NEXTVAL + Increment by Value property
If you connect CURRVAL port without connecting the NEXTVAL, then the CURRVAL will give you constant value.
If you connect CURRVAL, Informatica process one row at a time - can be a performance issue.
Properties :
Start Value : Start Value of the sequence. It will be used as the start value, once it reaches End Value and if the Cycle Option is enabled
Increment By :
End Value : End Value for the Sequence Generator
Current Value : Current Value of the Sequence. It will be updated either at the end of the session(Non-reusable and Number of Cached Values =0 ) or during the session (Number of Cached Value > 0 ) . If Reset Property is enabled, then Informatica resets the value to its original value of the session.
Cycle : Enable if you want to cycle through the values after it reaches the End Value.
Reset : If enabled, Informatica updates current value to the original value every time it runs the session. If disabled Informatica updates the current value to last generated value + 1.
Number of Cached Values : Non-Reusable - If the Value is 0 - It doesnt Cache Values. Informatica reads the start value form the repository and then keeps generating the sequence values. At the ene of the session it updates the current value to Start Value or Lastgeneratedsequence+1 (based on the Reset Option)
If the Value is > 0 then Informatica reads the start value and caches the number of values based on the Number of Cached Values and then updates the Current value in the repository. It will again go to the Repository if all the values are used. At the end of the session it throws away any unremaining Sequence Numbers which were not used.
Interview Questions :-
1) If the End Value is reached and if the Cycle option is not enabled then the session fails
2) Reset Option is disabled for Reusable Sequence Generator
3) Number of Cached Values has to be >= 1 for Reusable Sequence Generator
4) To avoid big gaps between sequence values, set the Number of cached values to 0 if it is non-reusable, set the Number of Cached values to less number if it is Reusable.
Performance Issues :-
1) Do not connect CURRVAL
2) Check Number of Cached Values
Set it to 0 if it is Non-Reusable, Set it to appropriate(default 1000) if it is Reusuable
ENDVALUE & Datatypes in Versions :-
In version 8.6 the datatype of NEXTVAL & CURRVAL is bigint(19) and the ENDVALUE is 9223372036854775807.
In version 7, 8.1. the datatypes are integer(10) and the ENDVALUE is 2147483647
Subscribe to:
Posts (Atom)