Wednesday, April 29, 2009

Workflow Manager

Workflow Manager

Export and Import Objects :-
You can export workflows, worklets, tasks and sessions.
When you export it includes child objects like worklets, sessions, connection objects and session configuration.


List of Tasks :-

Reusable Tasks
- Are the tasks which you can create in Task Developer. Can be used in worklets and Workflows
- Only Email, Command and Session Tasks can be Reusable tasks

Email
You specify Email Usernames, Subject and Text.
Command
Fail task if any Command Fails
Recovery Strategy [Fail task and continue workflow / Restart ]

Timer
Event Wait
Event Raise
Assignment
Control
Decision



Connection Objects :-

FTP Connection
- Username , Password, HostName, Default Remote Directory,
- Retry Period
- Instead of specifying password, you can Use parameter in the password
- Use SFTP. For SFTP you have to specify Public Key Filename, Private Key Filename and Private Key File password

Relational Connections
- Username, password, Connect String and Code Page
- You can use Session parameter $ParamName for database username and password
To use session parameter as password, enable Use Parameter in Password
For the database which doesnt have username/password use PmNullUser and PmNullPasswd


- Connection Environment SQL - SQL to run everytime when a connection is made
- Transaction Environment SQL - SQL to run at the begining of each transaction.
- Connection Retry Period : Time in seconds to retry connecting to the database. 0 means infinite

Sunday, April 26, 2009

Session Properties

Session Properties

General Options

Write backward compatible Session log file
Session log file name
Session log file directory
Parameter file name

Enable test load
Number of rows to test
- will be enabled only if you Enable Test Load

$Source connection value
$Target connection value

Target source rows as
- Insert / Update / Delete / Data driven
- If the mapping has update transformation then by default Data driven.

Commit Type [Target]
- Source / Target / User defined
Commit Interval
- commit after x number of records
Commit on End of file
Rollback transaction on errors

Recovery Strategy
- Restart Task / Fail task continue workflow / Resume from last check point

Java classpath

Performance -

DTM Buffer size [Auto]
- You can specify Auto or value. By default Integ service allocated 12MB

Collect performance data
- to collect performance data. you can use workflow monitor to see it

Write performance data to repository
- writes the performance details to the repository. you can use workflow monitor to see it

Incremental Aggregation
Reinitialize Aggregate data
Enable High precision
- to process Decimal datatype to a precision of 28

Session retry on deadlock
- applicable only for Normal loads
- In the Integ service you can specify number of deadlock retries and deadlock sleep time period.

Session sort order
- If the integration service runs in Ascii then it is Binary
- If it runs on Unicode, then it displays sort order based on the Integration service code page

Pushdown Optimization
- None / To Source /To Source with view / To Target / Full / Full with view / $PushdownConfig
-

Session Configuration

Session Configuration

Advanced -
Constraint based load ordering [Unchecked]
- To load data into the targets based on primary key foreign key relationship
Cache Lookup() Function - [Checked]
Default buffer block size [Auto]
-
Line Sequential Buffer Length [1024]
- Informatica reads a row from the file and holds it in the buffer. If the row is more than 1024 bytes, then increase the value.
Max Memory Allowed for Auto Memory attributes [512Mb]
Max percentage of Total memory allowed for Auto Mem Attributes [5]
Additional Concurrent pipelines for Lookup Cache Creation [Auto]

Log Options -
Save Sessions Log by [Session runs]
- Session runs - saves the number of sessions logs specifed
- Session timestamp - appends the timestamp to the session log file name.
Save Sessions log for these runs
- will be enabled only for Session Log by Session runs.
- specify the number of log files to store .0 means it will store only the current log file
- you can use $PMSessionLogCount

Error Handling -
Stop on errors
- Specify number of non-fatal errors to allow (reader, writer, dtm errors)
- 0 means it wont stop. you can use $PMSessionErrorThreshold service variable

Override tracing
On Stored Procedures error [Stop] - Stop / Continue
On Pre-session command task error [Stop] - Stop / Continue
On Pre-Post SQL error [Stop] - Stop / Continue


Error Log Type
Errot Log DB Connection
Error Log Table Name prefix
Error Log File Directory
Error Log File Name
Log Row Data
Log Source Row Data
Data column delimiter

Partitioning Options -
Dynamic Partitions [Disabled]
- Other options are Based on number of partitions , Based on source Partitioning, Based on number of nodes in the grid.
Number of Partitions [1]
- Will be enabled if you choose "Based on number of partitions"
- You can also specify $DynamicPartitionCount session parameter


Sessions on Grid -
Is Enabled [Unchecked]
- To run the sessions in a grid

Saturday, April 25, 2009

Mapping Designer

Mapping Designer

Source in the Mapping
- Owner Name
- Source Table name - Override with parameter or variable value


Expression
- You can add comments to the expresssion


Target in the Mapping
- Reject Truncated/Overflowed rows
- Update Override
- Table Name Prefix
- Pre SQL & Post SQL

Repository Manager

Repository Manager

Folder
+ Allow Shortcut Option allows you to create shortcuts for the mappings present in the folder
+ Owner - you can specify the owner of the folder
Folder Permissions
For each user/usergroup you can specify RWX permissions
Owner has RWX and you cannot change it.
OS Profile
If the Integration Service uses OS profiles, then you can specify the profile to use.

User Groups
+ Default Groups - Administrator and Public
+ Administrator Group contains 2 users, User Administrator and repository database User.
You cannot delete or remove these 2 users from the Administrator Group.

Show Locks & Show Connections
- You can see the locks obtained on the repository objects by user , folder, application, host.
- To release a lock you have to idnetify the user connection and kill the User Connection.

Export Objects and Import Objects:-
- When you export an workflow, it exports all the child objects like mappings and sessions.
- When you import you can choose to import individual objects like mapping or import the whole workflow.


Metadata Extension
The Metadata extension which you create using Repository Manager is ReUsable.
For example if you create it for Source, then it is available for all the sources in the repository

Vendor-defined cannot be changed. Anything you create is user-defined. To change the value , you use designer or workflow manager.

You can add metadata extension for any Transformation or for A;;.
Data type String / Boolean / Numeric
Default Value
Client Visible
Client Editable
Share Read
Share Write

Constraint based Loading

Constraint based Loading :-

It is an option available in the session.
If enabled, for each row generated by the source, it will load the record in the primary key table first, then it will load any foreign key tables. It applies only to the inserts.

- All the targets should have the same Active source
- You should define the relationships for the Target in the Warehouse Designer
- Targets shoule be in the same Target connection group
- Treat source as "Insert". You cannot use updates with constraint based loading


FullCBLOSupport = Yes/No
- It is a custom property. If you set it in the Integration service, it is applicable for all workflows. You can also set it in the session config object tab in the Custom Properties attributes.
- Allows you to load the changed data (updates and deletes) to be loaded to the target in the same transaction control unit (TCU).
- Your mapping should not contain any active transformation which changes the row id generated by the reader.

Wednesday, April 22, 2009

Update Transformation

Update Transformation

Active Tranformation
Used to flag records for insert / update / delete or reject based on condition.


Forward Rejected rows -
By default the rejected rows are forwarded to the next transformation.
All the rejected rows are written to the session reject file. If you disable this option then the records are written to session log file. If you enable row error handling then the rows are written to row error logs and not to the session log file.

Aggreagator after Update Transformation
If a record is flagged for delete, then the aggregator will subtract the value, if it is flagged for insert then it is added.

Router Transformation

Router Transformation:-

Active Transformation.
1 Input Group and Multiple Output group transformation.

Two types of output groups -Userdefined and default.
You specify the test condition for each user defined group you create.

All the groups are processed by informatica if the default group is connected.
If the default group is not connected then informatica will not process the groups which are not connected.

If a record satisfies more than one group, then informatica will pass this row multiple times.

Friday, April 17, 2009

Version 8 - New Features - Workflow Manager

Workflow Manager

Server Configuration is removed from Workflow and is added to the Repository Administration Console

Session
Session log file is written in new binary format

Write backward compatible session log file
Recovery Strategy (Restart task, Fail task continue workflow, Resume from lask checkpoint)
Java Classpath
Write performance data to repository

DTM Buffer Size value is set to Auto in v8. In v7 it is 120000000


Queries
Object Types which you can query are Schedulers, Session Configuration, Tasks, Sessions, Worklets and Workflows



Session Configuration

Advanced
Maximum Memory Allowed For Auto Memory Attributes
Maximum Percentage of Total Memory Allowed For Auto Memory Attributes
Additional Concurrent Pipelines for Lookup Cache Creation

The value for Default Buffer block size is "Auto" in v8. In v7 it is 64000.

Error Handling
Enable Recovery Option is removed in version 8

Partitioning Options
Dynamic Partitioning (Disabled or Enabled)
Number of Partitions (You can specify a number when you Enable Dynamic Partitioning)

Relational Connections
Envirnoment SQL which was in version 7 was replaced by 2
Connection Environment SQL
Transaction Environment SQL
New : Connection Retry Period : Time in seconds to retry connecting to the database. 0 means infinite
[For Oracle] Rollback Segment which was in version 7 was removed

FTP Connection
Retry Period


Command Tasks
In Version 8
Fail Task if any command fails (checkbox)
Recovery Strategy (Fail Task and continue workflow , Restart)
In Vesrion 7
Run if previous completed (checkbox)

Thursday, April 16, 2009

New Features : Designer : Version 8.1 , 8.6

User-defined Functions
- It can be private - callable in UDF only or public - callable from any transformation expresssion
- You cannot change from Public to private
- You can export/import. If you export a mapping which uses UDF, then it exports UDF too.

"Create and Add Target"
You can right click on any Transformation "

In v8.6
When creating lookup transformation you have the option of choosing source, target, source qualifier or Import

Source Qualifier
Output is deterministic - Checkbox - Source data does not change during the session run
. When you configure this property, the Integration Service does not stage source data for recovery if transformations in the pipeline always produce repeatable data.

Output is repeatable - Never/Always Source data is sorted
When output is deterministic and output is repeatable, the Integration Service does not stage source data for recovery.


Sorter Transformation
Sort Cache Size - By default it is Auto. In version 7 it is 8MB

Sequence Generator
In version 8.6 the datatype of NEXTVAL & CURRVAL is bigint(19) and the ENDVALUE is 9223372036854775807.
In version 7, 8.1. the datatypes are integer(10) and the ENDVALUE is 2147483647

Joiner Transformation (v8.6)
Property : Master Sort Order : Auto/Ascending
Specifies the sort order of the master source data. Choose Ascending if the master source data is in ascending order. If you choose Ascending, also enable sorted input. Default is Auto

Lookup Transformation (v8.1)(
Lookup policy on multiple match : Use Any Value is added.
Lookup source is static :checkbox
When you choose dynamic lookup cache (lookup policy on mulitple match defaults to Report Error in v8(. In v7 nothing happens

Lookup Transformation (v8.6)
Lookup Source Filter
Source Type (Database, Flat File, Source Qualifier)
Lookup policy on multiple match : Use Any Value is added.
Lookup source is static :checkbox
When you choose dynamic lookup cache (lookup policy on mulitple match defaults to Report Error in v8(. In v7 nothing happens
Pre-build lookup cache (Auto, Always allowed, Always disallowed)
Subsecond Precision :Default integer 6

Stored Procedure (v8.6)
Subsecond precision
Output is repeatable - Based on Input order
Output is deterministic - Checkbox

Sorter Transformation

Sorter Transformation

- Active/Passive

- Sorts data according to the sort key specified.
You can specify multiple columns as sort keys and the order (Ascending or Descending for each column)

Options :-

Distinct Output Rows -
Can output distinct records if you check this option

Null Treated Low -
By default it treats as High values. If you check this option it will treat it as Low

Case Sensitive - Checked by default. If you enable uppercase are higher

Work Directory - By default it is $PMTempDir

Sorter Cache Size -
It is the max amount of memory used for sorter.
In version 8 it is set to Auto. In version v7 it is 8MB
If it cannot allocate then the session will fail
If it needs more memory that it will pages the data in the Work Directory, and writes a warning in the log file.

Tuesday, April 7, 2009

SQL transformation

SQL transformation
- To run ANSI SQL queries in the pipelines

Type : Active/Passive based on the Mode

Two modes :-

1) Script Mode
- run scripts that are located externally. you specify script name to the transformation with
- It outputs one row for each input row
Ports
ScriptName Input
ScriptResult Output - Returns PASSED or FAILED
ScriptError Output - Returns error

2) Query Mode
Allows to run queries that you define in the editor. You can pass strings or parameters to the query (dynamic query). You can output multiple rows when the query has a select statement.
Static
Dynamic :
Full Query - put ~Query_Port~ to pass the full query to the transformation
you can pass more than 1 query statment. If the query has SELECT, then you have to configure the output ports

Partial Query -
you can create a port and then write the sql and then use ~Table_Port~ to substitue the value

Connection Type : Specify a connection object or pass the database connection to the transformation at run time.

Number of ports should match with the number of columns in the SELECT
Default port :SQLErrorPort, Numaffectedports

Version 8 Architecture

Informatica Version 8 architecture

Domain :-
- A collection of nodes and services
- Primary unit of administration

+ Service Manager
It stores domain users and groups in the domain configuration database and periodically synchronizes them to the PowerCenter repository.


+ Alerts
Subscribe to alerts - notification abt node events such as failure

Administration Console :-
- It is a web based application.

Security Page
- You can manage all Domain users and groups along with roles and privileges at the Domain level

Users -

Domain users with the appropriate privileges can log into the following PowerCenter applications: Administration Console, PowerCenter Client tools.

Repository Object level permissions are administered using the Repository Manager and are stored in the PowerCenter repository



High Availability :-

+ Resilience for sources and lookup transforamtion
Resilience for Network failures or database unavailability. it attempts to reconnect for the amount of time configured for retry period in the connection object

+ Operating mode
Normal mode
Safe mode - to limit access to IntegServ during deployment or maintenance
You can also configure safe mode on failover

Integration Service :-

+ Resource Thershold - resource thershold for each node to consider or eliminate a node for the dispatch of task. Maximum CPU run queue length, Max memory % thersholds, Max processes

+ Dispatch mode - the way load balancer dispatches tasks - round robin, metric based or adaptive dispatch mode

+ Service Levels - the order in which to dispatch tasks from dispatch queue. you have to create services levels in the Admin console


Transformations
+ SQL Transformation
+ HTTP Transformation


pmcmd
- massupdate

NonVersioned repository - deploymnet groups, dynamic deployment groups
Execute permiison on deploymnet group privelage to deploy without having write permission
u still need read on the source

can have mixed version service in domain

oracle supartition support - use dynamic partitioning based on NUMBER of source partition in a session -> the infa service sets the number of partitions to the total number of supartitions at he source

PUSHDOWN Optimization :-
Subquery support


data masking


Performance Improvements
- Sorter uses less cache/ temp space
- Aggregator and Joiner are faster
- Flat file source and targets
- Repository Agent Caching
Caches the objects. By default 10000

EnableRepAgentCaching :

Yes

RACacheCapacity

50000


Use it in production, where there is minimal repository changes. Otherwise it will recache everytime an object changes and might impact performance


Concurrent workflow - multiple instances of workflow
Pipeline Lookup - can lookup relational, flat, sap, jms, msmq
Get runtime information - mapping variables, session variables and workflow variables e.g mapping name, folder name

Incremental Aggregation -
updates the Btree when u run for the firsttime after the upgrade.
if the upgrade is from 7 run the session without data to update the btree

Domain





Parameters and Variables :-
Can have different sections for Service Variables, service process variables, workflow variables, session parameters, mapping parameters, and mapping variables

1) you can use workflow variables in sessions. it treats as parameter and the value does not change

2) you can use the parameters and variables in the following places
Table owner name for relational sources, table prefix for targets
Ftp file name and directory
Lookup cache file name and directory
Email task(address, subject and body)

Target pre and post session sql
pre and post session commands
call text for unconnected stored procedure
target update override
command tasks
workflow log file names
table name prefix for error logs


Performance Improvement in v8.1.1:-

+ Sorter - sorter uses more compact format which reduces the temp disk space . Sort alogorithm is also improved to reduce sort time.
+ Pushdown Optimization

Generate flat file targets by transaction. You can
transaction, you add a FileName port to the flat file target definition. When you connect the FileName port in the mapping, the Integration Service writes a target file at each commit.


Version 8.6.1

Deployment Groups
+ Non versioned repository
+ Execute Deployment Groups privelage - to deploy without having write permission
u still need read on the source
+ Post deployment validation

Partitioning
+ Oracle subpartition support

Transformations
+ Data Masking.


From PowerCenter 8, you can manage all users and groups information along with roles and privileges at the Domain level on the Security page of the Administration Console

The Service Manager stores domain users and groups in the domain configuration database and periodically synchronizes them to the PowerCenter repository.


Domain users with the appropriate privileges can log into the following PowerCenter applications: Administration Console, PowerCenter Client tools.

Repository Object level permissions are administered using the Repository Manager and are stored in the PowerCenter repository

1. Move the files to server/infa_shared directory under the PowerCenter Services installation directory.

2. Update $PMRootDir for the Integration Service process in the Administration Console. For more information about configuring $PMRootDir, see Creating and Configuring the Integration Service in the PowerCenter Administrator Guide.


Monday, April 6, 2009

Version 8 Flat file Enhancements

Merge Type
Not at all
Sequentially
Concurrently
File list

Append if Exists
Header Options

Header Command
Footer Command

Input Type
- File
- Command
- Command Type

Target Command
- Infa writs data to command instead of target flat file


FTP :-
- Source can be File List
- Merge files concurrently on the FTP