Wednesday, June 17, 2009

Query to get the session information from the repository

Query to get the session information from the repository

1) use the following sql if u want just the times for the session or workflow
SELECT
OPB_SUBJECT.SUBJ_NAME SUBJECT_AREA,
OPB_WFLOW_RUN.WORKFLOW_NAME,
OPB_TASK_INST_RUN.SUBJECT_ID,
OPB_TASK_INST_RUN.WORKFLOW_ID,
OPB_TASK_INST_RUN.WORKFLOW_RUN_ID,
OPB_TASK_INST_RUN.WORKLET_RUN_ID,
OPB_TASK_INST_RUN.CHILD_RUN_ID,
OPB_TASK_INST_RUN.INSTANCE_ID,
OPB_TASK_INST_RUN.INSTANCE_NAME,
OPB_TASK_INST_RUN.TASK_ID,
OPB_TASK_INST_RUN.TASK_TYPE,
OPB_TASK_INST_RUN.START_TIME,
OPB_TASK_INST_RUN.END_TIME,
OPB_TASK_INST_RUN.RUN_ERR_CODE,
OPB_TASK_INST_RUN.RUN_ERR_MSG,
OPB_TASK_INST_RUN.RUN_STATUS_CODE,
OPB_TASK_INST_RUN.TASK_NAME,
OPB_TASK_INST_RUN.VERSION_NUMBER TASK_VERSION_NUMBER,
OPB_TASK_INST_RUN.SERVER_ID,
OPB_TASK_INST_RUN.SERVER_NAME
FROM
OPB_TASK_INST_RUN, OPB_WFLOW_RUN, OPB_SUBJECT
WHERE
OPB_SUBJECT.SUBJ_ID = OPB_TASK_INST_RUN.SUBJECT_ID
AND OPB_WFLOW_RUN.WORKFLOW_ID = OPB_TASK_INST_RUN.WORKFLOW_ID
AND OPB_WFLOW_RUN.WORKFLOW_RUN_ID = OPB_TASK_INST_RUN.WORKFLOW_RUN_ID

3) Use the following if u want the number of records
NOTE :: widget_type 3 = source / source qualifier
2 = target

so u cannot do sum on the total....u might have to split into 2 queries...or u can use decode

SELECT
OPB_SUBJECT.SUBJ_NAME SUBJECT_AREA,
OPB_WFLOW_RUN.WORKFLOW_NAME,
OPB_TASK_INST_RUN.SUBJECT_ID,
OPB_TASK_INST_RUN.WORKFLOW_ID,
OPB_TASK_INST_RUN.WORKFLOW_RUN_ID,
OPB_TASK_INST_RUN.WORKLET_RUN_ID,
OPB_TASK_INST_RUN.CHILD_RUN_ID,
OPB_TASK_INST_RUN.INSTANCE_ID,
OPB_TASK_INST_RUN.INSTANCE_NAME,
OPB_TASK_INST_RUN.TASK_ID,
OPB_TASK_INST_RUN.TASK_TYPE,
OPB_TASK_INST_RUN.START_TIME,
OPB_TASK_INST_RUN.END_TIME,
OPB_TASK_INST_RUN.RUN_ERR_CODE,
OPB_TASK_INST_RUN.RUN_ERR_MSG,
OPB_TASK_INST_RUN.RUN_STATUS_CODE,
OPB_TASK_INST_RUN.TASK_NAME,
OPB_SWIDGINST_LOG.INSTANCE_NAME,
OPB_SWIDGINST_LOG.WIDGET_TYPE,
APPLIED_ROWS SUCCESSFUL_ROWS,
AFFECTED_ROWS SUCCESSFUL_AFFECTED_ROWS,
REJECTED_ROWS FAILED_ROWS,
LAST_ERR_MSG LAST_ERROR,
LAST_ERR_CODE LAST_ERROR_CODE
FROM
OPB_TASK_INST_RUN, OPB_WFLOW_RUN, OPB_SUBJECT, OPB_SWIDGINST_LOG
WHERE
OPB_SUBJECT.SUBJ_ID = OPB_TASK_INST_RUN.SUBJECT_ID
AND OPB_WFLOW_RUN.WORKFLOW_ID = OPB_TASK_INST_RUN.WORKFLOW_ID
AND OPB_WFLOW_RUN.WORKFLOW_RUN_ID = OPB_TASK_INST_RUN.WORKFLOW_RUN_ID
AND OPB_SWIDGINST_LOG.WORKFLOW_RUN_ID = OPB_TASK_INST_RUN.WORKFLOW_RUN_ID
AND OPB_SWIDGINST_LOG.TASK_INSTANCE_ID = OPB_TASK_INST_RUN.INSTANCE_ID
AND OPB_SWIDGINST_LOG.WIDGET_TYPE IN (3, 2)

Tuesday, June 2, 2009

Teradata External Loaders

FAST LOAD -
- Only on Empty Tables with no SI. Use it for Truncate and Reload
- High volume updates, when you run a session with single partition
- Each FastLoad job loads one table on the Teradata Database.
- When using FastLoad to load more than one table, you must submit multiple FastLoad jobs—one for each table.

Fastload acquires table level locks and does not release them if the loader terminates abnormally. You must drop and recreate the table to release the locks

TPUMP -
- Performs Insert, Update, Delete and Upserts
- Low volume updates, when you run a sessions with multiple partitions
- Acquires row-hash lock - other users can access

MULTILOAD -
- Performs Insert, Update, Delete and Upserts
- High volume updates, when you run a sessions with multiple partitions
- Acquires Table level locks - others cannot access it / offline loading
- Each Teradata MultiLoad import task can perform multiple data insert, update, and delete
functions on up to five different tables or views

To recover from a failed MultiLoad, you must release the target table from the MultiLoad state and also drop the MultiLoad log table.
MultiLoad can require a large amount of resources on a Teradata system
By default, infa uses instance of MultiLoad for every target file.
To load multiple tables the generated MultiLoad script file must be editted.

Each Teradata MultiLoad import task can perform multiple data insert, update, and delete
functions on up to five different tables or views

To recover from a failed MultiLoad, you must release the target table from the MultiLoad state and also drop the MultiLoad log table.
MultiLoad can require a large amount of resources on a Teradata system
By default, infa uses instance of MultiLoad for every target file.
To load multiple tables the generated MultiLoad script file must be editted.

Both FastLoad and MultiLoad work at the data block level. In other words, these loaders are much faster than"standard DML" within Teradata.
The both acquire table level locks, which means they are only appropriate for off-line data loading. MultiLoad first writes the data into temporary tables in Teradata, and then it updates the data blocks directly. All changes to a physical data block are made in a single operation


You can also use data driven mode to perform insert, update, or delete operations based on an Update Strategy or Custom transformation
For a session with multiple partitions, use the round-robin partition type to route data to a single target file.
If you invoke a greater number of sessions than the maximum number of concurrent sessions the database allows, the session may hang. You can set the minimum value for Tenacity and Sleep to ensure that sessions fail rather than hang.

If you use a Teradata external loader to perform update or upsert operations, use the Target Update Override option in the Mapping Designer to override the UPDATE statement in the external loader control file. For upsert, the INSERT statement in the external loader control file remains unchanged

- If you use named pipes, set the checkpoint value to 0 to avoid checkpointing

Overriding Control File
- You can override the control file in the session properties. If you do not override the control file, the Integration Service generates a new control file based on the session and loader properties each time you run a session. If you change a target or loader connection setting after you edit the control file, the control file does not include those changes.
- You can enter PmNullPasswd as the database password to prevent the password from appearing in the control file. Instead, the Integration Service writes an empty string for the password in the control file.


IS Staged -
If not checked then the file will be piped/streamed to loader. If not checked set CHECKPOINT 0
If checked then the file is written to the disk first and then the file is used to load.

CHECKPOINT -
Recovery/Restart will not work if you use Named Pipe.

If a table has a partition defined then the key(s) on which the partition has been defined should also be marked as a primary key(s) when defining the target table
If the primary keys are not defined then any attempt to update, upsert or delete data using Teradata loaders will result in an error.


Properties

MULTILOAD

Error Limit - 0 - number of rejected records. 0 means no limit. uniqueness doesnt count.
Checkpoint - 10000 - 60 or more - means multiple of records
- 1-59 means - in minutes
Tenacity - 10000 - hours to try, if login fails, waits for x minutes (sleep time) and then try to login
Load Mode - Insert, Delete, Update, Upsert or Data driven.
- if it is dd, infa writes a column to target to indicate update strategy.
Max Session - 1 - max num of mload sesions per mload job. max is between 1 & 32767
Is Staged - Disabled - if u enable, then it writes a flat file and hten loads, otherwise uses named pipe.
Errro Database Name, Work Table Database, Log Table Database

At Session Level
Error Table 1 -ET_
Errot Table 2 - UV_
Work Table - WT_
Log Table - ML_
Control file Content Override

TPUMP -

Error Limit - 0 - num of rows rejected for errors. if it exceeds tpump rollsback the transaction that causes the last error. 0 stop on error.
Checkpoint - 15 -
Tenacity - 4 -
Packing Factor - 20 - num of rows that each session buffers hold. improves network/channel efficiency.
statement rate - 0 - initial max rate per minute
Serialize - Disabled - enable if u have multiple updates per row

Teradata Client

Teradata Client
Teradata BTEQ client software, the Teradata ODBC driver, and any other Teradata client software must be installed on the machine where the Integration Service process runs

Mappings/Transformations
Source - Use ODBC. ODBC is a"native" interface for Teradata; it is not a layer on top of CLI
Lookup & Stored Procedure - Use ODBC
Target - Use ODBC, FASTLOAD, MLOAD, TPUMP

You can use only fixed-width files when you load to Teradata through loaders.
If you specify delimited files, the Informatica Server fails the session

In the control file all columns are defined as CHAR.
During runtime these loaders convert the input data to CHAR and then convert it back to original datatypes

When calling a pre- or post-load stored procedure without any argument, use a dummy value for the argument in the call text properties.
If you do not configure the call text property, the mapping is invalid.

Setting by ODBC
1) Set the TERADATA_HOME, ODBCHOME, and PATH environment variables.
2) Set the shared library environment variable
3) Edit the existing odbc.ini file or copy the odbc.ini file to the home directory and edit it

To run the loader utilities, make sure the local host file on the machine where the Teradata client is installed
Teradata data source in the odbc.ini file: DateTimeFormat=AAA
NoScan=Yes

Session Modes

The default value of SessionMode is Teradata , which is case insensitive.
Change the SessionMode parameter to 'ANSI' in the ODBC DSN in the .odbc.ini file. (SessionMode = ANSI)

When you use ANSI session mode, Teradata does not roll back the transaction when it encounters a row error.
If you choose Teradata session mode, Teradata rolls back the transaction when it encounters a row error.
So if you have a session with a Teradata target as well as a lookup you will want to create a separate .odbc.ini entry(DSN) for the target to avoid this issue.

A Stored Procedure that was compiled in one mode cannot be run in the other mode. You will get Invalid session mode for procedure error.

ANSI mode:
A transaction initiation is always implicit.
A transaction is opened by the first SQL statement executed in a session or by the first statement executed following the close of a transaction.
A transaction is initiated when no transaction is currently active, and a SQL statement is executed.
The COMMIT [WORK] or ROLLBACK [WORK] statements close a transaction.
MULTISET is the default in the CREATE TABLE statement in ANSI mode.

Teradata mode:
Transactions can be implicit or explicit.
Statement failures roll back the entire transaction. An error result rolls back the entire transaction.
Control of character truncation causes no error.
The default in character comparison is NOT CASESPECIFIC.
SET is the default in the CREATE TABLE statement


TIMESTAMP -
To laod timestamp with fractional secs, write the column as varchar2 and in the format 'YYYY-MM-DD HH24:MM:SS.SSSSSS'

You may also encounter an error writing a PowerCenter date/time port to a Teradata timestamp (0) field, or a Teradata timestamp with no fractional seconds.
To resolve this issue - change the column in Teradata to a Timestamp(n) wheren>0
or edit the control file generated-change the length of the field that corresponds to the Timestamp (0) field from CHAR (26) to CHAR (19)