Tuesday, June 2, 2009

Teradata External Loaders

FAST LOAD -
- Only on Empty Tables with no SI. Use it for Truncate and Reload
- High volume updates, when you run a session with single partition
- Each FastLoad job loads one table on the Teradata Database.
- When using FastLoad to load more than one table, you must submit multiple FastLoad jobs—one for each table.

Fastload acquires table level locks and does not release them if the loader terminates abnormally. You must drop and recreate the table to release the locks

TPUMP -
- Performs Insert, Update, Delete and Upserts
- Low volume updates, when you run a sessions with multiple partitions
- Acquires row-hash lock - other users can access

MULTILOAD -
- Performs Insert, Update, Delete and Upserts
- High volume updates, when you run a sessions with multiple partitions
- Acquires Table level locks - others cannot access it / offline loading
- Each Teradata MultiLoad import task can perform multiple data insert, update, and delete
functions on up to five different tables or views

To recover from a failed MultiLoad, you must release the target table from the MultiLoad state and also drop the MultiLoad log table.
MultiLoad can require a large amount of resources on a Teradata system
By default, infa uses instance of MultiLoad for every target file.
To load multiple tables the generated MultiLoad script file must be editted.

Each Teradata MultiLoad import task can perform multiple data insert, update, and delete
functions on up to five different tables or views

To recover from a failed MultiLoad, you must release the target table from the MultiLoad state and also drop the MultiLoad log table.
MultiLoad can require a large amount of resources on a Teradata system
By default, infa uses instance of MultiLoad for every target file.
To load multiple tables the generated MultiLoad script file must be editted.

Both FastLoad and MultiLoad work at the data block level. In other words, these loaders are much faster than"standard DML" within Teradata.
The both acquire table level locks, which means they are only appropriate for off-line data loading. MultiLoad first writes the data into temporary tables in Teradata, and then it updates the data blocks directly. All changes to a physical data block are made in a single operation


You can also use data driven mode to perform insert, update, or delete operations based on an Update Strategy or Custom transformation
For a session with multiple partitions, use the round-robin partition type to route data to a single target file.
If you invoke a greater number of sessions than the maximum number of concurrent sessions the database allows, the session may hang. You can set the minimum value for Tenacity and Sleep to ensure that sessions fail rather than hang.

If you use a Teradata external loader to perform update or upsert operations, use the Target Update Override option in the Mapping Designer to override the UPDATE statement in the external loader control file. For upsert, the INSERT statement in the external loader control file remains unchanged

- If you use named pipes, set the checkpoint value to 0 to avoid checkpointing

Overriding Control File
- You can override the control file in the session properties. If you do not override the control file, the Integration Service generates a new control file based on the session and loader properties each time you run a session. If you change a target or loader connection setting after you edit the control file, the control file does not include those changes.
- You can enter PmNullPasswd as the database password to prevent the password from appearing in the control file. Instead, the Integration Service writes an empty string for the password in the control file.


IS Staged -
If not checked then the file will be piped/streamed to loader. If not checked set CHECKPOINT 0
If checked then the file is written to the disk first and then the file is used to load.

CHECKPOINT -
Recovery/Restart will not work if you use Named Pipe.

If a table has a partition defined then the key(s) on which the partition has been defined should also be marked as a primary key(s) when defining the target table
If the primary keys are not defined then any attempt to update, upsert or delete data using Teradata loaders will result in an error.


Properties

MULTILOAD

Error Limit - 0 - number of rejected records. 0 means no limit. uniqueness doesnt count.
Checkpoint - 10000 - 60 or more - means multiple of records
- 1-59 means - in minutes
Tenacity - 10000 - hours to try, if login fails, waits for x minutes (sleep time) and then try to login
Load Mode - Insert, Delete, Update, Upsert or Data driven.
- if it is dd, infa writes a column to target to indicate update strategy.
Max Session - 1 - max num of mload sesions per mload job. max is between 1 & 32767
Is Staged - Disabled - if u enable, then it writes a flat file and hten loads, otherwise uses named pipe.
Errro Database Name, Work Table Database, Log Table Database

At Session Level
Error Table 1 -ET_
Errot Table 2 - UV_
Work Table - WT_
Log Table - ML_
Control file Content Override

TPUMP -

Error Limit - 0 - num of rows rejected for errors. if it exceeds tpump rollsback the transaction that causes the last error. 0 stop on error.
Checkpoint - 15 -
Tenacity - 4 -
Packing Factor - 20 - num of rows that each session buffers hold. improves network/channel efficiency.
statement rate - 0 - initial max rate per minute
Serialize - Disabled - enable if u have multiple updates per row

3 comments:

  1. Hi

    Quite a useful post there.
    I had a question regarding Teradata.
    Say I am loading FastLoad to load data into Teradata from Informatica.
    Is there some default text qualifier? If yes, then, will having a delimiter(say comma) inside the text qualifier(say double quotes) create any sort of problem?
    If not, how do I setup a text qualifier? Or change the default option?

    ReplyDelete
  2. I like your blog, I read this blog please update more content on hacking,Nice post
    Informatica Cetification

    ReplyDelete