Data Stage

DataStage Interview Questions and Answers,Solution and Explanation - Part 3

Does the selection of 'Clear the table and Insert rows' in the ODBC stage send a Truncate statement to the DB or does it do some kind of Delete logic.
Ans: There is no TRUNCATE on ODBC stages. It is Clear table blah blah and that is a delete from statement. On an OCI stage such as Oracle, you do have both Clear and Truncate options. They are radically different in permissions (Truncate requires you to have alter table permissions where Delete doesn't).

Data Stage

DataStage Interview Questions and Answers,Solution and Explanation - Part 2

What are conformed dimensions?
Ans: A conformed dimension is a single, coherent view of the same piece of data throughout the organization. The same dimension is used in all subsequent star schemas defined. This enables reporting across the complete data warehouse in a simple format.

Data Stage

DataStage Interview Questions and Answers,Solution and Explanation

How did you handle reject data?
Ans: Typically a Reject-link is defined and the rejected data is loaded back into data warehouse. So Reject link has to be defined every Output link you wish to collect rejected data. Rejected data is typically bad data like duplicates of Primary keys or null-rows where data is expected.

Data Stage

Capturing Unmatched Records from a Join in Data Stage

The Join stage does not provide reject handling for unmatched records (such as in an InnerJoin scenario). If un-matched rows must be captured or logged, an OUTER join operation must be performed. In an OUTER join scenario, all rows on an outer link (eg. Left Outer, Right Outer, or both links in the case of Full Outer) are output regardless of match on key values.

Data Stage

Data Stage - Lookup vs. Join Stages

The Lookup stage is most appropriate when the reference data for all lookup stages in a job is small enough to fit into available physical memory. Each lookup reference requires a contiguous block of physical memory. If the datasets are larger than available resources, the JOIN or MERGE stage should be used.

Data Stage

Data Stage Transformer Usage Guidelines

Choosing Appropriate Stages

The parallel Transformer stage always generates “C” code which is then compiled to a parallel component. For this reason, it is important to minimize the number of transformers, and to use other stages (Copy, Filter, Switch, etc) when derivations are not needed.

Data Stage

Data Stage Sequential File Stages (Import and Export) Performance Tuning

Improving Sequential File Performance

If the source file is fixed/de-limited, the Readers Per Nodeoption can be used to read a single input file in parallel at evenly-spaced offsets. Note that in this manner, input row order is not maintained.

Data Stage

-Example that Reduces Contention in Data Stage Job - Configuration File

The alternative to the first configuration method is more careful planning of the I/O behavior to reduce contention. You can imagine this could be hard given our hypothetical 6-way SMP with 4 disks because setting up the obvious one-to-one correspondence doesn't work. Doubling up some nodes on the same disk is unlikely to be good for overall performance since we create a hotspot.

Data Stage

Using Configuration Files in Data Stage Best Practices & Performance Tuning

The configuration file tells DataStage Enterprise Edition how to exploit underlying system resources (processing, temporary storage, and dataset storage). In more advanced environments, the configuration file can also define other resources such as databases and buffer storage. At runtime, EE first reads the configuration file to determine what system resources are allocated to it, and then distributes the job flow across these resources.

Data Stage

Datastage ETL Environment Variable Settings DataStage Best Practices and Performance Tuning

DataStage EE provides a number of environment variables to control how jobs operate on a UNIX system. In addition to providing required information, environment variables can be used to enable or disable various DataStage features, and to tune performance settings.

Data Stage

Data Engineering

AI / ML

Universal Remote Codes
Samsung TVs	Sharp TVs	Onn
ROKU TVs	LG TVs	BlackWeb
Comcast	Generic TVs	Emerson
SONY TVs	Jumbo	Proscan TV
Element TV	Sceptre TV	Sanyo TV
One For All	Hisense TVs	FireTV
Hitachi TVs	Supersonic TVs	Haier TVs

Data Stage

Choosing Appropriate Stages

Improving Sequential File Performance

Universal Remote Codes