SSIS Lookup Transformation
The lookup transformation is used to create a lookup dataset and join the selected columns to the major input columns based the connected key columns and it can only use an OLE DB connection manager or a Cache connection manager to connect to the lookup dataset. Also only SQL Server, Oracle and DB2 databases are supported as an OLE DB connection manager by the transform. I'll describe the transform in details in the following example.
Before starting to create the sample package, a lookup table like below must be created in the SQL Server database Test.
CREATE TABLE lendout( ID INT IDENTITY(1,1) NOT NULL, BookName VARCHAR(100) NOT NULL, StudentID INT NOT NULL ) INSERT INTO lendout(BookName, studentID) VALUES('Book A', 1) INSERT INTO lendout(BookName, studentID) VALUES('Book d', 2) INSERT INTO lendout(BookName, studentID) VALUES('Book H', 3) INSERT INTO lendout(BookName, studentID) VALUES('Book A', 4) INSERT INTO lendout(BookName, studentID) VALUES('Book D', 5)
The studentID is the foreign key which is used to link to the student table defined in SQL Task None Result Set. The current records in the table are listed below.
The main data comes from Books.xlsx which is the exactly same as the data source in Data Conversion section.
Now let's get started.
- Open the LearnSSIS2 project in Visual studio 2012 and make a copy of the package DataConversion.dtsx and rename it to Lookup.dtsx.
- Open the package and double click the Data Flow task to go to the data flow design area.
- Remove the OLE DB Destination transform and drag & drop a Lookup, 2 data rows transforms to the design area and have them linked as below.
- Create 2 variables count1 and count2 with Int32 data types and assign the 2 variables to the 2 Row Count transforms.
- Right-click Lookup transform and choose "Edit..." to open its editor. Then click Connection tab, and click "Use results of an SQL query" and input the following as the query.
- Click "Preview..." button the check the query results.
- Click Close button and then click Columns tab, drag the key "Copy of Book Name" in Available Input Columns to the lookup key "bookname" and then click "studentname" to create a new output column, then change its Output Alias to "Student Name".
- Click OK to close the Lookup editor. Then drag the Lookup output to the first Row count and choose "Lookup Match Output" in the pop-up dialogue box.
- Click OK and run the package.
- Stop the execution and right-click the Lookup and open its editor again. Then change the "Fail component" to "Redirect rows to no match output".
- Click OK button and drag the "Lookup No Match Output" to the second Row Count. At last, enable the Data Viewer on the 2 outputs and run the package again.
For details, you can check the Row Count section.
SELECT a.bookname, b.name as studentname FROM lendout a INNER JOIN student b ON a.studentID = b.id
After the setting, the editor looks like below.
You can see "Book A" has 2 records, "Book d" and "Book D" are the same? We'll see it at the running result.
Next we'll use the "bookname" as the connect key and add the "studentname" column to the outputs.
You will get the above error result. Why?
Because the book B and C cannot find its corresponding record in the lookup table and by default this situation is set as errors.
Let's change the setting.
We'll keep the Cash mode as its default "Full cache". This setting will cache the lookup query result in memory to make it run faster on the next call.
The first record of "Book A" was chosen as the matched result because the lookup transform always gets the first matched record in the lookup query. "Book d" was not chosen because the Lookup transformation are case sensitive. That means "Book d" and "Book D" are different in the eyes of the transform.