SSIS Merge Join Transformation
Like INNER JOIN, LEFT JOIN AND FULL OUTER JOIN in SQL statement, the Merge Join transformation provides INNER, LEFT and FULL join between two sorted datasets and output the join result. In the following table, The red colour stands for the Join result of the dataset A and B.
Left Join | Full Join | Inner Join |
---|---|---|
![]() |
![]() |
![]() |
In the following example, I'm going to replace the Merge transformation with the Merge Join transformation in the copy of Merge.dtsx package and trying to use 3 types of Join to see what difference is among them.
- Open the project LearnSSIS2, make a copy of Merge.dtsx and rename it to MergeJoin.dtsx.
- Open the package MergeJoin.dtsx and click Data Flow to go to the data flow edit area of the task.
- Remove the Merge transform and add a new Merge Join from Toolbox and link them as below.
- Put the output of "Sort A" to Merge Join transform to open the Input Output Selection dialogue box, then select "Merge Join Left Input" in Input drop-down list and click OK button.
- Put the output of "Sort B" to Merge Join transform and this input is set as "Merge Join Right Input" automatically.
- Right-click the Merge Join and choose "Edit..." to open the editor. Leave the default Join type "Inner join" unchanged and add the output columns as below.
- Click OK button and add the data viewer on the output of the Merge Join transform.
- Run the package.
- Terminate the running and repeat the step 6 to change Join type to "Left Outer Join" and run it again.
- Terminate the running again and repeat the step 6 to change Join type to "Left Outer Join" and run it again.
The output columns were renamed in the Output Alias column.
That's the data flow looking like after finishing all the above settings.
Two records were returned because both of the inputs have these records.
All the columns from "Sort A" input were outputted and the columns from "Sort B" were left joined to the output.
This time all the records were in the output.