The source table set combines about 1.2 million rows with about 600,000 rows, and the target table ends up with about 550,000 rows.
I don't have auto correct load set to yes, and none of the lookups are set to run as a separate process.
I cut down the result set to about 158,000 rows by adding another where clause to one of the queries, and in 4.1 the job ran in about 1.5 minutes, and it took 35 minutes in 4.2.