When we are using Contains
LINQ operator in an EF subquery, EF Core now generates better queries
using SQL
IN
instead of
EXISTS. This can result in dramatically faster queries.
Let's go by an example.
Consider the following
DbContext.
public class MyDbContext : DbContext
{
public DbSet<Employee> Employees { get; set; }
public DbSet<Department> Departments { get; set; }
}
public record Employee(string Name, int DepartmentId)
{
public int Id { get; init; }
}
public record Department(string Name)
{
public int Id { get; init; }
public bool IsWfhAllowed { get; set; }
}
Now say, I want to get all the employees who are allowed to work from home. I can
write the following LINQ query for my requirement.
List<Employee> wfhEmployees = await context.Employees
.Where(e => context.Departments
.Where(d => d.IsWfhAllowed)
.Select(d => d.Id)
.Contains(e.DepartmentId))
.ToListAsync();
If I am on an earlier EF version before EF 8.0 (like EF 7.x), EF generates the
following query.
SELECT [e].[Id], [e].[DepartmentId], [e].[Name]
FROM [Employees] AS [e]
WHERE EXISTS (
SELECT 1
FROM [Departments] AS [d]
WHERE [d].[IsWfhAllowed] = CAST(1 AS bit) AND [d].[Id] = [e].[DepartmentId])
Here the subquery is referencing the outer [Employees] table, so the subquery must be executed for each row in the [Employees] table (correlated subquery).
With EF 8.0, EF generates the following query.
SELECT [e].[Id], [e].[DepartmentId], [e].[Name]
FROM [Employees] AS [e]
WHERE [e].[DepartmentId] IN (
SELECT [d].[Id]
FROM [Departments] AS [d]
WHERE [d].[IsWfhAllowed] = CAST(1 AS bit))
Here the subquery no longer references the outer table, meaning it can be evaluated once, yielding massive performance improvements on most database systems.
Note: On Microsoft SQL Server, the database can optimize the first query to the second query so that the performance is likely the same.
Hope this helps.
Happy Coding.
Regards,
Jaliya
No comments:
Post a Comment