LINQ vs SQL in Ignite.NET: Performance
Ignite.NET offers a LINQ provider which translates C# expressions to SQL queries. LINQ has many benefits over SQL, but at what cost?
Detailed explanation of Ignite.NET distributed queries can be found in Getting Started With Apache Ignite.NET Part 3: Cache Queries. If you are new to Ignite, please read that first.
Let’s get straight to the results!
Method | Median | StdDev | ------------------ |------------ |---------- | QueryLinq | 175.8261 us | 9.9202 us | QuerySql | 62.2791 us | 5.4908 us | QueryLinqCompiled | 57.9274 us | 3.1307 us |
Code is at github.com/ptupitsyn/IgniteNetBenchmarks.
This is a comparison of equivalent queries via SQL, LINQ and Compiled LINQ.
Query is very simple (
select Age from SqlPerson where (SqlPerson.Id < ?)), data set is very small (40 items, 20 returned): this exposes LINQ overhead better.
We can see right away that LINQ is a lot slower than raw SQL, but compiled LINQ is a bit faster. Note that results are in microseconds: real-world queries may take tens or even hundreds of milliseconds, so LINQ overhead will be hardly noticeable.
Anyway, how can we explain these results? Why compiled LINQ is faster than raw SQL?
How Ignite LINQ Works
ICache<int, SqlPerson> cache = ignite.GetCache<int, SqlPerson>("persons"); IQueryable<int> qry = cache.AsCacheQueryable().Select(x => x.Value.Age); IList<int> res = qry.GetAll();
If we run the above code in Visual Studio debugger and look at
qry variable, we’ll see something like this:
Compiler has translated
.Select(x => x.Value.Age) to an Expression Tree and passed it to
which, as we can see, turns into a regular Ignite.NET
SqlFieldsQuery. Expression tree processing is not free, that’s where the overhead comes from.
We can get that
SqlFieldsQuery and run it manually:
IQueryable<int> qry = cache.AsCacheQueryable().Select(x => x.Value.Age); SqlFieldsQuery fieldsQry = ((ICacheQueryable)qry).GetFieldsQuery(); IQueryable<IList> res = cache.QueryFields(fieldsQry);
However, LINQ produces typed
IQueryable<int> instead of untyped
IQueryable<IList>. How is this achieved?
You may think that LINQ engine iterates over
IQueryCursor returned from
QueryFields and populates
List<int>, but it is more clever than that.
There is a hidden API,
ICacheInternal, which has
IQueryCursor<T> QueryFields<T>(SqlFieldsQuery qry, Func<IBinaryRawReader, int, T> readerFunc) method.
SQL engine returns query results as a raw memory stream where field values are written one after another.
So for a query above with one
int field LINQ engine will produce the following code:
var cacheInt = (ICacheInternal) cache; var fieldQry = new SqlFieldsQuery("SELECT Age from SqlPerson"); Func<IBinaryRawReader, int, int> readerFunc = (reader, fieldCount) => reader.ReadObject<int>(); IQueryCursor<int> cur = cacheInt.QueryFields(fieldQry, readerFunc);
This code produces zero extra allocations and zero type casts while reading query results. That is where LINQ advantage comes from: it is aware of resulting data types and can generate specialized deserialization code, while regular SQL query reads all field values as objects, which causes excessive allocations (
IList for each row, boxing of value types) and requires type casting.
LINQ is not only much nicer to work with than SQL, it can also be on par or faster when used properly! Just don’t forget to use
CompiledQuery when on a hot path.