Fundamentals of LINQ
Fundamentals of LINQ
Fundamentals of
(Language-Integrated Query)
Fundamentals of LINQ
Octavio Hernandez
1
About the Author
Octavio Hernandez currently lives and works in Santa Clarita,
California.
He is a seasoned developer with many years of experience with
Microsoft technologies. He is also the author or co-author of
several books.
From 2004 to 2010 he was distinguished by Microsoft as a
Visual C# MVP.
FUNDAMENTALS OF LINQ
Notice of Liability
The author and publisher have made every effort to ensure the accuracy of the
information herein. However, the information contained in this book is sold without
warranty, either express or implied. Neither the authors and Krasis Consulting S.L., nor
its dealers or distributors, will be held liable for any damages to be caused either directly
or indirectly by the instructions contained in this book, or by the software or hardware
products described herein.
Trademark Notice
Rather than indicating every occurrence of a trademarked name as such, this book
uses the names only in an editorial fashion and to the benefit of the trademark owner
with no intention of infringement of the trademark.
r>
for busy developers
<code
www.campusmvp.net
CHAPTER
1
Fundamentals of LINQ
Note: In this chapter you will see references to C# 3.0. This is not a misprint even
though we presently have a later version of the language. The 3.0 version is cited
because, in reality, LINQ has its origins in it and not in the current version of the
language.
1.- INTRODUCTION
Language-Integrated Query (the source of the LINQ acronym) is one of the most
significant improvements in recent years in the field of programming languages
and systems. It plays an essential role in the goal to minimize the effects of the
phenomenon known as impedance mismatch. Impedance mismatch
imposed the need upon us to use not only our ordinary programming language
when developing applications, but also a whole series of other different
languages to access a wide variety of data sources, such as SQL or
XPath/XQuery, whose syntactic constructs are currently embedded literally
inside the C# or Visual Basic code.
Thanks to LINQan integral part of the new versions of these languages
from .NET Framework 3.5 onwardswe can really start to create applications
that contain only .NET code, using a clear and natural syntax to access those
data sources, leaving the task of generating those foreign constructions to the
compiler and the support libraries.
3
4 Fundamentals of LINQ
When expressing queries, the developer will be able to make use of all the
benefits that the compiler and the integrated environment offer (syntax and type
checking by the compiler, IntelliSense help within the environment, metadata
access by both).
Although LINQ will not eliminate it completely, it will indeed allow us to
greatly reduce said phenomenon of impedance mismatch produced by the
differences between programming models proposed by the general purpose
languages and the query languages for relational databases (SQL),
XML documents (XPath/XQuery) and other data sources.
Finally, another important advantage of LINQ is that it will allow increasing the
level of abstraction and clarity when programming queries. For example, a
developer who needs to access a database nowadays must set out a meticulous
plan specifying how to retrieve the data they need. Query expressions, on the
other hand, are a much more declarative tool that largely allows us to just
indicate what we want to obtain, leaving the details on how to achieve this
purpose to the expression evaluation engine.
The other key element in the LINQ technology is its open architecture, which
makes extensibility possible. The semantics of operators in query expressions is
in no way hardwired into the language, so it can be modified (or extended) by
Microsoft or third party libraries in order to access specific data sources. Apart
from the standard mechanism to apply LINQ to arrays and generic collections
(LINQ to Objects), Microsoft itself provides us with at least four other
technologies: on the one hand, LINQ to XML (to execute integrated queries to
XML documents) and LINQ to DataSets (to execute queries in LINQ style against
typed and untyped datasets); on the other, LINQ to SQL and LINQ to Entities,
which make it possible to make integrated queries against relational databases.
But the architecture of the tools that are available to us through LINQ is such
that it has allowed the emergence of technologies (providers) which simplify
and homogenize access to many other data sources: LINQ to Amazon and LINQ
to SharePoint are two of the most powerful examples can be found on the net.
The architecture of LINQ can be graphically described as follows:
Fundamentals of LINQ 5
Query expressions are the mechanism by which LINQ technology comes to life. They
are simply expressions that respond to a new syntax that has been added to C# 3.0 and
Visual Basic 9.0, and they can act on any object implementing the generic interface
IEnumerable<T> (in particular, arrays and collections of .NET 2.0 and above implement
this interface), transforming it into other objects using a set of operators; generally (but
not always), objects that implement that same interface.
Query expressions rely, in turn, on other new features included into C# 3.0 and
Visual Basic 9.0, such as:
using System;
namespace CampusMVP.Classes
{
public class Country
{
public string Code { get; set; }
public string Name { get; set; }
#region Constructors
public Person() { }
public Person(string name, string country): this()
{
this.Name = name;
this. CodeCountryOfBirth = country;
Fundamentals of LINQ 7
}
#endregion
#region Methods
public override string ToString()
{
return
(Name == null ? "???" : Name) +
(CodeCountryOfBirth == null ? " (??)" :
" (" + CodeCountryOfBirth + ")") +
(Gender == null ? "" :
(Gender.Value == Gender.Female ? " (F)"
: " (M)")) +
(DateOfBirth == null ? "" :
" (" + DateOfBirth.Value.ToString("dd/MM/yyyy") +
")");
}
#endregion
}
}
Additionally, the Data class introduced below defines static properties of some test
data:
using System;
using System.Collections.Generic;
namespace CampusMVP.Classes
{
public static class Data
{
public static List<Country> Countries = new List<Country> {
new Country("ES", "SPAIN"),
new Country("CU", "CUBA"),
new Country("RU", "RUSSIA"),
new Country("US", "UNITED STATES")
};
CodeCountryOfBirth = "CU",
DateOfBirth = new DateTime(1982, 8, 12),
Gender = Gender.Female,
}
};
}
}
Suppose that, based on the above data, we want to obtain a collection with
the names and ages of the people appearing in the original list who are older
than 20, with their names converted to uppercase. The objects in the resulting
collection must also appear alphabetically.
If you are not familiar with the wonders of integrated queries yet, the first
thing that will come to your mind will probably be to search for a mechanism
to traverse through the original collection, creating new objects that contain
the characteristics required from the people who meet the requirements, and
then making another pass to order these objects alphabetically by their
names. Thanks to LINQ, this procedure becomes anachronistic.
In C# 3.0, we have a much clearer and elegant way of obtaining that result in a
single shot:
using System;
using System.Linq;
namespace Demo1
{
using CampusMVP.Classes;
class Program
{
static void Main(string[] args)
{
var older20 = from h in Data.People
where h.Age > 20
orderby h.Name
select new
{
Name = h.Name.ToUpper(),
Age = h.Age
};
The semantics of the assignment statement that contains the integrated query
will be intuitively clear for anyone familiar with the SELECT statement of
SQL. What is less usual is that the select clause, where we specify what we
want to obtain, is located at the end, unlike in SQL. The reason for this change
in position is quite logical: if select were at the beginning, it would be impossible to
offer IntelliSense help to the developer when typing the data to be selected, because
Fundamentals of LINQ 9
at that point they have not yet specified the data collection on which the query
will be executed. However, if we have the from operand in first place, it is easy
for the integrated environment to help the developer while they are typing the
expression all along: if Data.People is of the List<Person> type (and hence
IEnumerable<Person>), then h (the name chosen by us for the variable that will
successively refer to each of the elements in the sequence) is of the Person type,
and the system will be able to determine whether any of the expressions
appearing in the where, orderby, etc. clauses is correct or not.
In the statement, we make use of three features that appeared with C# 3.0:
Anonymous types: since the structure of the desired set of results does
not match the structure of the original type of the elements, in the select
clause of the expression we define ad hoc a new data type, which will be
generated by the compiler automatically. The result produced by the query
expression is a collection of objects of this anonymous type.
Anonymous types themselves rely on object initializers, which allow us
to assign initial values to the fields of the anonymous type objects.
Finally, the object resulting from the execution of the query is assigned to a
variable whose type is automatically inferred. Although in some cases
automatic determination of the type of a local variable is merely convenient,
when combined with anonymous types its use is simply indispensable. This
feature is also used later when declaring the variable used to iterate the resulting
collection with the foreach loop.
Next we will see how two other new features of C# 3.0 activate behind the scenes:
lambda expressions and extension methods.
What does the compiler do when it encounters a query expression? The rules of
C# 3.0 stipulate that, before being compiled, any query expression appearing in the
source code is mechanically transformed (rewritten) into a sequence of calls to methods
with predetermined names and signatures. These sequences are known as query
operators. The expressions in each of the clauses that make up the query expression are
also rewritten to adapt them to the requirements of those predetermined methods. Before
actually compiling it, the compiler will translate the query expression of our previous
example into the following:
The where clause of the query expression becomes a call to a method named Where().
In order to pass it to that method, the expression that accompanied wherei.e.
h.Age > 20
is transformed into a lambda expression that produces true or false for a person:
Hopefully, you will find this plainly logical: if the Where() method is to have a
general nature, that is, if it is to be able to work for any condition that a developer might
need, it should hence receive a delegate as a parameter, and this delegate would be
pointing to a Boolean method which would check the condition to be met. Lambda
expressions are just that: a more practical way to specify anonymous delegates.
Once the above is understood, our next question would be which is (or should be)
the signature of this Where() method into which the where clause translates? Where
should the method be located? Note that, after our first step rewriting our example query
expression, we would have this:
In order for our call to be valid, Where() must be (a) an instance method of the
class to which People belongs, or (b) a method of an interface implemented by
the class to which People belongs (and since we started with the premise that
objects that can serve as the source of integrated queries must implement
IEnumerable<T>, this interface would be a strong candidate to be extended with
methods such as Where(), etc.).
Any of the two aforementioned paths could work, but the creators of C# 3.0
considered that the architectures that they would produce would not be so open
and extensible. Thus, at this point in the representation, extension methods make
their appearance. With extension methods at our disposal, Where()like
OrderBy(), Select() and the other actorscould also be extension methods of
IEnumerable<T> which could potentially be defined in any static class that is in
scope when the query expression is compiled.
To check the previous theory, do this small experiment: In the source code of
the example, comment the line at the top of the file that reads:
using System.Linq;
You will see that the program stops compiling (analyze the error message in detail:
Could not find an implementation of the query pattern for source type List<Person>.
Where not found.). The reason is that when you commented the using statement, the
compiler was deprived of the definitions of the extension methods Where(), OrderBy(),
Select(), etc. (which are generically known as query operators) into which the
different clauses of the integrated query are translated. The default implementations of
these operators are held in a static class that is fittingly called System.Linq.Enumerable
Fundamentals of LINQ 11
As you will have concluded, query expressions are pure syntactic sugar. In the
previous section we saw how these expressions are mechanically translated into a
sequence of calls to methods, following a set of predefined rules in the language
specification. We have also seen how, when we eliminate an import of the namespace,
the code containing a query expression stops compiling. In relation to that, we should
emphasize that if we put in scope another namespace containing the static classes with
the relevant definitions of those extension methods, the query expression will compile
again without any problems, and it will use the new set of methods-operators for its
execution. This is what the open architecture of LINQ consists in: C# does not define
specific semantics for the operators implementing query expressions, and anyone can
create one or more classes with custom implementations of query operators for generic
or specific collections and plug them into the system by putting them in scope so that
they are used when compiling the integrated queries on sequences of those types. In fact,
this is the path through which the predefined extensions of LINQ, such as LINQ to XML
or LINQ to Entities, are integrated in the language, as well as the path through which
third parties can develop proprietary providers.
If you now compile and execute the query which selects those older than 20 years,
you will verify that our newly created method is used instead of the standard query
operator, and thus there will only be females in the result. Because this query operates
on an object of the IEnumerable<Person> type, our method takes precedence over the
method of the System.Linq.Enumerable class. And what happens with OrderBy(),
Select() and the rest? Well, we have not defined those methods, but
IEnumerable<Person> is compatible with IEnumerable<T>, and therefore the
implementations of those methods located in the base class library will be used.
One final aspect to bear in mind: What if we had defined the Where() method so that
it operated on IEnumerable<T>, just like in the default implementation? The answer is
that our version would be likewise used, because it is located in the same namespace as
the class where the query is executed. The call resolution algorithm of the compiler
searches the namespaces of our code from inside out, and only if it does not find anything
this way does it use the methods that it finds in static classes belonging to other
namespaces in scope.
merely prepares the enumerator objects needed. The result of a query will not be really
obtained until iteration takes place on it. This evaluation on demand, also known as
lazy or deferred evaluation is the default behavior of LINQ. We must always take into
account a possible collateral effect of this: two successive evaluations of one same query
can produce different results if there are changes in the underlying source of information
between them. In spite of this, deferred execution is the best option in most cases.
Nevertheless, sometimes we may want to completely cache the result of a query in
memory for its later reuse. For this purpose, we have the standard query operators
ToArray(), ToList(), ToLookup() and ToDictionary(). For example, we can obtain the
results of the previous query in one go using the statement:
By now it should be clear that, potentially, we can make the methods implemented
by query operators do whatever we want as long as they comply with the signatures
required by the compiler. However, we are meant to associate a functionality to them
that is in keeping with what is generally expected from them. The Where() method, for
example, is supposed to filter the input sequence, leaving only the elements satisfying
the specified condition for output. OrderBy(), on its part, is to collect the input sequence
and produce another one containing the same elements as the original one, but in
ascending order according to a certain criterion.
Still on Where(), by now we are completely acquainted with the signature of the
method, at least in its main overload. If it is implemented as an extension method, it
receives the input sequence (marked with this) as its first argument, and the second
parameter is a delegate to a function that receives a T and returns a bool. The type of the
return value is IEnumerable<T>, as you can conclude from our last example of code: It is
easy to realize it, considering that the result produced by Where() is going to serve as an
input for OrderBy(), Select() or one of the other methods in the cascade of calls that is
generated as a result of rewriting.
The main overload of the standard query operator Where() is implemented like this:
The method first checks the validity of the input arguments. Next, it enters a loop
which iterates through the input sequence, and for each of its elements it calls the
predicate to check whether the element meets the condition or not. Only if the element
meets the condition does the method produce it in the output sequence. In our example,
where we wanted people older than 20, that output sequence would be, in turn, the input
sequence for the OrderBy() operator.
As another example, look at how the Select() operator is implemented:
Here, the type of the elements of the resulting sequence results from the return type
of the selection or transformation expression used.
The default implementation (in the System.Linq.Enumerable class of
System.Core.dll) of a set of extension methods including Where(), Select(), OrderBy()
and some others, which can be used to execute integrated queries against any enumerable
sequence of objects in memory, is known as LINQ to Objects, and these methods are
called standard query operators.
Not all standard query operators are reflected in the language syntax. For example,
there is an operator called Reverse() which produces the elements of its sequence in
reverse order (from last to first). However, there is no syntactic mapping in the
C# language for this operator. When we need it, we will have to use it with the customary
notation for calls to method calls:
Also, not even all the overloads of the same query operator are reflected in the syntax
of C# query expressions, but just some of them. For example, the Where() operator has
two overloads, but only one (the one presented above) is used for rewriting query
expressions.
The subset of standard query operators of C# 3.0 from which the syntax of query
expressions depends directly (and hence any LINQ extension vendor should support)
produces what is known as the query expression pattern or LINQ pattern: a
specification of the set of methods (subset of the set of standard query operators) which
must be available in order to ensure full support for integrated queries.
| <join clause>
| <join-into clause>
| <let clause>
| <where clause>
| <orderby clause>)
<orderings> ::=
<ordering>
| <orderings> , <ordering>
<ordering> :=
<key expr.> (ascending | descending)?
<continuation> ::=
into <element> <query body>
Meta-language:
* - zero or more times
( ... | ... ) - alternative
? optional element
Basically, a query expression always begins with a from clause, where the source
collection on which the query will be executed is specified. Next, there may be one or
more from, join, let, where or orderby clauses, and finally a select or group by clause.
Optionally, at the end of the expression there may be a continuation clause, which begins
with the reserved word into and is followed by the body of another query. Remember
that all keywords used here are contextualthey only have a special meaning within
query expressions.
We will next show examples of use of the different syntactic elements of query
expressions.
16 Fundamentals of LINQ
The following table lists the available standard query operators, grouped by category.
We have first highlighted the basic operators, for which the syntax of query
expressions offers a special clause (where, orderby, select, group, join) and which have
some overloads that are part of the aforementioned query expression pattern, which is
precisely defined in the specification document of C# 3.0. For the rest of standard
operators there is no direct linguistic support in C# 3.0, and to use them we will have to
employ the specific syntax for method calls. Note that, although the basic operators and
many of the non-basic ones produce another sequence as a result, among the rest of
operators there are several ones which produce scalar values as results, which means that
they must always be placed at the end of the chain of method calls.
Aggregate operators
Returns the number of elements in the original
Count() / LongCount() sequence, or the number of elements which
satisfies a specified logical predicate.
Returns the maximum (or minimum) of the
Max() / Min()
elements in the original sequence.
Returns the sum of the elements in the original
Sum()
(numeric) sequence.
Returns the average of the elements in the
Average()
original (numeric) sequence.
Returns the result of applying a specified
Aggregate() aggregate function to the elements in the original
sequence.
1.11.-Some examples
The purpose of this section is to present some examples of the things we can
accomplish using query expressions.
Basic examples
Given that any object implementing IEnumerable<T> can serve as the source of an
integrated query, and that this interface is implemented by objects as common as arrays,
generic collections and even character strings (which allows us to enumerate the
characters constituting them), it is clear that we can apply integrated queries in a large
number of everyday situations for which we previously used loops, counters and other
various techniques.
Now we will give some examples of query expressions applied to strings and arrays:
orderby w.ToLower()
select w.ToLower()).Distinct();
Cartesian products
In the world of relational databases, a Cartesian product of two tables is simply the
set of rows resulting from combining each row from the first table with each row from
the second one. Here, the same concept can be applied to the combination of two
sequences: if we have two from clauses (which act as generators) one after the other, all
the elements of the second sequence will be produced for each element of the first
sequence. For example:
/* CARTESIAN PRODUCT */
var pc1 = from co in Data.Countries
from pe in Data.People
select new {
co.Name,
NamePerson = pe.Name
};
Fundamentals of LINQ 21
Cartesian products are implemented through calls to the standard query operator
SelectMany(), which is in charge of producing a sequence in which each of the elements
of the first sequence is combined with each of the elements of the second one.
The main danger of Cartesian products is the combinational explosion of results that
they can produce. For this reason, it is generally recommended to avoid Cartesian
products whenever possible, perhaps by applying the following techniques.
If you analyze the previous query closely, you will agree that the following option is
better with regard to performance, because the elements of the first sequence that will
ultimately be discarded are eliminated earlier in the pipe of query operators executed:
Although the study of specific optimization techniques for LINQ queries falls beyond
the scope of this Appendix, we have deemed it relevant to note this fact so that you can
take it into account when programming integrated queries. A completely different matter
is that an intelligent compiler could transform the first expression into the second one
in a way that was transparent to the developer. Future versions of the C# compiler will
probably do it, but not the current one.
Joins
Joins are another of the typical constructions of relational languages such as SQL that
have been added to C# 3.0 query expressions. A join is basically a Cartesian product on
two sequences which is limited to the tuples (t1, t2), where the value of a certain
expression applied to the element of the first sequence t1 is equal to the value of another
expression applied to the element of the second sequence t2. Basically, the point is to
22 Fundamentals of LINQ
drastically reduce the combinations that a full Cartesian product would produce, keeping
only the elements of the sequences that match according to a certain shared criterion.
For example, the query that we presented earlier which combines the names of
people and countries would be much better like this:
/* JOIN */
var enc1 = from co in Data.Countries
join pe in Data.People
on co.Code equals pe.CodeCountryOfBirth
select new {
co.Name,
NamePerson = pe.Name
};
The join condition comprises a key selector for the outer sequence, the contextual
keyword equals and another key selector for the inner sequence. The key selectors used
to compare fields can be any expression obtained based on the identifier representing the
element of the corresponding sequence.
The same result could have been obtained using a restricted Cartesian product, but
the performance of the join is much higher. The extension method Join() (which is
called to execute joins) is conceived for using hash tables, in a similar way as table
indexes are used in the world of databases.
Groups
The syntax of query expressions also supports the organization of the elements of a
sequence into groups according to the different values of a grouping key which is
calculated for each element. For example, the following statement
/* GROUP */
var groupsGender =
from h in Data.People
group new { h.Name, h.Age } by h.Gender;
groups the elements of the original sequence according to the different values of the
h.Gender expression. In this case, we will obtain a sequence of two elements, which will
in turn be sequences: the first one, with objects of an anonymous type which includes
the data requested (name and age) of all the females (objects for which the value of the
grouping key is Gender.Female); and the second one, with the data of all the males, for
which the value of h.Gender equals Gender.Male.
C# 3.0 translates the previous query expression into a call to the GroupBy()standard
operator. The result is a sequence, and each of its elements is, in turn, an inner sequence
associated to each group, implementing an interface called IGrouping<TKey, TElmt>
which inherits from IEnumerable<TElmt>. Basically, this interface adds a Key read-only
property, whose type is the type of the grouping key. The following loop shows the
structure of the result of the query:
Below is an example which involves our two tables of people and countries. The
following statement allows us to group the people by their country of birth:
Console.WriteLine("GROUPS BY COUNTRY");
var groupsCountries =
from co in Data.Countries
join pe in Data.People
on co.Code equals pe.CodeCountryOfBirth
group new { pe.Name, pe.Age }
by co.Name;
Note the similarity between this last query expression and the one presented in our
previous section on joins: the difference lies in the presence of the group...by clause
instead of select. These two clauses are precisely final clauses in the syntax of query
expressions.
Continuations
If you execute the previous grouping, you will see that the different groups appear in
the resulting sequence in the same order as the countries appear in the original sequence.
What if we wanted to obtain the groups in alphabetical order of the countries? We could
resort to explicit syntax:
var groupsCountries2 =
(from co in Data.Countries
join pe in Data.People
on co.Code equals pe.CodeCountryOfBirth
group new { pe.Name, pe.Age }
by co.Name).OrderBy(g => g.Key);
var groupsCountries3 =
from co in Data.Countries
join pe in Data.People
on co.Code equals pe.CodeCountryOfBirth
group new { pe.Name, pe.Age } by co.Name
into tmp
orderby tmp.Key
select tmp;
In practice, continuations are especially useful for processing the results produced by
a group...by clause. Observe the following example, similar to the previous one:
var summaryCountries =
from co in Data.Countries
join pe in Data.People
on co.Code equals pe.CodeCountryOfBirth
group new { pe.Name, pe.Age } by co.Name
into tmp
orderby tmp.Count() descending
select new {
Name = tmp.Key,
Number = tmp.Count()
};
This query produces an ordered sequence of objects with two properties: the name of
the country and the number of people born in that country.
Grouped joins
The second and most important application of the into clause has the purpose of
implementing what is known as grouped joins. This is a type of join that has no direct
equivalent in the world of relational databases. Instead of producing the typical sequence
of pairs yielded by a normal join, it produces a sequence where each element of the first
sequence is paired up with the group of elements of the second one whose matching key
values correspond to the matching key value of the element of the outer sequence. A
join...into construction translates into a call to the GroupJoin() standard operator,
which is based, like Join(), on the use of hash tables.
For example, a more concise way to obtain the list of countries with the number of
people in each country, similar to the one in the previous section, would have been this
one:
var summaryCountries2 =
from co in Data.Countries
orderby co.Name
join pe in Data.People
on co.Code equals pe.CodeCountryOfBirth
into gp
select new {
Country = co.Name,
Number = gp.Count()
};
Fundamentals of LINQ 25
var summaryCountries3 =
Countries.OrderBy(co => co.Name).
GroupJoin(Generation,
co => co.Code,
pe => pe.CodeCountryOfBirth,
(c, gp) => new { Country = c.Name,
Number = gp.Count() });
Note that, as opposed to our previous query, the countries where nobody has been
born will also be included in the result this time.
1.12.-LINQ extensions
As we have already said, query operators do not have predetermined semantics. Their
semantics are instead plugged in at compile time depending on the type of the data
feed to which the query is applied and the sets of extension methods that are in scope at
that moment. The following table shows the assemblies that contain the extension classes
and the namespaces to which these classes belong, for each of the extensions of LINQ
available as standard in .NET Framework 3.5 and above:
Table 2.- LINQ extensions
As you can see, LINQ providers can be classified into two very different categories:
local providers and remote providers. The following table briefly summarizes the main
differences between them.
Table 3.- Differences between local and remote providers
Local providers are those which operate on data sources available in the memory of
the computer where the application is executed. Apart from LINQ to Objects, in this
category we also find LINQ to XML (which allows querying XML documents loaded as
trees of nodes in memory) and LINQ to DataSets (intended for querying in-memory
ADO.NET datasets). Since they actually operate on sequences (which implement
IEnumerable<T>), such providers generally rely on the implementations of the standard
query operators provided by LINQ to Objects. For example, if you use the Object
Browser to examine the System.Xml.Linq.dll assembly, which contains the
implementation of LINQ to XML, you will not find any class having extension methods
called Where(), Select(), etc. LINQ to XML relies on LINQ to Objects. Therefore, to
program an integrated query against a LINQ to XML document, it will not only be
Fundamentals of LINQ 27
will compile correctly regardless of whether the source of the query (SourcePeople) is
IQueryable<Person> or simply IEnumerable<Person>. In the case of query expressions
28 Fundamentals of LINQ
of LINQ to SQL, LINQ to Entities and other providers based on IQueryable<T>, the
implementations of query operators do not receive, together with the input sequence,
delegates to the functions that have to be called if it is deemed appropriate, but instead,
expression trees that reflect what those functions do.
The same process will be repeated for the subsequent query operators (OrderBy() and
Select() in our example), serving each IQueryable<T> object resulting from a previous
call as the input object for the following one. At the end of the chain, we will have a
complex tree that will reflect everything that the query expression must do. This tree is
Fundamentals of LINQ 29
ready for use, when iteration over the query results begins, as a source from which to
build a statement in the language of the remote data source that we want to query (some
dialect of SQL, in the case of LINQ to SQL and LINQ to Entities). To translate LINQ
expression trees into syntactic constructions in the language of the store, LINQ providers
rely on auxiliary classes known as query providers.
As we have remarked, the only standard query operators that must mandatorily be
implemented by a LINQ provider are, in principle, the set of overloads of the operators
which are reflected on the syntax, which in the language specification are known as the
LINQ pattern. Also remember that, in the case of providers that we have referred to as
remote, at the end of the day a query expression translates into a huge expression tree,
which the query provider will later translate into a statement in the language of the
remote store to be queried.
For some types of stores, some of the extended standard query operators (invoked
using functional notation) may be meaningless or even impossible to implement. Of
course, in such cases we must avoid using said operators. If we use them, we will get an
exception of the System.NotSupportedException type.
For example, consider the ElementAt() operator, which returns the element located
in a specific position in the input sequence. The creators of LINQ to SQL and LINQ to
Entities understood that the use of this operator should not be permitted, so a query such
as the following
will produce an exception. Note, however, that the following alternative works properly:
Similarly, we must not forget that the query will finally need to be translated into a
statement in the language of the store against which we are working when writing the
lambda expressions corresponding to the where, orderby, etc. clauses of our integrated
queries. For example, the following query
30 Fundamentals of LINQ
will produce an exception, because the LINQ to Entities query provider has no way to
express the call to the Split() method of the string class of .NET Framework in SQL,
which is quite poorer than .NET with regard to programming support. You cannot get
blood out of a turnip :-)
All the information about which operators and functions are supported, supported
with limitations or not supported at all by LINQ to SQL and LINQ to Entities is available
at MSDN.
1.16.-Update mechanisms
Finally, we should mention that although, in principle, LINQ is only associated with
the retrieval (query) of information, most LINQ providers have been equipped with
additional mechanisms to also allow data updates and, more generally, the
manipulationin the broadest sense of the wordof the containers where these data
are stored, be it XML documents, relational databases or others. In particular, the
technologies based on LINQ which involve access to relational databases (LINQ to SQL
and LINQ to Entities) allow us to make in-memory changes to objects obtained from
executing queries, and then apply those changes to the relational store. For this purpose,
these technologies are also capable of generating the necessary SQL statements INSERT,
UPDATE and DELETE.
Are you enjoying this book?
You will love this superb EF5 course:
The best EF5 course you'll find in the market. You can bet!
Get it now!!!
Your trainer
Sergey Barskiy (Data Platform MVP) http://bit.ly/learnEF5
The best SQL Server 2012 course you'll find in the market.
You can bet!
Get it now!!!
Your trainer
Alessandro Alpi (SQL Server MVP) http://bit.ly/learnSQLServer