LINQ: from IEnumerable to concrete collections
I my recent posts introducing LINQ from a game developers point of view, I mentioned several times how the many LINQ methods returning sequences of the IEnumerable<T>
type do not actually return an actual collection.
Instead they return a query that can be executed any number of time on the given input collection.
Of course, there comes a point at which we need to store the results of such queries as regular collections. Today we will talk about how LINQ supports this almost trivially.
Lists and arrays
Arrays and lists are without a doubt the simplest collection types of C#. In fact, List<T>
is usually the go-to type to hold a set of items – given no additional constraints.
As such, being able to extract the result of a LINQ query as a list is a very common operation.
Of course, we hardly need LINQ to help us do this. Given a sequence of type IEnumerable<T>
– for example one returned by a LINQ query, we can simply create a list and add the contained elements as follows.
var sequence = input.Where(someCondition).Select(someConversion);
var list = new List<ItemType>();
foreach (var item in sequence)
{
list.Add(item);
}
This is a lot of code to do something rather simple however.
Of course, we can shorten it a lot, by taking advantage of the list’s AddRange()
method.
var sequence = input.Where(someCondition).Select(someConversion);
var list = new List<ItemType>();
list.AddRange(sequence);
This also simplifies the code and makes it much less likely to contain errors.
However, we could go even further by using another constructor overload of our list.
var sequence = input.Where(someCondition).Select(someConversion);
var list = new List<ItemType>(sequence);
Now we are down to a single line, which is great.
Imagine however, that we may not want to use a variable for our sequence, but define the LINQ query inline as follows
var list = new List<ItemType>(
input.Where(someCondition).Select(someConversion));
At this point, the code is beginning to become slightly unreadable, and would only get worse if we used lambda functions as parameters for Where()
and Select()
.
Additionally, note how we have to specify the type of the items in the resulting list. This goes very contrary to the way that LINQ infers types – more on this in a future post. We do not have to specify types for the Where()
and Select()
methods, so why should we for the creation of the list?
In fact, LINQ comes with one answer to all these question: the ToList()
extension method.
Instead of doing the above, we can do the following.
var list = input.Where(someCondition).Select(someConversion).ToList();
This line of code is both shorter, and clearer. Since it uses method chaining it can be read from left to right, instead of containing awkwardly nested method calls. Lastly, the call to ToList()
can also infer the type of the list items, so that we do not have to specify it.
Similarly, there exists an ToArray()
extension method which can be used in the exact same way but instead returns an array. However, while I think it is important to mention, I do not recommend for anyone to use this method unless they have a very specific reason to do so.
ToList() always creates a new collection
One detail that is important to understand is that a call to ToList()
will always create an entirely new list. While it does not create a deep copy – meaning that it does not create copies of reference type items, but only copies the references – the returned list is an a new collection.
For example, since List<T>
implements the IEnumerable<T>
interface, we could run the following code.
var aList = new List<int>() { 1, 2, 3 };
var anotherList = aList.ToList();
aList.Add(4);
// anotherList contains 1, 2, 3, but NOT 4
The same goes for ToArray()
as well as the other methods discussed below.
Dictionaries
Dictionaries – or maps – are useful data structures that allow us to index a collection of items by unique keys – like looking up a user by their id in a database.
Similar to above, we could use manual code to convert a sequence of users into a dictionary.
var dictionary = new Dictionary<int, User>();
foreach (var user in users)
{
dictionary.Add(user.Id, user);
}
Maybe unsurprisingly, we can do the same using LINQs ToDictionary()
extension method.
Note that this method takes two delegates as parameters: one to select the key, and one to select the value; in our example the user id and the user respectively.
Using the method we can write our code as follows instead:
var dictionary = users.ToDictionary(u => u.Id, u => u);
Note how I use the identity lambda function u => u
to select the user itself as the value for the dictionary.
In fact, we could skip this parameter and use the method overload that only takes a key selector delegate. That overload defaults to taking the element itself as item for the returned dictionary.
var dictionary = users.ToDictionary(u => u.Id);
However, I thought it was important to mention the value selector, to show this important capability of the method.
Lookups
Let us consider the case where we want to create a collection similar to a dictionary, but instead of by unique id, we want to get quick access to our users by their date of birth.
We could do the following.
var dictionary = users.ToDictionary(u => u.BirthDay, u => u);
However, this would result in an exception if two of our users have the same birthday.
Instead of a Dictionary<Date, User>
we need a Dictionary<Date, List<User>>
or something similar. But how do we go about constructing that?
Of course, we could again do so manually.
var dictionary = new Dictionary<Date, List<User>>();
foreach (var user in users)
{
List<Users> list;
if (!dictionary.TryGetValue(user.BirthDay, out list))
{
list = new List<Users>();
dictionary.Add(user.BirthDay, list);
}
list.Add(user);
}
However, this is rather cumbersome and prone to error.
Instead we could use the GroupBy()
extension method of LINQ, which I introduced in last week’s post, to do the grouping for us. Having our users grouped by birthday, we can then create a dictionary of lists from these groups as follows.
var usersByBirthDay = users.GroupBy(u => u.BirthDay);
var dictionary = usersByBirthDay.ToDictionary(g => g.Key, g => g.ToList());
Or, without the temporary variable:
var dictionary = users
.GroupBy(u => u.BirthDay)
.ToDictionary(g => g.Key, g => g.ToList());
The key of the group we select in ToDictionary()
is the same key we selected in GroupBy()
: the birthday of that group of users.
This is already pretty good. We here have an example of using three different LINQ extension methods in a single statement to organize our original list of users as we wanted.
However, in this particular case, there is an even easier way: the LINQ method ToLookup
.
This method is different from our approach above in a number of ways:
- it returns a
Lookup<TKey, TElement>
, instead of aDictionary<TKey, TElement
- the returned lookup object is immutable – no elements can be added or removed
- unlike the dictionary, the lookup’s key-indexer does not return a single element, but a sequence of all elements with that key – or an empty sequence if there are none
We can use it as follows.
var lookup = users.ToLookup(u => u.BirthDay, u => u);
// or simply:
var lookup = users.ToLookup(u => u.BirthDay);
While the resulting collection is not mutable – which may be something we need – in many if not most cases, using this simple method to quickly and easily construct a dictionary-like lookup is the best and clearest solution.
Conclusion
I hope this post has given you a good overview of how LINQ provides support to convert the output of queries to arrays, lists, dictionaries, and lookups.
Make sure to let me know if you have any questions or comments, or if there are other features of LINQ you would like me to cover next.
Enjoy the pixels!
Reference: | LINQ: from IEnumerable to concrete collections from our NCG partner Paul Scharf at the GameDev<T> blog. |