LINQ is a very powerful technique for operating on collections of objects in .NET. If you for example have a collection of integers, it is a simple task to pick the even numbers from the collection using LINQ:
int[] ints = new[] { 1, 3, 6, 8, 9, 10 };
IEnumerable<int> evens = ints.Where(i => i % 2 == 0);
Similarly for dictionaries:
Dictionary<int, int> intDict = new Dictionary<int, int>
{ { 2, 3 }, { 3, 5 }, { 6, 7}};
Dictionary<int, int> evenKeys = intMap.Where(kv => kv.Key % 2 == 0);
What? Compilation error?!?
In fact, the
intDict.Where() statement is not entirely correct. From the LINQ point of view,
Dictionary<,> and other classes implementing the
IDictionary<TKey, TValue> interface are actually regarded as implementations of the
IEnumerable<KeyValuePair<TKey, TValue>> interface. Thus (ignoring for a moment that we could also have used the
var keyword), the correct
evenKeys assignment should read:
IEnumerable<KeyValuePair<int, int>> evenKeys =
intMap.Where(kv => kv.Key % 2 == 0);
Now, my guess is that in the normal case one would rather have the above assignment to return a
Dictionary. Fortunately, LINQ also provides a number of
ToDictionary method overloads. So for the
evenKeys assignment to return a
Dictionary we simply type:
Dictionary<int, int> evenKeys = intMap.Where(
kv => kv.Key % 2 == 0).ToDictionary();
What?! Compilation error again?
Yes, because the
ToDictionary method also operates on
IEnumerable<T> objects. You need to tell the compiler how you want to design your
Dictionary based on this arbitrary type
T. For correctness, our
evenKeys assignment has to be expressed as follows:
Dictionary<int, int> evenKeys = intMap.Where(kv => kv.Key % 2 == 0).
ToDictionary(kv => kv.Key, kv => kv.Value);
For dictionaries, this explicitness may appear quite counter-intuitive, and there are several forum questions on the Internet indicating that this API design indeed has caused confusion (
here,
here and
here for example).
After giving this issue some thought, I believe I have come up with an approach to bypass this hurdle of confusion. I have implemented the following extension method in a static utility class:
public static Dictionary<TKey, TValue>
ToDictionary<TKey, TValue>(
this IEnumerable<KeyValuePair<TKey, TValue>> source)
{
return source.ToDictionary(kv => kv.Key, kv => kv.Value);
}
This overload of ToDictionary takes any object that implements the IEnumerable<KeyValuePair<TKey, TValue>> interface, which is a common return type when invoking LINQ operations on dictionaries, and returns a Dictionary<TKey, TValue> object using the same keys and values as the dictionary in the argument list. With this extension method defined, I now actually can enter:
Dictionary<int, int> evenKeys = intMap.Where(
kv => kv.Key % 2 == 0).ToDictionary();
without getting the annoying compilation error.
Now this is all good and well, but I then decided to take the issue even one step further.
Wouldn't it be good if for example the
Where() extension method when operating on a
Dictionary by default returned a
Dictionary with the same keys and values?
Well, it can of course be done! And here is the solution:
public static Dictionary<TKey, TValue> Where<TKey, TValue>(
this IDictionary<TKey, TValue> source,
Func<KeyValuePair<TKey, TValue>, bool> predicate)
{
return Enumerable.Where(source, predicate).
ToDictionary(kv => kv.Key, kv => kv.Value);
}
When this method is defined, it effectively hides the general
Where(IEnumerable<T>, Func<>) extension method when the
IEnumerable<T> object also implements the
IDictionary<TKey, TValue> interface. Having the above
Where() extension method defined now even makes it possible for us to apply our first dictionary LINQ attempt without compilation error:
Dictionary<int, int> evenKeys = intMap.Where(kv => kv.Key % 2 == 0);
Unfortunately, this method will always return a
Dictionary object, regardless of whether
intDict is a
SortedDictionary or
SortedList or another object implementing the
IDictionary<TKey, TValue> interface. At this point, I have no viable solution for returning the same dictionary type that was used as input to the specialized
Where method. The quest for a solution will continue nonetheless.
I have also collected a few more similar LINQ extension method overloads in a Github project that I have chosen to call ... dictionarylinq!
In this project you may find dictionary overloads for the
Except,
Intersect and
Union extension methods. Due to their signatures, for these methods it has actually been possible to implement a specialized overload that returns the true input dictionary type, together with a designated overload with better performance for the
Dictionary<TKey, TValue> class.
Included are also method overloads for the
Select method when the return type of the
Func<TSource, TResult> object is a
KeyValuePair<TKey, TValue>. It would be relatively easy to add more method overloads to
dictionarylinq, but for now this is what there is.
When the extension methods from the
dictionarylinq utility class are included in your application, these methods will effectively hide the general extension methods in the
System.Linq.Enumerable class. If you
do want to fall back on the general methods in a certain scenario, either make an explicit cast to the
IEnumerable<> interface:
IEnumerable<KeyValuePair<int, int>> output =
((IEnumerable<KeyValuePair<int, int>>)input).Where(kv => kv.Key > 3);
or invoke the static method explicitly:
IEnumerable<KeyValuePair<int, int>> output =
Enumerable.Where(input, kv => kv.Key > 3);
Please also note that the
dictionarylinq class library in the Visual Studio solution is a
Portable Class Library. This means that the library uses the "least common denominator" of .NET Framework 4, Silverlight and Windows Phone 7 base class libraries. The library can be built once and consumed by all three .NET technologies without rebuilding. If you have not installed the
Portable Library Tools, simply create a new designated class library project and include the source code from the
dictionarylinq class library, or include the source code directly in your own class library or application.
I hope that this effort can be useful also to others than myself. For questions and comments on this work, please do not hesitate to contact me for example via a blog comment. If you are a Github user, please feel free to report code issues via the project's
Issues tab.
Good luck with Dictionary and LINQ!