Monday, February 19, 2007

C# and keeping your lists sorted

Every once in a while I feel the need for a having a list in C# thats instantly sorted. And every time I always spend time examining the various lists available in C# and end up being quite annoyed.
Perhaps there is such a list in the .NET framework, but I havn't been able to find it yet.
Sure, there is the SortedList, but thats a Key/Value-style hashtable where it sorts on the keys. Sometimes thats fine, but most often I find that it's pretty useless. Especially since you can't have duplicate keys. You can also use the .Sort() method on an ordinary list of course. But what if you need to have a list that's always sorted, e.g. when you Add a new item to the list it's inserted in a sorted manner?! Calling .Sort() after each .Add() isn't really an option since it's way too slow.

So, as you might have expected, when I had the problem again this weekend I ended up writing my own implementation of a SortList.
It might not be pretty, but it seems to work and compared to the .Add();.Sort() alternative it's pretty darn fast :-)



public class SortList<T> : List<T>
where T : IComparable<T>
{

public new void Add(T Item)
{
if(Count==0){
//No list items
base.Add(Item);
return;
}
if(Item.CompareTo(this[Count-1])>0)
{
//Bigger than Max
base.Add(Item);
return;
}
int min = 0;
int max = Count-1;
while ((max-min)>1)
{
//Find half point
int half = min + ((max - min) / 2);
//Compare if it's bigger or smaller than the current item.
int comp = Item.CompareTo(this[half]);
if (comp == 0)
{
//Item is equal to half point
Insert(half, Item);
return;
}
else if (comp < 0) max = half; //Item is smaller
else min = half; //Item is bigger
}
if(Item.CompareTo(this[min])<=0) Insert(min,Item);
else Insert(min + 1, Item);
}
}


The code is based on a standard generic list, and it simply replaces the Add() method with a method that inserts item sorted. It does this by narrowing in on the list, always diving the list in two - much like one would solve the classical game of "Guess the number I'm thinking of".
To test it, I wrote a small test application, that generated an array with 10.000 random integers (values between 0 and 10.000). Then it timed how long it would take to add them to the SortList, and afterwards how long time it would take to Add+Sort them in a usual list (with Sort being called for each Add, cause we need a constantly sorted list).
On my workstation, SortList took about 40 ms to complete the task while List.Add+List.Sort took 2654 ms.
And in case you're wondering, the answer is Yes. The sorted contents was the same in both lists afterwards.

5 comments:

Alex de Groot said...

Hello Allan,

I've got the exact same problem as you have. By accident I came across the C5 library. It's based on the principles you've learned at school and it's dramatically fast ;-).
It contains stuff like SortedArrays, Trees, etc. Very cool. Check out at:
http://www.itu.dk/research/c5/

Allan Thræn said...

Thanks Alex. I think I heard about that library (not so strange considering I go to the same school).
I'll make sure to check it out. Thanks for the tip!

Allan Thræn said...

Just for completeness I tried to run the same test as described in the post with SortedArray from the C5 library. It took 30 ms so it's 10 ms faster than my algorithm!
I'll make sure to post again when I've found out what to do with all this new-found time. Perhaps it's time to take up a hobby :-)

splintor said...

See http://devlicio.us/blogs/marcin_hoppe/archive/2007/05/15/how-to-keep-list-t-sorted.aspx for a better approach that uses List.BinarySearch.

Gates VP said...

Just looked at Marcin Hoppe's version and though it leverages the BinarySearch, this is useless if you need to leverage a Comparable class instead of a list of Integers.

Thank you for this version Allan, definitely using this.