4 Sept 2009

Natural Sort Compare with Linq OrderBy

Having "fun" with IComparer<T> when trying to do OrderBy in Linq with a custom sort?

Me too. I wanted to sort some rows from a DB table by a string 'filename' field. The standard Linq (and SQL) OrderBy(item => item.sortProperty) sorts the strings like so:

image1.jpg
image200.jpg
image30.jpg
image4.jpg

But because I am a human being, I don't want that. No, instead I wanted them sorted like:

image1.jpg
image4.jpg
image30.jpg
image200.jpg

What I want is "Natural Sort". C# doesn't offer a built-in solution, but the Linq OrderBy method does have an override that allows you to use your own IComparer<T> function, a la
myLinqQuery.OrderBy(item => item.sortProperty, new MyComparer<string>())


(Edit: Incidentally, I now have a way of doing this dynamically, if you don't know what sort field will be used until run-time. See here.)

Building your own "Natural Sort" IComparer is not for the faint of heart, or the lazy, so I just nicked some code from Justin Jones and tweaked it a bit:

    public class NaturalSortComparer<T> : IComparer<string>, IDisposable
    {
        private bool isAscending;

        public NaturalSortComparer(bool inAscendingOrder = true)
        {
            this.isAscending = inAscendingOrder;
        }

        #region IComparer<string> Members

        public int Compare(string x, string y)
        {
            throw new NotImplementedException();
        }

        #endregion

        #region IComparer<string> Members

        int IComparer<string>.Compare(string x, string y)
        {
            if (x == y)
                return 0;

            string[] x1, y1;

            if (!table.TryGetValue(x, out x1))
            {
                x1 = Regex.Split(x.Replace(" ", ""), "([0-9]+)");
                table.Add(x, x1);
            }

            if (!table.TryGetValue(y, out y1))
            {
                y1 = Regex.Split(y.Replace(" ", ""), "([0-9]+)");
                table.Add(y, y1);
            }

            int returnVal;

            for (int i = 0; i < x1.Length && i < y1.Length; i++)
            {
                if (x1[i] != y1[i])
                {
                    returnVal = PartCompare(x1[i], y1[i]);
                    return isAscending ? returnVal : -returnVal;
                }
            }

            if (y1.Length > x1.Length)
            {
                returnVal = 1;
            }
            else if (x1.Length > y1.Length)
            { 
                returnVal = -1; 
            }
            else
            {
                returnVal = 0;
            }

            return isAscending ? returnVal : -returnVal;
        }

        private static int PartCompare(string left, string right)
        {
            int x, y;
            if (!int.TryParse(left, out x))
                return left.CompareTo(right);

            if (!int.TryParse(right, out y))
                return left.CompareTo(right);

            return x.CompareTo(y);
        }

        #endregion

        private Dictionary<string, string[]> table = new Dictionary<string, string[]>();

        public void Dispose()
        {
            table.Clear();
            table = null;
        }
    }


The first time I tried to use this, it failed with the rather useless error: "Unsupported overload used for query operator 'OrderBy'."

Turns out it was because I had tried to use my custom OrderBy on the Linq query before it had got the data records from the server, and hence it thought I was trying to run the natural sort in SQL. So I fixed the prob by getting the results first with a quick call to AsEnumerable(), a la:

List<Photo> photos = DataManager.MainContext.Photos
     .Where(item => item.PhotoFilename != null)
     .AsEnumerable()
     .OrderBy(item => item.PhotoFilename, new NaturalSortComparer<string>())
     .ToList();


Works a treat!
If I helped you out today, you can buy me a beer below. Cheers!