DataTable Row Loop c# Performance Testing (Linq vs. Select vs. Parallel vs. For)

froot-loopsI still find myself using DataTables (from SQL Server, mySQL, etc.) on a regular basis. In an effort to see how some different methods of looping through the data performs and what method might be the fastest, I put together some small, fast tests, below are my test results.

[Just jump to the fastest method.]

datatable-row-tests

My sample data was comprised of about a 34,000  row datatable – running these tests on a i7, 32gb, VS2015 setup. The 34,000 was chosen because a box of Froot Loops contains about 1800 Froot Loops (I think), but 1800 was too small, so how about 18 boxes of Froot Loops?

Continue reading “DataTable Row Loop c# Performance Testing (Linq vs. Select vs. Parallel vs. For)”

DataTable Row Loop c# Performance Testing (Linq vs. Select vs. Parallel vs. For)

Parsing JSON data in C# (JSON.NET, Linq, HttpClient)

Parsing JSON in C# is pretty simple. I’m a fan of using Newtonsoft.Json (known as JSON.NET) – I’ve found this to be the fastest, easiest JSON parser available.

Below are some steps to get parsing on your own (or jump down to the full source code). You can also grab this on GitHub. I’m just using a simple console app for this example. My JSON is also coming from a Google Custom Search (via http get). You’ll need to have your own JSON url (or maybe get a test url here).

  1. Start a new project > console application.
  2. In Package Manager Console, run: Install-Package Newtonsoft.Json
  3. Add the below to the top of your code file (Program.cs):
    using Newtonsoft.Json.Linq;
    using System;
    using System.Linq;
    using System.Net;
    using System.Net.Http;
    using System.Threading.Tasks;
  4. Create a task to get some JSON data from http:public static async Task<string> httpResponse(string url)
    {
    IWebProxy iprox = WebProxy.GetDefaultProxy();
    iprox.Credentials = CredentialCache.DefaultCredentials;HttpClientHandler httpHandler = new HttpClientHandler()
    {
    UseProxy = true,
    Proxy = iprox,
    PreAuthenticate = true,
    UseDefaultCredentials = true,
    Credentials = CredentialCache.DefaultCredentials
    };using (var httpClient = new HttpClient(httpHandler))
    return await httpClient.GetStringAsync(url);
    }
  5. Now get some data:
    JObject jData = JObject.Parse(httpResponse(@”https://www.googleapis.com/customsearch/v1?youneedtocreateyourownlink&#8221;).Result);
  6. Read some simple data:
    string sampleValue = (string)jData[“url”][“type”];
  7. Read array of data:
    foreach (var sampleItem in jData[“items”])
    {
    string sampleItemValue = (string)sampleItem[“link”];
    Console.WriteLine(sampleItemValue);
    }

The full source:

using Newtonsoft.Json.Linq;
using System;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;

namespace parsingJSON
{
    internal class Program
    {
        public static async Task<string> httpResponse(string url)
        {
            //i'm using a proxy, you could remove this if needed.
            IWebProxy iprox = WebProxy.GetDefaultProxy();
            iprox.Credentials = CredentialCache.DefaultCredentials;

            HttpClientHandler httpHandler = new HttpClientHandler()
            {
                UseProxy = true,
                Proxy = iprox,
                PreAuthenticate = true,
                UseDefaultCredentials = true,
                Credentials = CredentialCache.DefaultCredentials
            };

            using (var httpClient = new HttpClient(httpHandler))
                return await httpClient.GetStringAsync(url);
        }

        private static void Main(string[] args)
        {
            //get some JSON data
            JObject jData = JObject.Parse(httpResponse(@"https://yourJSONlinkhere").Result);

            string sampleValue = (string)jData["url"]["type"];

            Console.WriteLine(sampleValue);

            //loop through array in JSON
            foreach (var sampleItem in jData["items"])
            {
                string sampleItemValue = (string)sampleItem["link"];
                Console.WriteLine(sampleItemValue);
            }

            //filter w/ linq:

            var sampleArray =
              from p in jData["items"]
              select p;

            foreach (var sampleItem in sampleArray.Where(p => ((string)p["link"]).Contains("https")))
            {
                string sampleItemValue = (string)sampleItem["link"];
                Console.WriteLine(sampleItemValue);
            }

            Console.ReadLine();
        }
    }
}

View on GitHub

Parsing JSON data in C# (JSON.NET, Linq, HttpClient)

Parsing XML with HTML Agility Pack (instead of XDocument, etc.)

If you’re looking to easily parse some XML w/ .net and finding the normal XObject (Linq) isn’t cutting it, using HtmlAgilityPack could be your answer.

Below I’ll layout the simple steps for parsing a XML file w/ HtmlAgilityPack. The XML file I’m using to test is the WordPress export XML format. It uses lots of XML features so seemed like a good test. Let’s start. I’m using Visual Studio and in my example I’m assuming you have a project already open (I’m using a simple Console app). Continue reading “Parsing XML with HTML Agility Pack (instead of XDocument, etc.)”

Parsing XML with HTML Agility Pack (instead of XDocument, etc.)

Reading Excel Files in .net Using LinqToExcel

planetsI’m sure at some point you’ve needed to read some Excel files in an application. You’ve probably even used the Office Interop libraries (using something like: new Excel.Application();). Below I’ll show a much easier way that makes use of a data model to read an Excel file using LinqToExcel. If you’re looking for the complete code, scroll to the end of the page.

  1. Start a new solutions / project. In my example I’m just using a console app.
  2. Make sure you save your solution before running step 3, or you’ll get a “The current environment doesn’t have a solution open.” error.
  3. In the Package Manager Console, at the “PM>” prompt, enter “Install-Package LinqToExcel” and press enter. When complete, it should read: “Successfully added ‘LinqToExcel 1.10.1’ to…”
  4. Add these lines to the very top of your code (if not already there):
    using System;
    using System.Linq;
    using LinqToExcel;

    using LinqToExcel.Attributes;
  5. Add your data model. I’m using an Excel spreadsheet that contains the planets sizes and masses (taken from this sweet NASA page). Why? Because it’s really cool.
    xls1
    My model looks like:

    internal class Planet
     {
     [ExcelColumn("Planetary Body")]
     public string Name { get; set; }
    
    [ExcelColumn("Mass (1024kg)")]
     public double Mass { get; set; }
    
    [ExcelColumn("Diameter (km)")]
     public int Diameter { get; set; }
     }
  6. Notice how the ExcelColumn attribute contains the formatted name of the column (with spaces, etc.). Make sure these match to your Excel columns.
  7. Now in your Main method, let’s add some code to read the Excel file (replacing the below w/ your file path, model name and sheet name (“Planets” in the below):
    var excel = new ExcelQueryFactory(@"F:\cbitting\Planets.xlsx")
    {
     DatabaseEngine = LinqToExcel.Domain.DatabaseEngine.Ace,
     TrimSpaces = LinqToExcel.Query.TrimSpacesType.Both,
     UsePersistentConnection = true,
     ReadOnly = true
     };
    var planets = from p in excel.Worksheet<Planet>("Planets")
     select p;
  8. Now let’s add something to see what data this retrieved:
    foreach (Planet planet in planets)
     {
     Console.WriteLine(planet.Name + " Mass: " + planet.Mass);
     }
    Console.ReadLine();
  9. Your results should be the data in your Excel file.

You can filter the data in the Linq statement using something like:

var planets = from p in excel.Worksheet<Planet>("Planets")
 where p.Diameter > 5000
 select p;

Or order the data using:

 foreach (Planet planet in planets.OrderByDescending(p => p.Diameter))

This is a very simple example of this powerful method of reading Excel spreadsheets. https://github.com/paulyoder/LinqToExcel has many more examples and documentation.

If you’re looking to use this on a machine without Office installed, I believe you need https://www.microsoft.com/en-au/download/details.aspx?id=13255 to enable ACE.

Below is the entire code from this example:

using System;
using System.Linq;
using LinqToExcel;
using LinqToExcel.Attributes;

namespace cSharpExcelRead
{
    internal class Planet
    {
        [ExcelColumn("Planetary Body")]
        public string Name { get; set; }

        [ExcelColumn("Mass (1024kg)")]
        public double Mass { get; set; }

        [ExcelColumn("Diameter (km)")]
        public int Diameter { get; set; }
    }

    internal class Program
    {
        private static void Main(string[] args)
        {
            var excel = new ExcelQueryFactory(@"F:\cbitting\Planets.xlsx")

            {
                DatabaseEngine = LinqToExcel.Domain.DatabaseEngine.Ace,
                TrimSpaces = LinqToExcel.Query.TrimSpacesType.Both,
                UsePersistentConnection = true,
                ReadOnly = true
            };

            var planets = from p in excel.Worksheet<Planet>("Planets")
                          select p;

            foreach (Planet planet in planets.OrderByDescending(p => p.Diameter))
            {
                Console.WriteLine(planet.Name + " Mass: " + planet.Mass);
            }

            Console.ReadLine();
        }
    }
}
Reading Excel Files in .net Using LinqToExcel

Using Parallel and ThreadPool / QueueUserWorkItem in Combination with Writing / Locking Files

If you’ve wanted to combine parallel / multi-threading and also writing to files, you’ve probably experienced the issue of a file being locked when another process is writing at the same time. Something like:

An unhandled exception of type ‘System.IO.IOException’ occurred in mscorlib.dll

Additional information: The process cannot access the file ‘c:\yourfile’ because it is being used by another process.

One of the ways to get around this is utilize the “lock” statement.

In my below example, I’m running parallel tasks (including the optional MaxDegreeOfParallelism) and writing to a file. I also added a stopwatch and a .sleep to add time and debugging to the method. If you were to remove the lock() from the below, you should experience the ioexception.

Hope this helps as you create parallel tasks (so many great uses) and write files (maybe some simple logging, etc.)

You can also download the sample here.

using System;
using System.Diagnostics;
using System.IO;
using System.Threading;
using System.Threading.Tasks;

namespace parallelTasks_Example
{
    //custom object for passing to queue task
    public class UpdateFile
    {
        public string lineData { get; set; }

        public string filePath { get; set; }

        public long timeTaken { get; set; }
    }

    internal class Program
    {
        //set an update to use a lock
        private static object fileLock = new object();

        private static void Main(string[] args)
        {
            //just for testing
            Stopwatch timer = new Stopwatch();

            timer.Reset();

            //get some data
            string[] contacts = { "jon", "jon2", "aaron", "efren", "diedrich", "tina", "sandy", "haylie", "trevor", "shondrella" };

            //do something with the data
            timer.Start();
            Parallel.ForEach(contacts, new ParallelOptions { MaxDegreeOfParallelism = 3 }, contact =>
                {
                    //do the task
                    Console.WriteLine(contact + " " + timer.ElapsedMilliseconds);
                    //create an object to past to the queue item
                    UpdateFile fileupdate = new UpdateFile { filePath = @"c:\temp\temp.txt", lineData = contact, timeTaken = timer.ElapsedMilliseconds };
                    //add the queue item
                    ThreadPool.QueueUserWorkItem(UpdateFile, fileupdate);
                });

            Console.ReadLine();
        }

        private static void UpdateFile(object fileData)
        {
            //set object to be locked
            lock (fileLock)
            {
                //get the file
                using (var fw = File.AppendText(((UpdateFile)fileData).filePath))
                {
                    //write the data
                    fw.WriteLine(((UpdateFile)fileData).lineData + ": " + ((UpdateFile)fileData).timeTaken.ToString());
                    //make process take long (for example)
                    System.Threading.Thread.Sleep(100);
                }
            }
        }
    }
}
Using Parallel and ThreadPool / QueueUserWorkItem in Combination with Writing / Locking Files

Using SortedSet as a Sorted List with Custom Comparer – c#

Sometimes you have a list of objects you want to stay sorted. You can usually use an IList and then linq: orderby to sort the list – but other times it helps to have the list always sorted. In those times, SortedSet works very well. In my below example, a sorted set contains some people – and whenever you look at the list in the SortedSet – they are already sorted (by age, name & location).

I was concerned about the performance of the SortedSet vs. a List, so after running some tests, as expected – the SortedSet needs some initial time when adding items (to compare) vs. the List which will is fast to write to – but needs a little time when sorting the list after. Below is some really fast numbers (data in milliseconds) I’ve compiled based on adding / reading 25000 records:

sortedSet-performance

Below is the sample using SortedSet and an IComparer:

using System;
using System.Collections.Generic;

namespace sortedSet
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            //create the SortedSet
            SortedSet<Person> people = new SortedSet<Person>(new PersonComparer());

            //add some random folks:
            Random rnd = new Random();
            for (int i = 1; i <= 25; i++)
            {
                //new person
                Person person = new Person();
                person.name = "Bob " + i.ToString();
                person.location = "Miami" + i.ToString();

                //random age
                int r = rnd.Next(1, 99);
                person.age = r;

                //add person to set
                people.Add(person);
            }

            //show our list:
            foreach (Person person in people)
            {
                Console.WriteLine(person.name + " - Age: " + person.age.ToString());
            }

            Console.ReadLine();
        }
    }

    //create comparer
    internal class PersonComparer : IComparer<Person>
    {
        public int Compare(Person x, Person y)
        {
            //first by age
            int result = x.age.CompareTo(y.age);

            //then name
            if (result == 0)
                result = x.name.CompareTo(y.name);

            //a third sort
            if (result == 0)
                result = x.location.CompareTo(y.location);

            return result;
        }
    }

    internal class Person
    {
        public string name { get; set; }

        public string location { get; set; }

        public int age { get; set; }
    }
}

-If you don’t have a need to sort your lists – don’t forget about HashSet (very fast).

Using SortedSet as a Sorted List with Custom Comparer – c#

Force WWW & Fix Redundant Hostnames on Google / SEO

I feel the term “SEO” is completely overused, however, there are a few things you want to do besides just having great content. One is make sure your site url is consistent. chrisbitting.com is different from http://www.chrisbitting.com. 

Google Analytics will provide you a suggestion to fix this if you’re experiencing traffic from multiple hostnames. Something like:

Property http://www.yourdomain.com is receiving data from redundant hostnames. Some of the redundant hostnames are:

This is easy to fix using your Global.asax page. Just add this to your code, replacing “yourdomain” with your actual domain. The Application_BeginRequest will catch and redirect to the altered url, also issuing a 301 to help search engines.

void Application_BeginRequest(object sender, EventArgs e)
    {
        

        if (HttpContext.Current.Request.Url.ToString().ToLower().Contains(

            "http://yourdomain.com"))
        {

            HttpContext.Current.Response.Status = "301 Moved Permanently";

            HttpContext.Current.Response.AddHeader("Location",

                HttpContext.Current.Request.Url.AbsoluteUri.ToString().ToLower().Replace(

                    "http://yourdomain.com", "http://www.yourdomain.com"));

            HttpContext.Current.Response.End();
        }
    }

You could do this using web.config + rewrite, but I enjoy this method more.

Force WWW & Fix Redundant Hostnames on Google / SEO