I still find myself using DataTables (from SQL Server, mySQL, etc.) on a regular basis. In an effort to see how some different methods of looping through the data performs and what method might be the fastest, I put together some small, fast tests, below are my test results.
[Just jump to the fastest method.]
My sample data was comprised of about a 34,000 row datatable – running these tests on a i7, 32gb, VS2015 setup. The 34,000 was chosen because a box of Froot Loops contains about 1800 Froot Loops (I think), but 1800 was too small, so how about 18 boxes of Froot Loops?
Let me first show you the slowest method (using DataTable.Select):
foreach (DataRow sampleRow in sampleDT.Select("Field1 = 2")) { string sval = sampleRow["Field2"].ToString(); sb.Append(sval); }
This is an old, popular way to loop through the data, but in all my tests was the slowest.
Now, let me show you the two fastest methods:
Linq:
var sampleResults = from DataRow sampleRow in sampleDT.AsEnumerable() where sampleRow.Field<int>("Field1") == 2 select sampleRow; foreach (DataRow sampleRow in sampleResults) { string sval = contentRow["Field2"].ToString(); sb.Append(sval); }
And the fastest way I found – but obviously adds complexity to other logic if you’re worried about ordering, etc. – was using Parallel + Linq:
var sampleResults = from DataRow sampleRow in sampleDT.AsEnumerable() where sampleRow.Field<int>("Field1") == 2 select sampleRow; Parallel.ForEach(sampleResults, sampleRow => { string sval = sampleRow["Field2"].ToString(); sb.Append(sval); });
And for the fun of it, the other methods I tested against:
for (int i = 0; i < sampleDT.Rows.Count; i++) { if ((int)sampleDT.Rows[i]["Field1"] == 2) { string sval = sampleDT.Rows[i]["Field2"].ToString(); sb.Append(sval); } }
And:
Parallel.For(0, sampleDT.Rows.Count, rowi => { if ((int)sampleDT.Rows[rowi]["Field1"] == 2) { string sval = sampleDT.Rows[rowi]["Field2"].ToString(); sb.Append(sval); } });
I’ll probably stick w/ Linq & Parallel.ForEach.