Search and Truncate / Trim Paragraph By Sentence C# (not word or character)

mag-glassI’ve had the need on more than one occasion to preview or trim a paragraph around a search term, but I don’t want to just use a character count and cut words in half, or a word could and cut sentences in half. My simple method below takes a paragraph and a search word and returns a string that is truncated by sentences. With some additional params, you can decide how many sentences to include from around your desired sentence, if you want a “read more” tag or begging tag to indicate more text exists.

In this example, I’m searching for a word in a string object (‘para’).

truncateBySent(para, "redmond", 1, 1, true, " View More...", false, "...", false, "Not Found")

In this example paragraph:

By a board session on the weekend of Dec. 14, some board members were exhausted from the months of work, and were concerned that the process had dragged on, said a person familiar with the matter. They left the meeting, at a hotel less than 8 miles from Microsoft’s Redmond, Wash., headquarters, without a pick. The board had more potential CEO candidates with whom they wanted to meet, and were frustrated that research wasn’t ready on at least one new prospect, said people familiar with the situation.

It would return:

…They left the meeting, at a hotel less than 8 miles from Microsoft’s Redmond, Wash., headquarters, without a pick. The board had more potential CEO candidate s with whom they wanted to meet, and were frustrated that research wasn’t ready on at least one new prospect, said people familiar with the situation.
 

Another variation:

truncateBySent(para, “board”, 1, 1, true, ” View More…”, false, “…”, false, “Not Found”)

Returns:

By a board session on the weekend of Dec. 14, some board members were exhausted from the months of work, and were concerned that the process had dragged on, said a person familiar with the matter. They left the meeting, at a hotel less than 8 miles from Microsoft’s Redmond, Wash., headquarters, without a pick. View More…
 

And here is the function, you’ll need to import:

using System.Linq;
using System.Text.RegularExpressions;
  public static string truncateBySent(string source, string searchWord, int sentPrepend = 1, int sentAppend = 1, bool onlyShowFirst = true, string viewMoreTag = "", bool alwaysShowViewMoreTag = false, string startTruncTag = "", bool returnSourceIfKeywordNotFound = false, string returnNotFound = "")
        {
            //going to be the final string
            string truncated = "";

            //parse source sentences
           string[] sents = Regex.Split(source, @"(?<=[.?!;])\s+(?=\p{Lu})");

            //create some search start & end holders
            int i = 0;
            int ssent = -1;
            int esent = 0;

            //find start / end
            foreach (string sent in sents)
            {
                //search using regex for word boundaries \b
                if (Regex.IsMatch(sent, "\\b" + searchWord + "\\b", RegexOptions.IgnoreCase))
                {
                    if (ssent == -1)
                    {
                        ssent = i;
                    }
                    else
                    {
                        esent = i;
                    }
                }

                i++;
            }

            //make final string:

            if (esent == 0 || onlyShowFirst == true) esent = ssent;

            i = 0;

            foreach (string sent in sents)
            {
                if (i == ssent - sentPrepend || i == ssent || i == esent + sentAppend || (i >= ssent - sentPrepend && i <= esent + sentAppend))
                {
                    truncated = truncated + sent + " ";
                }

                i++;
            }

            //add view more

            if (esent + sentAppend + 1 < sents.Count() || alwaysShowViewMoreTag == true)
            {
                truncated = truncated + viewMoreTag;
            }

            //add beginning tag
            if (ssent - sentPrepend > 0)
            {
                truncated = startTruncTag + truncated;
            }

            //check if anything was even found:

            if (ssent == -1)
            {
                if (returnSourceIfKeywordNotFound)
                { truncated = source; }
                else
                {
                    truncated = returnNotFound;
                }
            }

           //and now return the final string - do a trim and remove double spaces.
            //did i ever mention how much i despise double spaces?
            return truncated.Trim().Replace("  ", " ");
        }

 

Happy Searching!

Search and Truncate / Trim Paragraph By Sentence C# (not word or character)

c# Whole Word Matching (RegEx)

metal-detector-shmarkiiIf you’ve ever wanted to test a string to see if a word exists, but can’t use “.contains” because it doesn’t respect whole words (not that I would expect it to), below is a fast, simple way using Regex:

Of course you’ll need: using System.Text.RegularExpressions;

Now setup your pattern and Regex object:

string pattern = @"\bteam\b";
Regex rx = new Regex(pattern, RegexOptions.IgnoreCase);

Now create a match:

Match m = rx.Match("Teamwork is working together.");

Does the word exist:

if (m.Success) {
//no
}

Try again using a string with the whole word:

Match m = rx.Match("I am just part of the team.");

Does the word exist now?:

if (m.Success) {
//yes!
}

Of course, this is just a tiny portion of the power of Regex. Happy matching!

c# Whole Word Matching (RegEx)

Google Analytics Custom Report Filters – RegEx

google_analytics_logoIf you’ve noticed, Google Analytics doesn’t seem to allow you to filter on standard reports using multiple dimensions if you’re only showing one. You can easily create a custom report and add your filters using multiple dimensions and only show one. However, many users get scared by the RegEx option under Filters (the only other choice is Exact). It’s not that complex, below is a simple example showing multiple filters using a simple RegEx ‘contains’. Continue reading “Google Analytics Custom Report Filters – RegEx”

Google Analytics Custom Report Filters – RegEx