Thursday, September 27, 2012

Playing With Google Search Results - 2

Another way of getting back web search results from Google is to use Google API. I've spent a couple of hours researching the option to do that, and did not find too many exciting choices. There is an option to use Google API which is deprecated, and limits the amount of searches to about 100 per day, and does not return more than 64 results, and does not allow automatic searches, or to use Google Custom Search, which can be used only to create site-specific custom searches. Anyway, as an exercise I decided to implement a call to Google API (deprecated one). There are a few options available.

The easiest way to use Google API I found was to use Google API for .NET. After I downloaded and referenced the GoogleSearchAPI.NET20 dll, it took me a surprisingly small amount of lines of code to create a quick prototype of querying the Google API

private void btnSearch_Click(object sender, EventArgs e)
{
 string searchTerms = txtTerms.Text;
 List<string> GoogleApiResults = GoogleAPI.StringResultList(searchTerms, 100);
}

public static class GoogleAPI
{
 public static GwebSearchClient client = new GwebSearchClient("");

 public static List<String> StringResultList(string terms, int number)
 {
  IList<IWebResult> list = client.Search(terms, number);
  List<String> results = new List<string>();
  foreach (var result in list)
  {
   results.Add(result.Url);
  }
  return results;
 }
}

Search Results

References

Google API for .NET by . Also posted on my website

Thursday, September 13, 2012

Learning MVC: No parameterless constructor defined for this object

I'm developing a sample application using MVC - a "blog engine". OK, getting rid of all the buzzwords, it is just a few tables: Blogs, Bloggers, Posts. You can add bloggers, create blogs and add posts to a selected blog. Being a good boy, I'm trying not to pass objects like Blog or Post to the view, but rather use ViewModels wherever makes sense. Nothing complicated, for example

public class BlogViewModel
{
 public Blog Blog;
 public List<Post> Posts;
 ...
 
 public BlogViewModel(Blog blog, List<Post> posts, ... )
 {
  Blog = blog;
  Posts = posts;
  ...
 }
}

and then in the BlogController I would have these methods for creating a new blog:

public ActionResult Create()
{
 Blogger selectedBlogger = db.Bloggers.First();
 Blog blog = new Blog();
 return View(new BlogViewModel(blog, new List<Post>(), ...));
}

[HttpPost]
public ActionResult Create(BlogViewModel viewModel)
{
 Blog blog = viewModel.Blog;
 blog.Blogger = db.Bloggers.Where(b => b.BloggerID == viewModel.BloggerID).FirstOrDefault();

 if (ModelState.IsValid)
 {
  try
  {
   db.Blogs.Add(blog);
   db.SaveChanges();
  }
  
  // process errors
 }
 return View(new BlogViewModel(blog, new List<Post>(), ...));
}

Something like that. So I'm testing the create method when I suddenly recieve the "No parameterless constructor defined for this object" error.

No parameterless constructor defined for this object

That left me scratching my head for some time, because I could not figure out what constructor I'm missing. Took a bit of searching to realise: the constructor is missing in the ViewModel. If I modify the constructor shown above as follows

public class BlogViewModel
{
 public Blog Blog;
 public List<Post> Posts;
 ...
 
 public BlogViewModel()
 { 
 }

 public BlogViewModel(Blog blog, List<Post> posts, List<Blog> blogs, int bloggerid, List<Blogger> bloggers)
 {
  Blog = blog;
  Posts = posts;
  ...
 }
}

the error just goes away (notice that parameterless constructor that is just sitting there now, happily doing nothing?). Why is that? Well, I'll be totally honest: I have no idea.

Reference

Fun and Struggles with MVC – No Parameterless Constructor Defined by . Also posted on my website

Tuesday, September 11, 2012

Running an Command Line Program in C# and Getting Output

A simple example. Let's say I want to run ping from command line, but to make this more automated, or maybe user friendly, I would like to run a C# application that pings an IP address, captures the returned result and displays it in a user-friendly format.

Fist thing is to start the command prompt and execute a process. Here's one of the most convenient ways to use it: utilize ProcessStartInfo and Process classes, which are part of System.Diagnostics namespace. ProcessStartInfo takes the program to run, in this case cmd.exe, and parameters, in this case ping, together with its own parameters. Here's how it works:

private void btnPing_Click(object sender, EventArgs e)
{
 string command = "/c ping " + txtIP.Text;

 ProcessStartInfo procStartInfo = new ProcessStartInfo("CMD", command);

 Process proc = new Process();
 proc.StartInfo = procStartInfo;
 proc.Start();
}

Command prompts started from Windows Form

The process starts and the familiar command window appears, then the ping command runs. Now to capture the results of the ping, a few other lines are needed. Firstly, the output of the process needs to be redirected. The following values need to be set:

procStartInfo.RedirectStandardOutput = true;
procStartInfo.UseShellExecute = false;

Next, to capture the output line by line as it is sent by the process, I'll attach a function that does it asynchronously.

proc.OutputDataReceived += new DataReceivedEventHandler(proc_OutputDataReceived);
proc.Start();
proc.BeginOutputReadLine();
proc.WaitForExit();

The function can do anything, but in my case I'm simply redirecting the output to the Windows Form.

void proc_OutputDataReceived(object sender, DataReceivedEventArgs e)
{
 if (e.Data != null)
 {
  txtOutput.Text = txtOutput.Text + e.Data.Trim() + Environment.NewLine;
 }
}

Looks correct, so why am I receiving this exception:

Cross-thread operation not valid: Control 'txtOutput' accessed from a thread other than the thread it was created on.

Well, looks like it's telling me that the process is running from another thread and can not quite access my text box from that thread. Long story short, this is the shortest solution I have found for this issue (there are many options, some as complicated as using a BackgroundWorker).

void proc_OutputDataReceived(object sender, DataReceivedEventArgs e)
{
 if (e.Data != null)
 {
  string newLine = e.Data.Trim() +Environment.NewLine;
  MethodInvoker append = () => txtOutput.Text += newLine;
  txtOutput.BeginInvoke(append);
 }
}

Command prompt output redirected to Windows Form

References:

Having trouble with Process class while redirecting command prompt output to winformCapturing process output via OutputDataReceived event by . Also posted on my website

Friday, September 7, 2012

Playing with Google Search Results

You will need:

Create a Visual Studio project, for example C# Windows Forms application. Drop a TextBox, a Button and a ListView on the form. Creat a class for the methods to be used, let's say Helper.cs. First, I'm using the System.Net.Webclient to call Google and get a page of search results.

public static WebClient webClient = new WebClient();

public static string GetSearchResultHtlm(string keywords)
{
    StringBuilder sb = new StringBuilder("http://www.google.com/search?q=");
    sb.Append(keywords);
    return webClient.DownloadString(sb.ToString());
}

The string that is returned is the html of the first page of the Google search for the string that is passed to the method. Opened in the web browser, it will look something like this

Google search result page

What I want to extract is the actual links, which are marked in red on the screenshot above. Here I'm going to use HtmlAgilityPack to load the string into the HtmlDocument object. After the string is loaded, I will use a simple LINQ query to extract the nodes that match certain conditions: They are html links (a href), and the URL of the link contains either "/url?" or "?url=". By this point, I get quite and unreadable list of values.

Raw URLs

To bring it into readable form, I'll match it to a regular expression and then load the results into the ListView. Here is the code:

public static Regex extractUrl = new Regex(@"[&?](?:q|url)=([^&]+)", RegexOptions.Compiled);

public static List<String> ParseSearchResultHtml(string html)
{
    List<String> searchResults = new List<string>();

    var doc = new HtmlAgilityPack.HtmlDocument();
    doc.LoadHtml(html);

    var nodes = (from node in doc.DocumentNode.SelectNodes("//a")
                 let href = node.Attributes["href"]
                 where null != href
                 where href.Value.Contains("/url?") || href.Value.Contains("?url=")
                 select href.Value).ToList();

    foreach (var node in nodes)
    {
        var match = extractUrl.Match(node);
        string test = HttpUtility.UrlDecode(match.Groups[1].Value);
        searchResults.Add(test);
    }

    return searchResults;
}

Here is the result:

Final Results

I'm not quite sure why this may be useful, but as an exercise it is possible to add an option to parse through a certain number of pages, rather than just the first page. But if you try to run those queries in an automated mode, Google will soon start serving 503 errors to you.

by . Also posted on my website