The Wolf Bytes: June 2009

Thursday, June 25, 2009

Parsing MGraph

Ok – now you created your MGrammar, have a project to turn your users input from text to MGraph, now what? You have a few options. For my little toy project I want to take the users input and turn it into a graph of POCO objects that I can do something with. When defining your grammar you will have the opportunity to define the projections or how you want your MGraph to look. You can find a great article on MGraph at MSDN.

Intellipad is great when defining, testing and building the projections for your grammar. Here is the relevant portion of the MGraph generated by the About My Pets grammar:

Sample MGraph Output Fragment, this would be considered one record within MGraph.

{
    PetInfo => {
      Name => "Razor",
      AnimalType => "Terrier",
      Age => "9",
      Sex => "female",
      PrimaryColor => "white",
      SecondaryColor => "brown"
    },
    Activities => {
      {
        Type => "jump",
        Minutes => null,
        Seconds => "12"
      },
      {
        Type => "bark",
        Minutes => "2",
        Seconds => null
      }
    },
    Meals => {
      {
        Cups => "1",
        MealTime => "breakfast"
      },
      {
        Cups => "2",
        MealTime => "dinner"
      }
    }
}

This is nice to review but what can we do with it?

First off let’s generate some XAML, from our input. I found this was a good way to understand what was going on under the hood as well as some of the terminology.

Configuration to run M Command Line Tools

First off, we need to configure a command prompt to get it a little easier for you work with the Oslo command line tools. To do this create a simple .CMD file somewhere and add the following text:

@set PATH=%PATH%;%programfiles%\Microsoft Oslo\1.0\bin

Then create a short cut to that file with the following information:

Click on that shortcut to open the command prompt, you should now have the Oslo tools in your path, to test this simply type “m”

and you should see something like this:

Compiling our Grammar

Now back to our regularly schedule post. First we need to compile our grammar so change the directory to where your file was saved and type:

C:\MyDirectory\m [MyGrammarFile].mg

When you do so, you should see the following:

This created a compiled version of your grammar with an MX extension, confirm this by looking at the contents of the directory.

Generate some XAML for your User Input

Now we are ready to do something with our input file, so save a sample of user input into a text file in something like UserInput.txt

The sample I saved is similar to:

About my pets

Razor is a Terrier that is a
9 year old female, and her color is white.
She will jump for 15 seconds and bark for 2 minutes.
She eats 1 cup of food for breakfast and
eats 2 cups of food for dinner.

Rocket is a Rat Terrier that is a 6 year old male,
and his color is black. He will whine for 3 minutes.
He eats 2 cups of food for dinner

Next run the following:

C:\MyDirectory\mgx /r:[MyGramamrFile].mx UserInput.txt /t:xaml

This will generate a XAML file from your sample user input or let you know any errors that may have occurred.

From my sample grammar, here is portion of the generated XAML.

Getting at this data in .NET

This is nice, and may be a little easier to parse, but what I really want to do is deserialize MGraph into a tree of POCO (plain old CLR objects) so I can do some “stuff” with the data. I built a simple method to load the compiled grammar, and build an in-memory representation of the parsed user input with an instance of System.Dataflow.GraphBuilder from the Oslo runtime library.

Before you get started, you need to add references to your project:

using Microsoft.M;
using System.DataFlow;

Both can be found in the <Program Files>\Microsoft Oslo\1.0\bin directory

Create our Parser

First we need an class that knows how to format parsing errors, this will inherit from System.DataFlow.ErrorReporter.

using System;
using System.Dataflow;
using System.Linq;

namespace PetParser
{
    class ParserErrorReporter : ErrorReporter
    {
        protected override void OnError(ErrorInformation errorInformation)
        {
            string msg = string.Format(errorInformation.Message, errorInformation.Arguments.ToArray());

            throw new FormatException(
                string.Format("Syntax error at [{0}, {1}]: {2}",
                errorInformation.Location.Span.Start.Line,
                errorInformation.Location.Span.Start.Column,
                msg));
        }
    }
}

Parse the User Input

Next we’ll create a simple method to load up our compiled grammar, build a parser instance, perform the parse and hand the MGraph representation as an object graph of a set System.Dataflow.Node instance.

public void ParseGrammar(string grammar)
{
    //Load the grammar that was compiled into the assembly as a resource.
    using (var img = MImage.LoadFromResource(System.Reflection.Assembly.GetExecutingAssembly(), "PetParser.mx"))
    {
        //Load up the specific MGrammar file.
        var factory = img.ParserFactories["Pets.MyPets"];

//Create our parser.
var parser = factory.Create();

        //Create an instance of NodeGraphBuilder to let the
        //parser know what type of graph to build.
        parser.GraphBuilder = new NodeGraphBuilder();

        try
        {
            var grammarTextStream = new StringTextStream(grammar);
            //Attempt to parse the grammar, any errors will be
            //formatted via the ParserErrorReporter and handed
            //via Try/Catch
            var root = (Node)parser.Parse(grammarTextStream, new ParserErrorReporter());

WalkTree(root);

        }
        catch (FormatException exc)
        {
            //If we get an error parsing the grammar, just write it (for now)
            LogIt(0, exc.Message);
        }
    }
}

Walk the System.Dataflow.Node tree

void WalkTree(Node node, int level)
{
    foreach (Edge recordEdge in node.Edges)
    {
        var value = string.Empty;
        if (recordEdge.Node != null && recordEdge.Node.AtomicValue != null)
            value = string.Format(@"Value: ""{0}""", recordEdge.Node.AtomicValue.ToString());

LogIt(level, "{0}. [{2}] Brand Text: \"{3}\" Label Text: \"{4}\" {5}", level, node.Brand.Text, node.NodeKind, node.Brand.Text, recordEdge.Label.Text, value);

        WalkTree(recordEdge.Node, ++level);
        --level;
    }
}

Output

Now when we execute this we get the following output

The following table is from the MSDN Page “MGraph Object Model”

Now that we are walking the tree, we just need to create instances of and populate our simple .NET types. This could be done with some sort of state machine or an implementation of the visitor pattern.

I wouldn’t be very surprised if in upcoming releases of Oslo there will be built in mechanisms to create some sort of .NET code/assemblies and provide automatic serialization/deserialization using MSchema, but for now this isn’t too bad.

Next task is to build a front end to collect user input. I’m thinking along the lines of an Azure hosted Silverlight application. I’m thinking the grammar will be passed to the sever via WCF and all the actual parsing will happen there.

-ec

Embedding MGrammar Files Within Your Project

Intellipad is an awesome tool, but if you really want to do some work with Oslo you are probably going to need to write a little code. This implies that you will need to get your MGrammar definition files into your Visual Studio project.

Once you create and do some testing on your MGrammar definition files within Intellipad save them with the extension .MG and include them within your Visual Studio.NET project. Next you will need to do a bit of hand-coding on your project file, the easiest way to do this is just to click on “Unload” and the “Edit”.

Once you do that, you will need to modify your project file with the following changes:

and

and then finally change the action for your .MG file

Here are the actual lines for you to cut-and-paste enjoyment:

<MgTarget>Mgx</MgTarget>
<MgTarget>MgxResource</MgTarget>
<MgIncludeReferences>false</MgIncludeReferences>

<MGrammarLibraryPath Condition="'$(MgCompilerPath)'!=''">$(MgCompilerPath)</MGrammarLibraryPath>
<MGrammarLibraryPath Condition="'$(MGrammarLibraryPath)'==''">$(ProgramFilesx86)\Microsoft Oslo\1.0\bin</MGrammarLibraryPath>

Once you’ve done this, you will need to reload your project.

At this point when you compile, your MGrammar should get compiled and included in your project as a resource. You can verify this by using Reflector.

Now to actually do something with this in your application you need to load the resource. The following chunk of code will do that for you:

DynamicParser parser = null;

using (var img = MImage.LoadFromResource(System.Reflection.Assembly.GetExecutingAssembly(), "AboutMyPets.Web.mx"))
{
var factory = img.ParserFactories["Pets.MyPets"];
parser = factory.Create();
}

Also as a side note, you will need to add the following reference to your project.

using Microsoft.M;
using System.DataFlow;

Both can be found in the <Program Files>\Microsoft Oslo\1.0\bin directory

Now that we have our grammar loaded from a resource into M’s DynamicParser instance we can pass in some text and hopefully get something meaningful out of it. Stay tuned for an upcoming post…

-ec

Wednesday, June 24, 2009

WebKit/Safari Application Cache Work-Around

If you are using offline Application Cache W3C and Safari in the initial 4.0 release of WebKit/Safari you might run into a strange problem, let me explain.

Initial Download

When you first download the site into your application cache everything works fine. From Fiddler (one of my all time favorite tools) you can see the “stuff” that gets sent over the wire.

As you can see it includes an ETag and Last-Modified date even though we have additional headers to ignore the cache. At least that’s how IIS 7.0 serves up the content. I added the following custom response headers to make sure these files are aren’t cached.

When you Update Your Application Cache

Now the problem, when you update your manifest file all the offline files should be re-downloaded and stored into the cache. When the resource is requested, you will see the following traffic within Fiddler.

Drilling down into one of these requests, it becomes fairly obvious where the problem is:

When WebKit/Safari attempts to download a resource it sends up an If-Modified-Since and a If-None-Match header. IIS looks at the files and says, Nope, the file hasn’t changed, so I’ll *help* you out by returning a 304 Not Modified status code. WebKit/Safari says, nope, that’s not good enough, I’m going to request it again. As you see with the initial Fiddler trace, this will continue as long as you let it. If you clear the cache in the browser and make the request. It won’t send the headers to check for “304 Non-Modified” so it will work, but I don’t think you want to have your users having to clear the cache everything you download an updated.

The Solution

I searched forever (well at least 5 minutes) on how to disable 304 checking and the only thing that was even close was to create an HttpModule to remove the headers. That’s what I ended up doing.

Obviously this is C# code that is running within IIS. If you’ve followed this far, chances are you get the gist of what I’m doing here. The code and solution isn’t the important thing but understanding what needs to be done is.

public class RemoveCacheTagModule : IHttpModule
{
    public void Dispose()
    {
        throw new NotImplementedException();
    }

    public void Init(HttpApplication context)
    {
        context.PreSendRequestContent += new EventHandler(context_PreSendRequestContent);
    }

    void context_PreSendRequestContent(object sender, EventArgs e)
    {
        HttpContext.Current.Response.Headers.Remove("ETag");
        HttpContext.Current.Response.Headers.Remove("Last-Modified");
    }
}

With the HttpModule installed and running Fiddler you can now see that the headers no longer contain a Last-Modifed or ETag so when WebKit/Safari makes the request for the resource IIS has no choice but to return the resource with a status code of 200.

Hope that saves someone a bit of time.

-ec

Thursday, June 18, 2009

Not all Code is Created Equally (Nor should it be)

Recently I built a stand-alone component that extends the ASP.NET FileUpload control to save to the Amazon S3 web service using method calls as simple as the FileUpload’s method SaveAs(). As I was writing the code for this component, I slipped into a different mode of development where I was thinking about all the extremely rare, 1 in 100,000 weird edge cases that people using this component might run into. The primary reason for this was that this is something that was intended on being released into the wild and I just wanted to release it and forget it. At this point, you may be saying, hmmm…this guy is a hack if he doesn’t always consider the “weird edge cases” that might occur. I beg to differ, writing really solid and truly 100% industrial strength code is extremely difficult and time consuming. To get to a point where your code is 99.99% perfect probably takes a couple of orders of magnitude more time and effort than to get your code to 99% perfection. The final usage and intended audience for the S3FileUpload component is completely different then the majority of the code we write for our business apps.

In most business applications the level of our software architectures and canned components should be to a point where our role is to write some sort of interface (UI or Service) to populate our business objects, validate the data is correct in that business object, possibly perform some business logic on that data and persist it to the database. Of course there are all sorts of new fangled patterns that can be followed to make this happen, but in the core essence that’s for the most part what they are doing.

Once we accept the majority of the code we write is never going to be used in navigational systems of the space shuttle, or even the base class library for the .NET framework, we can get on with business of establishing the right level of software quality for the problem we are attempting to solve. If you’re are thinking at this point, ok good, the way I’m putting all my business logic in the UI is an acceptable level of quality, this article assumes a baseline of experience, knowledge and quality practices so come back after you have a few more years and/or systems under your belt.

There are different concepts that need to be evaluated when choosing the “level of software quality”

What happens when your bugs sneak past QA?

Are you building some data entry screens that get used by two people once a quarter to update some values? If this works correctly 95% of the time, you may have a failure once every few years or so.
How easy is it for your users to know that there is a failure? Does it crash? Does it save the wrong value?
What happens when a wrong value gets saved? Do you just need to renter the value and all is well?
Do you have to run a simple process that reprocesses data that takes two minutes? Does it take two hours?
Does this wrong value impact the pay of one or two employees? Maybe it impacts one or two thousand? How much work is it to issue new pay checks?
Is this wrong value impacting the commission charged on selling securities where one or two days using the wrong value costs the company millions of dollars?

Obviously making sure that a single value is entered, validated and saved correctly sounds like a simple thing right? You can always get that one right.

How much work is it to craft a 100% perfect solution?

For that value that is entered by two people once a quarter, do you just have a target range that you need to validate against?
Do you need to download values from a government site, make sure that happens properly and use those values to validate?
Do you need to download values, apply some corporate wiz-bang business logic and then do validation?
Do you need to download values, apply biz-logic and get sign-off from the CFO?
Do you need to download values, apply biz-logic and get sign-off from the CFO and then perform some workflow with the board of directors before this value becomes valid?

As you can probably gather, if the impact of the failure is minimal, but the effort and it’s corresponding cost of establishing a bullet proof software is not minimal, the decision on the investment in time (and money) is not an easy one. It’s in our nature as software developers to try to achieve perfection, well at least it should be. But in some cases the cost of perfection just isn’t justified, I’m not sure there is a empirical formula here but I think there are things that can be done.

Some strategies for dealing with being lazy (or using your time wisely)

If you have a spec you are working from, not only understand the words that make up that spec, but understand the spirit in which that spec was created. That is try to read between the lines and understand what the person really wants to get out of the software and what it needs to do. Writing a spec is not easy, even for us analytical types (programmers) for the creative types this is even more difficult.
Discuss with the business stake holders the importance of the features. Don’t get too technical but try to explain your perceived efforts and get an understanding of the impacts of failure.
If you decide that the failure isn’t critical and time might be spent doing other work or adding other features, what ever you do don’t fail silently. Fail quickly and fail in an obvious way. What ever you do don’t do something like: try {…} catch {};
Make sure you are using a good error logging format. If you do find yourself in an unexpected state that would cause a failure, capture that state, capture the stack trace and have the logging system shoot yourself an email. Most of the times it’s easier to fix the problem than originally find it.
Write the right unit tests, I’ll have to admit, I’m not a big test first or 100% code coverage fan, but a number of well place unit tests have saved my a** more than a few times. What is the right number, sorry I don’t have that answer, I think it is a formula of the sophistication of the software and the cost of the failures.
Leverage your piers, if you are working on something where you know you should spend considerably more time on, but it just doesn’t make sense for you to churn on something for a week or two. Bring in a trusted pier, do a code or concept review, you may be too close and miss something obvious.

Just to be clear, I’m not advocating being sloppy and not thinking through the edge cases of the software you are writing, but there is a cost in the pursuit of perfection. You may win the battle (a perfectly tested and functioning admin module, which I will argue is not perfect anyway) but lose the war, never ship the system that uses the data from that admin module.

-ec

Saturday, June 13, 2009

Upgrading the Solid State Disk (SSD) in my HP Mini

I’m always searching for the perfect mobile computing platform/configuration. The day the first Tablet PC came out running Windows XP Tablet Edition I picked up a Toshiba before they even put out the display model. Also after attending an Mobile Embedded Toshiba Portege 3505 - PIII-M 1.33 GHz - 12.1 Developers Conference (MEDC) session I had Samsung Q1 Tablet PC thought Ultra Mobile PCS (UMPC’s) was the way to go so I picked up a Samsung Q1 as soon as I could find one. Last December when I was up in Minneapolis I picked up one of the latest contenders, a HP Mini Netbook. I really like the form factor and the styling even more than my Macbook, but the performance was just not acceptable. It came with Windows XP and I immediately updated it to Windows 7. Performance on the first beta of Windows 7 was better than Windows XP, but it still was slow. Taking 7-10 seconds to launch a HP Mini 1035nr - Atom N270 1.6 GHz - 10.2 browser is just plain painful. I even popped in a little SD card and enabled ready boost, I could tell a different but it wasn’t fast enough to make the device usable. The problem was the junky SanDisk SSD that came with the computer, I’m very surprised that HP even went to market with the device.

I did some research and it looked like the best way to go was a Runcore SSD but the availability wasn’t that great. Last week I finally received this Runcore 32GB that had been backordered for months. Installing it only took me about 10 minutes with these instructions.

I just can’t say enough about how much the new SSD transformed my little HP Mini from a paper weight into something that is actually a very usable device. I probably won’t be doing any development, but for taking notes, surfing the web, email and word processing, which are the things I purchased it for, it runs plenty fast. As you can see my Primary hard disk WEI transfer rate is 5.9. I probably should have looked before, but the number 1.7 sounds familiar.

Bottom line is if you own an HP Mini, run don’t walk to your computer (well I guess if you are reading this chances are you are probably already there) My Digital Discount and order a SSD, you won’t be sorry, trust me, really I mean it.

-ec

Friday, June 5, 2009

I’ve got an Awesome Battery

I wish I could find more like this

-ec