Monday, February 23, 2009

Anonymous Types and Regular Expressions

Regular Expressions are great, and they aren't bad to program, but . It is "non-obvious" which field you are supposed to use. A group is a capture, and has captures, so where is your named capture group to get that simple piece of text?

In the spirit of the "pit of success" here's a quick way to use Regex in your program and you only have to remember one call. We'll do this by casting to an anonymous type. It looks a tad wierd, but it is super easy to use. Super easy, the second time.


using System;
using RegularExpressions;

namespace egrep
{

class egrep
{
static void Main(string[] args)
{
string text = "The the quick brown fox fox jumped over the lazy dog dog.";

var matches = Regex.grep(text, @"\b(?<double_word>\w+)\s+(\k<double_word>)\b",
new { double_word = "" });

foreach (var find in matches)
Console.WriteLine(find.double_word);
}
}
}

Thursday, January 29, 2009

Building Windows Software Tracing from Visual Studio 2008

Rule File to use with Visual Studio 2008 SP1.

The Windows software trace preprocessor (abbreviated WPP; the preprocessor and related support tools are known as WPP Software Tracing) is a preprocessor that simplifies the use of WMI event tracing to implement efficient software tracing in drivers and applications that target Windows 2000 and later operating systems. WPP was created by Microsoft and is included in the Windows DDK.

WPP is run prior to compilation (in other words, before even the C preprocessor), and generates a trace message header for each file that it processes (by default this header is filename.tmh, where filename is the name of the processed source file). This header must then be explicitly included into the source file, for example:

// File: file.cxx
// This file is an example of using WPP
#include "file.tmh"


The preprocessing for WPP is normally handled by the DDK compiler. This rules out using Visual Studio to add this to your project. After looking around there is a nearly undocumented solution. The preprocessor also exists as an executable named tracewpp.exe in the bin directory in the DDK. Normally this would be
c:\WinDDK\6001.18002\bin\x86\tracewpp.exe


To add this to VS 2008 SP1, the easiest way I've found is to create a second project as a dependency to your driver project. Then each C/CPP file can use a custom build rule.

Here's how to setup the project.
Create a new project, and change the type to Utility. This will keep it from compiling the .c files.



Right click on the new project to edit the Custom Build Rules.


Add a custom rule file so that you can reuse it. I called mine WPP.

Here are the Build Rule properties you will need.


Rule file to use with Visual Studio 2008 SP1.

Wednesday, August 27, 2008

NTFS 010 Editor Template



Here's a NTFS (New Technologies File System) template I have been working on for viewing the NTFS Master File Table (or MFT) using the 010 Editor.

The MFT contains all the information about the files and directories stored on the disk. In the NTFS, everything is a file. The MFT itself is a list of file records, typically 1k in size, that has all your file entries. The root directory of the file structure starts in entry 5, and it's name is '.'



Bootstrap
Since the above paragraph is technically correct, but makes no sense, here's is how we start. The start of the MFT is stored in the NTFS boot sector. This is the first block of the volume. There is a disk boot sector as well, to confuse matters. The disk boot sector lists all the volumes, such as the NTFS volume we are going to look at.

From the boot sector we can find the start of the MFT. The MFT is logically an array of MFT entries on disk. It can be in multiple chunks, which makes sense for expanding and contracting volumes. When you expand, there is surely little room for another disks worth of file entries (mft entries), so NTFS can put these in different spots on the the drive.

MFT entry 0, is a file called (drum roll....) $MFT. It's file attribute called $DATA lists the different runs for the file. The file in this case being the MFT itself. The first run starts exactly where MFT 0 starts, which is why it is called self describing. We just need the boot sector to get things rolling.



Here's a picture of bcd.hive.LOG1 and a picture is worth way more than a few thousand words this time.

Start at MFT 5, Attribute 4 is the Index_Allocation (Attribute 3 points here). In the second run of the Allocation, entry 2 has the file properties.


ntfs_defs.bt
ntfs.bt

Saturday, March 29, 2008

Design Rules

I ran across this on Kevin's blog. Great stuff I have to add my slant to it.

Minimize code complexity
Maximize API ease-of-use
Maximize the chance another dev will use an API correctly (see Pit of Success)
Minimize code size
Minimize unnecessary code churn
Maximize code readability
Maximize correctness
Maximize robustness
Maximize flexibility
Maximize code maintainability
Maximize CPU performance characteristics
Maximize Memory performance characteristics
Opt for immutable data structures
Opt for thread-safe data structures and APIs
Verify pre-conditions
Follow design guidelines
Get it done yesterday!

These are all but the same rule to me. It's the first goal listed, Minimize code complexity. Maybe I should say rules 2 through N are how to get to minimal code complexity.

Levels of Complexity
If you have a component with no interrelations, at most it will take you W^1 units of work. Put your slowest or fastest dev on the component, it won't change you project completion date.

Now, if you have a component that rely's on something else. This is pretty much going to take you 4 times as long. W^2

W^3? That's right.... 9 times as long.

On the way to reaching you minimal code complexity, item's 2 and 3 are super important. Maybe we should change it to minimal api complexity. I will take tons of code in a class with a super easy api. That all but takes you from 9 work units down to 4.

Either way, a great list to use as talking points.

Some crazy extreme's come to mind. boost's Spirit (www.boost.org). A very neat parsing library. Almost nothing in the classes is directly tied to any other class.

The use of templates and the fact that every compiler error and call stack is a good 20 lines long makes it just impossible to use. Impossible unless you know about all the classes that exist to help everything compile. If VC9.0 would just let you turn off namespaces, and even template parameters in the debugger it'd be 100 times easier.

Wednesday, March 26, 2008

Great keyboard

This get's an "epic" rating.
Das Keyboard II

Of course, there are some nice $600 keyboards with mechanical keys, but this one is a whole $80. A no brainer if you type all day.

IL IDA tool

A little interactive disassembly for you from the Reflector addins.

Reflexil on SourceForge.net

Reflexil is an assembly editor and runs as a plug-in for Reflector. Using Mono.Cecil, Reflexil is able to manipulate IL code and save the modified assemblies to disk. Reflexil also supports 'on the fly' C# and VB.NET code injection.

Tuesday, January 15, 2008

Visual Studio 2008 CRT bug

I run into this "issue" alot compiling this open source project or that open source project. It's a fun MACRO problem. I love MACROs.

Here's what your compiler will give you.

12>C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\stdio.h(358) : error C3163: '_vsnprintf': attributes inconsistent with previous declaration
12> C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\stdio.h(350) : see declaration of '_vsnprintf'
12>Generating Code...

Now this looks like a no brainer, I mean the two declarations are 8 lines apart! So I stare for a while. (Looks the same to me). Hmmmm. Then I try following a few MACRO's, often futile if you don't have browser symbols built yet. Perhaps I have some crazy path with my includes!? I only have 12 SDKs or something installed. So I turn on /showIncludes (how did it take 20yrs to add this option?) Looks good again.....

Crap time for the Big Guns -- If you love MACRO programming you know what's next. Turn on the C++ listing output. The listing output is what the compiler really compiles are the preprocessor has had it's merry way with the code. 2MB of the finest night time reading you'll ever find. Per source file of course. I was kinda dreading trying to find the function after all it's beautification had been stripped.

I've read a lot of code over the years, and I'm not even sure what this would preprocess out to.

__DEFINE_CPP_OVERLOAD_STANDARD_NFUNC_0_2_ARGLIST_EX(int, __RETURN_POLICY_SAME, _CRTIMP, _snprintf, _vsnprintf, _Pre_notnull_ _Post_maybez_ char, _Out_cap_(_Count) _Post_maybez_, char, _Dest, _In_ size_t, _Count, _In_z_ _Printf_format_string_ const char *, _Format)

I digress...

It just compiles when you turn on the listing output.

I'll type it more slowly this time, in case that didn't sink in. The compiler switch, that just spits out more information, changed things just enough to have everything compile. Neato, unless that sort of thing keeps you up at night.

btw, the fix, if you've been reading this far is to not #define vsnprintf in _your_ project. The CRT must redefine it once or thrice.