Grouse Grep

Grouse Grep is a program that searches for a target string or regular expression pattern in one or more files. It features stunning performance for many common search cases, often outstripping GNU Grep or any other grep for speed. However, at present the program only handles basic regular expressions, and may be much slower when handling more complicated searches.

There are over 10,000 lines of C code in Grouse Grep, and around another 8,000 lines in supporting scripts and test rig files. Finding your way through this body of work can be daunting. This web shows how the program is split into modules, and documents much of the analysis and design that went into the construction of each component that is not always easy to see in the code.

The original announcement message may help provide an introduction to the program and this web.

A Map of the Code

RETable TblDisp MatchGCC MatchEng STBM STBM Shim CompDef Tracery Bakslash RegExp ScanFile FastFile GGrep Platform


Behind The Scenes

There's a lot more to Grouse Grep than meets the eye. Here you'll find detailed tutorials on the major architectures and algorithms used in the code. The design and coding style topic is perhaps a little self-indulgent, but contains lots and lots of information about how I design and code programs, and may be of interest (it's even controversial in places!)

Other bits and pieces about Grouse Grep are also presented here. The test rig is particularly interesting: It's a fully-automated regression and flog test rig which is easily expandable.

Design & Coding String Searching Grouse FSA Test Rig Command Line Peformance

Download

You can download the source code for Grouse Grep (tar/gzipped) from the Grouse FTP site ggrep-2.00.tar.gz.

The FTP Archive contains other related files, and also the earlier DOS versions released as part of the DDJ publication (from November 1997).


Back to Grouse's home page