Grecce: a Command-line Utility
Though there are already regex-engines designed for character-based input-data, which provide a wide spectrum of features, the library exported by the package "regexdot", besides being polymorphic in terms of the type from which the input-data is composed, provides the additional ability to discover the complete mapping of each input-datum into the regex. This provides a useful aid to debugging regex-syntax, & it is for this reason, that the derived library "regexchar" (which inherits this ability), has been linked into an executable "grecce", which is in other respects, a re-implementation of egrep.
Despite exploiting parallelism in both the evaluation of any alternative sub-expressions defined in the regex, & in the processing of multiple input-data files (where specified), the performance of grecce is relatively poor because the underlying polymorphic library can't exploit character-specific optimisations to read its input-data rapidly. The matching-algorithm is also of theoretically inferior time-complexity to that used by TDFA, but this additional factor is only significant for rather atypical pathological regexen.
Examples
- Words containing all the vowels, in alphabetic order.
-
$
grecce 'a[[:alpha:]]*e[[:alpha:]]*i[[:alpha:]]*o[[:alpha:]]*u' '/usr/share/dict/words'
abstemious abstemiously abstemiousness abstentious adenocarcinomatous adventitious adventitiously adventitiousness amentiferous androdioecious andromonoecious anemophilous antenniferous antireligious arenicolous argentiferous arsenious arteriovenous asclepiadaceous autecious auteciously bacteriophagous caesalpiniaceous cavernicolous chaetiferous facetious facetiously facetiousness flagelliferous garnetiferous hamamelidaceous lateritious parecious quadrigeminous sacrilegious sacrilegiously sacrilegiousness sarraceniaceous supercalifragilisticexpialidocious ultrareligious ultraserious valerianaceous - The longest words which can be spelt, using only the top row of letters on a typewriter.
-
$
grecce '^[qwertyuiop]{10,}$' '/usr/share/dict/words'
peppertree pepperwort perpetuity perruquier pirouetter prerequire proprietor repertoire rupturewort typewriter - One can obtain the mapping of input-data into the regex, for any of these, by specifying the "--verbose" flag.
-
$
echo 'A typewriter.' | grecce --verbose '[qwertyuiop]{10,}'
(Just (.*?,0,"A "),Just [(['q','w','e','r','t','y','u','i','o','p']{10,},2,"typewriter")],Just (.*,12,"."))