[NTLUG:Discuss] Counting key presses in a file...

Mon Aug 27 18:25:04 CDT 2007

Robert Citek wrote:
> On 08/27/2007 03:13 PM, Richard Geoffrion wrote:
>   
>> I know that 'wc' can count words and lines, but how would one count individual keystrokes.  
>> While 'a' would constitute ONE keystroke, 'A' would constitute two. 
>> (Shift + a).
>>     
>
> Not necessarily.  If you have CapsLock on, then the reverse would be
> true. That is, 'A' would count as one and 'a' as two.  What would happen if it was a string of 'A's, as in AAA?  <snip>
>> Carriage returns would count as one while bolding text
>> would constitute four keystrokes as one would need a CTRL-B to turn 
>> bolding on and another to turn bolding off.
>>     
>
> What if I type 'a^Hah'?  What if I type 'a{left arrow}h'?  What if I
> type 'ah'?  They all produce 'ah' but are a different number of
> key-strokes.  Lastly, what about cut and paste?
>
>   
>> How would one go about <snip converting [a document] into something that can be parsed and counted?
>>     
>
> If you can't use Word, which has a built-in counter, then open the
> document in OpenOffice Writer, press Ctrl-A to highlight everything,
> press Ctrl-C to copy all the text in the clipboard, open gedit, press
> Ctrl-V to pasted the text into gedit, and save as a textfile
> "foobar.txt"  Then open a terminal and type 'wc foobar.txt'.
>
> Am I even close to answering your question?
>
>   
These are all VERY good points and maybe I should state a more 
end-product kind of thing.  The end result would be an automated script 
that could evaluate a set of files to calculate a $$ value for a 
document based on the number of characters typed.   (One would have to 
assign certain values to CAPS, *bold*, _underline_  et al.)

So, no matter how quickly or efficiently Bob can type a document, he 
only gets paid a set value for the document based on pre established 
rules.  If Bob can figure out shortcuts to transcribing his documents, 
then fine -- he makes extra money.  If Bob has to correct 1/3 of the 
text he types or chooses to use the mouse to do formatting -- well then 
that's his problem for being woefully inaccurate or inefficient.

So with set values assigned for each character type and formatting 
class.....any words of wisdom to solve this issue?  I've been on 
sourceforge and freshmeat but no luck so far.  I've seen a couple of 
commercial win32 packages, but win32 apps don't lend themselves too well 
to automation in a linux cron job. :)

If one *DID* try to parse a file manually... what would one need to do?  
Lots of grepping, counting and stripping of control characters? Counting 
and stripping higher cost characters that have ascii values > 127 
(typically upper case and foreign characters)?

I'm exploring the OO macro scene. Hopefully there is a similar project 
somewhere.

-- 
Richard