Archive for September, 2011

Using CLI and GUI Together Seemlessly

September 30, 2011

Some people love CLI like heaven, hate GUI like hell, and vice verser for some other people.

Me? I used to belong to the first group, but now I have learned to another level. Why not stop the love and hate and make them work together?

For e.g., now I often start from the CLI, and end in the GUI. If I want to open a file for uploading in GUI browser, I start from the CLI shell, because navigating directories and searching for files is so much easier in CLI shell than clicking all the way down the directory hierarchy in GUI FindFile dialog. Then I, while still in CLI shell, put the file path into clipboard with xclip (for Linux) or putclip (for Cygwin), switch back to the browser FindFile dialog, and paste it.

Similarly, when I want to locate a file in FileManager like Explorer for e.g., I start from the shell with this bash function:

function oc()
    cygstart `which explorer.exe` /n,/select,\""`cygpath -alw \"$1\"`"\"

When I want to locate a registry entry in regedit.exe, I use another command I wrote in python and pywin32, and made a short name of in bash:

alias #open registry

When I want open a file with its associated opener, like Word for .doc, Excel for .xls, I use cygstart in Cygwin, and gnome-open in Linux:

if test `uname` = CYGWIN_NT-5.1 -o `uname` = CYGWIN_NT-6.1
    function of()
        if test -e "$1"; then
            cygstart "$@"
        if which "$1" >/dev/null 2>&1; then
            if [[ "$1" == of ]]; then
                local file=`which cygstart`
                local file="`which \"$1\"`"
            cygstart "$file" "$@"
            cygstart "$@"
    alias of=gnome-open

This way, I can use the same .bashrc under both Linux and Windows (think, Cygwin), and the same (well, not completely) of command to Open File.

(So this is not just about CLI/GUI together, but also about using Linux/Windows (again, think Cygwin) in the maximally possible same way).


Introducing beagrep, the beast like grep

September 29, 2011

I just found out about the beagle project couple of months ago, I’m totally excited by it. It’s the missing brick that I longed for to write a `grep on steroid’ which I can use as a source code reading tool.

Yeah, right, I was using grep to read souce code, often times finding cscope insufficient (because some files are not source code, and even cscope’s fuzzy syntax parser can not parse them). On the other hand, with large projects, such as Linux Kernel, or the even larger Android system, grepping can be very slow on the whole project. I once searched for readlink in android source tree, it took me >30 minutes!

With beagrep (beagle combined with grep), I can grep it for less than 2 seconds!

Why is it possible

When you grep for reading source code’s sake, you often don’t need complex regexp power: when you search readlink, you grep readlink, not grep r.*e.*a.*d.*l.*i.*n.*k, that just does not make any sense!

IOW, you 99.9% times search only for whole words like readlink, which is a kind of simple regexp, and unlike complex regexps (such as r.*e.*a.*d.*l.*i.*n.*k), is something search engines can deal with perfectly.

How is it done

It is really a very simple idea, when you want to grep a target repexp, do the following:

  1. Break the target regexp into whole words, for e.g., grep -e some.*fun.*stuff should be broken into “some fun stuff”.
  2. Query beagle with the whole words, beagle answers with which files in the repository contais these words. These files are the possibe matching files.
  3. Grep the target regexp in those possible files (which often is only a very small part of the whole repository, thus grep can finish in a blink of the eyes).

Modifications to beagle

Here’s the details of how I changed beagle to satisfy my need (warning: boring stuff ahead):

  1. Change all beagle built in filters to FilterText. This is because I don’t want those keywords filtered by those SourceCode filters. This way, I can beagle-query `extends CFunny’ to see which classes are inheriting from `CFunny’ in Java (The default Java filter will remoke extends since it is too common and uninteresting in java source files).
  2. Remove some restricts. For e.g., only the first 100000 tokens in a file would be indexed, which is undesirable for my purpose. Also, I enlarged the memory threshold by 10 times, since I found it causing problems with some large xml files.
  3. Remove more restricts. Basically, I unremoved anything the NoiseFilter will remove. Also, another filter will remove common English words, I unremoved those as well.
  4. Added support for indexing Chinese characters (This is because I’m a Chinese).


Here’s how I use it:

  1. Build a static index at the top level dir of the souce code:
  1. Use beagrep in any directory in the source tree:
    ~/bin/beagrep -e "ENGLISH_STOP_WORDS" 

    The output is like the following:

    beagle query argument `ENGLISH STOP WORDS'
    /src/beagle/beagled/LuceneCommon.cs:1206: ...ENGLISH_STOP_WORDS...

    Note: ENGLISH_STOP_WORDS is broken into 3 words before beagle is queried.

Where to find everything

I have put the source code at github.

If you checkout the source code, you can find the beagrep and its helper scripts under windows-config/bin.

The beagle source code I modified is under windows-config/gcode/beagle.

The c# program which breakes ENGLISH_STOP_WORDS to ENGLISH STOP WORDS is under windows-config/gcode/BeagleTokenizer.

The simplest way to set things up is to run


and then


For more details, please RTFS using beagrep!

Switched to USA Pragrammer Dvorak Keyboard

September 29, 2011

After reading xahlee‘s articles on keyboards, I recently made 2 big changes (and many other small changes because of the 2 big ones) to the way I type. I switched from QWERTY to UPD (USA Programmer Dvorak), and I switched my Control keys and Alt keys.

And how does it feel? Well, at first it was like pain in the ass. But then it seems to get better and better.

And to make it even better, I bought a Microsoft Ergonomics 4000 keyboard for use at work!

Here’s some hacking around this switch.

Switching Chinese and English at the same time

As a Chinese, I also need to type Chinese using an Input Method, such as Wubi. Now, I want myself to use the same UDP keyboard layout when using IME, because it would be crazy if I switch back to QWERTY when typing Chinese and stick to UDP when typing English…

But switching to UDP for my IME is even more painful. Because I am using Wubi, instead of PinYin, and the secret about Wubi is you use your muscle to memorize the encoding, so basically I just re-learned Wubi.

Some changes to Emacs

Xahlee considered Progammers Dvorak no better than Simple Dvorak (he tried UDP and then gave up), but I chose it anyway and have no plan to change again. I kinda feel its not that a bigger differenc now that both are Dvorak…

But Progammers Dvorak does give me some edge, considering the fact that I also switched Controls and Alts (which is also one of the advices by Xahlee).

This made a lot trouble with C-x and M-x. In the mean time, C-x got separated with many of the combination keys that used to be typed with the same hand as itself: C-x C-s, C-x C-f, etc.

Finally I found my solution (after searching Dvorak on EmacsWiki and some inspiration):

  1. Switch C-h and C-x using keyboard-translate.
  2. Stop using M- prefix altogether, use C-[ (the same effect as ESC) instead. Now Progammers Dvorak seem a much better choice than Simle Dvorak, because C-[ is very easy to type.

I tried to add more hacks, but finally stopped to reduce the confusion, and because the above 2 is about enough.

Using Progammers Dvorak under Windows.

Not that I use Windows a lot nowadays, but I do use it once in a little while, and I don’t want to be a fool when at it. So I need to find a way to use Progammers Dvorak under Windows.

Windows already provide a Simple Dvorak keyboad layout, and Programms Dvorak also available on the net. But they are both incompitable with IME.

I found on the net people has used autohotkey to provide Simple Dvorak, which does can be used at the same time with IME.

So I just wrote my Programms Dvorak version of autohotkey script. You can get it here. This ahk script was generated from 3 bash scripts, which you can find here, here and here.