Archive for April, 2009

The Craft of Text Editing

Sunday, April 26th, 2009

I spent the last few days reading The Craft of Text Editing, a book on the design of text editors. It focuses mainly on editors based on Emacs, but many of the principles apply to all other text editors as While old, many of the topics it covers would appear to still be relevant to anyone designing an editor today, which is admittedly roughly nobody.

The book starts off talking about the different types of users, and the different types of input. The section on input types is quite dated - having more than one button on a mouse is bad design? and not really relevant today. It then covers the requirements you need for the language you are programming the editor in - not surprisingly for a book focusing on emacs-type editors, the recommended choices are C and Lisp.

Once it finishes those sections, the book gets to the interesting part: how to actually represent the text and structure your editor. The first part in this section talks about the possible editing models: Text as an array of characters, a 2D array of characters, a list of lines, and a few other options.

Next, several different file formats that you may need to handle are discussed. While not all of them will be applicable to every editor, ones that aspire to be as general as possible must handle them. This section also discusses how extra information not represented in the text of the file may be represented and stored - for example, typesetting information.

The implementation of the actual editor is then discussed. The main ways of representing buffers is talked about - essentially, the two most common ways are a linked list of characters and a ‘buffer gap’ system, which is what emacs uses. The efficiency of each of these implementations is discussed in several categories, including crash recovery. In general, the buffer gap system is found to be better than the other systems.

Redisplay algorithms are covered next - how to display the changes a user makes with as little interruption to the user as possible. This section doesn’t seem as important as it did back when this book was written - we have faster connections and processors, leading to it mattering a lot less whether five commands are issued per redisplay or four. Still quite interesting, though.

The next section deals with user commands. It discusses multilevel commands (such as C-x C-c), arguments, key rebinding, and modes. While the focus is on Emacs, all text editors share this command loop: for example, vim has suffix arguments corresponding to the range to delete when you press ‘d’. How to deal with Undo and Redo are then described, and various methods of implementing them and whether they are even necessary.

The next chapter deals with the design of the command set, or what exactly you are able to do with the editor. It talks about what you should strive for in a command set; responsiveness, consistency, permissiveness, progress, simplicity, uniformity, and extensibility. It discusses a few special types of editing certain types of syntax and how to enhance support for these. For some commands that have multiple interpretations, like forward-word, each implementation is discussed.

All in all, this was a very interesting book, well worth reading. Since most people spend a large portion of their time inside of their text editor, I believe it is important to understand the basics of how it works. While it probably won’t help in everyday use, having an idea of what is happening behind the scenes of your editor is important if anything goes wrong with it.

Cua in emacs

Wednesday, April 22nd, 2009

I recently started investigating cua-mode, one of emac’s built-in minor modes. It does several things; the first is to rebind some keys to be more friendly to users familiar with conventions such as C-x, C-z, etc. I don’t use these, having become familiar with emacs commands for these, but if you haven’t mastered emac’s built-in keys it may help. Cua enables transient-mark mode and does it’s keybindings only take effect when the mark is highlighted, so it should not interfere with your regular keybindings *too* much, although it was still annoying while I was trying them out.

There are two other items that cua uses that are much more interesting to me: cua’s rectangles and the global mark. Once you have enabled the cua minor mode, C-RET will start selection of a rectangle. Unlike selecting a rectangle by setting a mark, this will highlight only the rectangle you are selecting. You can expand the rectangle in any direction, using RET to switch which corner you are expanding from. If you type with a rectangle selected, the string you type is inserted at the edge of the rectangle, for every line. Which edge depends on which corner you have selected. All your normal rectangle commands also work: killing, yanking, etc. It also handles tab characters much more gracefully than just setting the mark and converting a region to a rectangle.

Cua’s global mark is another feature that is sometimes useful, though not as often as the rectangle selection. The way it works is this: Once you set the global mark, text you type will go into that location, not the current point. While a good idea and sometimes useful, it is not as developed as I’d like. It does not function exactly as the point does: I would like functions that look around points would instead look around the global mark. For example, yasnippet expansion does not work, and
neither does pabbrev-mode. I also have some problems with inserting spaces that I haven’t tracked down yet that I believe are set by CEDET that conflict. It is still somewhat useful if you need to reference another part of a long buffer while typing, but some development could make it much more useful. Of course, having everything that references the point actually reference the global mark sounds pretty hard: I tried with some advice, but it didn’t work out. Oh well.

To enable cua, put the following in your .emacs file:

(setq cua-enable-cua-keys nil)
(cua-mode)

The first line will tell cua not to set its compatibility keybindings. The second enables the cua-mode rectangle selection and global-mark. Once there are in your .emacs and evaluated, you’re good to go!

Summing Columns in Emacs

Wednesday, April 15th, 2009

I recently noticed that I had to do one task in Emacs a fair amount, so I decided to write some code so that I didn’t have to do it manually.

(defun sum-column()
  "Sums a column of numbers starting at point"
  (interactive)
  (save-excursion
    (if (and (not (= (current-column) 0))
	     (re-search-backward "[ \t]" 0 t ))
	(forward-char))
    (let ((retn 0)
	  (old-column (current-column))
	  (old-next-line-add-newlines))
      (setq next-line-add-newlines nil)
      (while (not
	      (looking-at "^[ \t]*$"))
	(move-to-column old-column t)
	(if (and (looking-at "-?[0123456789]+")
		 (eq (current-column) old-column))
		(setq retn (+ retn (string-to-number (current-word)))))
	(next-line)
	(beginning-of-line))
      (next-line)
      (next-line)
      (move-end-of-line 0)
      (inse rt (make-string (- old-column (current-column)) 32))
      (insert (number-to-string retn))
      (setq next-line-add-newlines old-next-line-add-newlines)
      retn)))

This function will sum a column of numbers in a buffer when the point is at the start of them. It will insert the total after the first whitespace-only line, aligned with the rest of the numbers. For example, the following text

10
20
40

Will be replaced with:

10
20
40

70

This works even for columns that are not at the left column of the screen, so

foo 10
bar 20
baz 40

Becomes

foo	10
bar	20
baz	40

	70

If you have multiple columns, you can sum them up individually:

10	20
30	40
50	60

After two calls to sum-column, this becomes:

10	20
30	40
50	60

90	120

Enjoy!

hideshow-org

Monday, April 13th, 2009

While I don’t usually use hideshow-mode, a lot of the reason why has to do with the awkward key combinations it uses. The prefix of C-c @ is just terrible, so I never really used it. However, hideshow-org was recently released, which provides automatic hiding/showing using the TAB key.

hideshow-org does not interfere with the normal use of your TAB key. It only performs hiding and showing of blocks if the point does not move when you hit TAB, allowing autoindent to work properly. From my tests, it works well with my setup of pabbrev, yasnippet, and autoindenting being bound to the TAB key, although I am not entirely sure of the priorities on all of them.

I have noticed one problem with hideshow so far, but it is unrelated to hideshow-org. If you have a block formatted like this:

int foo( int bar )
{
	//stuff
}

And you hide the block, toggling the hiding will not expand the block, but will instead hide the block above it. For it to work properly, the opening brace must be on the same line as the function definition. If anyone knows how to fix this, let me know - otherwise I’ll take a look at it after my finals are over.

To install hideshow-org, download it from the git repository and put the following in your .emacs file:

(require 'hideshow-org)

More detailed information is found on the GitHub page for the project.

Predictive-Expansion for Emacs

Saturday, April 4th, 2009

One of the minor modes I use most frequently is pabbrev-mode, a mode used for predictive expansion of text. Pabbrev doesn’t use a dictionary; instead, it scavenges your open buffers to find word usage and statistics. This allows it to be useful in a lot more contexts than just text editing: For example, it will offer completions of variable names if you have a buffer containing those names open.

Pabbrev overwrites the tab key, but if there is no valid expansion, then the behaviour falls through to the previously-bound command. This brings tab-completion to your ordinary text editor, which is just incredibly useful. The way pabbrev is implemented offers the suggested expansion as you are typing, so that you know whether or not you should expand your suggestion.

One annoying thing about the default behaviour is that if there are multiple suggestions for expansion, it will open for you to choose one. I prefer to just expand to the displayed expansion, always, without opening any additional buffers. To get this behaviour, redefine the following function

(defun pabbrev-expand-maybe-no-buffer()
  "Expand abbreviation, or run previous command.
If there is no expansion the command returned by
`pabbrev-get-previous-binding' will be run instead."
  (interactive)
    (if pabbrev-expansion
        (pabbrev-expand)
      (let ((prev-binding
             (pabbrev-get-previous-binding)))
        (if (and (fboundp prev-binding)
		 (not (eq prev-binding 'pabbrev-expand-maybe)))
	    (funcall prev-binding)))))

To enable pabbrev-mode, first download it from here. Then, add the following lines to your .emacs file:

(require 'pabbrev )
(global-pabbrev-mode)
(setq pabbrev-read-only-error nil)

Once this is done, pabbrev-mode will be enabled in all buffers that are not read-only. I highly recommend either it or another predictive-expansion mode; there are a few out there, but I haven’t tried them out.