IntelliJ IDEA ate my nREPL

A challenge I’ve recently come across during my work at Nokia is how to enable users of different editing systems than Emacs to use Clojure over the network. For Emacs, it’s just straightforward nrepl.el, Vi users (aka masochists) have something called Nailgun (no, not the one from the Quake game series) to connect to with VimClojure.
As it turns out, however, not everyone is ready for the red pill, yet, and would rather stick to their familiar Java’esque working environment. Among the variety of these, IntelliJ IDEA seems to be quite popular these days and so it became my duty^Wpleasure to enable the poor souls connect to Clojure instances that would run in some different network. The recommended solution for working with Clojure in IntelliJ IDEA is called La Clojure and it packs quite a bunch of Le Features. Configuring it to not start La REPL locally but connect to one over La Network is not among them. Merde!
Le Me to the rescue.

REPL-y in-code

La Clojure is designed to call Java directly, providing all classpath elements determined through the IDE directly as part of the invocation, instead of relying on Leiningen’s or Maven’s built-in swank-clojure or REPL-y support.
At least, it allows one to configure what class to use to invoke Main, so what if we could just trick it to think it was running a local REPL, when it’s a remote one, really?

As La Clojure relies on a certain input/output communication scheme when attaching to the local process, Swank was out of the game – nREPL, on the other hand, is designed to provide a communication scheme together with a server implementation, and all it needs to implement a REPL client – but not the client itself. The existing documentation demonstrates how to communicate separate messages over a connection, but not how to turn the local REPL into a networked one.
On the other end of the spectrum, REPL-y provides a batteries-included solution of an improved Clojure REPL that has built-in nREPL support, but it doesn’t tell you how to use that from your own code. It’s easy if you dig and grock the code, but not obvious. Enough talking – let’s get to the action!

A Guide in PicturesCode

REPL-y launches nREPL through some high-level code that passes down options, normally taken from command line. We need to somehow resemble its behavior.

(defn launch-nrepl [options]
  "Launches the nREPL version of REPL-y, with options already
  parsed out"
  (with-launching-context options
    (reader.jline/with-jline-in
      (set-prompt options)
      (eval-modes.nrepl/main options))))

The options get passed down to the code that establishes the connection.

(defn main
  "Mostly ripped from nREPL's cmdline namespace."
  [options]
  (let [connection         (get-connection options)
        ...]
    ...))
(defn get-connection [{:keys [attach host port]}]
  (let [port (if-not attach
               (-> (nrepl.server/start-server :port (Integer. (or port 0)))
                   deref :ss .getLocalPort))
        url (url-for attach host port)]
    (when (-> url java.net.URI. .getScheme .toLowerCase #{"http" "https"})
      (load-drawbridge))
    (nrepl/url-connect url)))

As it appears, the :attach keyword argument determines whether to start a new server or to connect to one. Somehow. Let’s take a look at url-for.

;; TODO: this could be less convoluted if we could break backwards-compat
(defn- url-for [attach host port]
  (if (and attach (re-find #"^\w+://" attach))
    attach
    (let [[port host] (if attach
                        (reverse (.split attach ":"))
                        [port host])]
      (format "nrepl://%s:%s" (or host "localhost") port))))

It may not seem intuitive, but if attach is set, it takes precedence over both host and port. Without a colon contained in attach, it’s assumed to be a port number to connect to (at localhost), but with a colon, the first part is assumed to be the host instead.

One possible way to come to this conclusion, instead of monkey hacking, could be to run different possible combinations of parameters against the implementation of url-for:

(map (fn [[attach host port]]
      (if (and attach (re-find #"^\w+://" attach))
          attach
        (let [[port host] (if attach
                              (reverse (.split attach ":"))
                            [port host])]
          (format "nrepl://%s:%s" (or host "localhost") port))))
  [["foo:456" "bar" 123]
   ["456" "bar" 123]
   ["456" nil nil]
   [nil "bar" 123]
   ["456 "foo" nil]])
=> ("nrepl://foo:456" "nrepl://localhost:456" "nrepl://localhost:456" "nrepl://bar:123" "nrepl://localhost:456")

At this point, it should be possible to put together the code required for Main.

(ns com.my-company.foobar.core
  (:require [reply.main :as reply]
            (clojure.tools.nrepl.server :as nrepl))) ; note: need to use parentheses here because SyntaxHighlighter parser wants to eat clojure with brackets

(gen-class
 :name com.my-company.foobar.core.nrepl
 :main true
 :prefix "nrepl-")

(defn nrepl-server [port]
  (print (format "Attempting to start nREPL server on localhost:%s..." port))
  (let [server (nrepl/start-server :port (Integer/parseInt port))]
    (println "done")
    server))

(defn nrepl-client [attach]
  (reply/launch-nrepl {:attach attach}))

(defn nrepl-main
  [mode & opts]
  (apply (case mode
           ":server" nrepl-server
           ":client" nrepl-client
           (throw (IllegalArgumentException. "Please specify either :server or :client")))
         opts))

This would go into a file located at src/main/clojure/com/my_company/foobar/core.clj, relative to your project’s source directory. At this point, it should be possible to fire up nREPL servers and REPL-y shells through Java invocations of your JAR. Just what we need for IntelliJ IDEA.

Hooking Up

All dependencies introduced this far, REPL-y and nREPL (the latter is actually a dependency of the former already), need to be handled by your project’s dependency system. La Clojure seems to have a hard time supporting at least Leiningen 2 so far, which is state of the art, so the rest of this guide assumes you’re using Maven for your project, which is very well supported by the IDE in general. Furthermore, it is assumed that clojure-maven-plugin is used for Clojure integration. Add the following dependencies to your pom.xml:

<dependency>
    <groupId>reply</groupId>
    <artifactId>reply</artifactId>
    <version>0.1.2</version>
    <exclusions>
        <exclusion>
            <groupId>ring</groupId>
            <artifactId>ring-core</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>org.clojure</groupId>
    <artifactId>tools.nrepl</artifactId>
    <version>0.2.0-beta10</version>
</dependency>

At the time of writing, the latest version of clojure-maven-plugin, 1.3.13, doesn’t include the advertised built-in nREPL server goal, yet. Hence, the manual implementation of an nREPL server from above proves to be useful:

$ mvn clojure:run -Dclojure.mainClass=com.my-company.foobar.core.nrepl -Dclojure.args=:server\ 1234
(...)
Attempting to start nREPL server on localhost:1234...done

So, what’s left… right, connecting! To avoid bad surprises, let’s try from command line first.

$ mvn clojure:run -Dclojure.mainClass=com.my-company.foobar.core.nrepl -Dclojure.args=:client\ 1234
(...)
REPL-y 0.1.2
Clojure 1.4.0
user=> (quit)
Bye for now!

Excellent! If we can achieve the same from the Clojure facet in IntelliJ IDEA, we’ve (almost) reached our goal. Visit the project settings (Ctrl-Alt-Shift-s), add the Clojure facet if necessary, and modify the options to look as follows:

  • JVM arguments: -Xss1m -server
  • REPL options: :client 1234
  • REPL main class: com.my-company.foobar.core.nrepl

Perform a Maven build and fire up the Clojure Console (Ctrl-Shift-F10). If everything goes well, you’ll be greeted with a fresh REPL-y prompt. Hooray! Let’s try evaluating some code.

user=> (+ 1 1)
(+ 1 1))
2

What’s that garbled output? JLine, which is responsible for the readline support of REPL-y, assumes an underlying UNIX-style terminal by default. It needs to be instructed to use a scheme compatible with La Clojure’s console. Re-visit the project settings screen, this time touching just the JVM arguments again:

  • JVM arguments: -Xss1m -server -Djline.terminal=jline
  • REPL options: :client 1234
  • REPL main class: com.my-company.foobar.core.nrepl

Setting -Djline.terminal=jline will make REPL-y and the console get along with each other just fine, so no more garbled output.

If you want to connect to a remote running nREPL, change the REPL options to something like :client host:port.

Putting it all together

For your convenience, all the custom code and settings quoted in this article are made available as Free Software in the public domain over at github.

The final conclusion is that I totally kick butt, once again. Oh, and please tell your friends. Over and out.

Git: Evolution of a topic branch and how to review it

The key to reviewing topic branches in Git, i.e. story-related branches in case of Scrum, is Git’s rev-list command.

Let us suppose we’ve got the following commit/branch history:

   B---C-----D---E--
  /        \        \
 A---F---G--+----H---+--I

Here, the series of commits B-C-D-E happened on a topic branch that was created when commit A was the tip (or HEAD, in git-speak) of the master branch. The master branch continued with commits F, G – when a remerge had to occur. After another commit H on top the master branch we got the final merge with the topic branch and another commit I.

In this scenario, reviewing the series of commits B-C-D-E plus the merge would be a piece of cake but what happens if F-G-H-I was forming the topic branch with master getting merged in from time to time? We’d have to tell apart commits, i.e. know not to review F-G-H as well. It might seem trivial here but figure F-G-H was really hundreds of commits with even more merges in between.

In this case, rev-list offers the inconspicuous (or even misleadingly
named) option --first-parent:

Follow only the first parent commit upon seeing a merge commit. This
option can give a better overview when viewing the evolution of a
particular topic branch, because merges into a topic branch tend to be
only about adjusting to updated upstream from time to time, and this
option allows you to ignore the individual commits brought in to your
history by such a merge.

Creating the review patch

To cut a long story short,

git rev-list --first-parent --reverse HASH-OF-A...HASH-OF-I

does just what you need and prints the series of hashes for F-G-+-H-+-I. Similarly,

git rev-list --first-parent --reverse HASH-OF-A...HASH-OF-E

gives you B-C-D-E so it even works for a simple use case without in-between merges. Adding --no-merges will additionally skip the merge commits (+).

Make this an alias in your ~/.gitconfig by adding this:

[alias]
        review-branch = rev-list --reverse --first-parent

Now, in order to use this for a review, all that’s left to do is using such a list of hashes to retrieve their diffs:

git show $(git review-branch HASH-OF-A...HASH-OF-I) > review.patch

or, alternatively

git review-branch HASH-OF-A...HASH-OF-I \
  | xargs -n1 git show > review.patch

In order to just get all files involved throughout all commits, you could add some funny action, e.g.

git show $(git review-branch HASH-OF-A...HASH-OF-I) \
  | grep -E '^(--- a|\+\+\+ b)/' \
  | perl -ple '$_ =~ s/.......*)/$1/' \
  | sort | uniq > files

Reviewing the changes

In order to make best use out of the review patch files created above, I’ve been using Review Board, which is Free Software as well, of course.

Review Board diff viewer

Review Board uses patch files as its native format and even supports loading missing context code from the repository, if set up to do so.

Conclusion

Git kicks ass and I’ve got Balls of Steel™.

Using GNU Emacs as a terminal emulator

This is something that has long been waiting on my machine and in my mind, but can’t easily be pressed into a project of its own. It is possible, with some tweaking, to use GNU Emacs as a powerful terminal emulator, effectively replacing software such as GNOME Terminal and GNU Screen at the same time. Additionally, it’ll give you something missing with alternative other pieces of software: Fully searchable and storable output buffers.

Basics

At the foundation of the concept lies a forsaken Emacs library called term. term, by itself, already provides a full-fledged terminal emulator. Just try invoking

M-x term

and it will give you a terminal emulating buffer, after querying for your preferred CLI shell. term by itself, however, is not very friendly: It has rough edges here and there, something you’ll likely realize after playing around with it for a while. Then, there’s also the ansi-term function pre-provided as part of term, which tries to be a little more clever with buffer names. But it all gets better by fetching and using multi-term by Andy Stewart and “ahei”, which deals with a number of problems in the default term implementation:

1. term.el just provides commands `term’ or `ansi-term’
for creating a terminal buffer.
And there is no special command to create or switch
between multiple terminal buffers quickly.

2. By default, the keystrokes of term.el conflict with global-mode keystrokes,
which makes it difficult for the user to integrate term.el with Emacs.

3. By default, executing *NIX command “exit” from term-mode,
it will leave an unused buffer.

4. term.el won’t quit running sub-process when you kill terminal buffer forcibly.

5. Haven’t a dedicated window for debug program.

And multi-term.el is enhanced with those features.

This little gem will make your life easier, and to get more out of it I use

(when (require 'multi-term nil t)
  (global-set-key (kbd "<f5>") 'multi-term)
  (global-set-key (kbd "<C-next>") 'multi-term-next)
  (global-set-key (kbd "<C-prior>") 'multi-term-prev)
  (setq multi-term-buffer-name "term"
        multi-term-program "/bin/zsh"))

in my .emacs. By setting multi-term-program to my favored shell, zsh, I’m not queried anymore each time I try to open a new terminal emulation. The F5 key would now open new terminal buffers, and the familiar Ctrl-PageUp/Down key combinations would cycle between existing open terminal buffers.

Thanks to multi-term’s cleverness, the default-directory variable is honored by default, which means that opening new terminal buffers will always have them started at the directory of the currently open file, if applicable.

Keybindings

Now, many popular key combinations still get captured by the default keymaps in place for multi-term. Let’s put some new keybindings in place that handle this:

(when (require 'term nil t) ; only if term can be loaded..
  (setq term-bind-key-alist
        (list (cons "C-c C-c" 'term-interrupt-subjob)
              (cons "C-p" 'previous-line)
              (cons "C-n" 'next-line)
              (cons "M-f" 'term-send-forward-word)
              (cons "M-b" 'term-send-backward-word)
              (cons "C-c C-j" 'term-line-mode)
              (cons "C-c C-k" 'term-char-mode)
              (cons "M-DEL" 'term-send-backward-kill-word)
              (cons "M-d" 'term-send-forward-kill-word)
              (cons "<C-left>" 'term-send-backward-word)
              (cons "<C-right>" 'term-send-forward-word)
              (cons "C-r" 'term-send-reverse-search-history)
              (cons "M-p" 'term-send-raw-meta)
              (cons "M-y" 'term-send-raw-meta)
              (cons "C-y" 'term-send-raw))))

Most of the bindings should be self-explanatory. What’s interesting is term-line-mode and term-char-mode: By default, term operates by sending each keystroke to the shell, unless otherwise defined in the active keymap. This prevents us from easily navigating or searching the buffer. By switching to term-line-mode however, which is intended to send input only after return is pressed, we can e.g. backward-search with Ctrl-r as usual, copy text, etc. and switch back to term-char-mode for normal operation when done.

Shell interop

If you were previously using emacsclient to open and edit files in your existing Emacs instance in server mode, you will wonder how that works out with Emacs being the terminal emulator itself. As it turns out, Emacs is not very good for running its own terminal frames inside itself so we need to find something different.

When digging through term.el, you will eventually find the following comment preceding the function term-handle-ansi-terminal-messages:

Function that handles term messages: code by rms (and you can see the difference ;-) -mm

Hmm, some leftover code by Richard Stallman, founder of the GNU project and GNU Emacs himself? As it turns out, the code in question will scan output for special byte sequences similar to the ones used by xterm terminal emulator to e.g. change its title. It would consume the sequence as well as the information passed to it and store it in three different variables,

  • term-ansi-at-dir
  • term-ansi-at-host
  • term-ansi-at-user

and use these, in turn, for default-directory and the ange-ftp library. This is, however, problematic as FTP (file transfer protocol) is not in wide use anymore and the generated values for default-directory will default to FTP operations even if the currently operated host is the local one running Emacs.

But we can use the existing function, hijack its implementation and use it to add custom commands as well as make it handle localhost correctly and remote hosts via SSH! To cut a long story short, add the following to your .emacs:

(when (require 'term nil t)
  (defun term-handle-ansi-terminal-messages (message)
    (while (string-match "\eAnSiT.+\n" message)
      ;; Extract the command code and the argument.
      (let* ((start (match-beginning 0))
             (command-code (aref message (+ start 6)))
             (argument
              (save-match-data
                (substring message
                           (+ start 8)
                           (string-match "\r?\n" message
                                         (+ start 8))))))
        ;; Delete this command from MESSAGE.
        (setq message (replace-match "" t t message))

        (cond ((= command-code ?c)
               (setq term-ansi-at-dir argument))
              ((= command-code ?h)
               (setq term-ansi-at-host argument))
              ((= command-code ?u)
               (setq term-ansi-at-user argument))
              ((= command-code ?e)
               (save-excursion
                 (find-file-other-window argument)))
              ((= command-code ?x)
               (save-excursion
                 (find-file argument))))))

    (when (and term-ansi-at-host term-ansi-at-dir term-ansi-at-user)
      (setq buffer-file-name
            (format "%s@%s:%s" term-ansi-at-user term-ansi-at-host term-ansi-at-dir))
      (set-buffer-modified-p nil)
        (setq default-directory (if (string= term-ansi-at-host (system-name))
                                    (concatenate 'string term-ansi-at-dir "/")
                                  (format "/%s@%s:%s/" term-ansi-at-user term-ansi-at-host term-ansi-at-dir))))
    message)

Now, term will play nicely with local shell sessions. But how do we teach term about our current user, host and working directory? The implementation varies with your shell of choice. For zsh, I use:

prompt_eterm_precmd () {
  case $TERM in
    xterm*)
      print -Pn "\e]0;%n@%m:%~ (%l)\a"
      ;;
    eterm-color*)
      print -P "\eAnSiTh %m"
      print -P "\eAnSiTu %n"
      print -P "\eAnSiTc %~"
      ;;
  esac
}

in my custom theme that I set using

autoload -U promptinit && promptinit && \
  prompt $([ ${TERM}X = "eterm-colorX" ] && echo eterm || echo e-user)

in my .zshrc. Similarly, for GNU Bash the following will work if set from .bashrc:

# are we an interactive shell?
if [ "$PS1" ]; then
  case $TERM in
    eterm-color*)
      if [ -n "$SSH_CONNECTION" ]
      then
        _HOST=$(echo -n $SSH_CONNECTION | cut -d\  -f3)
      else
        _HOST=$HOSTNAME
      fi

      PROMPT_COMMAND='echo -ne "\033AnSiTh ${_HOST}\n\033AnSiTu ${USER}\n\033AnSiTc ${PWD/#$HOME/~}\n"'
      ;;
    xterm*)
      PROMPT_COMMAND='echo -ne "\033]0;${USER}@${HOSTNAME%%.*}:${PWD/#$HOME/~}"; echo -ne "\007"'
      ;;
    screen)
      PROMPT_COMMAND='echo -ne "\033_${USER}@${HOSTNAME%%.*}:${PWD/#$HOME/~}"; echo -ne "\033\\"'
      ;;
    *)
      [ -e /etc/sysconfig/bash-prompt-default ] && PROMPT_COMMAND=/etc/sysconfig/bash-prompt-default
      ;;
  esac
fi

What’s now left to do is make file opening work again, the Emacs bit are already in place with the term-handle-ansi-terminal-messages implementation from above. For zsh, add to .zshrc:

if [ "${TERM}x" = "eterm-colorx" ]
then
  alias e='print -P "\eAnSiTe"'
  alias x='print -P "\eAnSiTx"'
else
  alias e='emacsclient -n -t -a nano'
fi

And for Bash, add to .bashrc:

if [ "${TERM}x" = "eterm-colorx" ]
then
  alias e='echo -ne "\033AnSiTe"'
  alias x='echo -ne "\033AnSiTx"'
else
  alias e='emacsclient -n -t -a nano'
fi

FWIW, you might not comply with my default fallback choice of GNU nano as Editor, just adapt to your personal needs.

From now on however, new sessions in Emacs term should be capable of making Emacs open files appointed to by e filename for a different window and x filename to make a file open in a new buffer in the same window the terminal emulator is running in. Furthermore, remote hosts with the same Bash / zsh RC configuration in place should support find-file with correctly built SSH default paths out of the box.

Conclusion

Even if with a lot of tampering involved, graphical mode Emacs can be turned into a powerful replacement for terminal emulators and GNU Screen at the same time. By utilizing Emacs buffers for shell output, new possibilities for context-switching free working arise (as well as horrifying security nightmares). The possibility to store and search output buffers, however, is unparalleled in power and essentially gives you similar possibilities to that of a typical in-Emacs Lisp REPL.

FSFE Fellowship of Berlin: We want YOU!

So you do care about freedom. You do care about Free Software, because it gives you the four freedoms to use, study, share and improve the fabrics that make up digital life. If you also happen to live in Berlin area, get your ass here!

We, the Fellowship of the FSFE in Berlin, are an active group of kick-ass Free Software activists who could use another hand or two – your hands!

So you think there’s nothing you could contribute? Think again! Everyone has talents: Providing technical support, bright ideas to share, a smile that makes everyone cheer up, giving speeches about a field of special interest that has to do with Free Software, taking action on the streets.. the possibilities are without limits. And there more we are, the better we can do, together.

Come and join the Berlin Fellowship in the FSFE Berlin office at Linienstr. 141, in 10115 Berlin, 19:30 on every second Thursday each month. Simply find the next date via GriCal. Got questions? Contact e dash user at fsfe dot org.

See you there!

Concatenating PDFs

While digging through my little archive of Emacs Muse based journal files, I found this little gem that will concatenate an arbitrary number of PDF files:

shell> gs -q -sPAPERSIZE=letter -dNOPAUSE -dBATCH -sDEVICE=pdfwrite \
          -sOutputFile=out.pdf in1.pdf in2.pdf ... inN.pdf

Ghostscript needs to be installed and gs on your PATH for this to work.

Translation of bit fields

I recently came across the problem of how to convert a bit field input value, in this case representing simultaneously pressed modifier keys coming from the Ace code editor. A close-to literal translation of the typical C-paradigmatic way would have been to write something like

(let [result (transient #{})]
  (when-not (zero? (bit-and input 1))
    (conj! result :alt))
  (when-not (zero? (bit-and input 2))
    (conj! result :ctrl))
  (when-not (zero? (bit-and input 4))
    (conj! result :shift))
  (persistent! result))

At first, this may seem intuitive, and could be further refined as

(let [result (transient #{})]
  (doseq [[n field] [[1 :alt] [2 :ctrl] [4 :shift]]]
    (when-not (zero? (bit-and input n))
      (conj! result shift)))
  (persistent! result))

and put into a higher order function taking the bit field values as input. Problem? It’s fucking ugly and not really Clojuresque. Here’s what I came up with:

(defn bitmask-seq [& xs]
  (zipmap (iterate (partial * 2) 1) xs))

(defn flagfn [& xs]
  (fn [n]
    (let [bitmask (apply bitmask-seq xs)]
      (->> (filter (complement #(zero? (bit-and n %))) (keys bitmask))
           (select-keys bitmask) vals set))))

(def modifier-keys (flagfn :ctrl :alt :shift))

Not only does it look much better, it feels idiomatic through extensive use of higher-order functions and lazy evaluation. Surprisingly, it’s even a little faster in my version of Clojure.

Conclusion: I kick ass.