CK knows Wayne

A Rust adventure

Published at by Christian Kruse, updated at
Filed under: programming, rust

tl;dr: Rust rocks

During the 31C3 I started learning Rust. I wanted to for a long time, but finally I had the time to do so: a compiled, system programming language featuring static type safety, a type system inspired by Haskell’s type classes and easy parallelism with nearly no runtime overhead? Sounds like a dream comes true!

So I started the Rust nightly install (in archlinux this is just a yaourt -S rust-nightly-bin && yaourt -S cargo-nightly-bin), read through the guide and started hacking a small project to get into the language a little bit more: a parallel grep.

First steps were pretty easy, creating new projects and building them is, thanks to cargo, just a command away and the guide is pretty easy to understand. My basic concept was to create a std::sync::Future (Rust’s futures implementation) for each file we have to search through and return the results as an array to the main thread when finished.

The problems began when I tried to find out how to work with the standard library: the API documentation is pretty unusable. Try to find out how to search for a substring in a string or how to open a file. I wasn’t able to find them by myself, I always had to do a web search. This has to get better!

The first „woah!!“ occured when I tried to put a std::sync::Future into a vector. I definitely have to get used to type inference, I tried to use a type annotation for the vector since I didn’t think that the compiler is able to get the type right. Thus I’ve been reading futures.rs to get the type annotation right: Vec<Future<_>>. The underscore seems to tell the type checker that we don’t care about the exact future type, it is just some future. Man, that is really cool stuff! Then (after I spent an hour to get the type right) I tried to simply leave out the type - and it works! This is really amazing.

Another problem occured when I tried to iterate over the vector to gather the results. The default way to iterate over a vector in Rust is using iterators:

for result in results.iter() {
    let lines = result.get();
    for line in lines.iter() {
        print!("{}", line);
    }
}

results is the Vec<Future<_>>. The code above leads to this error message:

/home/ckruse/dev/ngrep/src/main.rs:48:21: 48:27 error: cannot borrow immutable dereference of `&`-pointer `*result` as mutable
/home/ckruse/dev/ngrep/src/main.rs:48         let lines = result.get();
                                                             ^~~~~~
error: aborting due to previous error
Could not compile `ngrep`.

To learn more, run the command again with --verbose.

This error message is not that helpful for beginners, but after reading the guide again I was able to track down the problem: the vector iterator returns an immutable value, but the .get() method needs the std::sync::Future to be mutable. So after searching the web again I was able to find a solution: I iterate over the array the old way with a counter and call .get_mut() which returns the element mutable:

for i in range(0, results.len()) {
    let mut result = match results.get_mut(i) {
        Some(v) => v,
        None => panic!("error! no value!")
    };

    let lines = result.get();
    for line in lines.iter() {
        print!("{}", line);
    }
}

This works, but is ugly. Maybe a reader knows a better solution?

The rest of the code is pretty straight forward:

use std::os;
use std::sync::Future;
use std::io::BufferedReader;
use std::io::File;

fn main() {
    let args = os::args();
    let mut results = vec![];

    for file in args.slice(2, args.len()).iter() {
        let path = Path::new(file);
        let pattern = args[1].clone();

        let delayed_value = Future::spawn(move || -> Vec<String> {
            let mut retval: Vec<String> = vec![];

            let fd = match File::open(&path) {
                Ok(f) => f,
                Err(e) => panic!("Couldn't nopen file: {}", e)
            };

            let mut file = BufferedReader::new(fd);
            let mut lineno : uint = 0;

            for maybe_line in file.lines() {
                let line = match maybe_line {
                    Ok(s) => s,
                    Err(e) => panic!("error reading line: {}", e)
                };

                if line.contains(pattern.as_slice()) {
                    let s = path.as_str().unwrap();
                    retval.push(format!("{}:{}: {}", s, lineno, line));
                }

                lineno += 1;
            }

            retval
        });

        results.push(delayed_value);
    }

    for i in range(0, results.len()) {
        let mut result = match results.get_mut(i) {
            Some(v) => v,
            None => panic!("error! no value!")
        };

        let lines = result.get();
        for line in lines.iter() {
            print!("{}", line);
        }
    }
}

Another really cool feature are the valued enums. Let’s have a look at this:

let mut result = match results.get_mut(i) {
    Some(v) => v,
    None => panic!("error! no value!")
};

The return value of get_mut() is just an enum, but in Rust enums may contain additionally a value. This leads to constructs like above. The match keyword introduces a pattern matching construct and we basically say „when the return value of get_mut() is Some give me the value into the variable v and return it; if it is None simply panic out.“ This rocks! Now we can forget about all this int somefunc(char *real_retval) bullshit from C. We can properly distinguish between error cases and real return values. Yay!

Another interesting point was that I tried to use the pattern variable directly in the closure. The compiler did forbid that with some remarks about the lifetime of the variable and I didn’t understand what exactly it tried to say. After a web search I found this: everything is fine during the first loop. The closure takes over the pattern variable and we can use it. But after the first loop (well, to be exact: after the first closure finished) it would clean up after itself and free the memory. Thus we would access invalid memory, potentially leading to a crash. This has been caught by the compiler! Wow, great! So the solution was to create a copy for each loop. Finished.

All in all I really like that language. The functional elements (immutable variables by default, pattern matching, tuples, etc, pp) are integrated well and fit into the flow. Paralellism is easy and remindes me of Erlang. It is really hard to introduce bugs or crashes, the compiler detects a lot of potential and real problems. Of course there is still work to do (some error messages are hard to understand, the API documentation is really bad), but I could definitely imagine that this will be my first-choice language in future.

11 comments

Facebook laws for idiots

Published at by Christian Kruse, updated at
Filed under: college humor, facebook

I am hereby immune to gonnorhea!

0 comments

Emacs: run rails test at point

Published at by Christian Kruse, updated at
Filed under: emacs, rails

Lately I’ve been writing a lot of tests. To shorten the round trip times and for some comfort I wanted to run tests in Emacs: I don’t need to switch to the console and I can simply hit enter on an error message to get to the right file at the right position.

While projectile-rails supports running rake tasks, rake can only run the complete test suite or single test cases, not just a single test. This increases the round trip time too much for me, so Emacs to the rescue! I wrote some elisp to run a single rails test:

(defun get-current-test-name ()
  (save-excursion
    (let ((pos)
          (test-name))
      (re-search-backward "test \"\\([^\"]+\\)\" do")
      (setq test-name (buffer-substring-no-properties (match-beginning 1) (match-end 1)))
      (concat "test_" (replace-regexp-in-string " " "_" test-name)))))


(defun run-test-at-point ()
  (interactive)
  (let ((root-dir (projectile-project-root)))
    (compile (format "ruby -Ilib:test -I%s/test %s -n %s" root-dir (expand-file-name (buffer-file-name)) (get-current-test-name)))))


This little snippet runs the test at point, so the test the cursor is currently located in. I bound it to C-cct.

2 comments

Things You Should Never Do

Published at by Christian Kruse, updated at
Filed under: development, mistakes, software

Last night I was reading an article about the worst mistake in software development: a complete rewrite of the companies flagship software.

I can definitely confirm that it is a really bad idea in general to do a complete rewrite: by personal experience because I for myself did that mistake and by second hand experience - I have several friends struggling with the same problem.

On the other hand it can be a very good idea to cut off old code and replace it by new code: have a look at the openssl/libressl. Another example may be the nginx project, which is able to outperform Apache just because the code base is not that crufted.

I can’t tell you what the right way for your project is, but I thought it might be a good idea to share that link from above. It is an interesting read either way.

0 comments

Risiko-Abwägung (German)

Published at by Christian Kruse, updated at
Filed under: media

This!

Risken

0 comments

Trouble at the Koolaid Point

Published at by Christian Kruse, updated at
Filed under: feminism, gamergate

I didn’t want to say something to “gamergate” since I’m not even a real gamer, just a casual. But I find it simply horrible what people are able to do, and death threats and harassment should be punished with jail or a very substantial money fine. This has nothing to do with “free speech,” your freedom ends where the freedom of the other begins.

That said I’d like to point you to a very interesting blog post by Seriouspony. It talks about a phenomena called “the kool-aid point”. It describes that in the beginning people who don’t like you or your ideas ignore you and as soon as you get attention they get nasty, because they think that you don’t deserve this and because they want to get some kind of revenge.

I can totally confirm that theory. I made similar experiences (although not that hard, I never got death treats but I’ve been doxed and harassed).

This is serious, people. It’s not less serious because you do it on the internet. It is even more serious, because people can still read it after years.

0 comments

Lena Reinhard: This is bigger than us: Building a future for Open Source

Published at by Christian Kruse, updated at
Filed under: development, foss, open source

Diversity is the default. If it’s not diverse, it’s broken.

0 comments

PostgreSQL full text search is good enough

Published at by Christian Kruse, updated at
Filed under: postgresql

LostProperty wrote a nice blog post about the PostgreSQL full text search in which they state that for most use cases it is “good enough”. It gives a nice overview about the capabilities of the PostgreSQL full text search, you should read it if you are new to this field.

0 comments

Emacs: more convenient unique buffer names

Published at by Christian Kruse, updated at
Filed under: emacs

In Emacs each buffer has a unique name. For file buffers the name is derived from the file name, so for example a buffer associated with the file README is named README. This is fine as long as you don’t open files with the same name. To ensure the uniqueness of the buffer name Emacs will append a number to the buffer name, for example README<1>.

This makes it somewhat hard to distinguish file buffers. Gladly there is a solution for that: Uniquify. This module lets you choose a different variant to generate unique buffer names: directories. It will use parts of the directory to make the buffer name unique, for example cforum/README instead of README.

I configured it to append the directory parts to the buffer name instead of prepending it, in this way the name is still the most prominent info:

(require 'uniquify)
(setq uniquify-buffer-name-style
      'post-forward
      uniquify-separator ":")

0 comments

Bug necromancers

Published at by Christian Kruse, updated at
Filed under: bug, out of date

The last two weeks I got two mails about bug reports that I submitted or participated in in 2004. The first one is a feature request for Kate, an text editor for KDE. I requested to be able to fold by the often-used VIm and Emacs folding markers, as Kate invented their own. This finally has been implemented now 10 years after requesting it. Yay!

The second one is a bug report for Firefox. For documents coded in ISO-8859-1 Firefox sent (I don’t know if this is stil the case, as today in the age of Unicode and UTF-8 it is a no-issue for me) text entered by the user in a weird mixture of Unicode escape sequences and Windows-1252 coded text, despite the accept-charset attribute. This clearly is silly as you won’t be able to reliably reconstruct the text a user enters. This bug has been marked as “WONTFIX”. Hm, would be interesting to check what today’s behaviour is, but on the other hand… just use UTF-8, NN4 and IE<6 aren’t around anymore. This bug report has been around for ten years as well.

I’m really amused that now, after 10 years, they’ve been closed, one even with a fix ;)

0 comments