Creating custom syntax highlighting for kakoune

The missing syntax highlighter

I recently got the urge to try my hand at coding c# in my favourite modal editor Kakoune. However to my suprise I found that csharp was not one of the syntax highlighters that came with kakoune.

10 minutes later

Initially when I confirmed that this was the case I thought it would be a slog to add syntax, but following the suggestion in this ticket and just adapting the java syntax it turns out I could have syntax highlighting working very quickly.

I created a repo at https://github.com/Thomashrb/csharp.kak to keep track of it.

»


Kakoune a fresh take on editors

Editing and exploring text in a modal editor is fun. In editors like Kakoune and (Helix)[https://helix-editor.com/] it is even more fun.

Modal editing is flipped on its head; where you in vim you state what you want to do/execute first and then how much of it/where, in kak/hx you first state how much/where and then what you want to do.

In practice this means that you first select an area or text and then you state what you want to do with. Ex deleting two words forward:

  • in vim: d2w
  • in kak: 2wd
  • in hx: v2wd

This concept also extends to things like search and replace where you would use macros etc because kak and hx are both built around using multiple cursors. Multiple cursors besides giving clear visual feedback on what you are about to do is also a really fun way to edit text.

Why am I drawn to Kakoune specifically

It is not because Kakoune is written in c++ (helix is written in rust btw) but because it tries to stick to the unix philosopy of only doing one thing and doing it well. This means that instead of including stuff like file explorers, lsp, treesitter, vcs integrations, tabs, windows, or even more esoteric stuff like sorting text (which vim has included), or if you want to do a search and replace in a big file where multiple cursors in unwieldy kakoune makes it easy to integrate with these things and even uncludes utilities to pipe text out to command line ulitities. For windowing and tabs Kakoune either opens up new windows on the os or even better uses tmux for window handling using its client/server architecture. Why implement and commit to maintining something that already exists and probably does a better job? Using the example of sorting text which in vim is build in; you can select the lines of text you want to sort pipe sort (pipe is per default hotkeyed as | even) Kakounes lack of file explorer has lead me to discovering the broot which I didnt know I was missing before I found it, and is easy to integrate into my kakoune work flow.

So far I have found that using pipe and even append-output which runs a shell command and appends its output into the buffer allows me to work the same within my editor as I do outside of it affording me less time to think about where I am and how to do it and more about what I am doing. I know this is also possible in vim with ! % and the angled brackets its just more streamlined in Kakoune where it is “the” way of doing things.

Oh and clippy is back!

»


Using a c shared library from zig

Using a c library from zig is quite easy. The zig compiler understands c code so no FFI required.

Create a standard zig project

$ zig init-exe

Find a library and function(s) you would like to use

$ ls /lib/

As an example I will use libc and the simplest function I can think to use is:

$ nm -C -A /lib/libc.so.6 | grep " printf"
/lib/libc.so.6:0000000000058230 T printf
$ git diff build.zig
diff --git a/build.zig b/build.zig
index 0e4f4b7..66343d9 100644
--- a/build.zig
+++ b/build.zig
@@ -12,6 +12,7 @@ pub fn build(b: *std.build.Builder) void {
     const mode = b.standardReleaseOptions();

     const exe = b.addExecutable("c_lib_test", "src/main.zig");
+    exe.linkSystemLibrary("c");
     exe.setTarget(target);
     exe.setBuildMode(mode);
     exe.install();

Import and call library in your zig code

The import is done with two builtin functions

const c = @cImport({
    @cInclude("stdio.h");
});

Now all functions in stdio resides under the the const c and can be called like so:

c.printf("Hello world of C!");

Thats it, here is the full program

const c = @cImport({
    @cInclude("stdio.h");
});

pub fn main() void {
    _ = c.printf("Hello world of C!");
}
$ zig build run
Hello world of C!
»


A first look at unison language

Its not a philishave

Its unison! Unison is a (as of writing this) alpha release language I am by no means an expert in unison so the following is my summary after a deep dive in the documentation

The lowdown of unison is that it is a functional language that applies the concept of content addressed storage to the how code and the codebase is managed. Unlike other languages where the codebase is a set of files in unison the codebase is a database of content addressable definitions.

At the heart of how the codebase works are hashed functions. Unison stores and manages these hashes as well as the ast that makes up the program. One can think of a unison program as a graph where every node is a definition that can refer to other definitions. For the programmer to be able to work on the codebase unison can render these ASTs to and from text files.

Hashed functions

In order to enable content addressing unison will hash all your functions. However it does so by unifying the functions so aliases does not effect the hash.

Lets attempt to psudo-hash the following function as unison would do it:

xum : [Nat] -> Nat
sum xs = foldLeft (+) 0 xs

The first thing unison does is to “normalize” the function.

sum xs foldLeft (+) 0 xs
_ $arg1 #abcde1234 (##Nat.+) 0 $arg1
  • sum - is the function name but we do not care about that because the aim is to create a function name (hash)
  • xs - is exchanged for a indexed argument name
  • foldLeft - is exchanged for its hash (here represented with just the start of the hash)
  • 0 - constants are stored as constants
  • (+) - this is a built in function so is a special case of resolving to a namespaced function name

Now the “normalized” representation of the function can be hashed.

Some codebase selling points

Refactoring aliases does not effect the codebase

Because of the way the functions are hashed unison does not care what the programmers name functions or even if they name them the same thing. Our function above:

sum xs = foldLeft (+) 0 xs

would have the name hash as this function:

tally nums = foldLeft (+) 0 nums

No conflicting dependencies

Imagine that we in a non-unison language have libraries b c and d like so:

    A
   / \
  b   c
   \ /
    d

Our program is A and it depends on b and c. They in turn depend on d. This is fine as long as b and c uses the same version of d or even a compatible one. However if that ceases to be the case we can no longer build our program with upgraded dependencies.

This is not a problem in unison. In unison the d would evaluate to a hash and c or d can either depend on the same d or a different one (between the two).

No builds

Code that exists in the codebase is already ready to run. The instant you pull a library codebase its functions can be executed. This in turn enables self deploying code also which is another (large) concept in unison that is unfortunately as of writing this just a theory and a promise.

Pluggable syntax

While it does not exist currently the codebase can enable pluggable syntax. As stated initially there exists tooling in unison to translate its ASTs and function hashes to human readable text files and back again. Adding a new syntax is simply having a different translation to and from.

»


A second look at pony language

I last wrote about my initial look at pony. I have since played a bit more with the language and compiled a list of interesting quirks I learned about pony.

Division by 0

Pony wont crash if you devide by 0. Division by 0 is 0 :)

fun a(i: U64): U64 => 10 / i
env.out.print(a(0).string())                  //  0

But only if the zero is not known when compiling the function

fun b(): U64 => 10 / 0                        //  main.pony:2:24: constant divide or rem by zero

Sum types

The sum types of pony are a beauty, and honestly short of Haskell I think the most pleasant to use for me so far in any language :) They are so powerful they even entierly replace the need for Option/Maybe types.

fun c(str: String): (U8 | None) =>
    match str
        | "one" => 1
        | "two" => 2
        else None
    end

env.out.print(Abc.c("one").string())        //  1
env.out.print(Abc.c("gazzillion").string()) //  None

Calling C

The C FFI of pony is fairly straight forward

// import library
use "lib:computersays"

// declare function from library above
// with the correct types
// pony will trust that you get this
// correct. in this case the function is
// called answer, its return type is USize
// and it takes no arguments
use @answer[USize]()

// now we can call the function
// and print its output
env.out.print((@answer().string())          //  42

Interface or Traits

Some languages support either traits/nominal subtyping or interfaces/structural subtyping. Pony supports both.

trait Named
  fun name(): String => "Bob"

class Bob is Named

env.out.print(Bob.name())                   //  "Bob"

Sorry I couldnt think of a better example to illustrate the “structural” part :) Interfaces also accept the keyword is which for interfaces with implemented functions is actually how you would say that this class implements interface Named without the “structural subtyping” part :)

interface Named
  fun name(): String => "Bob"

class Bob is Named

env.out.print(Bob.name())                   //  "Bob"

Behaviours

Actors can have behaviours be. These behaviours are how actors communicate. There is no “await” function like with BEAM languages so a Main actor and a Counter actor might communicate like so:

actor Counter
  var _count: U32

  new create() =>
    _count = 0

  // this is accessable from the outside because it is a `be`
  be increment() =>
    _count = _count + 1

  // a `be` can get an alias to a (in this case an actor with `tag` refcap) passed
  be get(main: Main) =>
    // we call Mains `be display`
    main.display(_count)

actor Main
  let _env: Env

  // Main is the entrypoint of an executable, Env is passed implicitly
  // as new create (the constructor) is called on execute
  new create(env: Env) =>
    _env = env

    var count: U32 = 10
    var counter = Counter

    for i in Range[U32](0, count) do
      counter.increment()
    end

    counter.get(this)

  be display(result: U32) =>
    _env.out.print(result.string())         //  "10"
»