LEM: Lilush Editor, Modal

LEM logo

Overview

LEM is the built-in text editor for Lilush Shell. It provides Vi-inspired modal editing with syntax highlighting, multiple buffers, and a piece table buffer for efficient editing with unlimited undo.

LEM is also a tribute to Stanisław Lem, the Polish science fiction author.

Related docs:

Features

LEM runs inside the lilush process as a shell builtin, sharing the event loop, theme system, and loaded modules. It uses the alternate screen buffer and the Kitty keyboard protocol for full modifier and key-event support.

Key characteristics:

Invoking LEM

lem                      # open empty buffer
lem file.lua             # open file
lem file1.lua file2.c    # open multiple files (first is active)
lem +42 file.lua         # open file, jump to line 42

The +N syntax jumps to line N on startup.

LEM requires a terminal (refuses to run without a TTY). It registers as a shell builtin with fork = "never", so it runs in the current process without forking.

Modes

LEM has four modes. The active mode determines how keystrokes are interpreted.

Mode transitions

              ESC / operation complete
          +------------------------------+
          |                              |
          v          i/a/o/O/I/A         |
     +---------+ ------------------> +---------+
     | NORMAL  |                     | INSERT  |
     |         | <------------------ |         |
     +----+----+        ESC          +---------+
          |
          |  v/V
          |
          v
     +---------+
     | VISUAL  |
     | (char/  | ---- ESC ----> NORMAL
     |  line)  |
     +---------+

     : (from NORMAL)
          |
          v
     +---------+
     | COMMAND | ---- ESC/ENTER ----> NORMAL
     +---------+

Normal mode

The default mode. Keystrokes are interpreted as motions, operators, or commands. Supports the [count]operator[count]motion/text-object grammar. In operator-pending form, counts multiply, so 2d3w acts like d6w, and line-target motions follow the same rule (2d3G targets line 6).

Operators

KeyDescription
dDelete the motion/text-object range, yank to register
cDelete + enter insert mode
yCopy (yank) the range to register
>Indent lines covered by motion
<Dedent lines covered by motion

Operators wait for a motion or text object. Doubling the operator key applies it to the current line (dd deletes line, yy yanks line, >> indents line).

Motions

All motions accept an optional count prefix: 3w moves three words forward, 5j moves five lines down.

KeyDescription
h / lCursor left / right
j / kCursor down / up (preserves target column)
w / b / eWord forward / backward / end
W / B / EWORD forward / backward / end (whitespace-delimited)
0First column
^First non-whitespace character
$End of line
ggFirst line (or line N with count)
GLast line (or line N with count)
{ / }Previous / next blank line (paragraph motion)
f{c} / F{c}Find char forward / backward on line (count finds the Nth match)
t{c} / T{c}Till char forward / backward on line (count targets the Nth match)
; / ,Repeat / reverse last f/F/t/T (count repeats N times)
%Matching bracket ()[]{}
Arrow keysCursor movement (also works in insert mode)
HOME / ENDFirst non-blank / end of line
PAGE_UP / PAGE_DOWNScroll by screen height
CTRL+f / CTRL+bSame as PAGE_DOWN / PAGE_UP

Text objects

Used after an operator, prefixed with i (inner) or a (around):

ObjectInner (i)Around (a)
wWord charactersWord + surrounding space
WWORD charactersWORD + surrounding space
" / ' / `Inside quotesIncluding quotes
( / ) / bInside parenthesesIncluding parens
[ / ]Inside bracketsIncluding brackets
{ / } / BInside bracesIncluding braces
< / >Inside angle bracketsIncluding brackets

Commands

KeyDescription
xDelete character under cursor
XDelete character before cursor
r{c}Replace character under cursor with {c}
JJoin current line with next line
p / PPaste after / before cursor
uUndo
CTRL+rRedo
.Repeat last edit operation
~Toggle case of character under cursor
* / #Search forward / backward for word under cursor
n / NNext / previous search match
m{a-z}Set mark
`{a-z}Jump to mark (exact position)
'{a-z}Jump to mark (start of line)
ZZSave and quit
ZQQuit without saving
CTRL+pOpen file chooser
CTRL+SHIFT+pOpen file chooser and insert chosen file after cursor (like :r)
CTRL+nNext buffer

Insert mode entry

KeyDescription
iInsert before cursor
IInsert at first non-blank of line
aInsert after cursor
AInsert at end of line
oOpen new line below, enter insert
OOpen new line above, enter insert

Insert mode

Keystrokes are inserted as text. Special keys:

KeyAction
ESCReturn to normal mode, seal undo group
BACKSPACE / CTRL+hDelete character before cursor
DELETEDelete character under cursor
ENTERInsert newline (with autoindent if enabled)
TABInsert configured indentation (spaces or tab)
CTRL+wDelete word before cursor
CTRL+uDelete to start of line
CTRL+n / CTRL+DOWNScroll completion candidates forward (secondary if active with >1, else primary)
CTRL+p / CTRL+UPScroll completion candidates backward (secondary if active with >1, else primary)
CTRL+spaceTrigger CALM prose completion (secondary ghost)
CTRL+lAccept active CALM ghost in full
CTRL+RIGHTAccept next word from active CALM ghost
Arrow keysMove cursor without leaving insert mode

All typing within an insert session (from entering insert mode to pressing ESC) is grouped as a single undo entry.

Visual mode

Entered from normal mode with v (character-wise) or V (line-wise). Motions extend the selection from the anchor (entry point) to the cursor.

KeyAction
MotionsExtend selection
oSwap cursor and anchor
d / xDelete selection
cChange selection (delete + insert)
yYank selection
> / <Indent / dedent selected lines
~Toggle case of selection
v / VSwitch visual sub-mode or cancel
ESCCancel selection, return to normal

Command mode

Entered by pressing :, /, or ? in normal mode. A command line appears at the bottom of the screen.

Ex commands

CommandDescription
:w [path]Save (optionally to a different path)
:qQuit (fails if unsaved changes)
:q!Force quit, discard changes
:wqSave and quit
:xSave if modified, then quit
:e <path>Open file in new buffer
:bn / :bpNext / previous buffer
:bdClose current buffer
:b <N\|name>Switch to buffer by number or partial name match
:lsList open buffers
:r <path>Read file and insert contents below cursor
:<N>Jump to line N

Search and replace

CommandDescription
/<pattern>Search forward (plain text, not regex)
?<pattern>Search backward
:s/pat/rep/[flags]Substitute on current line
:%s/pat/rep/[flags]Substitute in entire file
:<range>s/pat/rep/[flags]Substitute in line range

Search and substitute patterns use Lua patterns. Flag g replaces all occurrences on each line (without it, only the first match per line is replaced).

Range syntax: N (line N), . (current line), $ (last line), % (entire file = 1,$), N,M (lines N through M).

Editor settings

CommandDescription
:set number / :set nonumberToggle line numbers
:set wrap / :set nowrapToggle line wrapping
:set tabstop=NTab display width
:set shiftwidth=NIndentation width for > / <
:set expandtab / :set noexpandtabTabs as spaces or literal tabs
:set autoindent / :set noautoindentAuto-indent new lines
:set scrolloff=NMinimum lines above/below cursor
:set page_scroll_overlap=NLines of overlap kept when PAGE_DOWN/UP scrolls
:set filetype=XOverride filetype detection
:setShow all current settings

CALM prose completion

CommandDescription
:calm / :calm statusShow active CALM model, enable state, path
:calm on / :calm offEnable / disable the CALM source
:calm model <name>Switch to a registry-named model (session override)
:calm model autoClear override, return to mode-map mapping

Piece Table

The piece table is the core data structure for text storage, implemented as a C module (lem.piece_table) for performance.

A piece table represents a buffer as a sequence of pieces, each referencing a contiguous span in one of two backing buffers:

Editing operations modify only the piece list, never the backing data. This gives efficient insert/delete regardless of position and preserves the original file content until an explicit save.

Line index

The piece table maintains a line index -- an array of byte offsets for each newline. The index is rebuilt on each edit. This provides O(1) line-to-offset and offset-to-line translation.

Undo/redo

Undo is unlimited and linear (no branching history). Each edit pushes a snapshot to the undo stack. Redo is populated when undo is invoked; any new edit after undo clears the redo stack.

Edit grouping: Consecutive inserts in insert mode are grouped into a single undo entry. The group is sealed when the user leaves insert mode. This matches vi behavior: u undoes the entire insert session.

Lua API

local pt = require("lem.piece_table")

local p = pt.new(content_string)  -- or pt.new() for empty buffer
p:insert(offset, text)            -- 0-based byte offset
p:delete(offset, length)
local text = p:get_text(offset, length)
local total = p:length()

-- Line index (1-based line numbers)
local count = p:line_count()
local off = p:line_offset(line)
local len = p:line_length(line)
local line = p:offset_to_line(offset)

-- Undo/redo
p:undo()   -- returns true/false
p:redo()
p:group_open()
p:group_close()

-- Full content for saving
local content = p:snapshot()

-- GC releases resources, or explicit:
p:close()

Buffers

A buffer wraps a piece table with file metadata and editor state (cursor position, scroll state, marks).

Buffer fields

FieldDescription
cfg.pathFile path (nil for unnamed/scratch)
cfg.filetypeDetected filetype (for highlighting)
cfg.readonlyPrevent modifications
__cursorCursor position {line, col} (1-based)
__scrollViewport scroll state {top_line, left_col}
__marksNamed marks {name = {line, col}}
__versionMonotonic counter bumped on every edit

Modification state is exposed through buf:is_modified(), which returns true whenever the buffer is not at the last-saved position in its undo history.

Filetype detection

Filetype is detected from the file extension, with a fallback to shebang inspection. Extensions with a bundled highlighter:

ExtensionsFiletype
.lualua
.c, .hc
.mdmarkdown
.jsonjson
.sh, .bashsh
.lshlsh
.pypython

Filenames Makefile / GNUmakefile and Dockerfile (case-insensitive) are recognized as make and dockerfile respectively. Shebang lines (#!/usr/bin/env lua, #!/bin/sh, #!/usr/bin/env python, etc.) provide detection when the extension is absent; recognized interpreters include lua, luajit, python/python3, ruby, sh/bash/dash/ zsh, node, and perl.

Additional extensions (.go, .rs, .js, .ts, .yaml/.yml, .toml, .rb, .html, .css, .xml, .conf, .ini) are detected and assigned a filetype label, but there is no bundled highlighter for them. Register one via highlight.register(filetype, module_name) to enable highlighting.

Multiple buffers

The buffer list manages open buffers with :e, :bn, :bp, :bd, :b, and :ls. CTRL+n cycles to the next buffer from normal mode. When the last buffer is closed, the editor exits.

Closing a modified buffer requires :bd! or saving first.

Registers

LEM has a minimal register system for yank/paste operations.

RegisterDescription
""Unnamed -- default target for all yank/delete
"0Yank register -- last yanked text (not affected by deletes)
"_Black hole -- discard (delete without storing)
"+System clipboard via OSC 52

Access: "<register><operator> -- for example, "+y yanks to the system clipboard, "0p pastes the last yank.

Linewise vs. character-wise paste is determined by how the text was originally yanked or deleted.

Clipboard (OSC 52)

The "+ register uses OSC 52 terminal escapes for clipboard writes. This works over SSH sessions and in terminals that support OSC 52 (kitty, foot, alacritty, ghostty, etc.). Paste from the system clipboard is delivered via bracketed paste mode.

Syntax Highlighting

Syntax highlighting uses Lua-scriptable per-filetype highlighters. Each highlighter processes one line at a time and returns an array of styled spans.

Design principles

  1. Selective, not exhaustive. Most code stays in base text color. Highlighting is reserved for things that benefit from visual separation.

  2. Minimal palette. The default theme targets five semantic colors.

  3. Highlight the rare, leave the common. Constants and definitions get color. Variable usage and function calls stay in base text.

  4. Language keywords are scaffolding. if, for, local render in base text. The names and values beside them matter more.

  5. Comments deserve attention. Explanatory comments get a visible, distinct color -- not greyed out.

  6. Punctuation steps back. Brackets and delimiters are dimmed below base text.

  7. Red is for errors. No ordinary token type uses red.

Token types

Token typeTSS pathWhat it covers
literalsyntax.literalStrings, numbers, booleans, nil/null
definitionsyntax.definitionNames being defined (functions, variables)
commentsyntax.commentExplanatory comments
disabledsyntax.disabledCommented-out code, inactive preprocessor branches
punctuationsyntax.punctuationBrackets, semicolons, commas, dots
operatorsyntax.operatorArithmetic, logical, comparison operators
directivesyntax.directivePreprocessor directives, shebangs
errorsyntax.errorSyntax errors, unmatched brackets

Untagged text renders in the base text color. This is the default for variable references, function calls, and language keywords.

Multiline constructs

Constructs that span multiple lines (block comments, multiline strings, heredocs) are tracked via line state -- a value passed from each line to the next indicating whether a multiline construct is open.

Bundled highlighters

FiletypeModuleNotes
lualem.highlight.luaLong brackets, block comments with nesting
clem.highlight.cBlock comments, preprocessor directives
markdownlem.highlight.markdownFenced code blocks, headings, inline code
jsonlem.highlight.jsonObject keys vs. string values
shlem.highlight.shHeredocs, variable expansion
lshlem.highlight.lshReuses sh highlighter
pythonlem.highlight.pythonTriple-quoted strings, decorators, string prefixes

Highlighter interface

A highlighter module exports new() and returns a table with:

local new = function()
    return {
        -- Returns {from, to, token_type} spans (1-based byte positions)
        -- and the new line state for multiline tracking.
        highlight_line = function(self, text, prev_state)
            -- ...
            return spans, new_state
        end,

        -- Reset state (e.g., on file reload).
        reset = function(self) end,
    }
end

Register custom highlighters via highlight.register(filetype, module_name) before the editor starts.

Bracket Pair Matching

When the cursor is on a bracket character ((, ), [, ], {, }), LEM highlights both the bracket under the cursor and its matching counterpart. The highlight uses the text.bracket_match TSS path.

The matching uses the same depth-tracking algorithm as the % motion: nested brackets are handled correctly across lines. If no match exists (unmatched bracket), no highlight is shown.

The highlight is computed on every render and appears in all modes.

Inline Diagnostics

LEM provides real-time syntax validation for JSON and Lua files. When a syntax error is detected, the editor shows three indicators:

  1. Gutter marker: The line number on the error line is rendered with the gutter.diagnostic_error style (bold red by default).

  2. Inline highlight: The character at the error position is styled with text.diagnostic (underlined red by default).

  3. Message bar: When the cursor is on the error line, the error message is shown in the message area at the bottom of the screen.

Supported filetypes

FiletypeValidation methodError granularity
jsoncjson.safe.decode()Line and column
lualoadstring() (compile only)Line

Only the first error is reported (single-error model).

Validation runs only when the buffer content changes (tracked via an internal version counter), not on cursor-only movements.

Completion

LEM shows completion suggestions as dim ghost text after the cursor while typing in insert mode. There is no popup list -- completions are end-of-line only.

Triggering

A session auto-triggers on every insert-mode edit whenever the cursor is at end-of-line and the prefix immediately before the cursor is at least two characters long. Typing more characters narrows the active session; deleting characters or moving the cursor refreshes it. Leaving insert mode, moving the cursor away from end-of-line, or reducing the prefix below the threshold dismisses the session.

When the prefix and cursor position are unchanged between refreshes, the currently chosen candidate is preserved so scrolling doesn't reset as you keep typing the same word.

Sources by filetype

For .lua buffers the session merges three sources in this order:

  1. Lua keywords (and, break, do, ...)

  2. Lua symbols from an enriched global table: every name visible in _G plus the modules configured in cfg.lua_preloads (by default std, crypto, dns, term, lev, mneme, http, json, wg; the json binding loads cjson.safe and wg loads wireguard). Dotted access like std.fs.r walks the table path and lists matching members. Add extra modules by overriding lua_preloads in your LEM config.

  3. Unique identifiers harvested from all open buffers (active buffer first).

For every other filetype only the buffer-word source runs.

The buffer-word source scans each buffer's piece table once per __version change and caches the token set, so repeated completions on an unchanged buffer do not rescan.

Keys

KeyAction
CTRL+n / CTRL+DOWNNext candidate
CTRL+p / CTRL+UPPrevious candidate
TABAccept the ghost

TAB falls through to its normal indent behavior when no session is active. Any other key ends the session implicitly by changing the prefix or cursor position; the ghost disappears on the next refresh.

TSS

Ghost text uses the single completion TSS path from the lem theme section (gray italic by default). The secondary (CALM) ghost uses completion_secondary (accent-color italic by default).

CALM Prose Completion

For prose filetypes (markdown, text, and unnamed buffers), LEM can draw a second ghost fed by a CALM language model. The CALM ghost is secondary: it coexists with the regular buffer-word / Lua ghost and never replaces it. Both ghosts render simultaneously -- the primary at the cursor, the CALM ghost immediately after.

Triggering

CALM is manual-only: CTRL+space in insert mode runs inference and installs the secondary ghost. There is no keystroke-by-keystroke auto-trigger -- prose writing has natural pauses, and running the model on every keypress would stall the editor.

Any subsequent edit or cursor move dismisses the secondary ghost (its anchor becomes stale); the user can re-trigger at the new position.

Filetype gating

CALM activates only for filetypes in cfg.calm_filetypes. The default is {markdown, text, ""} (markdown, plain text, and buffers with no detected filetype). Override programmatically before startup:

local lem_cfg = require("lem.config").new({
    calm_filetypes = { markdown = true, text = true },
})

Model resolution

The source resolves its model in this order:

  1. LILUSH_LEM_MODEL environment variable (absolute path override)

  2. calm.registry lookup for mode_map["lem"]

  3. Registry default (same fallback the shell uses)

To use a dedicated prose model, add an entry to ~/.config/lilush/calm/registry.json:

{ "default": "shell",
  "mode_map": { "lem": "prose" } }

:calm model <name> installs a runtime override; :calm model auto clears it and returns to the mode-map mapping.

Context construction

The source builds a prose-domain sequence directly, without frame tokens:

<BOS> <ATN> [lookback text as byte tokens...]

Lookback is the text from the start of the buffer up to the cursor, capped at the model's l_max budget (minus BOS, ATN, and generation headroom). The trim walks backward to the nearest paragraph boundary (double newline), then sentence boundary (.!? + space/newline), then word boundary -- never splits mid-word.

Sampler parameters come from the CWGT metadata (sampler_defaults) if present, with fallbacks: temperature=0.8, top_k=8, max_tokens=40, num_candidates=1.

Accepting

KeyEffect
CTRL+lAccept full ghost text, advance cursor to its end
CTRL+RIGHTAccept leading whitespace + next word + trailing space; remainder stays as ghost, cursor advances
CTRL+n / CTRL+DOWNNext candidate (when num_candidates > 1)
CTRL+p / CTRL+UPPrevious candidate (when num_candidates > 1)
Any other insert-mode keyDismiss the ghost

When the secondary ghost has only one candidate, CTRL+n/CTRL+p fall through to scrolling the primary ghost as usual.

Accepting a CALM ghost also dismisses any active primary session so the next keystroke refreshes cleanly.

Hot-reload

The source tracks the model file's mtime. Overwriting the .cwgt on disk causes the next trigger to reload the weights without restarting LEM -- useful during training-and-test cycles.

Rendering

All rendering goes through TSS. LEM obtains its stylesheet via theme.subscribe("shell", "lem"), which returns a lazy getter that refreshes when the theme changes.

Screen layout

+-----+----------------------------------------+
| gut | text area                              | row 1
| ter |                                        |
|     |                                        |
|     |                                        |
+-----+----------------------------------------+
| status line                                  | row H-1
+----------------------------------------------+
| command line / message area                  | row H
+----------------------------------------------+

TSS paths

Gutter

PathPurpose
gutter.line_numberNon-current line numbers
gutter.current_lineCurrent line number
gutter.diagnostic_errorLine number on a line with a diagnostic error

Status line

PathPurpose
statusStatus line background
status.filenameFile name
status.modifiedModified indicator [+]
status.positionLine:col display
status.filetypeFiletype label
status.modeMode indicator (normal)
status.mode_insertMode indicator (insert)
status.mode_visualMode indicator (visual)

Messages

PathPurpose
message.infoInformational messages
message.warningWarning messages
message.errorError messages

Text area

PathPurpose
textBase text color
text.selectionVisual mode selection overlay
text.bracket_matchMatching bracket highlight
text.diagnosticInline diagnostic error highlight
completionPrimary ghost text (buffer-word / Lua)
completion_secondaryCALM prose ghost text
syntax.*Syntax token styles (see Token types)

Configuration

Defaults

SettingDefaultDescription
numbertrueShow line numbers
relativenumbertrueShow relative line numbers (hybrid with number)
wrapfalseSoft-wrap long lines
tabstop4Tab display width
shiftwidth4Indentation width for > / <
expandtabtrueInsert spaces instead of tab characters
autoindenttrueCopy indentation from current line on ENTER
scrolloff5Minimum visible lines above/below cursor
page_scroll_overlap2Lines of overlap kept when PAGE_DOWN/UP scrolls
lua_preloadstable of 9 modulesModules merged into the Lua completion environment (see Completion)
calm_filetypes{markdown, text, ""}Filetypes eligible for CALM prose completion

Scalar settings can be changed at runtime with :set. Table-valued settings like lua_preloads are set programmatically by overriding config.new(overrides) before starting the editor.

Architecture

Files

PathRole
src/lem/lem_piece_table.cC piece table implementation
src/lem/lem_piece_table.hC piece table header
src/lem/lem.luaEditor core: startup, main loop, key dispatch
src/lem/lem/buffer.luaBuffer object: wraps piece table + metadata
src/lem/lem/buffer_list.luaMultiple buffer management
src/lem/lem/mode.luaMode machine: normal, insert, visual, command
src/lem/lem/keymap.luaKey binding registry and dispatch
src/lem/lem/motion.luaMotion definitions and cursor math
src/lem/lem/text_object.luaText object definitions
src/lem/lem/operator.luaOperator definitions and register system
src/lem/lem/command.luaEx-mode command parser
src/lem/lem/viewport.luaViewport: scroll state, cursor positioning
src/lem/lem/render.luaScreen rendering via TSS
src/lem/lem/diagnostic.luaInline diagnostic validation engine
src/lem/lem/highlight.luaSyntax highlighting engine
src/lem/lem/highlight/Per-filetype highlighter modules
src/lem/lem/completion.luaGhost-text completion controller (primary + secondary)
src/lem/lem/completion/buffer_words.luaBuffer-word completion source
src/lem/lem/completion/calm.luaCALM prose completion source
src/lem/lem/config.luaConfiguration defaults and filetype detection
src/shell/shell/builtins/lem.luaShell builtin entry point

Dependencies

ModulePurpose
termTerminal I/O: alt screen, keyboard input, cursor, output
term.tssStyling engine for all visual output
term.widgetsfile_chooser() for CTRL+p
themeTheme lazy getter via theme.subscribe()
std.fsFile I/O
std.utfUTF-8 text operations
std.txtText utilities, binary detection
cryptob64_encode() for OSC 52 clipboard
shell.completion.source.lua_keywordsLua keyword list for completion
shell.completion.source.lua_symbols_G walker reused for Lua completion

Planned Features

The following features are designed but not yet implemented: