Ozan Sazak | sazak.io
sazak.io
home blog whoami

gls: File Manager on Terminal with Go

This week my last term in college is finished and my brain started to come up with creative ides, so I started working on one of the fancy ones: developing a file manager with GUI that I can use in terminal, with Go.

Before diving into my design and development process, here’s a little backstory: I use a 2019 model MacBook Pro storage as my development machine. Since it only has 256 GB storage, I frequently run out of space and try to find which files take up most of the space and which of them to delete. Here I have a few options and the one I do is basically using du command to find out the biggest files on terminal:

$ du -shc ~/* | sort -rh

This basically works out, but I find myself consistently bragging about du not having an interactive interface that also lists the filesystem in tree format with file/folder sizes.

And that’s where I said out loud: I should do this!

The initial design

After deciding to develop a program for this idea, I thought of the main features this program should provide for my daily use:

Many other features could be added, but these are the trivial ones :) I also decided to name my program gls, a mimicking “graphical ls”.

Initial UI design of gls that I've drawn on iPad

Walking down the filesystem

Before diving deep into coding the UI, we first need to build a virtual filesystem representation of the original folder structure with the path we’re currently interested in. This representation will make us able to show the real folder structure in the UI in tree view. It will also help us perform specific file- and folder-related operations, such as copying and removing, by using the virtual representation to get info of specific files and folders instead of using system calls to get the same information each time.

Speaking of which, most of the operating systems hold the whole filesystem structure as a one-rooted tree data structure, where all folders and files are child nodes of the root node or another node with a higher-level parent node. This type of data structure is also called “directed acyclic graph”.

An example file tree in UNIX-like filesystem format

Extra information: in Linux and UNIX-like systems, every node in the filesystem tree is a file1. Even directories, drivers (files under /dev), terminals (eg. /dev/tty), etc. are files with special properties.

Coming back to our virtual filesystem representation, it will be sufficient to construct a basic tree data structure that you’ve probably coded from scratch in a DS course:

type Node struct {
    Name             string
    Mode             os.FileMode
    Size             int64
    SizeOnDisk       int64
    IsDir            bool
    LastModification time.Time
    Children         []*Node
    Parent           *Node
}

You may be asking “Why Size is a signed integer while it’s used for the size of an entity which can’t be negative?”. Well, the answer is os.FileInfo’s size property is also int64 and that’s also probably because of why len() returns a signed integer, to prevent integer underflows in arithmetic calculations2. Also, storing the size in int64 makes us able to properly store the size of files/folders up to 16777216 TB, so it’s actually not a downside to use signed integers here.

The reason that we have both Size and SizeOnDisk is that, Size shows the actual number of bytes a specific file consists of. But as the operating systems allocates a number of blocks on the storage device for each file, the real size on the disk is a multiple of block_size3 (Even if the file content is a single byte, the size of the storage area allocated for the file will be block_size bytes4). So we calculate SizeOnDisk by block_size * number_of_blocks.

Now it’s time to walk on the file tree. To do this, we need to use a tree traversal algorithm5, which are pretty straightforward. I will explain the traversal basics in the next subtitle, so you can pass it if it’s too basic for you.

Primer on tree traversals

The most trivial tree traversal algorithm is depth-first search, where we walk down the tree from a node as deep as possible before hopping to its siblings. DFS can be used as pre-order, post-order and in-order. The only difference is when do we process the root (or current) node, other than that, what DFS does is iteratively applying the same traversal to the child subtrees of the specific node.

In pre-order traversal, we visit the root node and then the children one by one (root-Left-Right for example if we have a binary tree). In post-order, we visit the children one by one and then the root node eventually (e.g L-R-root). Likewise in in-order traversal we visit the root node in between visiting the child nodes (e.g L-root-R).

Node visiting order visualization in the same binary tree using post-order (left) and pre-order (right) algorithms

Coding the traversal function

For the sake of implementation simplicity, I decided to use pre-order traversal to walk in the file tree and construct the *Node tree on our side.

func Walk(path string) (*Node, error) {
    f, err := os.Lstat(path)
    if err != nil {
        log.Warningf("%s: %v", path, err)
        return nil, nil
    }
    st, ok := f.Sys().(*syscall.Stat_t)
    if !ok {
        return nil, fmt.Errorf("could not cast %T to syscall.Stat_t", f.Sys())
    }
    root := &Node{
        Name:             f.Name(),
        Mode:             f.Mode(),
        Size:             st.Size,
        SizeOnDisk:       st.Blocks * 512,
        IsDir:            f.IsDir(),
        LastModification: f.ModTime(),
    }
    if root.IsDir {
        names, err := readDirNames(path)
        if err != nil {
            log.Warningf("%s: %v", path, err)
            return root, nil
        }
        for _, name := range names {
            child, err := Walk(path + "/" + name)
            if err != nil {
                return nil, err
            }
            child.Parent = root
            root.Size += child.Size
            root.SizeOnDisk += child.SizeOnDisk
            root.Children = append(root.Children, child)
        }
    }
    return root, nil
}

We call os.Lstat for each node in the file tree to get the file/directory information. The os.Lstat function uses the lstat6 system call internally, which returns struct stat structure. Then, os.Lstat uses struct stat to return os.FileInfo, which is an easily usable variant of struct stat.

After getting the file information, we construct the root node of type *Node, and then call Walk for each of its children if root is a directory, and lastly update the child nodes’ root, the root’s size and so on. Children of the root are found by the readDirNames function, which basically reads the directories under the specified path and sorts them by their name7. The SizeOnDisk calculation is done by using Stat_t.Blocks. The Stat_t structure is the Go equivalent of Linux kernel’s struct stat, and Stat_t.Blocks is the number of 512-byte blocks that the file uses.

Please note that this code we’ve written will only work in Linux and UNIX-like systems!

We can now call Walk and test it on a real directory:

import (
    "flag"
    "fmt"
    "os"
    "sort"
    "strings"
    "syscall" 
    "time"

    "github.com/ozansz/gls/log"
)

type Node struct {
    Name             string
    Mode             os.FileMode
    Size             int64
    SizeOnDisk       int64
    IsDir            bool
    LastModification time.Time
    Children         []*Node
    Parent           *Node
}

func (n *Node) Print() {
    n.printWithLevel(0)
}

func (n *Node) printWithLevel(level int) {
    fmt.Printf("%s%s [%s] [%d]\n", strings.Repeat("  ", level),
        n.Name, n.Mode.String(), n.SizeOnDisk)
    for _, child := range n.Children {
        child.printWithLevel(level + 1)
    }
}

var (
    path = flag.String("path", "", "path to walk")
)

func main() {
    flag.Parse()
    if *path == "" {
        flag.Usage()
        return
    }

}

func Walk(path string) (*Node, error) {
    f, err := os.Lstat(path)
    if err != nil {
        log.Warningf("%s: %v", path, err)
        return nil, nil
    }
    st, ok := f.Sys().(*syscall.Stat_t)
    if !ok {
        return nil, fmt.Errorf("could not cast %T to syscall.Stat_t", f.Sys())
    }
    root := &Node{
        Name:             f.Name(),
        Mode:             f.Mode(),
        Size:             st.Size,
        SizeOnDisk:       st.Blocks * 512,
        IsDir:            f.IsDir(),
        LastModification: f.ModTime(),
    }
    if root.IsDir {
        names, err := readDirNames(path)
        if err != nil {
            log.Warningf("%s: %v", path, err)
            return root, nil
        }
        for _, name := range names {
            child, err := Walk(path + "/" + name)
            if err != nil {
                return nil, err
            }
            child.Parent = root
            root.Size += child.Size
            root.SizeOnDisk += child.SizeOnDisk
            root.Children = append(root.Children, child)
        }
    }
    return root, nil
}

// Taken directly from the Go stdlib
func readDirNames(dirname string) ([]string, error) {
    f, err := os.Open(dirname)
    if err != nil {
        return nil, err
    }
    names, err := f.Readdirnames(-1)
    f.Close()
    if err != nil {
        return nil, err
    }
    sort.Strings(names)
    return names, nil
}

When we save this code as main.go and run it by specifying a path,

$ go run main.go --path ~/root-folder

…and assume that the file tree is something like that:

Example file tree for the Walk function test

…we will see an output like that:

root-folder [drwxr-xr-x] [10010624]
  subfolder [drwxr-xr-x] [10002432]
    file_3.mp4 [-rw-r--r--] [9998336]
  file_1.c [-rw-rw-r] [8192]
  file_2.sh [-rw-r---] [4096]

  1. https://en.wikipedia.org/wiki/Unix_filesystem ↩︎

  2. https://stackoverflow.com/questions/39088945/why-does-len-returned-a-signed-value ↩︎

  3. https://wiki.linuxquestions.org/wiki/Block_devices_and_block_sizes ↩︎

  4. https://tldp.org/LDP/sag/html/filesystems.html ↩︎

  5. https://en.wikipedia.org/wiki/Tree_traversal ↩︎

  6. https://man7.org/linux/man-pages/man2/lstat.2.html ↩︎

  7. https://cs.opensource.google/go/go/+/refs/tags/go1.18.3:src/path/filepath/path.go;l=543 ↩︎