gls: File Manager on Terminal with Go
This week my last term in college is finished and my brain started to come up with creative ides, so I started working on one of the fancy ones: developing a file manager with GUI that I can use in terminal, with Go.
Before diving into my design and development process, here’s a little backstory: I use a 2019 model MacBook Pro storage as my development machine. Since it only has 256 GB storage, I frequently run out of space and try to find which files take up most of the space and which of them to delete. Here I have a few options and the one I do is basically using du
command to find out the biggest files on terminal:
1$ du -shc ~/* | sort -rh
This basically works out, but I find myself consistently bragging about du
not having an interactive interface that also lists the filesystem in tree format with file/folder sizes.
And that’s where I said out loud: I should do this!
The initial design
After deciding to develop a program for this idea, I thought of the main features this program should provide for my daily use:
- List files and folders under a specified path in tree view
- Show total size of the file/folder in KB, MB, GB, etc.
- Show file type (MIME and file extension) as the
file
command does - Show file/folder permissions
- Optionally show last modification or creation time of files
- Search files by name with regular expressions
- Create (
touch
-like), move, copy, remove files - Open files with the default or specified program
Many other features could be added, but these are the trivial ones :) I also decided to name my program gls
, a mimicking “graphical ls”
.
Walking down the filesystem
Before diving deep into coding the UI, we first need to build a virtual filesystem representation of the original folder structure with the path we’re currently interested in. This representation will make us able to show the real folder structure in the UI in tree view. It will also help us perform specific file- and folder-related operations, such as copying and removing, by using the virtual representation to get info of specific files and folders instead of using system calls to get the same information each time.
Speaking of which, most of the operating systems hold the whole filesystem structure as a one-rooted tree data structure, where all folders and files are child nodes of the root node or another node with a higher-level parent node. This type of data structure is also called “directed acyclic graph”.
Extra information: in Linux and UNIX-like systems, every node in the filesystem tree is a file1. Even directories, drivers (files under
/dev
), terminals (eg./dev/tty
), etc. are files with special properties.
Coming back to our virtual filesystem representation, it will be sufficient to construct a basic tree data structure that you’ve probably coded from scratch in a DS course:
1type Node struct {
2 Name string
3 Mode os.FileMode
4 Size int64
5 SizeOnDisk int64
6 IsDir bool
7 LastModification time.Time
8 Children []*Node
9 Parent *Node
10}
You may be asking “Why
Size
is a signed integer while it’s used for the size of an entity which can’t be negative?”. Well, the answer isos.FileInfo
’s size property is alsoint64
and that’s also probably because of whylen()
returns a signed integer, to prevent integer underflows in arithmetic calculations2. Also, storing the size inint64
makes us able to properly store the size of files/folders up to 16777216 TB, so it’s actually not a downside to use signed integers here.
The reason that we have both Size
and SizeOnDisk
is that, Size
shows the actual number of bytes a specific file consists of. But as the operating systems allocates a number of blocks on the storage device for each file, the real size on the disk is a multiple of block_size
3 (Even if the file content is a single byte, the size of the storage area allocated for the file will be block_size
bytes4). So we calculate SizeOnDisk
by block_size * number_of_blocks
.
Now it’s time to walk on the file tree. To do this, we need to use a tree traversal algorithm5, which are pretty straightforward. I will explain the traversal basics in the next subtitle, so you can pass it if it’s too basic for you.
Primer on tree traversals
The most trivial tree traversal algorithm is depth-first search, where we walk down the tree from a node as deep as possible before hopping to its siblings. DFS can be used as pre-order, post-order and in-order. The only difference is when do we process the root (or current) node, other than that, what DFS does is iteratively applying the same traversal to the child subtrees of the specific node.
In pre-order traversal, we visit the root node and then the children one by one (root-Left-Right for example if we have a binary tree). In post-order, we visit the children one by one and then the root node eventually (e.g L-R-root). Likewise in in-order traversal we visit the root node in between visiting the child nodes (e.g L-root-R).
Coding the traversal function
For the sake of implementation simplicity, I decided to use pre-order traversal to walk in the file tree and construct the *Node
tree on our side.
1func Walk(path string) (*Node, error) {
2 f, err := os.Lstat(path)
3 if err != nil {
4 log.Warningf("%s: %v", path, err)
5 return nil, nil
6 }
7 st, ok := f.Sys().(*syscall.Stat_t)
8 if !ok {
9 return nil, fmt.Errorf("could not cast %T to syscall.Stat_t", f.Sys())
10 }
11 root := &Node{
12 Name: f.Name(),
13 Mode: f.Mode(),
14 Size: st.Size,
15 SizeOnDisk: st.Blocks * 512,
16 IsDir: f.IsDir(),
17 LastModification: f.ModTime(),
18 }
19 if root.IsDir {
20 names, err := readDirNames(path)
21 if err != nil {
22 log.Warningf("%s: %v", path, err)
23 return root, nil
24 }
25 for _, name := range names {
26 child, err := Walk(path + "/" + name)
27 if err != nil {
28 return nil, err
29 }
30 child.Parent = root
31 root.Size += child.Size
32 root.SizeOnDisk += child.SizeOnDisk
33 root.Children = append(root.Children, child)
34 }
35 }
36 return root, nil
37}
We call os.Lstat
for each node in the file tree to get the file/directory information. The os.Lstat
function uses the lstat
6 system call internally, which returns struct stat
structure. Then, os.Lstat
uses struct stat
to return os.FileInfo
, which is an easily usable variant of struct stat
.
After getting the file information, we construct the root node of type *Node
, and then call Walk
for each of its children if root is a directory, and lastly update the child nodes’ root, the root’s size and so on. Children of the root are found by the readDirNames
function, which basically reads the directories under the specified path and sorts them by their name7. The SizeOnDisk
calculation is done by using Stat_t.Blocks
. The Stat_t
structure is the Go equivalent of Linux kernel’s struct stat
, and Stat_t.Blocks
is the number of 512-byte blocks that the file uses.
Please note that this code we’ve written will only work in Linux and UNIX-like systems!
We can now call Walk
and test it on a real directory:
1import (
2 "flag"
3 "fmt"
4 "os"
5 "sort"
6 "strings"
7 "syscall"
8 "time"
9
10 "github.com/ozansz/gls/log"
11)
12
13type Node struct {
14 Name string
15 Mode os.FileMode
16 Size int64
17 SizeOnDisk int64
18 IsDir bool
19 LastModification time.Time
20 Children []*Node
21 Parent *Node
22}
23
24func (n *Node) Print() {
25 n.printWithLevel(0)
26}
27
28func (n *Node) printWithLevel(level int) {
29 fmt.Printf("%s%s [%s] [%d]\n", strings.Repeat(" ", level),
30 n.Name, n.Mode.String(), n.SizeOnDisk)
31 for _, child := range n.Children {
32 child.printWithLevel(level + 1)
33 }
34}
35
36var (
37 path = flag.String("path", "", "path to walk")
38)
39
40func main() {
41 flag.Parse()
42 if *path == "" {
43 flag.Usage()
44 return
45 }
46
47}
48
49func Walk(path string) (*Node, error) {
50 f, err := os.Lstat(path)
51 if err != nil {
52 log.Warningf("%s: %v", path, err)
53 return nil, nil
54 }
55 st, ok := f.Sys().(*syscall.Stat_t)
56 if !ok {
57 return nil, fmt.Errorf("could not cast %T to syscall.Stat_t", f.Sys())
58 }
59 root := &Node{
60 Name: f.Name(),
61 Mode: f.Mode(),
62 Size: st.Size,
63 SizeOnDisk: st.Blocks * 512,
64 IsDir: f.IsDir(),
65 LastModification: f.ModTime(),
66 }
67 if root.IsDir {
68 names, err := readDirNames(path)
69 if err != nil {
70 log.Warningf("%s: %v", path, err)
71 return root, nil
72 }
73 for _, name := range names {
74 child, err := Walk(path + "/" + name)
75 if err != nil {
76 return nil, err
77 }
78 child.Parent = root
79 root.Size += child.Size
80 root.SizeOnDisk += child.SizeOnDisk
81 root.Children = append(root.Children, child)
82 }
83 }
84 return root, nil
85}
86
87// Taken directly from the Go stdlib
88func readDirNames(dirname string) ([]string, error) {
89 f, err := os.Open(dirname)
90 if err != nil {
91 return nil, err
92 }
93 names, err := f.Readdirnames(-1)
94 f.Close()
95 if err != nil {
96 return nil, err
97 }
98 sort.Strings(names)
99 return names, nil
100}
When we save this code as main.go
and run it by specifying a path,
1$ go run main.go --path ~/root-folder
…and assume that the file tree is something like that:
…we will see an output like that:
1root-folder [drwxr-xr-x] [10010624]
2 subfolder [drwxr-xr-x] [10002432]
3 file_3.mp4 [-rw-r--r--] [9998336]
4 file_1.c [-rw-rw-r] [8192]
5 file_2.sh [-rw-r---] [4096]