Behind the Shell

Julian Tabares
5 min readApr 14, 2021

--

What happens when you type `ls -l *.c` in the shell

Using “ls” along with “cd” command are probably one the first navigation tools we learn when we start using the shell in a terminal, and that’s so, because they provide some of the most basic and helpful ways to navigate trough directories and list its content.

ls, as well as most of the commands in the shell, come with a bunch of optional arguments that allow us to specify its behavior, and therefore, the outcome.

Checking ls’ man page, we have:

and specifying -l as its argument, we find:

Long story short, typing ls -l on the terminal, will list all the files in the current working directory in a long listing format, and the outcome will look sort of like this:

IMAGE 1: “ls -l” command

Here we can see among other things, permissions, file owner, creation date/time and file name.

When we add *.c, we are telling the shell to list only all those files with a c extension (filename.c), so, the above example will look like this:

IMAGE 2: “ls -l *.c” command

Now, as you probably noted it, the holberton.h file hast not been included in the resulting list, this is because it has a .h (header) extension, and we are using the wildcat *.c, which, as previously mentioned, will apply the command ls -l ONLY to the .c files.

Knowing the above, let’s dig deeper and try to understand what’s happening behind the screen to get this result, but before Let’s just make a short stop to go trough some of the definitions we are going to use in this article:

System call or syscall: is the programmatic way in which a computer program requests a service from the kernel of the operating system in which it is executed.

Shell: In simple terms, we can think of the Shell as an interpreter, that allow us to interact with the operating system by reading, translating and taking our orders to it, it has two modes: interactive, in which the shell shows the prompt waiting for a command from the user, executes it and finally shows the prompt again.

The value of the default prompt is stored in the PS1 enviroment variable. It changes the shell command prompt appearance and environment

Second mode is non interactive. Unlike the first one it does not shows a prompt and requires a script to interact with.

PATH: Is a set of directories where executable programs are located, this directories follow the directory three hierarchy and are separated by a delimiting character which varies according to the operating system (if you want to know more about the PATH in an OS, you can click here).

And finally, FORK (a syscall): It is basically the creation of a copy of the current process to have two “identical” ones running at the same time, this system call, duplicates all the variables of the current process and even the PATH on the new one (child process)

First off, the Shell will split the command into single words, so it can understand each one, this is a process known as tokenization, this way the shell will get something like this:

word 0: ls

word 1: -l

word 2: *.c

IMAGE 3: looking for a command in the PATH

Then the Shell will check through all the PATH for a file (program) named the same as the WORD 0, to this, it performs a tokenization process on the PATH tree of directories similar to the previous one, but instead of splitting it by words will do it by full routes which are separated by a delimiting character as mentioned before, then, concatenates that full route with the entered command (ls) and checks if that command is contained in such directory address (by means of another system call named “stat”) .

If the system call answer is a “yes”, the program will be executed by means of another system call named “execev”, but that execution comes with a cost: the executed program will overlay to the current one killing it!. In order to avoid any system crash, it is necessary to perform another system call before: Fork.

Fork will allow us to create a copy of the current process, and with some additional parameters (syscall wait), the main thread will wait until the copy program finishes its execution, to carry on, this way, we can run execev only on the copy process, and when it finishes, the main process will take the control again avoiding any system crashes

If it the program and its arguments are valid, the terminal will show a result as the one in IMAGE 2(supposing that not other errors occur during its execution), otherwise, we will have an error message probably saying that the introduce command could not be found or is not valid.

Hope you have found this article helpful.

Article written by Carlos Cruz Zuluaga and Julian Tabares V.

--

--