Jekyll2024-01-14T19:52:11+00:00https://indradhanush.github.io/feed.xmlCracking The CodeHack to learnIndradhanush Guptaindradhanush.gupta@gmail.comWrite a compiler with David Beazley2022-12-31T00:00:00+00:002022-12-31T00:00:00+00:00https://indradhanush.github.io/blog/write-a-compile-with-david-beazley<p>As the title suggests, I attended David Beazley’s course on <a href="https://dabeaz.com/compiler.html" title="The course website">writing a
compiler</a> earlier this
year and wanted to write about my experience.</p>
<p><strong>🏎️️ In a rush? Read the TL;DR at the bottom of this page.</strong></p>
<p>I won’t bore you with the details of the course which you can find on the course
page itself but in one sentence: <strong>A 5 day hands on workshop which will demand
your full attention, energy and then some.</strong></p>
<p>Now with that out of the way here’s what my experience was like.</p>
<p>📖 <strong>Before course start</strong></p>
<p>David recommended to read the first part of the <a href="http://craftinginterpreters.com/contents.html">Crafting
Interpreters</a> book (you can read
it online) and also provided some warm up exercises. I had a packed schedule
leading up the course and couldn’t get to the warm-up exercises but did read the
first parts of the book. I found it easy to follow the material in the book and
the great wall of compilers in my mind began to crumble one brick at a time.</p>
<p>🤖 <strong>Day 1</strong></p>
<p>The course kicked off with a discussion on computer instructions and how the
instructions that actually execute on the CPU are much simpler (ADD, SUB, MUL,
LOAD, STORE) than the programs we write. This was a nice refresher of computer
architecture that I had not thought about for years. Working with high level
programming languages is comfortable after all. 🛋️</p>
<p>We had a program simulating a CPU to play with and the goal was to write some
programs against it in order to use the registers, memory and CPU instructions
at a lower level. For the purpose of the course this was a great way to be able
to play around with assembly-like programming in a controlled sandbox. Realizing
how much work goes into a simple program like computing the factorial of a
number has made me appreciate computers and high level programming languages so
much more.</p>
<p>💡 In the process I learned a cool fun fact:</p>
<p><a href="https://twitter.com/indradhanush92/status/1577478322271162368">
<img src="https://indradhanush.github.io/images/writing-a-compiler/fun-fact-registers.png" alt="A tweet about a fun fact about computer registers" />
</a></p>
<p>On day 1, we were introduced to the concept of the Data Model. Consider the
following lines of code:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>print 42;
print 42.5;
</code></pre></div></div>
<p>The data model is how you’d go about representing a statement like this. You can
have a data model for the print statement. Which can print an integer. Or a
float. Or multiple arguments. The data model is code that knows how to represent
such a statement. It’s an API for the compiler to understand what each statement
intends to accomplish. This would form the core of our compiler.</p>
<p>By his own admission, David was running an interesting experiment with our
batch - providing no boilerplate code unlike the previous iterations. He would
spend some time talking about the new concepts and do some live coding and let
us go code it up in our own ways. Spoilers already: This turned out to be a
great decision as I loved being able to approach the problem on a clean slate
while also having some guidance on getting started. The motivation behind this
experiment was to make the course language agnostic. As a result I chose to
start my very own experiment with Go.</p>
<p>However, it became apparent early on that this choice to implement the data
model as the first thing for my eventual compiler would considerably slow me
down. Instead of creating loosely defined abstractions I found myself having to
wrestle with Go’s type system. This is not a Go critique post, just that given
the circumstances, I could ill afford to loose much time as I had to keep pace
with the batch.</p>
<p>After a little more than an hour, I abandoned Go and switched to the next best
tool in my arsenal - Python, which also happened to be the language David was
writing his very own version of the compiler alongside us and was also using to
demonstrate the concepts.</p>
<p>The test bed was a set of programs in <a href="https://dabeaz.com/wabbit.html">wabbit</a>,
the target language for which we were writing the compiler - starting from
simple programs that print a single number to simple mathematical expressions to
programs with functions. Using these as a base we were to implement the data
model so that these programs could be represented with this API. There’s no
parsing of the program involved yet (this came much later).</p>
<p>🧗 The hardest challenge here was getting started. You have no code. You have
an idea. I found myself staring at the screen a lot, writing some code and
getting rid of it to start again.</p>
<p>As the day ended, I managed to get started but I also found myself lagging
behind where I’d ideally want to be. Maybe a couple of hours of focused coding.
I resisted the temptation to work after hours as a long and intense week was
just getting started and I did not want to burn up too much energy early on.
This was a marathon. Not a sprint.</p>
<p>🚧 <strong>Day 2</strong></p>
<p>With my slow start the previous day, I tried to not get bothered too much about
my pace. My goal was to implement the data model to support slightly complicated
programs with <code class="language-plaintext highlighter-rouge">if</code>, <code class="language-plaintext highlighter-rouge">if-else</code> and <code class="language-plaintext highlighter-rouge">while</code> loops. The hardest challenge of the
previous day was getting started, but with that out of the way already I managed
to quickly reach this goal early on day 2. And when I felt like I had enough to
work with for the next step I switched my focus to the formatter.</p>
<p>The goal here was that given a wabbit program can the formatter output the
“ideal” version of the code? A programming language is only as strong as the dev
tools it provides for its users. So I found this little foray into dev tools for
the language you’re implementing quite insightful.</p>
<p>🔁 Recursion Recursion and more Recursion. It was the bread and butter of this
project and with the formatter it started getting used quite heavily early on.
By the end of the day I was able to format simpler programs including
mathematical expressions but I was still a little behind the goal. The slow
start on day 1 was proving difficult to catch up with.</p>
<p>🧩 <strong>Day 3</strong></p>
<p>Day 3 started with a discussion on tokeniser and the parser. A token is
the smallest building block of your programming language. The parser is
responsible for reading the blob of text from your file and generate a stream of
tokens. This building block is then used by anything that wants to understand
and execute your code.</p>
<p>Most compiler projects from what I’ve seen in my very limited experience have a
high tendency to start with building the tokeniser and parser. However I still
feel that starting at the data model layer was a great call instead of spending
a ton of time early on fencing around with string patterns and regular
expressions. The data model gave me an early headstart at being able to
comprehend the inner workings of a compiler and what might be the abstractions
that need to be designed.</p>
<p>I paused my work on the formatter and started out on building the tokeniser and
the parser. This turned out to be both fun and challenging. And when I needed to
bake in operator precedence for mathematical operators it blew my mind away. It
reminded me a lot of my early programming days and data structure lessons in
university. With enough effort I was able to now parse programs that had unary
operators and parenthesis like:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>print -(2.0+3.0);
</code></pre></div></div>
<p>It was mind boggling how much code had to be written for seemingly such a simple
line of code! Although they each had their own role, I was now also able to see
a distinct pattern emerging from the code and a similarity between the
tokeniser, the parser and the formatter.</p>
<p>Their was a temptation to try to DRY the code but the end goal here was to get
stuff working first and try to make them efficient later. David would keep
reminding us of maybe not trying too hard to make the code pretty early on. And
I took this advice for the most part in order to stay on track.</p>
<p>The output from sending this line of code into the parser was a data model. And
the data model itself was composed of tokens! It felt very satisfying to start
tying the different pieces together.</p>
<p>By this time I had also started coding after hours a bit as it became clear to
me that I really needed to push myself more and engage in the after burners this
week.</p>
<p>🖥️ <strong>Day 4</strong></p>
<p>This day was all about the interpreter. But I was still working on my parser as
I wanted to be able to have operator precedence working correctly. I knocked off
some bugs instead from the parser and made some more progress to be able to
support <code class="language-plaintext highlighter-rouge">if-else</code> and <code class="language-plaintext highlighter-rouge">while</code> loops. I realised I needed to see some tangible
output and started focusing on the interpreter.</p>
<p>Somewhere around mid-day I had a parser and interpreter in place that could
correctly execute print statements with simple mathematical operations. This was
also the moment that I remember fondly. One that gave me great joy and reminded
me how much I loved and enjoyed programming! I also finally got why some of my
close friends have always wanted to write their own compiler and have done so
whenever they got a chance. I only wished I had started sooner, but here I was.
Better late than never! 💪🏽</p>
<p>🏘️ <strong>Day 5</strong></p>
<p>By this time I was almost a full day behind the batch’s current milestone. The
day started with a discussion on virtual machines and generating machine code
from your input program. The goal was to compile the program against a target
VM. While I didn’t do this myself that day, listening to David talk about this
did pique my curiosity.</p>
<p>I spent the rest of the day working on my parser and the interpreter so that I
could execute more complicated programs than simple math expressions. I did
eventually get operator precedence working though. But it did require a
significant rewrite of my parser logic as my initial approach was a bit flawed. 😅</p>
<p>Writing a compiler in a week is a herculean effort. This course was not a
competition. And each programmer in the batch had their own free choice in which
direction they wanted to push more as their is plenty to do on any of the
domains at any given point. It was interesting to keep making these decisions at
almost every juncture.</p>
<p>Overall I am happy I took this course and would do so again if I had the chance
and maybe I’ll go further along this time. 🥂</p>
<p><strong>🧑🏽💻 Talk is cheap; show me the code?</strong></p>
<p>Example wabit programs were available in a private repository for us and we used
those as a milestone to iteratively build each component out. Sadly it’s hard
for me to share any of the code without also giving away those example programs.
😔</p>
<p>And since this was a paid course it would feel ethically wrong to share them out
in public. But if you’re curious and would like to find out more about my
compiler (more like an interpreter since I didn’t manage to write code that
compiles it technically), I’d be happy to chat or pair program and maybe revisit
some of the concepts! Send me a DM on
<a href="https://twitter.com/indradhanush92" title="My twitter profile">Twitter</a> or an email.</p>
<p>👀 <strong>TL;DR</strong></p>
<ol>
<li>Does the word <code class="language-plaintext highlighter-rouge">compiler</code> make your brain shut down completely and anything
that follows it just does not make sense at all?</li>
<li>Do you want a hands on experience instead of going through a full fledged
theoretical university or MOOC course?</li>
<li>Do you want to understand what is a compiler and how you can start writing one yourself but don’t yet?</li>
</ol>
<p>If you answered <strong>yes</strong> to at least one of the above questions, you should find
out a course date that works for you and do it! David is an awesome technical
educator and a master of drilling down complex topics in an approachable way.</p>
<p>Hint: I was answering <strong>yes</strong> to all of those questions up until a few months ago. 😉</p>Indradhanush Guptaindradhanush.gupta@gmail.comMy experience of attending the one week courseNo dunst notifications when running picom - a debugging story2021-02-16T00:00:00+00:002021-02-16T00:00:00+00:00https://indradhanush.github.io/blog/dunst-notifications-with-picom<p>🚀 I started using <a href="https://wiki.archlinux.org/index.php/Picom">picom</a>, a
compositor a few days ago as I wanted my inactive windows to be transparent.
This goes a long way into identifying the active window when you’re working with
multiple split windows on the same workspace. This has not been a problem for me
previously as I was running <a href="https://i3wm.org/">i3wm</a> as my window manager and
<a href="https://github.com/greshake/i3status-rust/">i3status-rust</a> for my status
bar. That meant that each window had a title bar. But I moved to
<a href="https://github.com/Airblader/i3/">i3-gaps</a> and
<a href="https://github.com/polybar/polybar/">polybar</a> to make my desktop prettier. And
with <code class="language-plaintext highlighter-rouge">i3-gaps</code> you need to disable the title bar that <code class="language-plaintext highlighter-rouge">i3</code> adds on each
window. Thus <code class="language-plaintext highlighter-rouge">picom</code> as a visual indicator to quickly spot the active window
among a bunch of small windows within a workspace felt like a good
solution. Also it makes your windows prettier! 🤩</p>
<p>⚠️ <strong>Heads up:</strong> If you decide to go down this route, it <strong>will</strong> consume a lot
of your time. A lot more than you would have originally planned. You have been
warned.</p>
<p>🐛 Setting <code class="language-plaintext highlighter-rouge">picom</code> up was very straightforward, but I ran into an issue very soon:
system notifications stopped showing up completely. And the moment I killed the
<code class="language-plaintext highlighter-rouge">picom</code> process, all the notifications collected by the notification daemon up to
that point would pop right on screen. This had an unintended effect of becoming
a DND mode. 😛</p>
<p>🔔 The notification daemon I use is
<a href="https://wiki.archlinux.org/index.php/Dunst">dunst</a> and it has an easy way of
turning on / off notifications on demand. So I didn’t necessarily need this side
effect from running <code class="language-plaintext highlighter-rouge">picom</code>. It definitely came across to me as a bug and not a
feature. Jokes apart. This bugged me for the last five days. I searched the web
but was unable to find anything related to this.</p>
<p>⚙️ In <code class="language-plaintext highlighter-rouge">picom</code>, the user can set different <em>opacity</em> levels (opposite of
transparency) for different windows. So an opacity level of 0.9 makes a window
only slightly transparent, while an opacity level of 0.1 would make it almost
transparent and barely visible.</p>
<p>🤔 I tried configuring this for dunst windows but to no effect. Finally I tried
to install and run another compositor,
<a href="https://wiki.archlinux.org/index.php/Xcompmgr">xcompmgr</a> to see if I was able
to reproduce the issue. And unsurprisingly I had the same problem with
<code class="language-plaintext highlighter-rouge">xcompmgr</code> as well. This helped me with my direction of debugging the issue
immediately. I started feeling that the issue had to be coming out of <code class="language-plaintext highlighter-rouge">dunst</code>
itself and not <code class="language-plaintext highlighter-rouge">picom</code>. So I looked up my <code class="language-plaintext highlighter-rouge">dunstrc</code> (the config file for
<code class="language-plaintext highlighter-rouge">dunst</code>) and started reading through it. And soon enough I found this setting
there:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>transparency = 100
</code></pre></div></div>
<p>👀 At first glance I thought it was the right setting. Remember in <code class="language-plaintext highlighter-rouge">picom</code>
above, a higher number means more opaque. But then I read the <code class="language-plaintext highlighter-rouge">man</code> pages for
<code class="language-plaintext highlighter-rouge">dunst</code> where it was clear that a higher number means more transparency (quite
expected by the word itself). I’ve been using <code class="language-plaintext highlighter-rouge">dunst</code> for a few years already so
there is a good chance I made that edit at some point as the default value for
this setting is <code class="language-plaintext highlighter-rouge">0</code>. That is, this was a bug of my own making (like most bugs
anyway!).</p>
<p>So I immediately updated my <code class="language-plaintext highlighter-rouge">dunstrc</code> file:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>transparency = 0
</code></pre></div></div>
<p>🎉 Restarted <code class="language-plaintext highlighter-rouge">dunst</code> and voila! The notifications started showing up when
<code class="language-plaintext highlighter-rouge">picom</code> was running. And that too without any additional configurations on
<code class="language-plaintext highlighter-rouge">picom</code>! I can now go back to using DND in <code class="language-plaintext highlighter-rouge">dunst</code> the way the authors intended. 😅</p>Indradhanush Guptaindradhanush.gupta@gmail.comHow I fixed the problem of no system notifications when I ran picomLife of a Container2020-02-24T00:00:00+00:002020-02-24T00:00:00+00:00https://indradhanush.github.io/blog/life-of-a-container<p><strong>Disclaimer:</strong> I gave a talk of the same title at <a href="https://events.linuxfoundation.org/kubernetes-forum-delhi/">Kubernetes Forum Delhi</a> last week. You may watch the <del><a href="https://www.youtube.com/watch?v=gHR8V02dI48">video on YouTube</a></del> if you prefer that. (Update: The original video was removed by CNCF. It has been reuploaded <a href="https://www.youtube.com/watch?v=mGWWTP1Jeso">here</a>) Additionally this post also serves as a reference for the commands used in the demos.</p>
<p>I have been programming for almost six years now and have used containers for nearly the entirety of that time. For what comes with being a programmer, curiosity got the better of me and I started asking around the question, that what is a container? One of the answers were along the lines of the following:</p>
<p>“Oh they’re like virtual machines but only that they do not have their own kernel and share the host’s kernel.”</p>
<p>This led me to believe for a long time that containers are a lighter form of virtual machines. And they felt like magic to me. Only when I started digging into the internals of a container much later did I realize that this quote felt very true:</p>
<blockquote>
<p>Any sufficiently advanced technology is indistinguishable from magic.</p>
<p>— Sir Arthur Charles Clarke, author of 2001: A Space Odyssey</p>
</blockquote>
<p>And I have always tried to find a way of explaining things that look like the following at first or second glance:</p>
<figure>
<img src="https://indradhanush.github.io/images/life-of-a-container/complicated-arrow.png" alt="complicated-arrow" />
</figure>
<p>And see if they can be explained in a much easier way:</p>
<figure>
<img src="https://indradhanush.github.io/images/life-of-a-container/simple-arrow.png" alt="simple-arrow" />
</figure>
<p>And the first thing I learned is that, there is really no such things as a container at all. And I found that what we know as a container, is made up of two Linux primitives:</p>
<ol>
<li>Namespaces</li>
<li>Control groups (cgroups)</li>
</ol>
<p>Before we look into what they are and how they help form the abstraction known as a container, it is important to understand how new processes are created and managed in Linux. Let us take a look at the following diagram:</p>
<figure>
<img src="https://indradhanush.github.io/images/life-of-a-container/fork.png" alt="fork" />
</figure>
<p>In the above diagram, the parent process can be thought of as an active shell session, and the child process can be thought of as any command being run in the shell, for eg: ls, pwd. Now, when a new command is run, a new process is created. This is done by the parent process by making a call to the function <code class="language-plaintext highlighter-rouge">fork</code>. While it creates a new and independent process, it returns the process ID (PID) of the child process to the parent process that invoked the function <code class="language-plaintext highlighter-rouge">fork</code>. And in due course of time, both the parent and the child can continue to execute their tasks and terminate. The child PID is important for the parent to keep track of the newly created process. We will come back to this later in this blog post. If you’re interested to go deeper into the semantics of <code class="language-plaintext highlighter-rouge">fork</code>, I wrote a more detailed blog post in the past describing this and how to do that with code. You may read it <a href="/blog/writing-a-unix-shell-part-1/">here</a>.</p>
<h2 id="namespaces">Namespaces</h2>
<p>So now that have an idea about how new processes are created in Linux, let us try and understand what namespaces help us achieve.</p>
<p>Namespaces are an isolation primitive that helps us to isolate various types of resources. In Linux, it is possible to do this for seven different type of resources at the moment. They are, in no specific order:</p>
<ul>
<li>Network namespace</li>
<li>Mount</li>
<li>UTS or Hostname namespace</li>
<li>Process ID or PID namespace</li>
<li>Inter process communication or IPC namespace</li>
<li>cgroup namespace</li>
<li>User namespace</li>
</ul>
<p>I won’t go into detail about what each of them does in this post, as there is already a lot of literature on that and the man pages are possibly the best resource for them. Instead, I will try to explain network namespaces in ths post and see how it helps us to isolate network resources. But before that, it is important to note that by default each of these namespaces already exists in the system and are called the host namespaces or the default namespaces. For example, the default network namespace in a system contains network interface cards for WIFI and / or the ethernet port if there’s one.</p>
<p>All the infromation about a process is contained under <code class="language-plaintext highlighter-rouge">procfs</code>, which is typifcally mounted on <code class="language-plaintext highlighter-rouge">/proc</code>. Running <code class="language-plaintext highlighter-rouge">echo $$</code> will give us the PID of the currently running process:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ echo $$
448884
</code></pre></div></div>
<p>And if look inside <code class="language-plaintext highlighter-rouge">/proc/<PID>/ns</code> we will the list of namespaces used by that process. For example:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ ls /proc/448884/ns -lh
total 0
lrwxrwxrwx 1 root root 0 Feb 23 19:00 cgroup -> 'cgroup:[4026531835]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 ipc -> 'ipc:[4026531839]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 mnt -> 'mnt:[4026531840]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 net -> 'net:[4026532008]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 pid -> 'pid:[4026531836]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 pid_for_children -> 'pid:[4026531836]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 user -> 'user:[4026531837]'
lrwxrwxrwx 1 root root 0 Feb 23 19:00 uts -> 'uts:[4026531838]'
</code></pre></div></div>
<p>For each namespace, there is a file which is a symbolic link<sup>[1]</sup> to ID of the namespace. So for the network namespace, the ID of the namespace in the above example is <code class="language-plaintext highlighter-rouge">net:[4026532008]</code> while <code class="language-plaintext highlighter-rouge">4026532008</code> is the inode number. For two processes in the same namespace, this number is the same.</p>
<p>On Linux, to create a new namespace, we can use the system call <code class="language-plaintext highlighter-rouge">unshare</code>. And to create a new network namespace we need to add the flag <code class="language-plaintext highlighter-rouge">-n</code>. So in a shell session with root privileges, we will do:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># unshare -n
</code></pre></div></div>
<p>We can look into the <code class="language-plaintext highlighter-rouge">/proc/<PID>/ns</code> directory to verify that we have indeed created a new namespace:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ls -l /proc/$$/ns/net
lrwxrwxrwx 1 root root 0 Feb 23 18:46 /proc/447612/ns/net -> 'net:[4026533490]'
</code></pre></div></div>
<p>The namespace ID is different than what we see above for the host network namespace. And running the command <code class="language-plaintext highlighter-rouge">ip link</code> after this will only show us the loopback interface:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ip link
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
</code></pre></div></div>
<p>If there are any network interfaces like the WIFI card or the ethernet port, they won’t show up at all. In fact, if we tried to run <code class="language-plaintext highlighter-rouge">ping 127.0.0.1</code>, something we mostly take for granted to work won’t work either:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ping 127.0.0.1
ping: connect: Network is unreachable
</code></pre></div></div>
<p>But why did the above happen? Let us try to understand that.</p>
<p>At first we created a new network namespace, the very act isolated the network resources already in the default namespace. And the only interface available to us in this new namespace is the <code class="language-plaintext highlighter-rouge">loopback</code> interface. However it does not have an IP address assigned to it yet, as a result of which <code class="language-plaintext highlighter-rouge">ping 127.0.0.1</code> does not quite work. This can be verified by running:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ip address
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
</code></pre></div></div>
<p>Which shows that not only does this interface does not have an IP address at the moment, its <code class="language-plaintext highlighter-rouge">state</code> is also set to <code class="language-plaintext highlighter-rouge">DOWN</code>. Running the following commands would fix that:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ip address add dev lo local 127.0.0.1/8
# ip link set lo up
</code></pre></div></div>
<p>At first we assigned the IP address <code class="language-plaintext highlighter-rouge">127.0.0.1</code> to that interface and set the state of the interface to <code class="language-plaintext highlighter-rouge">UP</code> and thus making it available to listen for incoming network packets. And now <code class="language-plaintext highlighter-rouge">ping</code> would work as expected:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ping 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.020 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.060 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.071 ms
</code></pre></div></div>
<p>To understand the concept of isolation, we will go forward with trying to get this new network interface, (let’s call it CHILD) to talk to the host network namespace and vice versa.</p>
<p>To aid our understanding, we will set the <code class="language-plaintext highlighter-rouge">PS1</code> variable in this shell to something easily identifiable:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># export PS1="[netns: CHILD]# "
[netns: CHILD]#
</code></pre></div></div>
<p>And we will also spawn a new terminal with root access so that the shell running in it belongs to the host network namespace. Once again we will set the <code class="language-plaintext highlighter-rouge">PS1</code> variable to help with identifying the host namespace easily:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># export PS1="[netns: HOST]# "
[netns: HOST]#
</code></pre></div></div>
<p>Running the <code class="language-plaintext highlighter-rouge">ip link</code> command on this interface would show the currently installed network interfaces in the system. For example:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
3: enp0s31f6: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000
link/ether 0e:94:18:de:da:b3 brd ff:ff:ff:ff:ff:ff
4: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
link/ether 02:42:ad:0f:83:cc brd ff:ff:ff:ff:ff:ff
11: wlp61s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DORMANT group default qlen 1000
link/ether fa:3d:a9:90:95:5d brd ff:ff:ff:ff:ff:ff
</code></pre></div></div>
<p>To list all the network namespaces in the system we can run:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip netns list
</code></pre></div></div>
<p>But that will produce an empty output if readers have been following along. So does that mean the command didn’t work or we did something wrong there, even though we created a new network namespace earlier? The answer to both of the questions is a no. As everything is a file in UNIX<sup>[2]</sup>, the <code class="language-plaintext highlighter-rouge">ip</code> command looks for network namespaces in the directory <code class="language-plaintext highlighter-rouge">/var/run/netns</code>. And currently that directory is empty. So we will first create an empty file and then try running that command again:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# touch /var/run/netns/child
[netns: HOST]# ip netns list
Error: Peer netns reference is invalid.
Error: Peer netns reference is invalid.
child
</code></pre></div></div>
<p>We do see the <code class="language-plaintext highlighter-rouge">child</code> namespace in the list, but we also see an error. This exists because we have not yet mapped the shell running the new namespace to this file yet. To do that, we will bind mount the <code class="language-plaintext highlighter-rouge">/proc/<PID>/ns/net</code> file to the new file we created above. This can be done by executing the following in the shell running the child network namespace:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: CHILD]# mount -o bind /proc/$$/ns/net /var/run/netns/child
[netns: CHILD]# ip netns list
child
</code></pre></div></div>
<p>And this time the command to list the network namespaces works without any errors. This means that we have associated the namespace with the ID <code class="language-plaintext highlighter-rouge">4026533490</code> to the file at <code class="language-plaintext highlighter-rouge">/var/run/netns/child</code> and the namespace is now persistent.</p>
<p>Now we need to find a way to get the host and the child network namespace to talk to each other. To do this, we will create a pair of virtual ethernet devices in the host network namespace:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip link add veth0 type veth peer name veth1
</code></pre></div></div>
<p>In this command we create a virtual ethernet device named <code class="language-plaintext highlighter-rouge">veth0</code> while the other end of this pair device is called <code class="language-plaintext highlighter-rouge">veth1</code>. We can verify this by running:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip link | grep veth
35: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
36: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
</code></pre></div></div>
<p>At the moment, both of these devices exist in the host namespace. If we run <code class="language-plaintext highlighter-rouge">ip link</code> in the child network namespace, it will only show the <code class="language-plaintext highlighter-rouge">loopback</code> address as was the case previously:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: CHILD]# ip link
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
</code></pre></div></div>
<p>So what can we do to make one of the veth devices show up in the child namespace? To do that, we will run the following command in the host network namespace, because that is where the veth devices currently exist:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip link set veth1 netns child
</code></pre></div></div>
<p>Here we are instructing the <code class="language-plaintext highlighter-rouge">veth1</code> network device to be assigned to the namespace <code class="language-plaintext highlighter-rouge">child</code>. Looking at <code class="language-plaintext highlighter-rouge">ip link</code> in this namespace will not show the <code class="language-plaintext highlighter-rouge">veth1</code> device any longer:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip link | grep veth
36: veth0@if35: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
</code></pre></div></div>
<p>While on the other hand, <code class="language-plaintext highlighter-rouge">veth1</code> now appears in the child network namespace:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: CHILD]# ip link | grep veth
35: veth1@if36: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
</code></pre></div></div>
<p>We have two more steps before we can make them to talk to each other, which are to assign an IP address to each <code class="language-plaintext highlighter-rouge">veth</code> device and to set the state to up. So let’s do that quickly:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip address add dev veth0 local 10.16.8.1/24
[netns: HOST]# ip link set veth0 up
</code></pre></div></div>
<p>We can verify the results of the commands with:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ip address | grep veth -A 5
36: veth0@if35: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN group default qlen 1000
link/ether 32:c7:79:c7:e2:e0 brd ff:ff:ff:ff:ff:ff link-netns child
inet 10.16.8.1/24 scope global veth0
valid_lft forever preferred_lft forever
</code></pre></div></div>
<p>And the same for the child namespace:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: CHILD]# ip address add dev veth1 local 10.16.8.2/24
[netns: CHILD]# ip link set veth1 up
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: CHILD]# ip address | grep veth -A 5
35: veth1@if36: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 5a:62:dd:40:a6:f1 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.16.8.2/24 scope global veth1
valid_lft forever preferred_lft forever
inet6 fe80::5862:ddff:fe40:a6f1/64 scope link
valid_lft forever preferred_lft forever
</code></pre></div></div>
<p>Finally, we should be able to ping each other:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: HOST]# ping 10.16.8.2
PING 10.16.8.2 (10.16.8.2) 56(84) bytes of data.
64 bytes from 10.16.8.2: icmp_seq=1 ttl=64 time=0.086 ms
64 bytes from 10.16.8.2: icmp_seq=2 ttl=64 time=0.099 ms
64 bytes from 10.16.8.2: icmp_seq=3 ttl=64 time=0.100 ms
</code></pre></div></div>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>[netns: CHILD]# ping 10.16.8.1
PING 10.16.8.1 (10.16.8.1) 56(84) bytes of data.
64 bytes from 10.16.8.1: icmp_seq=1 ttl=64 time=0.057 ms
64 bytes from 10.16.8.1: icmp_seq=2 ttl=64 time=0.090 ms
64 bytes from 10.16.8.1: icmp_seq=3 ttl=64 time=0.118 ms
</code></pre></div></div>
<p>Voila! We did it! I hope that helps with understanding namespaces better. What we did above can be best described by this image of two children talking to each other with a string telephone made up of tin cans and a long string. In this image, the children can be thought of as the namespaces while the tin cans are analogous to the virtual ethernet devices we created and used for sending and receiving network traffic.</p>
<figure>
<img src="https://indradhanush.github.io/images/life-of-a-container/telephone.png" alt="form-network-namespace" />
</figure>
<h2 id="cgroups">cgroups</h2>
<p>Next up is cgroups. They help us in controlling the <strong>amount</strong> of the resources that a process can consume. The best examples for this are CPU and memory. And the best use case to do this is to avoid a process from accidentally using all the available CPU or memory and choking the entire system from doing anything else. The cgroups reside under the <code class="language-plaintext highlighter-rouge">/sys/fs/cgroup</code> directory. Let us take a look at the contents:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ls /sys/fs/cgroup/ -lh
total 0
dr-xr-xr-x 5 root root 0 Feb 17 01:05 blkio
lrwxrwxrwx 1 root root 11 Feb 17 01:05 cpu -> cpu,cpuacct
lrwxrwxrwx 1 root root 11 Feb 17 01:05 cpuacct -> cpu,cpuacct
dr-xr-xr-x 5 root root 0 Feb 17 01:05 cpu,cpuacct
dr-xr-xr-x 2 root root 0 Feb 17 01:05 cpuset
dr-xr-xr-x 5 root root 0 Feb 17 01:05 devices
dr-xr-xr-x 2 root root 0 Feb 17 01:05 freezer
dr-xr-xr-x 2 root root 0 Feb 17 01:05 hugetlb
dr-xr-xr-x 9 root root 0 Feb 20 00:24 memory
lrwxrwxrwx 1 root root 16 Feb 17 01:05 net_cls -> net_cls,net_prio
dr-xr-xr-x 2 root root 0 Feb 17 01:05 net_cls,net_prio
lrwxrwxrwx 1 root root 16 Feb 17 01:05 net_prio -> net_cls,net_prio
dr-xr-xr-x 2 root root 0 Feb 17 01:05 perf_event
dr-xr-xr-x 5 root root 0 Feb 17 01:05 pids
dr-xr-xr-x 2 root root 0 Feb 17 01:05 rdma
dr-xr-xr-x 5 root root 0 Feb 17 01:05 systemd
dr-xr-xr-x 5 root root 0 Feb 17 01:06 unified
</code></pre></div></div>
<p>Each directory is a resource whose usage can be controlled. To create a new <code class="language-plaintext highlighter-rouge">cgroup</code>, we need to create a new directory inside one of these resources. For example, if we intended to create a new <code class="language-plaintext highlighter-rouge">cgroup</code> to control memory usage, we would create a new directory (the name is upto us) under the <code class="language-plaintext highlighter-rouge">/sys/fs/cgroups/memory</code> path. So let’s do that:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># mkdir /sys/fs/cgroup/memory/child
</code></pre></div></div>
<p>And let us take a look inside this directory. If you’re thinking why bother because we just created the directory and it should be empty, read on:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># ls -lh /sys/fs/cgroup/memory/demo/
total 0
-rw-r--r-- 1 root root 0 Feb 24 12:29 cgroup.clone_children
--w--w--w- 1 root root 0 Feb 24 12:29 cgroup.event_control
-rw-r--r-- 1 root root 0 Feb 24 12:29 cgroup.procs
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.failcnt
--w------- 1 root root 0 Feb 24 12:29 memory.force_empty
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.failcnt
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.limit_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.max_usage_in_bytes
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.slabinfo
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.tcp.failcnt
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.tcp.limit_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.tcp.max_usage_in_bytes
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.tcp.usage_in_bytes
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.kmem.usage_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.limit_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.max_usage_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.memsw.failcnt
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.memsw.limit_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.memsw.max_usage_in_bytes
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.memsw.usage_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.move_charge_at_immigrate
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.numa_stat
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.oom_control
---------- 1 root root 0 Feb 24 12:29 memory.pressure_level
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.soft_limit_in_bytes
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.stat
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.swappiness
-r--r--r-- 1 root root 0 Feb 24 12:29 memory.usage_in_bytes
-rw-r--r-- 1 root root 0 Feb 24 12:29 memory.use_hierarchy
-rw-r--r-- 1 root root 0 Feb 24 12:29 notify_on_release
-rw-r--r-- 1 root root 0 Feb 24 12:29 tasks
</code></pre></div></div>
<p>Turns out operating system creates a whole bunch of files for every new directory. Let us take a look at one of the files:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># cat /sys/fs/cgroup/memory/demo/memory.limit_in_bytes
9223372036854771712
</code></pre></div></div>
<p>The value in this file dictates the maximum memory that a process can use if it is part of this cgroup. Let us set this value to a much smaller number, say 4MB, but in bytes:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># echo 4000000 > /sys/fs/cgroup/memory/demo/memory.limit_in_bytes
</code></pre></div></div>
<p>And let us look inside this file:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># cat /sys/fs/cgroup/memory/demo/memory.limit_in_bytes
3997696
</code></pre></div></div>
<p>While this is not exactly what we wrote into the file, it is approximately 3.99 MB. My guess is that this has something to do with memory alignment which is managed by the operating system. I haven’t researched this futher at the moment. (If you know the answer, please let me know!)</p>
<p>Now let us start a new process in a new hostname namespace:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># unshare -u
</code></pre></div></div>
<p>This starts a new shell process. Let us try to run a command, like <code class="language-plaintext highlighter-rouge">wget</code> which I know needs more than 4MB memory to function:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># wget wikipedia.org
URL transformed to HTTPS due to an HSTS policy
--2020-02-24 12:36:58-- https://wikipedia.org/
Loaded CA certificate '/etc/ssl/certs/ca-certificates.crt'
Resolving wikipedia.org (wikipedia.org)... 103.102.166.224, 2001:df2:e500:ed1a::1
Connecting to wikipedia.org (wikipedia.org)|103.102.166.224|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://www.wikipedia.org/ [following]
--2020-02-24 12:36:58-- https://www.wikipedia.org/
Resolving www.wikipedia.org (www.wikipedia.org)... 103.102.166.224, 2001:df2:e500:ed1a::1
Connecting to www.wikipedia.org (www.wikipedia.org)|103.102.166.224|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 76776 (75K) [text/html]
Saving to: ‘index.html’
index.html 100%[============================================>] 74.98K 362KB/s in 0.2s
2020-02-24 12:36:59 (362 KB/s) - ‘index.html’ saved [76776/76776]
</code></pre></div></div>
<p>Now we noticed that the command worked. That is because this process is part of the default cgroup. To make it part of the new cgroup, we need to write the PID of this process to the <code class="language-plaintext highlighter-rouge">cgroup.procs</code> file:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># echo $$ > /sys/fs/cgroup/memory/demo/cgroup.procs
</code></pre></div></div>
<p>And let us look inside the contents of this file:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># cat /sys/fs/cgroup/memory/demo/cgroup.procs
468401
468464
</code></pre></div></div>
<p>There seems to be two entries here. The first entry is the PID of the shell process that we wrote to the file. The other is the PID of the <code class="language-plaintext highlighter-rouge">cat</code> process that we run. This is because all child processes are part of the same cgroup as the parent by default. And once the process terminates, the PID is automatically removed from the file. If we run the same command again, we will still find two entries, but the second one would be different:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># cat /sys/fs/cgroup/memory/demo/cgroup.procs
468401
468464
</code></pre></div></div>
<p>And now let us try to run the <code class="language-plaintext highlighter-rouge">wget</code> command once again:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># wget wikipedia.org
URL transformed to HTTPS due to an HSTS policy
--2020-02-24 12:44:26-- https://wikipedia.org/
Killed
</code></pre></div></div>
<p>The process gets killed immediately because it was trying to use more memory than the cgroup it is part of currently permits. Pretty neat I’d say.</p>
<h2 id="addendum">Addendum</h2>
<p>So while <code class="language-plaintext highlighter-rouge">namespaces</code> and <code class="language-plaintext highlighter-rouge">cgroups</code> allow to isolate and control the usage of resources and form the core of the abstraction popularly known as containers there are two more concepts that are used for enhancing the isolation further:</p>
<ol>
<li>
<p>Capabilities: It limits the use of root privileges. Sometimes we need to run processes that need elevated permissions to do one thing but running it as root is a security risk because then the process can do pretty much anything with the system. To limit this, capabilities provide a way of assigning special privileges without giving system wide root privileges to a process. One example is if we need a program to be able to manage network interfaces and related operations, we can grant the program the capability <code class="language-plaintext highlighter-rouge">CAP_NET_ADMIN</code>.</p>
</li>
<li>
<p>Seccomp: It limits the use of syscalls. To ramp down on security even further, it is possible to use them to block syscalls that can cause additional harm. For example blocking <code class="language-plaintext highlighter-rouge">kill</code> syscall will prevent the processes from being able to terminate or send signals to other processes.</p>
</li>
</ol>
<h2 id="recap">Recap</h2>
<p>So while <code class="language-plaintext highlighter-rouge">namespaces</code> allow us to isolate the <strong>type</strong> of resource, <code class="language-plaintext highlighter-rouge">cgroups</code> help us to control the <strong>amount</strong> of resource usage by a process. And <code class="language-plaintext highlighter-rouge">capabilities</code> limit the use of root privileges by breaking down operations into different types of capabilities. Finally <code class="language-plaintext highlighter-rouge">seccomp</code> helps to block processes from invoking unwanted syscalls. These concepts combined together form a container, which is a nicer abstraction than having to worry about all of these at the same time.</p>
<h2 id="one-final-note">One final note</h2>
<p>The diagram about <code class="language-plaintext highlighter-rouge">fork</code> earlier in this post is slightly incomplete. Here is a more complete diagram:</p>
<figure>
<img src="https://indradhanush.github.io/images/life-of-a-container/fork-waitpid.png" alt="form-waitpid" />
</figure>
<p>As noted earlier <code class="language-plaintext highlighter-rouge">fork</code> returns the child’s PID to the parent process, and it uses this PID to “wait” for the child process to finish execution. This is done by the <code class="language-plaintext highlighter-rouge">waitpid</code> syscall. This is important to avoid zombie processes and is known as reaping. Once a child process has terminated, it is the responsibility of the parent to ensure any resources allocated for the child process are cleaned up. In a nutshell, <strong>this</strong> is the job of a container runtime or a container engine. It spawns new conatiners or child processes and ensures the resources are cleaned up once the container has terminated.</p>
<h2 id="references">References</h2>
<ul>
<li>
<p>I found a lot of information about <code class="language-plaintext highlighter-rouge">namespaces</code> in this amazing seven part series on lwn.net: https://lwn.net/Articles/531114/.</p>
</li>
<li>
<p>Julia Evans’ post on What even is a container is a brilliant guide for grasping the concepts quickly: https://jvns.ca/blog/2016/10/10/what-even-is-a-container/</p>
</li>
<li>
<p>man pages for <code class="language-plaintext highlighter-rouge">namespaces</code>, <code class="language-plaintext highlighter-rouge">unshare</code> and <code class="language-plaintext highlighter-rouge">cgroups</code> have been very helpful as well and is a recommended reading.</p>
</li>
</ul>
<p>That is all for now. I hope this post was helpful and containers don’t feel like magic anymore.</p>
<p><br />
<sup>Footnotes:</sup>
<br />
<sup>[1]: Quoting the man page for <code class="language-plaintext highlighter-rouge">namespaces</code>: “In Linux 3.7 and earlier, these files were visible as hard links. Since Linux 3.8, they appear as symbolic links.”</sup>
<br />
<sup>[2]: https://en.wikipedia.org/wiki/Everything_is_a_file</sup></p>Indradhanush Guptaindradhanush.gupta@gmail.comUnderstanding the internals of a containerWhat I have learned being a programmer2019-11-10T00:00:00+00:002019-11-10T00:00:00+00:00https://indradhanush.github.io/blog/what-i-learned-being-a-programmer<p>I joined <a href="https://loodse.com">Loodse</a> last month and I wanted to reflect upon the journey of my programming career so far. And give some thought to what I’d like to improve or change going forward.</p>
<h2 id="be-a-human-first-programmer-second---">Be a human first, programmer second. 🤓 > 🤖</h2>
<p>I feel this is the most important lesson that I have learned and fittingly this should be the first thing on this post. If there’s one thing that I would like the readers to take away from this post it’s this. And I cannot emphasize this enough.</p>
<p><strong>Developing empathy for your coworkers is very important</strong>. If someone hasn’t been able to accomplish a great deal over the week, instead of following up for an update on the status of the project check with them about their well being. See if there’s something in their personal life bothering them and if there’s anything you can do to help them. And if it is indeed the task at hand that is slowing them down, offer to pair with them on it.</p>
<p><strong>Do not assume a person’s gender</strong>, especially if you only happen to know them by a “username” on the internet. Most of us are susceptible to using <code class="language-plaintext highlighter-rouge">he</code> when referring to a user on the internet because of our biases. Use <code class="language-plaintext highlighter-rouge">they/them</code> first. If you have a Twitter profile put your preferred pronoun on it. When referring to a group of people, prefer not to use <code class="language-plaintext highlighter-rouge">guys</code>. Use <code class="language-plaintext highlighter-rouge">folks</code> or <code class="language-plaintext highlighter-rouge">people</code>. I do not want to argue about why <code class="language-plaintext highlighter-rouge">guys</code> has been always used as a gender neutral term and thus should continue to be acceptable as such. Using <code class="language-plaintext highlighter-rouge">folks</code> doesn’t hurt anyone.</p>
<h2 id="code-reviews-">Code reviews 🖥</h2>
<p>Talking about code reviews, how does one find the right balance between striving for clean and perfect code vs shipping stuff? This is a question that has always bothered me. And while this continues to be an area of improvement for me personally, there are some pointers worth sharing that has helped me along the way.</p>
<p>I try not to block a merge / pull request just because it doesn’t match <em>my</em> idea of what perfect code looks like. If it’s an improvement on the existing state, I try to approve it. That is unless I spot any bugs. This is especially true for new contributors. Getting your first few PRs merged quickly give you the feeling of getting things done and helps you settle down in a new project / team. And for seasoned contributors it keeps things moving. If there is something that can be improved on the PR but isn’t really a deal breaker, I leave a note about the possibility and suggest to do it in a follow up PR. This helps give them responsibility and own that part of the codebase. In my experience the longer a PR lingers and goes back and forth between review and changes, the lesser its chances of getting merged. Sometimes they never get merged. ¯_(ツ)_/¯</p>
<p>Use emojis in code reviews. The more the better. 🎉 👍.</p>
<p><strong>Code reviews shouldn’t only be about what’s wrong.</strong> They should also highlight the good parts. If you spotted a clever design pattern or good documentation or tests, leave a comment. Thank the author. Appreciate them. 🙌</p>
<p>If a PR takes more than one cycle of review, offer to pair on that PR and address the concerns that helps to get it merged. Code reviews needn’t always be done passively. They can be done while the changes are being worked on in real time. Admittedly, this is something that I have not been able to do as much as I would have wanted to and would like to fix that in my workflow going forward.</p>
<h2 id="programming-styles-">Programming styles 💻</h2>
<p>Don’t stick to a specific programming style religiously. Be flexible. For example, I have always preferred spaces over tabs but I recently learned that <a href="https://www.reddit.com/r/javascript/comments/c8drjo/nobody_talks_about_the_real_reason_to_use_tabs/">tabs are better for accessibility for programmers with visual impairment</a>. I intend to move to tabs instead of spaces for projects that allow it.</p>
<p>⏎ Never underestimate the power of a newline. Use a newline to separate code blocks with succeeding or preceding blocks of code. For example, look at the following function:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">validateDateTime</span><span class="p">(</span><span class="n">input</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">time</span><span class="o">.</span><span class="n">Time</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
<span class="n">val</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">time</span><span class="o">.</span><span class="n">Parse</span><span class="p">(</span><span class="s">"02/Jan/2006:15:04:05 -0700"</span><span class="p">,</span> <span class="n">input</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
<span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"Invalid DateTime: %q; error: %v"</span><span class="p">,</span> <span class="n">input</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="o">&</span><span class="n">val</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And now look at the same function again:</p>
<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">func</span> <span class="n">validateDateTime</span><span class="p">(</span><span class="n">input</span> <span class="kt">string</span><span class="p">)</span> <span class="p">(</span><span class="o">*</span><span class="n">time</span><span class="o">.</span><span class="n">Time</span><span class="p">,</span> <span class="kt">error</span><span class="p">)</span> <span class="p">{</span>
<span class="n">val</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">time</span><span class="o">.</span><span class="n">Parse</span><span class="p">(</span><span class="s">"02/Jan/2006:15:04:05 -0700"</span><span class="p">,</span> <span class="n">input</span><span class="p">)</span>
<span class="k">if</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
<span class="k">return</span> <span class="no">nil</span><span class="p">,</span> <span class="n">fmt</span><span class="o">.</span><span class="n">Errorf</span><span class="p">(</span><span class="s">"Invalid DateTime: %q; error: %v"</span><span class="p">,</span> <span class="n">input</span><span class="p">,</span> <span class="n">err</span><span class="p">)</span>
<span class="p">}</span>
<span class="k">return</span> <span class="o">&</span><span class="n">val</span><span class="p">,</span> <span class="no">nil</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Does the newline before the return statement make it easier to scan the function at first glance? Now try to imagine a function with 50 lines or so. <strong>Remember that code is more often read than written.</strong> 📖 👓.</p>
<p>If a project uses a specific style, stick to it when contributing to it and do not try to enforce your own style on it.</p>
<h2 id="social-rules--">Social rules 👮 📢</h2>
<p>I continue to look back upon my time at the Recurse Center from more than two years ago as the most influential part of my programming career. Not only did it make me a better programmer, but also a better human being. The Recurse Center is governed by <a href="https://www.recurse.com/manual##sub-sec-social-rules">four social rules</a> that I feel should be a part of every community. Both in and outside of tech. The page linked does a great job of explaining them so I will not go into detail here. I have also written <a href="/tags/#recurse-center">quite a few posts</a> on my work and experience there.</p>
<p>When explaining a concept, I try to not use the words basically and actually. To me, needing to use these words indicates that I am unable to express my thoughts in simpler words (and probably don’t understand the concept well enough myself). In addition to this, using these words reads in my head as “what I’m trying to say is simple enough, so you should have already understood it”, even though that might not have been the intention.</p>
<h2 id="communication-">Communication 📞</h2>
<p>Almost every article that I’ve read on the topic of “what makes a good programmer/software engineer/<insert fancy tech job title>”, seemed to have mentioned about the importance of communication. However it wasn’t always obvious what one could do to improve upon that. Here’s what I feel has helped me.</p>
<p><strong>When communicating over text, use emojis as much as possible.</strong> The more the better. For example, use the 📢 emoji for an announcement. I don’t always feel like I need to use <code class="language-plaintext highlighter-rouge">@here</code> or <code class="language-plaintext highlighter-rouge">@channel</code> to convey a message but still feel it is important for my team mates to know. I just don’t feel like I want to intrude their thoughts with a notification right away. Using an emoji helps draw attention to it from the wall of text that might have already accumulated on a channel. For a question, I use ❓.</p>
<p>Use 🙂 for ending statements that others may not agree with and 🤔 for something you do not understand. Text is not able to convey emotions. Sometimes words maybe read in a different tone than the one on your mind when they were originally written. An emoji is worth a thousand words. 😉</p>
<p>I try to not use animated GIFs as emojis. Although they can be fun in the right context, I find them distracting for the most part. Something moving in the corner of my vision keeps vying for my attention.</p>
<p><strong>Communication is more important than your programming skills.</strong> If I have a hard time understanding your question / bug report I am more likely to skip it or file it away as <code class="language-plaintext highlighter-rouge">TODO</code> for later. Spoiler alert: That is a place where no issue gets looked at again. As there is always new work on the table. 😅</p>
<p>Make your sentences shorter and use paragraphs to separate contexts. Use formatting when available. For example, use backticks for code / technical buzz words. Bold for highlighting important stuff.</p>
<p>If English is not your native language, spend some time on learning comprehension skills in English as that is the most commonly used language for programming. It is easier when others can interpret your thoughts based on your written words without having to read it a few times to understand them. 💭</p>
<h2 id="feedback-">Feedback ⤴</h2>
<p>Getting timely feedback is important for yourself. But it is also insanely hard. The sooner you get feedback the better. Ask what your colleagues felt about that last feature you worked on. Was there anything that you could have done differently in their view? I’ve elaborated more on this in the past <a href="/blog/how-to-ask-for-feedback/">here</a>. <strong>When asking for feedback, I try to mention explicitly to not worry about sharing critical feedback and me feeling offended.</strong> It works most of the times. And the critical feedback has been very helpful to me personally.</p>
<h2 id="personal-productivity-">Personal productivity 🚀</h2>
<p><strong>Learn to touch type</strong>. I often find myself getting up from my desk and going to another room to get something. In the meantime I’ve forgotten why I got up from my desk in the first place. Trying to find where that character is on the keyboard interrupts my thought process. And I am better off with my keyboard not interrupting me as well. Additionally, if I’m using the keyboard for eight hours a day, five days a week I would rather be able to use all my ten fingers than just four or five.</p>
<p>It can be hard to quit your old typing habit cold turkey, especially if you already have a day job. One approach that might work is correcting one or two keys at a time. It would still be a struggle but at least your work won’t come to a standstill. Start with the home row (<code class="language-plaintext highlighter-rouge">a,s,d,f,g</code> and <code class="language-plaintext highlighter-rouge">h,j,k,l,;</code> on QWERTY layouts) and try to get the fingers right for these keys. This is going to be your frame of reference and will help to get the other keys right sooner. Do a few simple typing exercises with those keys when you’re waiting for your code to compile (or your cluster to provision rather 😝). And do not stress if you don’t get most of them with the right fingers initially. Keep adding a couple of keys every few days and eventually you will have learned to touch type. Also don’t worry about the number row initially. Get to it once you feel you have a good command over the alphabets. I used this approach while I was at my first internship back in 2013.</p>
<p>You don’t need to have typing speeds of a live note taker. 40-50 wpm is a good point for most programming tasks while 90-110 wpm is awesome. You will get there eventually. If you’ve gone down this path, try not to quit and go back to your older way of typing. That will only slow you down. If you’re a student and not already touch typing, I highly recommend taking time out to learn this skill.</p>
<p><strong>Learn the keyboard shortcuts of the tools you use.</strong> For example, your editor, browser or terminal. Each time you need to take your hand off the keyboard and look for the mouse to do something, it slows you down. I’ve found that <a href="https://addons.mozilla.org/en-US/firefox/addon/vimium-ff/">vimium</a> has helped me to replace most mouse actions on the browser with keyboard shortcuts.</p>
<p>📧 I used to keep my email always open as a pinned tab and had the sync turned on on my phone. This used to make me keep checking for new email and get interrupted whenever there was a new one. I’ve now turned the sync off on my phone and open email on my browser only when I need it or at the start and end of my day.</p>
<p>Don’t feel shy about going offline / turning on DND from your company’s chat app to focus on work. Try to schedule no meeting days at work and schedule as many of your meetings back to back as possible.</p>
<h2 id="red-flags---">Red flags 🔴 ⚠</h2>
<p>While being productive is important, it is also important to look out for signs that affect productivity. It is very important to look after my own interests because I am the only person responsible for my career and my life. Not my manager or my colleagues. On that note, here are a few red flags that I keep a lookout for.</p>
<p><strong>Time spent on Twitter / Facebook / Reddit is inversely proportional to sense of fulfillment at my day job.</strong> That is, the more I’ve felt detached from my work, the more I’ve found myself aimlessly scrolling down my Twitter timeline. In hindsight, both the times in my career when I’ve felt I needed to change my job, my usage of social media has increased significantly.</p>
<p>If you find yourself with a lack of ideas to share as blog posts or talks at meetups, there’s a chance that this may mean that the technical tasks you have been working on are not helping you improve as a programmer. They’re not making you think and you’re working on things that are easy and require less thought. Please note that this does not imply that you must publish blog posts or present technical talks at meetups or conferences. But one should just feel like that they have an idea out of which it might be possible to write a blog post or present a talk with further research.</p>
<p>Try to change your work setting from time to time, especially if you’re working remotely. While I prefer to work from my home office on most days, I feel spending too much time in there has a negative effect on me. Stepping out to occasionally work from a cafe helps my productivity. Also this is more fitting on a blog post on remote work, so I will expand on this more if I write one in the future.</p>
<p><strong>To me writing code is very important.</strong> I do not envision myself in a role where it isn’t a part of my job description. When not writing much code for weeks on end, I tend to be cranky and less motivated. And at times like these, even the smallest of bash scripts to automate a tedious task makes me feel good again. For example during one such phase, I wrote a tiny bash script that automated the cleanup of some directories on a bunch of servers based on some filters. It took me under an hour or even less to get it working, but I remember feeling great the rest of the day and the day after. I’m a programmer after all. This does not imply that all I want to do at my job is write code, but I prefer it to be 50-60% of my responsibilities.</p>
<p><strong>The best environment for me to grow is where I can be the mentor and the mentee simultaneously.</strong> For the same or different people. That is the ideal scenario, but if I am in a room where I don’t get to fulfill either of the roles, that is not a place I want to be. There’s a beautiful quote that I’ve come to admire:</p>
<blockquote>
<p>In learning you will teach, and in teaching you will learn.</p>
<p>— Phil Collins</p>
</blockquote>
<p><strong>Go read</strong> <a href="https://amzn.to/2PZZ4xk">The Psychopath Code</a>. Reading it helped me a lot in being able to look out for myself and spotting early warning signs. The book helped me to spot patterns where someone might intend to or already be using me for their personal gains. Be it a job interview or otherwise. If something feels fishy or unusual to you, talk about it to someone you trust. And then trust your instincts.</p>
<h2 id="personal-improvement-">Personal improvement ⏫</h2>
<p>I just started a new job and want to create and maintain a <a href="https://jvns.ca/blog/brag-documents/">brag document</a> to track my progress at work. I do not know if my new colleagues maintain one already, but I intend to find out and encourage others to start one if they haven’t already.</p>
<p>So far I’ve had good exposure to a wide variety of tech stacks, from backend to infrastructure. But I would like to go deeper into one or two subject areas. The ones that immediately come to mind are <code class="language-plaintext highlighter-rouge">TLS</code> and <code class="language-plaintext highlighter-rouge">iptables</code>. My current project at work has helped me understand TLS better than I did a few weeks back and I hope I get an opportunity to dig deeper into computer networking.</p>
<h2 id="conclusion-">Conclusion 🏁</h2>
<p>Take everything you see on the internet with a grain of salt. Including this blog post. 😉</p>
<p>There is always another framework / programming language / tool / algorithm to understand. It is a never ending process. Try not to get stressed about the things you’re missing out on and focus on learning one thing at a time. As long as you’re improving, you are on the right track. Everyone has their own pace. And that’s the way it should be.</p>
<p>It has been a good journey over the last five years so far and I am grateful for the opportunities I have had and the people I met along the way. I look forward to the next five.</p>
<p>Is there anything else that comes to your mind? Let me know in the comments below.</p>Indradhanush Guptaindradhanush.gupta@gmail.comLessons learned as a programmer over the last five yearsWhitespaces and strings in Bash2018-09-02T00:00:00+00:002018-09-02T00:00:00+00:00https://indradhanush.github.io/blog/whitespaces-and-strings-in-bash<p>Last week at work I was working on a bash script, part of which needed
to get the status of a Kubernetes node. Specifically I was running
this command:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ kubectl get nodes node-0
</code></pre></div></div>
<p>And if everything looked well, the output would be something similar
to:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>NAME STATUS ROLES AGE VERSION
node-0 Ready <none> 22h v1.11.1
</code></pre></div></div>
<p>As part of the script I was using <code class="language-plaintext highlighter-rouge">cut</code> to extract the value of the second column by doing:</p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><table class="rouge-table"><tbody><tr><td class="gutter gl"><pre class="lineno">1
2
3
4
5
6
</pre></td><td class="code"><pre><span class="c">#!/bin/bash</span>
<span class="nv">result</span><span class="o">=</span><span class="si">$(</span>kubectl get nodes node-0 | <span class="nb">tail</span> <span class="nt">-n</span> 1<span class="si">)</span>
<span class="nv">state</span><span class="o">=</span><span class="si">$(</span><span class="nb">echo</span> <span class="nv">$result</span> | <span class="nb">cut</span> <span class="nt">-d</span> <span class="s1">' '</span> <span class="nt">-f2</span><span class="si">)</span>
<span class="nb">echo</span> <span class="nv">$state</span>
</pre></td></tr></tbody></table></code></pre></figure>
<p>Let’s break that down:</p>
<p>On line three, I pipe the output of <code class="language-plaintext highlighter-rouge">kubectl get nodes node-0</code> to
<code class="language-plaintext highlighter-rouge">tail</code> and store only the second line of the output since the first
line contains the column headings and is not useful in this case.</p>
<p>On line four, I use the <code class="language-plaintext highlighter-rouge">cut</code> command to split the words by the
character <code class="language-plaintext highlighter-rouge">space</code>, indicated by the <code class="language-plaintext highlighter-rouge">-d ' '</code> flag and extract the
value of the second column, indicated by the <code class="language-plaintext highlighter-rouge">-f2</code> flag. When I ran
the script I saw the output as expected:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Ready
</code></pre></div></div>
<p>However, <a href="https://www.shellcheck.net/">shellcheck</a> complained about
the line number four of the script (it complained about the last line
as well, but that’s not relevant to this post):</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In test.sh line 4:
state=$(echo $result | cut -d ' ' -f2)
^-- SC2086: Double quote to prevent globbing and word splitting.
</code></pre></div></div>
<h2 id="behold-the-mighty-bug-">Behold the mighty bug 🐛</h2>
<p>So I fixed the warning by wrapping it around double quotes:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>state=$(echo "$result" | cut -d ' ' -f2)
</code></pre></div></div>
<p>But to my surprise, all I had now was an empty string as output. To
debug this I printed the value of <code class="language-plaintext highlighter-rouge">$result</code> by:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>echo $result
</code></pre></div></div>
<p>Which gave the output:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>node-0 Ready <none> 3d v1.11.1
</code></pre></div></div>
<h2 id="debugging-">Debugging 🔬</h2>
<p>I copied this and ran piped it to <code class="language-plaintext highlighter-rouge">cut -d ' ' -f2</code> directly on the
bash shell. And I got the expected output – <code class="language-plaintext highlighter-rouge">Ready</code>.</p>
<p>I copied this and ran the code directly on the shell this time:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ result="node-0 Ready <none> 3d v1.11.1"
$ state=$(echo "${result}" | cut -d ' ' -f2)
$ echo $state
Ready
</code></pre></div></div>
<p>I was able to get the desired output this time. I asked in my
company’s Slack and eventually found out that if I ran the following
in the shell directly:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ result=$(kubectl get nodes node-0 | tail -n 1)
$ echo "$result"
node-0 Ready <none> 3d v1.11.1
</code></pre></div></div>
<p>Versus, if I ran:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ echo $result
node-0 Ready <none> 3d v1.11.1
</code></pre></div></div>
<p>Notice the difference yet? The difference is the presence of double
quotes around <code class="language-plaintext highlighter-rouge">$result</code>. When we wrap a variable around double quotes
we are forcing bash to print the variable as is and to quote the linux
documentation project:</p>
<blockquote>
<p>Using double quotes the literal value of all characters enclosed is
preserved, except for the dollar sign, the backticks (backward
single quotes, ``) and the backslash.</p>
</blockquote>
<h2 id="lessons-learned-">Lessons learned 📖</h2>
<p>I learned two things out of this:</p>
<ol>
<li>The output of <code class="language-plaintext highlighter-rouge">kubectl get nodes</code> is not separated by a single space character.</li>
<li>Bash cleans up extra spaces when writing to <code class="language-plaintext highlighter-rouge">stdout</code> unless you use
variable quoting.</li>
</ol>
<p>Now that the problem had been identified, the solution was easy – I
needed to squeeze out the extra spacing between the columns before
using <code class="language-plaintext highlighter-rouge">cut</code> on it:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>state=$(echo "${result}" | tr -s "[:space:]" | cut -d ' ' -f2)
</code></pre></div></div>
<p>Another important lesson that I learned from this exercise is that
when manipulating strings plain text format isn’t the best
option. <code class="language-plaintext highlighter-rouge">kubectl</code> can output in both <code class="language-plaintext highlighter-rouge">yaml</code> and <code class="language-plaintext highlighter-rouge">json</code>. I prefer
<code class="language-plaintext highlighter-rouge">json</code> as it’s easier to parse by the naked eye and I can use
<a href="https://github.com/stedolan/jq">jq</a> to manipulate <code class="language-plaintext highlighter-rouge">json</code> output
programmatically. As a result I was able to minimize my script to a
single line:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>state=$(kubectl get nodes node-0 -o json | jq ".status.conditions[-1])"
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">-o json</code> flag instructs the Kubernetes API to return the reponse
in <code class="language-plaintext highlighter-rouge">json</code> and we pipe the result to <code class="language-plaintext highlighter-rouge">jq</code> for further processing, where
the filters are specific to Kubernetes.</p>
<h2 id="addendum-">Addendum 📢</h2>
<ol>
<li>
<p>It’s generally considered good practice to wrap bash variables
around <code class="language-plaintext highlighter-rouge">{}</code>. So instead of <code class="language-plaintext highlighter-rouge">echo "$result"</code>, using <code class="language-plaintext highlighter-rouge">echo ${result}</code>
is safer and can help avoid bugs involving string expansion in
bash.</p>
</li>
<li>
<p>If you’re into Kubernetes, <code class="language-plaintext highlighter-rouge">kubectl</code> also supports the <code class="language-plaintext highlighter-rouge">-o
jsonpath</code> that lets you directly specify the json filter and returns
the minimized output instead of the entire json string. I tried using
<code class="language-plaintext highlighter-rouge">jsonpath</code> but in my case I noticed that negative indexing raised a
<code class="language-plaintext highlighter-rouge">panic</code>. I haven’t yet filed an issue about this on the Kubernetes
project’s issue tracker and intend to do so once I’ve verified this
with the latest Kubernetes components.</p>
</li>
</ol>Indradhanush Guptaindradhanush.gupta@gmail.comYet another bug when manipulating strings in bash and lessons learned from this exercise.A Helm debugging story2018-04-19T00:00:00+00:002018-04-19T00:00:00+00:00https://indradhanush.github.io/blog/a-helm-debugging-story<p>I’ve been working on a Kubernetes service broker at work, specifically
the <a href="https://github.com/kinvolk/habitat-service-broker/">Habitat service
broker</a>. To put it
mildly without diving deep into the details, the Habitat service
broker is an
<a href="https://github.com/openservicebrokerapi/servicebroker">OSB</a> compliant
piece of software whose job is to run apps built with
<a href="https://www.habitat.sh/">Habitat</a> in a
<a href="https://kubernetes.io">Kubernetes</a> cluster. The service broker is
deployed via <a href="https://github.com/kubernetes/helm/">Helm</a>, which is a
package manager for Kubernetes. What each of these are built for is
not exactly important as is the debugging story that I’m about to
describe. Let’s reserve that for a future post(s).</p>
<p>I had reached a checkpoint of sorts where I was able to successfully
create and apply a custom configuration on a redis instance that was
provisioned via the broker in the Kubernetes cluster. Once the <a href="https://github.com/kinvolk/habitat-service-broker/pull/7/">pull
request</a>
was merged, I started working on another feature that was to undo this
custom configuration. This required me to tinker with the
<a href="https://kubernetes.io/docs/admin/authorization/rbac/">RBAC</a> rules,
which is an authorization mode available in Kubernetes. I had to add
new rules for accessing additional API endpoints. After writing the
code for this and trying to apply the change on my cluster for the first
time, I started seeing an error that looked like this:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Error: release habitat-service-broker failed: clusterroles.rbac.authorization.k8s.io "habitat-service-broker-habitat-service-broker" already exists
make: *** [Makefile:35: deploy-helm] Error 1
</code></pre></div></div>
<p>Even if someone has no experience in Kubernetes, it should be
relatively straightforward to understand this error at a higher level,
which is that my code is trying to create a resource named
“habitat-service-broker-habitat-service-broker” and failed since it
already exists in the cluster. To elaborate on the error a little, I
am trying to create a new <code class="language-plaintext highlighter-rouge">clusterrole</code> object via Helm, our protagonist
of this story.</p>
<p>My first instinct was that there was something stale lying around on
my Kubernetes cluster from my previous testing and starting on a fresh
cluster would fix it. No surprises there that starting a new cluster
did not fix the bug. I was running my cluster via
<a href="https://github.com/kubernetes/minikube">minikube</a>, which is a tool to
run Kubernetes clusters locally. After a lot of tinkering and without
any success I shared a set of instructions to reproduce the bug with
my colleagues. But surprisingly the code was working fine for them
without any signs of the bug. At this point I thought that there was
something wrong with my minikube setup or that it had some other
configurations that I didn’t know about.</p>
<p>I decided to use <a href="https://github.com/kinvolk/kube-spawn">kube-spawn</a>
which is another tool to quickly setup a Kubernetes cluster but with
some differences than minikube (you can read more about it
<a href="https://kinvolk.io/blog/2017/08/introducing-kube-spawn-a-tool-to-create-local-multi-node-kubernetes-clusters/">here</a>). However
this bug seemed to be omnipresent and it showed up on the kube-spawn
cluster as well. By this point I had spent more than two days on this
bug and was beginning to be think that this was something sillier than
expected.</p>
<p>Helm, like we mentioned previously is a package manager for Kubernetes
and uses “charts” for defining services and their properties. The RBAC
related changes are also part of the charts and is defined using
<code class="language-plaintext highlighter-rouge">yaml</code> files.</p>
<p>Not knowing where to look next, I asked my colleagues Iago and Lorenzo
to pair with me on this and I set upon to reproduce the bug from
scratch. During the pairing session, they asked to look at the file
that contained the <code class="language-plaintext highlighter-rouge">clusterrole</code> definition.</p>
<p>Here’s what the corresponding directory looked like:<sup>[1]</sup></p>
<figure class="highlight"><pre><code class="language-shell" data-lang="shell"><span class="nv">$ </span>tree <span class="nv">$PROJECT_ROOT</span>/charts/habitat-service-broker/templates/
charts/habitat-service-broker/templates/
├── broker-deployment.yaml
├── broker-service.yaml
├── broker.yaml
├── clusterrolebinding.yaml
├── <span class="c">#clusterrole.yaml#</span>
├── clusterrole.yaml
├── _helpers.tpl
└── serviceaccount.yaml
0 directories, 8 files</code></pre></figure>
<p>Notice the odd file starting and ending with a <code class="language-plaintext highlighter-rouge">#</code>? That’s an Emacs
buffer for the original file named <code class="language-plaintext highlighter-rouge">clusterrole.yaml</code>. It turns out
that Helm, picks up all files that “look like” yaml from the charts
directory. As a result it was first applying the contents of
<code class="language-plaintext highlighter-rouge">clusterrole.yaml</code> and then those of <code class="language-plaintext highlighter-rouge">#clusterrole.yaml#</code>, or the
other way round. Either way, when it got to the second file that’s
when the command would fail. Emacs buffers normally get deleted when a
file is saved, and this might have been leftaround from a crashed
Emacs session. Deleting this buffer fixed the bug.</p>
<p>I felt a mixture of emotions at this point: anger, stupidity,
agitation, happiness and relief. It took me sometime to get back to my
original problem statement though.</p>
<p><a href="https://twitter.com/indradhanush92/status/986186213001490434">
<img src="https://indradhanush.github.io/images/a-helm-debugging-story/tweet-debugging.png" />
</a></p>
<p>I was stuck with this for more than two days and was eventually able
to fix it over a ten minute video call thanks to my colleagues. Even
though I found Helm’s approach to dealing with template files a bit
silly, I learned (and relearned) the following things out of this
experience:</p>
<ul>
<li>I should configure my editor to store buffers outside my working
directory.</li>
<li>An extra pair of eyes is always a great idea.</li>
<li>If you’re stuck seeing the same error for a long enough time, ask
for help sooner rather than later. Although a different error from
your last one can be thought of as progress. This is especially
relevant if you’re worried about asking for help “too much”.</li>
<li>If an error cannot be reproduced on someone else’s machine it
becomes much harder for others to help you unless you invite them
over to your machine.</li>
<li>If a bug sounds like a simple oversight somewhere rather than
something obscure and non trivial, it probably is.</li>
</ul>
<p>That’s all for now, thanks for reading and I’ll let you enjoy this
xkcd comic for now:</p>
<p><img src="https://imgs.xkcd.com/comics/debugging.png" /></p>
<p><br />
<sup>Footnotes:</sup>
<br />
<sup>[1]: I used Emacs for navigating the directory, but I used tree
here for better representation.</sup></p>Indradhanush Guptaindradhanush.gupta@gmail.comA description about an annoying bug and the lessons learned from itLeaves2018-03-31T00:00:00+00:002018-03-31T00:00:00+00:00https://indradhanush.github.io/paintings/leaves<figure>
<img src="/images/paintings/leaves.jpg" alt="image" />
</figure>Indradhanush Guptaindradhanush.gupta@gmail.comA painting of leaves.Django Girls - Bangalore 20172017-11-26T00:00:00+00:002017-11-26T00:00:00+00:00https://indradhanush.github.io/blog/django-girls-bangalore<p>Earlier this month on 11th November I participated in Django Girls,
Bangalore as a coach. <a href="https://djangogirls.org/">Django Girls</a> is a
one day workshop aimed at encouraging more women to pick up
programming as a career option by spending the entire day going
through the basics of Python and Django and eventually building a
minimal web app for themselves. I was the coach in the 2015 edition as
well and had had an enjoyable experience. However, at the same time I
thought I could have done a better job of being a mentor. The few key
points that I had pledged to myself to improve upon next time were:</p>
<ul>
<li>
<p>I had showed up a little late at the event in 2015 and my team was
already underway with another coach. I was introduced to my team
when I arrived at the venue and we started with the workshop right
away. This meant that I was unable to “break the ice” the way I
prefer by telling my team that I was no different than them and I
often got stuck with programming as well. As a result it is always
okay to ask questions and as many times as it is necessary to
understand a concept. Besides, if a participant is unable to grasp a
concept after having explained it a few times then the problem
surely lies in my teaching methods in not being able to use simple
to understand instructions. And apart from that, one should always
respect others time. I made ammends by ensuring that I showed up
early at the event which not only meant that I was able to say hello
to my team but also meet the other coaches. And we started right on
time at 10 in the morning.</p>
</li>
<li>
<p>I felt that I took longer than the other coaches to go through the
Python basics at the workshop in 2015 and consequently felt a bit
pressed for time towards the end of the workshop to cover all the
concepts. I had also spent longer than what I intended on the CSS
part of the workshop. This time knowing better, I informed my
particpants that I would skip the CSS bits since I felt that the
other core concepts about building a web app were more important and
harder to pick up on their own. However I was unable to do much
better than last time on the Python basics part of the workshop
this time as well. But I’ve always felt that it is important to
understand the basics of Python well before digging into Django. I
had once tried teaching Python to someone while teaching Django at
the same time and it did not go very well. While I do not have a
better idea at the moment I am going think about this over time and
try to come up with a better strategy.</p>
</li>
</ul>
<p>Going through the basics of Python, understanding the concepts behind
how the web works followed by those that power a web app and all in a
single day is quite a tall order for anyone. I feel that (and I’m
probably not alone) it might serve it’s purpose better if Django Girls
could be split across two days, one for Python and the other for
Django. But that also presents its own challenges like finding a venue
and coaches who can spare that amount of time at a stretch. I will not
go deeper into this as that is a topic for another blog post itself.</p>
<p>Another set of challenges that come with becoming an efficient coach
are that:</p>
<ul>
<li>
<p>One should always be very patient and open to answering as many
questions as required. There’s more on this in the
<a href="http://coach.djangogirls.org/">Coaching manual</a>. While reading, I
was pleasantly surprised to find the
<a href="https://www.recurse.com/manual#sub-sec-social-rules">social rules of the Recurse Center</a>
adopted into it. I attended the Recurse Center earlier this year and
it is by far my favourite programming community and the <a href="/tags/#recurse-center">best
experience of my life</a>.</p>
</li>
<li>
<p>Not all the participants are on the same level. Some might have more
programming experience than others and some might be already
familiar with a few concepts being taught in the workshop. As a
result, being a coach it is important to ask everyone about their
programming background instead of assuming that they must know a
particular concept already. One strategy that works best is to
describe all the concepts from a beginner’s perspective. The ones
who do not know about it get to learn, while it works as a refresher
for the ones already familiar with it.</p>
</li>
<li>
<p>A participant on my team showed up late on each of the 2015 and 2017
events, which is probably my penance for showing up late myself at
the 2015 event. It is always easier if everyone is on the same page
but with a participant showing up late I felt like I was juggling
tumblers while standing on a football and trying to balance a stool
with one leg at the same time. It was surely stressful for me but I
felt that it also hampered the other participants’ progress who were
already there on time. The strategy that I applied to manage this
situation was to ask the person who showed up late to start reading
a particular section from the
<a href="http://tutorial.djangogirls.org/">tutorial</a> with a very small and
focussed topic. During the time that she was reading, I switched
over to explaining other concepts to the remaining members of my
team and once done asked them to work on a small and isolated task
related to what I had just explained. I switched my attention back
to the other participant who was already reading up on something
else. My end goal was to get her upto speed with the others as soon
as possible, which meant I had to ask her to skip a few parts in
between. There were times when I found myself explaining critical
concepts like what is an <code class="language-plaintext highlighter-rouge">HTTP</code> request and the difference between a
<code class="language-plaintext highlighter-rouge">GET</code> request vs a <code class="language-plaintext highlighter-rouge">POST</code> request or how a database table maps over
to a Django model. At such a juncture I would ask her to pause her
work and listen along instead. I rinsed and I repeated.</p>
</li>
</ul>
<p>At the end of the workshop I asked for the attention of all the
particpants and told them to form groups within themselves to stay
connected with each other. The reasoning being that it is easy to
slack off after the workshop and if one does not follow up and build
upon what they spent an entire day learning then the entire workshop is
rendered useless. I told them that it is easier to continue working on
this in a group rather than alone, since then they would be
accountable to each other. This thought did not occur to me during the
2015 event and I see it as a lost opportunity. Someone suggested to
create a facebook group and add them or create a mailing list. However
I insisted that they exchange contacts at the event itself because
tomorrow never comes and it’s hard to regain the lost momentum later
on.</p>
<p>Finally after the long day, I had dinner with the rest of the coaches
and a few particpants. As a result of the workshop, I got to meet a
few people whom I only knew by twitter handles and IRC
nicknames. Organizing this event was no mean feat, and special thanks
to the organizers for selflessly putting in their time. It was great
to have been able to give back to the community and I look forward to
another one soon!</p>Indradhanush Guptaindradhanush.gupta@gmail.comThoughts from my second Django Girls event as coachJob search2017-10-14T00:00:00+00:002017-10-14T00:00:00+00:00https://indradhanush.github.io/blog/job-search<h2 id="the-story-so-far">The story so far</h2>
<p>It has been a little longer than two months
<a href="/blog/recurse-center-never-graduate/">since I finished my batch</a> at
the <a href="https://www.recurse.com">Recurse Center</a>. Going back to school
for a Masters degree has never been on my mind which meant that
working full time once again was on the agenda. Naturally most of my
time has been spent searching for a job. The last time I had to look
for a job was back in 2014 which was also the first time. Instamojo
was the only company I interviewed with. The process started with a
technical phone call, followed by a take home task and another
technical phone call. All this was completed in two weeks including
the final offer being rolled out.</p>
<p>This time around, with my limited experience with interviewing for a
full time position I was expecting the job search to maybe take a
month at best, but I was in for a rude awakening. It turns out that my
experience with Instamojo was a one off. When I started searching
for possible opportunities finding roles that interested me were
hard. And when I actually found something of interest, most of the
places I applied to never replied back. A lot has already been said
and written on how recruiting is broken and I will not expand on that
here.</p>
<p>By the beginning of September I had begun talking to quite a few
companies and as I found out, this is a very exhausting process. My
first calls were to get to know the company better and see if I had
any interest in the potential role being offered. This often expanded
to a second follow up call and all this even before I had formally
started interviewing for the company.</p>
<p>The places where I started interviewing ended up in either me botching
up the interview (the very first few) or me progressing through
multiple rounds in the end to hit a dead end. While one such company
changed their mind about the time not being right after I had been
through multiple technical phone interviews and a take home challenge,
another changed their mind after telling me they wanted me on the team
in what was the final call after three technical rounds. I am sure all
of them had valid reasons and were only doing their job but this was
fast becoming a very frustrating experience for me overall. Confidence
levels were going down and I was beginning to feel that the job search
was dragging on longer than what was comfortable for me. I have to
thank my family and friends for understanding me all this time and
hearing out my often unbearable rants.</p>
<p>All this has become a valuable life lesson teaching me more about the
importance of patience and humility and the randomness introduced by
timing factors which is also known as luck and something we really
don’t have much control over.</p>
<p>Double your initial estimate for the entire job search. If your target
is to get a job withing six weeks, make that twelve. It is important
to realize that things don’t pan out exactly as planned. Your point of
contact in the company can go on a vacation (this actually
happened). The company may be waiting for the financial quarter to end
to reevaluate their hiring strategy for the forthcoming one.</p>
<p>And most importantly ensure that you have money in the bank to be able
to provide for yourself and your family during that time. Knowing that
you can still keep paying the bills for another month or two is a
huge relief and goes a long way in contributing positively towards
your job search. Have people on whom you can fallback when you are
feeling vulnerable and low on confidence. Having someone to discuss
about your problems makes the fight easier. It helps you to regain
composure. Find some time out for a different activity.</p>
<h2 id="what-next">What next?</h2>
<p>I am joining <a href="https://kinvolk.io/">Kinvolk</a> from Monday, October 16th
and I will be building and improving existing tooling around
Kubernetes. The interview process did not involve any whiteboarding
around programming puzzles. Instead, I was given the task to work on
an <a href="https://github.com/rkt/rkt/issues/3756">open issue</a> of rkt, a
container engine. It was exciting to be able to dig into a new project
and also get a feel for the kind of work I’d be doing at Kinvolk. I
sent a <a href="https://github.com/rkt/rkt/pull/3812">PR</a> containing the fix
and made changes as per feedback from the project’s maintainers. And I
was given an offer very soon after that. I couldn’t think of a better
way to interview a candidate myself.</p>
<p>I have been looking for work around systems programming and
infrastructure tooling and being able to work on exactly that going
forward brings in a very familiar feeling of excitement back to
me. I’ll be working with <code class="language-plaintext highlighter-rouge">Golang</code> and potentially write some <code class="language-plaintext highlighter-rouge">C</code> and
<code class="language-plaintext highlighter-rouge">Rust</code> as well. I’ve met the team and they are a very happy and
fun to work with group of people. The team is based out of Berlin,
Germany while I will be working remotely from India. Oh, and the best
part of this all? I’ll be writing Open Source code!</p>
<p>Here’s looking forward to Monday!</p>Indradhanush Guptaindradhanush.gupta@gmail.comLessons learned while looking for a full time jobNotes on Raft, the consensus protocol2017-10-04T00:00:00+00:002017-10-04T00:00:00+00:00https://indradhanush.github.io/blog/notes-on-raft<p>During my RC batch, I read through the Raft paper,
<a href="https://raft.github.io/raft.pdf">In Search of an Understandable Consensus Algorithm</a>
while taking a lot of notes. Raft is a distributed consensus
protocol. All that culminated in me writing my own implementation in
Erlang. The project is
<a href="http://dilbert.com/strip/2017-10-02">a work in progress</a> and is
available <a href="https://github.com/indradhanush/raft">here</a> on Github.</p>
<p>Following are my notes from reading the paper. Some points might
repeat themselves, but that is okay.</p>
<h2 id="introduction">Introduction</h2>
<ul>
<li>Each node has three possible states: follower, candidate or leader.</li>
<li>A follower is a node that accepts writes from a leader.</li>
<li>A candidate has the same responsibilities of a follower but is also
a potential leader.</li>
<li>A leader runs the cluster by serving requests from clients and
sending updates to other nodes in the cluster.</li>
<li>An write requests from a client is known as a log entry. The
collection of log entries is known as the log. Each node maintains
its own log.</li>
<li><strong>Strong leader</strong> - Log entries only flow from leader to other
servers.</li>
<li><strong>Leader election</strong> - Randomized timers to elect leaders.</li>
<li><strong>Membership changes</strong> - Joint consensus is used.</li>
</ul>
<h2 id="replicated-state-machines-rsm">Replicated State Machines (RSM)</h2>
<ul>
<li>State machines create identical copies on a cluster of servers.</li>
<li>Systems with a single leader (GFS, HDFS, RAMCloud) use a separate
RSM for managing leader elections and storing configurations
(Zookeeper).</li>
<li>A replicated log is used to build RSM.</li>
<li>Servers store a log of commands in order. This is the same log that
is stored on each server. Thus the machines can remain
consistent and arrive at the same state. This is possible since the
state machines are deterministic.</li>
<li>The replicated log <strong>must</strong> remain consistent. This is the job of
the consensus algorithm.</li>
</ul>
<h3 id="consensus-algorithms">Consensus Algorithms</h3>
<ul>
<li>Each node receives commands that get added to the replicated log.</li>
<li>Each node talks to the other nodes to ensure that the log remains
consistent (eventually) even if some nodes have failed.</li>
<li>When the commands have been replicated in the log, the state machine
in each server runs the command in the <em>log order</em>.</li>
<li>The output of running the commands is returned to the server by the
algorithm module.</li>
</ul>
<h3 id="properties-of-systems-using-the-consensus-algorithm">Properties of systems using the consensus algorithm</h3>
<ul>
<li><strong>Safety</strong> is ensured. Never returns an incorrect result under
<a href="https://en.wikipedia.org/wiki/Byzantine_fault_tolerance">non-byzantine conditions</a>.</li>
<li>The system remains functional as log as the majority of the servers
are up. For \( N \) servers, at \( N/2 + 1 \) servers must remain up.</li>
<li>Servers fail if they,
<ul>
<li>stop.</li>
<li>cannot communicate with the other nodes or clients.</li>
</ul>
</li>
<li>Servers <strong>do not</strong> depend on timing to ensure log consistency.</li>
<li>A command is consdered to e complete if the majority of the servers
have responded. This means that the slower nodes do not impact the
system’s (cluster) performance.</li>
</ul>
<h2 id="raft-consensus-algorithm">Raft consensus algorithm</h2>
<ul>
<li>It elects a leader as the first thing before everything else.</li>
<li>The leader has full responsibility to manage the replicated log.</li>
<li>Client sends log entries to the leader. The leader replicates it to
the other servers.</li>
<li>The leader tells the server when it is safe to apply the log entires
ot their state machines.</li>
<li>Leader decides the position fo a log entry without having to ask
anyone else.</li>
<li>A leader can fail or get disconnected from other servers.</li>
<li>As a result the current <em>term</em> ends, a new one begins and a new
leader is elected.</li>
</ul>
<h3 id="three-independent-subproblems">Three independent subproblems</h3>
<ul>
<li><strong>Leader Election</strong> - Must elect on if the current leader has
failed.</li>
<li><strong>Log replication</strong> - The leader must accept entries from clients
and replicate it to the servers. It should also force its decision
on the servers.</li>
<li><strong>State machine safety</strong> - If <em>any</em> server has applied a log entry
to its state machine, then all the servers must also apply the same
log entry for that log index.</li>
</ul>
<p>Raft guarantees that the following properties are true at all times:</p>
<ul>
<li><strong>Election safety</strong> - There can only be one leader elected for a
given term. A term is a phase between elections and is valid until a
leader fails.</li>
<li><strong>Leader Append-Only</strong> - A leader will never overwrite or delete
entries. It can only append new entries to the log.</li>
<li><strong>Log matching</strong> - If two logs contain an entry with the same
<em>index</em> and <em>term</em>, they are identical for all the entries from the
<em>beginning</em> and up to that index.</li>
<li><strong>Leader completeness</strong> - If a log entry has been committed in a
given term, then that particular entry will be present in the logs
of the leaders for all subsequent terms.</li>
</ul>
<h3 id="raft-basics">Raft basics</h3>
<ul>
<li>Typically 5 servers are used.</li>
<li>A server can have on of the following states:
<ul>
<li><strong>Leader</strong>: Handles all client requests. If a client sends a
request to a follower, it forwards the request to the leader.</li>
<li><strong>Follower</strong>: Passive. Only responds to the leader’s and
candidate’s requests or forwards client requests to the leader.</li>
<li><strong>Candidate</strong>: A potential leader.</li>
</ul>
</li>
<li><strong>Term</strong> - This is a sequentially increasing integer and exists for
an arbitrary amount of time. Each term beings with an election and a
leader is elected. Post which, the leader remains the “leader” for
the rest of the term. If there is a tie in the election, the term
ends with no leader and anew term begins.</li>
</ul>
<h3 id="flow">Flow</h3>
<figure class="highlight"><pre><code class="language-asciidoc" data-lang="asciidoc"> System starts
+------------------------------+
| Candidates |
| C1 C2 C3 C4 C5 |
+--------------+---------------+
|
|
v
Term starts (T1)
Leader Election begins
+
|
|
v
+------------------------------+
| Nodes |
| F1 F2 L3 F4 F5 |
+--------------+---------------+
|
|
v
Normal operation
(Accept client requests, update log,
apply log, respond to request)
+
|
|
v
Leader (L3) fails
+
|
|
v
Term 1 ends
+
|
|
v
New Term starts (T2)
Leader Election begins
+
|
|
v
+------------------------------+
| Nodes |
| C1 C2 -- C4 C5 |
+------------------------------+
+
|
|
v
Repeat Normal Operation
once leader has been elected</code></pre></figure>
<p>This does not consider a tie during a leader election.</p>
<ul>
<li>A term is a logical clock.</li>
<li>Terms are not constant for all servers. Different servers may switch
terms at different times.</li>
<li>Or a server may not participate in an election at all or may remain
inactive throughout one or more terms as a whole.</li>
<li>Servers will store the current term number. The term number is
exchanged whenever servers communicate.</li>
<li>If a server sees its term lower than another server, it updates to
the maximum value, i.e. \( T = max(T1, T2) \)</li>
<li>If a leader discovers that its term is out of date, it will <strong>step
down</strong> immediately and become a <em>follower</em>.</li>
<li>If a server receives a request with a stale term number, the request
is rejected.</li>
</ul>
<h3 id="leader-election">Leader Election</h3>
<ul>
<li>When a server starts up, it is in the <em>follower</em> state by default.</li>
<li>There are two kinds of RPC:
<ul>
<li><code class="language-plaintext highlighter-rouge">AppendEntries</code></li>
<li><code class="language-plaintext highlighter-rouge">RequestVote</code></li>
</ul>
</li>
<li><strong>Heartbeats</strong> - This is an <code class="language-plaintext highlighter-rouge">AppendEntries</code> RPC, with an empty body
(no log entries). The leader is responsible for sending heartbeats
to all followers to remain the leader.</li>
<li>A server remains a follower as long as it receives valid RPCs from a
leader or a candidate.</li>
<li><strong>Election timeout</strong> - If a follower does not receive any request for
this time duration, it assumes that there are no leaders in the
cluster, and begins a leader election phase. This is generally set
to 150-300 ms.</li>
</ul>
<h4 id="starting-an-election">Starting an Election</h4>
<ul>
<li>A follower increments its current term and becomes a candidate.</li>
<li>It votes for itself and requests votes from all the servers
(<code class="language-plaintext highlighter-rouge">RequestVote</code>) in parallel.</li>
<li>At this point, one of the three outcomes are possible:
<ul>
<li>The candidate wins the election and becomes the leader.</li>
<li>A different candidate wins the election instead.</li>
<li>There is a tie with no clear winner.</li>
</ul>
</li>
</ul>
<h4 id="voting">Voting</h4>
<ul>
<li>A candidate needs majority of the votes.</li>
<li>Each server can vote for only one candidate in a specific term. The
vote goes to the first <code class="language-plaintext highlighter-rouge">RequestVote</code> issuer (first come, first served).</li>
<li>While waiting for votes, a server may receive an <code class="language-plaintext highlighter-rouge">AppendEntries</code> RPC
from another server claiming to be the leader. This can happen in a
network partition.</li>
<li>If the leader’s term, \( T_l\) is at least greater than equal to
the candidate’s term, \(T_c\) then the candidate steps down to
become a follower and the leader continues, i.e. \( T_l \ge T_c \)</li>
<li>This situation can arise when two candidates initiate a new term,
but one of them gets majority of votes before the other.</li>
<li>If \( T_l < T_c \), then the <code class="language-plaintext highlighter-rouge">AppendEntries</code> RPC is rejected by
the server.</li>
<li>Split votes can occur if too many followers become candidates for
the same term. This means none of the candidates were able to secure
a majority vote.</li>
<li>In such a scenario, the candidates will time out and start a new
election.</li>
<li>\(T_c \) is incremented by <code class="language-plaintext highlighter-rouge">1</code> and fresh <code class="language-plaintext highlighter-rouge">RequestVote</code> RPCs are
issued.</li>
<li>However, split votes can keep repeating indefinitely. To avoid this,
random election timeouts are used.</li>
<li>This implies, that the candidates will have a different random
election timeout, generally between <code class="language-plaintext highlighter-rouge">150-300 ms</code>. The candidate with
the smallest value will timeout before the others have timed out,
and will be able to issue a new <code class="language-plaintext highlighter-rouge">RequestVote</code> RPC.</li>
</ul>
<h2 id="log-replication">Log Replication</h2>
<ul>
<li>When a leader is elected, it starts sending <code class="language-plaintext highlighter-rouge">AppendEntries</code> RPC
calls.</li>
<li>When there is consensus on the entry, it applies the log to its
state machine and responds to the client with the result.</li>
<li><strong>Logs</strong>: Each log entry has a term number and a state command. They
also have an index to identify the position of the entry in the log.</li>
<li>The leader decides when it is safe to apply a log entry. The log
entry when applied is said to be <em>committed</em>.</li>
<li>Committed entries are durable and will <em>eventually</em> be applied by
all the servers in their respective state machines.</li>
<li>A log entry is committed when the leader has replicated it to a
majority of the servers. This also commits all previous entries in
the leader’s log (including all entries created by the previous
leaders).</li>
<li>The leader tracks the highest known index that was committed.</li>
<li>This index is sent in all future <code class="language-plaintext highlighter-rouge">AppendEntries</code> calls and
heartbeats. Thus any slow server will eventually find out about
newly committed entries.</li>
<li>When a follower sees a committed entry in the log, it applies it to
its own state machine.</li>
</ul>
<h3 id="log-matching-property">Log matching property</h3>
<ul>
<li>A leader creates only one entry for a given index in a given term.</li>
<li>The log entries are never changed or deleted. Their position is also
never altered.</li>
<li>This implies, that if two entries in two different logs have the
same index and term, they will have the same command.</li>
</ul>
<h3 id="consistency-check-for-appendentries">Consistency check for AppendEntries</h3>
<ul>
<li>When sending a request, the term and the index of the previous log
entry is also sent by the leader.</li>
<li>If the follower does not find an entry in its log for the
corresponding term and index already, it rejects the request.</li>
<li>An acknowledge by a follower indicates that its log matches with
that of the leader.</li>
</ul>
<h3 id="conflict-resolution">Conflict resolution</h3>
<ul>
<li>In normal operation mode, the logs stay consistent across the
cluster.</li>
<li>However, if a leader crashes and a new leader has emerged, the logs
may become inconsistent. Followers might be missing entries from the
leader and have extra entries that are not present on the leader.</li>
<li>Or if there is a network partition with a leader in each partition
along with their own set of followers.</li>
<li>To fix this, the leader always forces the followers to accept its
log entries. Conflicts in the followers are overwritten with entries
from the leader’s log.</li>
<li>The leader maintains a <code class="language-plaintext highlighter-rouge">nextIndex</code> for each follower. This is the
index of the log entry that the leader will send to that follower.</li>
<li>When a leader is elected, it sets the <code class="language-plaintext highlighter-rouge">nextIndex</code> as the index after
the last log entry in its own log, i.e. this value is same for all
the followers at the start of the term.</li>
<li>If a follower’s log is inconsistent, the leader decrements the
<code class="language-plaintext highlighter-rouge">nextIndex</code> for that follower and tries again. This is repeated
until a matching log entry is found. The index of this log entry is
the value of <code class="language-plaintext highlighter-rouge">nextIndex</code> for that follower.</li>
<li>The worst case scenario is that the entire log of the follower is
incorrect and it is reset to an empty one.</li>
<li>When the conflict is resolved, all the subsequent log entries from
the leader are appended to the follower’s log.</li>
<li>The log will remain consistent throughout the current term.</li>
<li>When a leader is elected, it starts normal operations right away,
and all the logs in the followers converge over time if the
<code class="language-plaintext highlighter-rouge">AppendEntries</code> RPC fails in the consistency checks.</li>
</ul>
<h2 id="safety">Safety</h2>
<ul>
<li>If a follower is unavailable during a term, \(T_n \) and the
current leader goes ahead with several committed log entries.</li>
<li>In such a scenario, if the follower becomes the new leader in term,
\(T_n+1 \), it will accept new writes and force the leader of the
previous term to erase its log entries. This means some committed
log entries will be overwritten.</li>
</ul>
<h3 id="election-restriction">Election restriction</h3>
<ul>
<li>To fix this problem, we ensure that a leader for any given term must
contain all the entries committed from all previous terms.</li>
<li>Raft ensures that all committed entries from previous terms are
already present on the new leader from the moment of its election
without having to send them to the leader after it has been elected.</li>
<li>During the voting period, if a candidate does not have all the
committed entries, then servers do not vote for it.</li>
<li>The <code class="language-plaintext highlighter-rouge">RequestVotes</code> RPC adds this restriction and includes the latest
log entry. The voter then compares this log entry with its own and
only issues a vote if the log entry in the <code class="language-plaintext highlighter-rouge">RequestVotes</code> RPC is at
least as new as the latest log entry of the voter’s state machine.</li>
<li>If the term of the logs is different, then the log with
the latest term is considered to be the most up-to-date. But if the
term is the same, then the longer log is considered as the more
up-to-date one.`</li>
</ul>
<h3 id="committing-entries-from-previous-terms">Committing entries from previous terms</h3>
<ul>
<li>If an entry from the current term is replicated on a majority of
servers, the leader marks it as committed.</li>
<li>If a leader crashes before committing an entry, future leaders will
attempt to finish replicating that entry. But a new leader cannot
assume that an entry from a previous term is committed even if it is
stored in a majority of servers. The previous leader might have
crashed after replicating it to the majority but before committing
it.</li>
<li>The leader will only commit log entries from the current term by
checking if the entry has been replicated in majority of the
servers. As a result all prior entries are also considered to be
committed. This is the log matching property.</li>
</ul>
<h3 id="safety-argument">Safety argument</h3>
<p>Let us assume that the Leader completeness is a false property and we
have term, \( T \) and leader, \( L_T \), which commits a log
entry \( E \) but does not get replicated to a node that is also a
leader in a future term.</p>
<p>For the nearest term after \( T \) such that \( U > T \) and
leader, \( L_U \) that does not have the entry \( E \).</p>
<p>But, \( L_T \) replicated the entry to a majority of the servers and
\( L_U \) received votes from the majority of nodes to be elected a
leader.</p>
<p>This implies, that there exists at least one server that received the
entry \( E \) and voted for \( L_U \) as well.</p>
<p>Also the voter received entry \( E \) first and then made the vote
to \( L_U \). \( E \) has term \( T \) as defined above and if
it received entry \( E \) after casting the vote, the entry \( E
\) should have term \( U \) instead. As a result the entry would
have been rejected since \( U > T \).</p>
<p>Thus we can safely conclude, that the voter has stored entry \( E \)
and if the voter voted in favor of \( L_U \), the leader must have
had its log at least as new as that of the voter (log matching
property).</p>
<p>Also, if \( L_U \) had the last element of the voter in its log, it
implies that:</p>
<p>\[ length (\ log\ of\ L_U\ ) \ge length (\ log\ of\ the\ voter\ ) \]</p>
<p>which means that log of \( L_U \) has all the elements that are
in the log of the voter.</p>
<p>But, the voter already has the entry \( E \), which means \( L_U
\) also has \( E \). However we assumed that the leader \( L_U \)
does not have the entry \( E \). This is a contradiction and thus an
incorrect assumption.</p>
<p>And finally, if the last log entry of \( L_U \) has a term greater
than that of the voter, it implies that the term of the last log entry
of \( L_U \) is greater than \( T \) since the voter at least
contained the term \( T \)</p>
<p>All log entries are applied in log index order. Therefore, from the
state machine safety property, we can conclude that all the servers
will apply the exact same set of log entries to their state machines
in the same order. This need to be immediate, but will be eventually
true.</p>
<h3 id="follower-and-candidate-crashes">Follower and candidate crashes</h3>
<ul>
<li>If a follower or candidate crashes then the leader retries the
<code class="language-plaintext highlighter-rouge">RequestVotes</code> and the <code class="language-plaintext highlighter-rouge">AppendEntries</code> RPCs indefinitely. This
implies that if the crashed server recovers, the RPC calls will
succeed.</li>
<li>If the server crashes after completing the RPC but before responding
back to the leader it will receive the applied RPC again when it
restarts (since the leader will retry indefinitely).</li>
<li>However RPC calls are idempotent, that is they can be safely
repeated without any side effects. If a server receives an
<code class="language-plaintext highlighter-rouge">AppendEntries</code> RPC which it has already completed in the past, it
will ignore this. This is possible since the log entry is already
present in its own log, which can which can be identified by the
index and the term of the corresponding log entry.</li>
</ul>
<h3 id="timing-and-availability">Timing and availability</h3>
<ul>
<li>If message exchanges take longer than the time between server
crashes, we will never be able to elect a leader.</li>
<li>The following is a timing requirement for Raft to function:
\[ broadcastTime \ < < \ electionTimeout \ < < \ MTBF \]</li>
<li><code class="language-plaintext highlighter-rouge">broadcastTime</code>: Average time required for the servers to send RPCs
in parallel to all the servers in the cluster and receive their
responses.</li>
<li><code class="language-plaintext highlighter-rouge">electionTimeout</code>: Time for which a server waits for a leader to be
elected. If no leader is elected within this time, the server starts
a new term and a new election is initiated.</li>
<li><code class="language-plaintext highlighter-rouge">Mean time between failures (MTBF)</code>: Average time between failures for a single server.</li>
<li><code class="language-plaintext highlighter-rouge">broadastTime</code> should be an order of magnitude lesser than the
<code class="language-plaintext highlighter-rouge">electionTimeout</code>.</li>
<li>This implies that servers can send heartbeats and RPCs before a node
times out and initiates an election. This also minimizes the chance
of a split vote.</li>
<li>The <code class="language-plaintext highlighter-rouge">electionTimeout</code> should be an order of magnitude lesser than
the <code class="language-plaintext highlighter-rouge">MTBF</code> for the system to work.</li>
<li>If the leader crashes, the system becomes unavailable for
approximately the election timeout, since at this point no client
requests can be safely processed in the absence of a leader.</li>
<li>The <code class="language-plaintext highlighter-rouge">broadcastTime</code> and <code class="language-plaintext highlighter-rouge">MTBF</code> are derived properties of the system
while the <code class="language-plaintext highlighter-rouge">electionTimeout</code> should be determined by us.</li>
</ul>
<h2 id="cluster-membership-changes">Cluster membership changes</h2>
<ul>
<li>TBD</li>
</ul>
<h2 id="log-compaction">Log compaction</h2>
<ul>
<li>TBD</li>
</ul>
<h2 id="resources">Resources</h2>
<ul>
<li><a href="https://raft.github.io/raft.pdf">https://raft.github.io/raft.pdf</a></li>
<li><a href="http://thesecretlivesofdata.com/raft/">Secret lives of data</a></li>
</ul>Indradhanush Guptaindradhanush.gupta@gmail.comMy notes from reading the paper - In Search of an Understandable Consensus Algorithm