# Samrat Man Singh

Email: mail@samrat.me

Hi! I’m Samrat, a programmer from Nepal.

# Summary of recent reading(April-June 2019)

### 2019-06-30

• Peak by Anders Ericsson and Robert Pool: a book on deliberate practice by one of the researchers who originally coined the term. This book lays out a framework for how to achieve expertise. Briefly, this goes as follows: identify what skills and mental representations an expert possesses, design a step-by-step program where you build up that skill, get feedback on what you’re doing wrong, focus on your weaknesses as you practice. The book emphasizes mental representations as essential to expertise– as chess players study games for many, many hours they see not just individual pieces but learn to identify larger patterns in any configuration. Deliberate practice is about building similar mental representations in any skill you want to master.

Overall, I think this book is a worthwhile read. You’ve probably encountered the idea of deliberate practice elsewhere by now, and the authors lay down exactly what that entails. The book can be a bit dry at times– it tries to stay firmly grounded on facts, and that means citing one study after another. But I did find the description of how Benjamin Franklin designed a program to practice his writing skills to be really cool.

• Backstabbing for Beginners by Michael Soussan: an insider’s account of the United Nations operation overlooking sanctions imposed on Saddamn Hussein’s Iraq. The book is very open about the many failings(including incompetence, naivete and downright corruption) of the Oil-for-Food program and the people involved. And how Saddam Hussein benefitted massively, to the detriment of the people of Iraq, by exploiting these failings.

I found the book really well-written, and highly recommend it.

• Phoenix by Steven Brust: This is book 5 of the Vlad Taltos series. I’ve reviewed the earlier ones, and I can definitely stick to my recommendation for this series– these books are short and entertaining. I’m reading them in order of publication date; in terms of chronological order, this book follows book 3(Teckla).

• Athyra by Steven Brust: Book 6 of the Vlad Taltos series. This one’s written in a different voice and centers around a different character, although the protagonist of the series– Vlad– still plays a significant role. Still really fun.

# Summary of recent reading(January-March 2019)

### 2019-03-17

• Messy by Tim Harford: how pursuing tidy systems is harming us and why we should embrace messy systems that better reflect the unpredictableness of the real world. The systems in question can be anything from how you organize your inbox to how cities are laid out. Although it struck me while reading this book how similar most non-fiction books tend to be, I did enjoy the book and found the arguments quite compelling.

• Learning to Climb Indoors by Eric Horst: a guide to learning and improving for the beginning climber. The author seems like a knowledgeable person and the book covers everything from how to approach different kinds of handholds to managing mental roadblocks like fear to optimum training schedules.

I also found this to be a really well-written book, and it goes beyond just the technical and talks about how to stay motivated in the long run and even how climbing holds many life-lessons.

• Jhereg by Steven Brust: It had been a while since I’d read any fantasy and this was just a great book to come back to the genre with. This is part of a series called Vlad Taltos, named after the main character who is an assassin.

Unlike most other fantasy books I’ve read(not that many), the book doesn’t take itself too seriously. I thoroughly enjoyed the book, and I plan on picking up other books in the series soon.

[spoiler] Someone has stolen from the Jhereg organization, and Vlad is hired to assassinate him. The thief is taking refuge in Morrolan’s castle, where he knows Jhereg dare not attack. [/spoiler]

• Show Your Work! by Austin Kleon: tries to make the case that creativity is a social process and how you should share your work online to get an audience. I got excited about this book after reading Derek Sivers’s review of it, but I didn’t find the book as insightful as I was hoping.

• Yendi by Steven Brust: Second book in the Vlad Taltos series, from above. I enjoyed this book as well, although the structure of the plot felt a bit similar to first book.

[spoiler] Vlad goes into a turf war, gets killed by Cawti, gets revivified, then falls in love with Cawti. The turf war turns out to be part of a plan to instate Norathor as Dragon heir, displacing Aliera. [/spoiler]

• A Mind For Numbers by Barbara Oakley: strategies to learn better. The book also goes has ample explanations, based on research on the human mind, on why the techniques described work. Many of the ideas in the book, I’d already encountered in Make it Stick (eg. practicing recall through mini-tests, spacing repetition), but it was still useful to remind myself that I don’t have any excuses to not put these techniques to use.

• Teckla by Steven Brust: next book in the Vlad Taltos series. I felt like this book hinted at a much deeper big-picture narrative than the first two books did. I continue to be excited about this series.

[spoiler] Vlad’s wife Cawti has joined a revolution against the Dragaeran oppression of Easterners and Teckla. [/spoiler]

• The Defining Decade by Meg Jay: targeted towards twenty-somethings. The book tries to push readers into taking ownership of transitioning into adulthood, and being more intentional with life.

• Taltos by Steven Brust: book 4 of the Vlad Taltos series. I really enjoyed this one.

[spoiler] Vlad and Morrolan travel to the Paths of the Dead to revive Aliera [/spoiler]

# Book reviews- Alone on the wall, The Push, Can't Hurt Me

### Alone on the Wall by Alex Honnold and David Roberts

This is a memoir of sorts by a rock climber most famous for his free solo ascents(free soloing = rock climbing without a rope). The reason I picked up this book is because I’ve been doing some climbing myself and while the type of climbing I’ve done– mostly bouldering, mostly indoors– is very different from what Honnold does, it’s still very interesting to read about the perspective and experiences of someone who performs at such a high level.

Especially interesting was his perspective on risk versus consequences. Climbing a big wall without a rope has obviously very grave consequences if he falls, but he figures that most of the time the risk of his actually falling is pretty low.

### The Push by Tommy Caldwell

Another rock climbing book. A big theme in this book is getting over adversity and pushing past perceived limits. Caldwell has had quite an interesting life and there are bits of biographical details in the book alongside his rock climbing exploits. The biographical bits are actually quite interesting and they tie in with what fed his sense of purpose while doing some of his climbs.

### Can’t Hurt Me by David Goggins

This one is not about rock-climbing! Instead, it’s by a former Navy SEAL turned endurance athlete– Goggins has participated in several ultra-marathons. This is somewhat thematically similar to “The Push” in that it is also about pushing past your perceived limits, although unlike “The Push”, this one is a bit prescriptive and borders on the “self-help” genre at times.

Still, I think the core message of the book– that you should constantly be pushing past your comfort zone– is a valuable message.

# Disk-based algorithms: External merge-sort

### 2017-11-13

I have lately been studying how databases work under the hood. There are many problems that a database has to solve. At the moment, I’ve been having fun exploring how data is stored on disk. Previously, I wrote about how a hashtable is stored on disk. This post is somewhat similar– I talk about how data stored on disk can be sorted.

Databases frequently have to sort data of massive sizes, sizes big enough that they don’t fit in memory at once. Sorting algorithms that can sort data residing in disk are called external sorting algorithms. External merge-sort is one such algorithm.

## External merge-sort

Just like merge-sort, external merge-sort is a divide-and-conquer algorithm– it recursively sorts smaller subarrays, and then merges those subarrays. Because disk I/O happens in blocks, however, the smallest subarray is the size of a disk page. So for a file with n pages, the algorithm proceeds as follows.

First, all pages are sorted internally. To achieve this, we bring in however many pages can fit into the program’s buffers(remember, the entire data doesn’t fit in memory), sort each page and write them back. This step is usually referred to as pass 0.

Next, in pass 1, we sort runs of length 1. A run is a collection of sorted pages, so that means that we take pairs of pages and merge them to get $n/2$ runs of length 2. Now, the contents of pages 0 and 1 are sorted; as are 2 and 3, and so on. In pass 2, we do the same thing, but with runs of length 2. And so on, until finally we have two subarrays(runs of length $n/2$ pages). Merging those gives us the final sorted result. We start with $n$ runs and halve at each pass, so there will be $log_2(n)$ passes.

All this may seem very familiar to you if you’re already acquainted with merge-sort. The difference here is that is some extra work to be done when merging. Take, for example, the last pass where we do the final merge between two runs of $n/2$ pages. Our original constraint was that n pages don’t fit in memory, so 2 runs of $n/2$ pages certainly won’t either. To solve this, the pages in the two input runs are streamed in.

First we bring in the first pages from each of the runs. As with “regular” merge-sort, we iterate through both, picking off whichever has the smaller item. Eventually, one will run out. At that point, bring in another page from that run into the buffer, and continue. Our output is also a file, so we need to do the same on the output side– once the output buffer is filled, write it to the output file, and start with an empty buffer.

You can check out my Rust implementation of this algorithm on Github.

## Multi-way merging

What I’ve described above is called two-way merging since at any given step we merge two runs. However, it can be advantageous to merge $k$ runs at once. This would result in $log_k(n)$ number of passes. Since we write out to disk in each pass, the efficiency gain this reduction in the number of passes gives us can be significant.

You can read more about the ways to do k-way merging on Wikipedia.

# Implementing a key-value store, part 2: Linear Hashing implementation in Rust

### 2017-11-09

In the last post, I introduced the idea of linear hashing. This post will describe a Rust implementation of the algorithm. I won’t go through every last line of code, but hopefully enough to give you a good understanding of how the whole thing works. I should also mention that even though this is a post about implementing linear hashing, it spends quite some time talking about how storing data to disk works. This is intentional– articles about hashtable implementations are aplenty; articles talking about how external storage datastructures work, in my opinion, are not.

You can check out the full source code on Github.

## The interface

Before diving into how the hashtable will work, let’s first discuss how a client using the hashtable will be expected to use it.

Our hashtable implementation will map byte arrays to byte arrays. That is, we’ll let the client program figure out how to serialize/deserialize keys and values.

This is how a linear hashing table will be created:

let mut h = LinHash::open("book_ratings", 32, 4);


The parameters to open are the filename, the length of the key and the length of the value. That means that our key-value pairs(or records) will be stored in the file “book_ratings”; keys will be 32 bytes long(in the example above, it’s probably a string) and the value will be 4 bytes long(probably an integer). Keeping key and value size fixed is somewhat restrictive(RocksDB, for example, allows you to just pass in bytearrays of any length) but this will simplify how records are stored in our file.

Inserting and looking up works like so(I’ve put an assert for get just to give an idea of what the return value looks like):

h.put(b"Spin", &i32_to_bytearray(9));
h.put(b"Axis", &i32_to_bytearray(6));
// Assumes little-endian architecture
assert_eq!(h.get(b"Spin"), Some(vec![9, 0, 0, 0]));


I know this feels clunky but not having to worry about serialization/deserialization will simplify things quite a bit for us. I should mention that i32_to_byte_array(and other similarly named functions in the util module) use std::mem::transmute under the hood and so are not portable across architectures(hence the comment above about endianness).

## Implementation overview

Here’s a high-level description of what happens when you call put as in the example above(the rest of the post goes into it in depth, of course). First we hash the key, and take however many bits the hashtable is currently set to take(see last post). This tells us which bucket to place the record in. A bucket is a linked list of pages, which are chunks of bytes on disk. Pages in the linked list may be full, so we now need to figure out which page the record should go in. Once we figure out which page it should go in, this page is fetched from disk. We then make the necessary changes to the page in memory– eg. write record, increment number of records in page’s header)– then save it out to disk.

get is very similar and uses the same method(search_bucket described below) that we use to figure out which page in the bucket the record should be placed in.

## Pages and buffer pool

A file is divided into pages of size 4KB. When we want to read or write a some chunk of bytes in a page, we have to read the page into memory, make whatever changes we want to make on the copy resident in memory, then save the updated page out to disk.

Now, we will usually have a little more than 4KB of memory at our disposal, so what we can do is buffer the pages we read and write. So, instead of flushing the page out to disk once we do one operation, we keep the page in memory as long as possible. In this post, we will use a rather simplistic(and inefficient!) way to decide which pages get to stay(or rather which page gets evicted)– a FIFO queue1. If we have a buffer pool of size 2, and we read pages in the order 1,1,2,3 this is what the reads/writes to disk will look like:

read 1 [fetch page 1]
read 3 [fetch page 3; place in slot occupied by 1]


a page resident in our buffer pool is represented in a Page:

pub struct Page {
pub id: usize,
pub storage: [u8; PAGE_SIZE],
pub num_records: usize,
// page_id of overflow bucket
pub next: Option<usize>,
pub dirty: bool,

keysize: usize,
valsize: usize,
}


Notice that besides the bytearray(storage) we have some other metadata about the page here too:

• id specifies which page in the file this page is
• storage is all the bytes in the page copied out to a byte array
• num_records specifies how many records this page has
• next is what strings the overflow pages we talked about in the last post. It is also used to keep track of pages that used to be overflow pages but are not in use.
• dirty specifies whether the page here is out of sync with its corresponding page on file.
• the next two fields keysize and valsize specify what length the key and value bytearrays are in the records the page stores.

The metadata is stored in the page itself. To read and write this metadata we have the following methods:

// in a impl Page block
let num_records : usize = bytearray_to_usize(self.storage[0..8].to_vec());
let next : usize = bytearray_to_usize(self.storage[8..16].to_vec());
self.num_records = num_records;
self.next = if next != 0 {
Some(next)
} else {
None
};
}

mem_move(&mut self.storage[0..8], &usize_to_bytearray(self.num_records));
mem_move(&mut self.storage[8..16], &usize_to_bytearray(self.next.unwrap_or(0)));
}


mem_move is a function similar to the memcpy system call.

Of course, the main content of the page are the records it stores. These are read and written using the following methods

pub fn read_record(&mut self, row_num: usize) -> (&[u8], &[u8]) {
let offsets = self.compute_offsets(row_num);
let key = &self.storage[offsets.key_offset..offsets.val_offset];
let val = &self.storage[offsets.val_offset..offsets.row_end];
(key, val)
}

pub fn write_record(&mut self, row_num: usize, key: &[u8], val: &[u8]) {
let offsets = self.compute_offsets(row_num);
mem_move(&mut self.storage[offsets.key_offset..offsets.val_offset],
key);
mem_move(&mut self.storage[offsets.val_offset..offsets.row_end],
val);
}


I think both of the above are fairly straightforward. Because records are of fixed-size and because we’re only dealing with bytearrays, it’s just a matter of moving stuff around.

## Buckets

Being able to read and write records to pages is great, but we need to be able to write to buckets without having to know in advance where and in which page the record will go to:

pub struct SearchResult {
pub page_id: Option<usize>,
pub row_num: Option<usize>,
pub val: Option<Vec<u8>>
}

// impl DbFile { ...
pub fn search_bucket(&mut self, bucket_id: usize, key: &[u8]) -> SearchResult {
let mut page_id = self.bucket_to_page(bucket_id);
let mut buffer_index;
let mut first_free_row = SearchResult {
page_id: None,
row_num: None,
val: None,
};
loop {
buffer_index = self.fetch_page(page_id);
let next_page = self.buffers[buffer_index].next;
let page_records = self.all_records_in_page(page_id);

let len = page_records.len();
for (row_num, (k,v)) in page_records.into_iter().enumerate() {
if slices_eq(&k, key) {
return SearchResult{
page_id: Some(page_id),
row_num: Some(row_num),
val: Some(v)
}
}
}

let row_num = if len < self.records_per_page {
Some(len)
} else {
None
};

match (first_free_row.page_id, first_free_row.row_num) {
// this is the first free space for a row found, so
// keep track of it.
(Some(_), None) |
(None, _) => {
first_free_row = SearchResult {
page_id: Some(page_id),
row_num: row_num,
val: None,
}
},
_ => (),
}

if let Some(p) = next_page {
page_id = p;
} else {
break;
}
}

first_free_row
}


This may look like a lot but what it’s doing is not too complicated. It keeps fetching the next page in the bucket until it finds the record with the key it’s looking for. What is perhaps interesting is what it returns when it doesn’t find the record(because the record hasn’t been inserted)– if there’s space, it indicates in SearchResult which page and row to insert the record in; if there isn’t any space, it returns what the last page it looked in was, which is meant to be used when creating an overflow page.

## Methods in LinHash

We now know how to figure out where a record should be placed in a bucket(search_bucket) and how to place record in said location(write_record). An implementation of our hashtable’s put operation arises quite naturally:

//  impl LinHash { ...
pub fn put(&mut self, key: &[u8], val: &[u8]) {
let bucket_index = self.bucket(&key);
match self.buckets.search_bucket(bucket_index, key.clone()) {
SearchResult { page_id, row_num, val: old_val } => {
match (page_id, row_num, old_val) {
// new insert
(Some(page_id), Some(pos), None) => {
self.buckets.write_record_incr(page_id, pos, key, val);
self.nitems += 1;
},
// case for update
(Some(_page_id), Some(pos), Some(_old_val)) => {
panic!("can't use put to reinsert old item: {:?}", (key, val));
},
// new insert, in overflow page
(Some(last_page_id), None, None) => { // overflow
self.buckets.allocate_overflow(bucket_index, last_page_id);
self.put(key, val);
},
_ => panic!("impossible case"),
}
},
}

self.maybe_split();
self.buckets.write_ctrlpage((self.nbits, self.nitems, self.nbuckets));
}


The self.bucket call at the very top hashes the key and computes which bucket the item should go in. Remember that which bucket should go in depends on which how many bits we’re looking at and how many buckets we have. We covered how this works in part 1, so we won’t go into that now.

This is how maybe_split is implemented:

fn maybe_split(&mut self) -> bool {
if self.split_needed() {
self.nbuckets += 1;

self.buckets.allocate_new_bucket();
if self.nbuckets > (1 << self.nbits) {
self.nbits += 1;
}

// Take index of last item added and subtract the 1 at the
// MSB position. eg: after bucket 11 is added, bucket 01
// needs to be split
let bucket_to_split =
(self.nbuckets-1) ^ (1 << (self.nbits-1));
// Replace the bucket to split with a fresh, empty
// page. And get a list of all records stored in the bucket
let old_bucket_records =
self.buckets.clear_bucket(bucket_to_split);

// Re-hash all records in old_bucket. Ideally, about half
// of the records will go into the new bucket.
for (k, v) in old_bucket_records.into_iter() {
self.reinsert(&k, &v);
}
return true
}

false
}


## Limitations

Let me close by stating some of the limitations of this implementation. I’ve already hinted at the limitations of the cache eviction policy we use. Most real systems use LRU. The error handling story here is also not great– we should not be panicking when the client uses the datastructure incorrectly.

More importantly, our hashtable cannot handle concurrent accesses. We’re not using locks anywhere so, if multiple threads are inserting records, we almost certainly will see the data go out of whack.

Thanks to Kevin Butler for sending in corrections and suggestions.

1. A much better way of doing cache eviction is to kick out the least-recently used page. I chose FIFO queue just because it’s much simpler to implement.

[return]