Assignment 5: Search Index using 3-4-5-6 trees

Deadline: Sunday 3rd May 11:59 PM

3-4-5-6 trees are a logical extension of 2-3-4 trees. Whenever we need to insert in a 6-node, it will be split into two 3-nodes and the middle key sent up. When there are more than one elements in the tree, every node will be a 3-node, 4-node, 5-node, or 6-node.

However, the tree is now stored in two files. One file contains all the nodes in JSON format. Use JSON.stringify and JSON.parse to convert to and from JSON. Always append new nodes, each terminated by a new line, leaving old unused nodes where they were. Instead of in-memory pointers you need to now store file offsets where the node starts from. The second file just stores the file offset of the root node. Note that when you change the root node, you will append it and it will have a new address. This mechanism is called copy-on-write (CoW) and this is similar to immutability in Clojure. Use fs.open, fs.read, and fs.close to read from a specific place in file. Read a fixed number of bytes and then take the portion until the first newline.

You need to write an HTTP server application that serves the following two operations where the first should list all files containing the word searchme and the second should add filename to the list of files stored against the keyword addme.

http://localhost:8000/?do=search&word=searchme
http://localhost:8000/?do=add&word=addme&file=filename

Try console.log(url.parse(request.url,true).query); and you will know how to extract the parameters.

When writing in file, you will artificially delay the write by 10 seconds using setTimeout. However, searching should work correctly during this time too with the new data being served. For this reason, you will need to cache all new nodes created due to the addition until they are eventually appended to file successfully.

Like last assignment, we will consider plain text files in a directory. You will write another node progream called build-search-index that works as an HTTP client and sends key-value pairs to your server. The keys are all words longer than 3 letters in all text files and the value is the file containing that word. Consider anything separated by a space as a word.

node --harmony build-search-index /directory/of/text/files

You could run this command on multiple directories and they will all be added to the tree in the server. You could also stop the server and restart it and it will still have the whole tree available for searching.

Submission is via your git repository as before.

Code Quality

Indentation, descriptive naming, good function decomposition, and good use of asynchronous features (NO synchronous functions) are a necessary part of this assignment. A complete working assignment might be worth less than half the grade if done in a poor way. Performance is not a focus of this assignment.

Warnings

Taking any code from Internet, from each other, helping each other debug or having access to each other’s code is strictly prohibited. Any perpetrators will be forwarded for strict action. Any guidance on design or even examples you study from the Internet should be quoted in your README file.