Daniel's Blog

Setting Up Crystal on ARM

Posted by Danila on March 2, 2019. Last edit on March 2, 2019

Having found myself using a Chromebook with Arch Linux, I acutely felt the lack of official Crystal builds for ARM. After spending an hour or so looking online for a comprehensive guide to getting a working compiler on ARM, I think I now know the general idea. I will lay it out here, for myself, as well as for others who may be in the same situation.

To compile Crystal, you need Crystal

To compile Crystal without bootstrapping it from older versions, you need a different machine which is capable of cross-compilation. Fortunately for me, my friends and I have previously set up a server hosting several x86_64 virtual machines, one of which we dedicated to cross-compiling various programs.

To get started, I had to download the compiler on that machine:

sudo pacman -S crystal

Note that this isn't the ARM Chromebook. Running this command on the Chromebook would not work.

Building on the x86_64 Machine

After getting the compiler, I also needed to download the compiler source. This was done using git:

git clone https://github.com/crystal-lang/crystal.git

I also installed llvm6, which is required on both the machine that's building the compiler and the machine for which the compiler is being built:

sudo pacman -S llvm6

From here on in, I ran commands from inside the directory that was downloaded via git clone:

cd crystal

Finally, I didn't want to compile the "master" version of the compiler. It wasn't a release! To check out the latest release (0.27.2 at the time of writing), I used git:

git checkout 0.27.2

Now, I had the compiler and the source. I was ready to compile the source to get myself a nice ARM Crystal compiler. But how? The official guide specified two options for cross compilation:

  • --cross-compile - This option is basically a flag. You just add it to the command to enable cross compilation.
  • --target=<llvm target triple> - This specifies the target architecture you're building for.

In order to get the second option right, I had to know the LLVM target triple for my target machine. To find it, I ran the following command on that machine:

gcc -dumpmachine

This produced the output armv7l-unknown-linux-gnueabihf. This was exactly what I needed to know!

Finally, looking through the Makefile in the repository, I found three more flags that are used by default in the process:

  • -D without_openssl
  • -D without_zlib
  • --release - for faster compiler

To compile the compiler, I had to compile the src/compiler/crystal.cr file. With all these options, the command came out to be:

crystal build src/compiler/crystal.cr --cross-compile --target=armv7l-unknown-linux-gnueabihf -D without_openssl -D without_zlib --release

There is only one more trick to cross-compiling Crystal: although the official guide specifies the options --cross-compile and --target=..., and although you can just attempt to use the crystal command on the source file, this won't work. You need to use the wrapper script that Crystal provides. I had to replace crystal with ./bin/crystal:

./bin/crystal build src/compiler/crystal.cr --cross-compile --target=armv7l-unknown-linux-gnueabihf -D without_openssl -D without_zlib --release

With this, I finally obtained a crystal.o file. After downloading this onto my target machine, and making sure to copy the command the compiler printed out, I was ready to proceed.

Compiling on the Target ARM Machine

Just like with the x86_64 machine, I needed llvm6:

sudo pacman -S llvm6

I also needed the Crystal source, again!

git clone https://github.com/crystal-lang/crystal.git && cd crystal
git checkout 0.27.2

Finally, I needed a few more libraries. These are gc (the Garbage Collector Crystal uses) and libevent:

sudo pacman -S gc libevent

With these dependencies installed, I could compile the last two files needed to build a working compiler:

make deps

After this, the command I noted down from earlier (on the x86_64 machine) was all that was left (note that I placed crystal.o in the clone of the Crystal repository):

cc 'crystal.o' -o 'crystal'  -rdynamic src/llvm/ext/llvm_ext.o `/usr/bin/llvm-config --libs --system-libs --ldflags 2> /dev/null` -lstdc++ -lpcre -lm -lgc -lpthread src/ext/libcrystal.a -levent -lrt -ldl -L/usr/lib -L/usr/local/lib

This produced a fresh, new crystal executable!

I was not done. The executable couldn't compile a basic "Hello, world"! This is because I once again needed the wrapper script. This script searches the .build directory for the Crystal executable, and so I placed it there:

mkdir .build
mv crystal .build

Finally, I made sure to add the ./bin directory to my PATH.

Shards

Crystal is not complete without its package manager, shards. This program doesn't need to be cross compiled, but it does come separately from the compiler. First, I cloned the repository:

git clone https://github.com/crystal-lang/shards.git

Then, I installed libyaml, which is necessary to compile shards:

sudo pacman -S libyaml

And finally, I ran the Makefile provided in the repository:

make

I once again added the ./bin directory to my path. And finally, a working Crystal environment!

Haskell Error Checking and Autocompletion With LSP

Posted by Danila on January 16, 2019. Last edit on January 17, 2019

For Oregon State University's CS 381, Programming Language Fundamentals, Haskell is the language of choice. While it has an excellent REPL, and an excellent compiler, a piece of the puzzle is missing when writing Haskell code: on-the-fly error checking. Although there are many IDEs for more popular languages, such as Java and C++, that check your code as you type, this is not the case for Haskell. However, thanks to the magic of the Language Server Protocol, you can get code error checking and linting in your favorite editor. After we're done, your IDE / Text Editor / etc. should be able to display errors in a way similar to the image below:

Imgur

Prelude: Language Server Protocol

Language Server Protocol (LSP) is an attempt to simplify the code analysis feature in various IDEs. Before LSP, each individual IDE had to provide its own functionality for analyzing user code, and couldn't re-use the code analysis features of another IDE. Thus, for every IDE, for every language, a new code analyzer had to be developed. For \(m\) IDEs and \(n\) languages, there would need to be \(m \times n\) code analyzers. This is less than ideal. LSP solves this problem by moving the task of analyzing code into an external process, called a Language Server. The language server examines code, finds errors and warnings, and figures out the names of the various variables and functions that the source file contains. It then communicates this information to a Language Client, which is usually the IDE. The language client is responsible for presenting the information to the user in any way it deems fit.

Because LSP is a protocol, any server is able to communicate with any client. Once someone has written a language client for, say, Haskell, this client can be used by any IDE that supports LSP, which means that IDE instantly gains Haskell support. Similarly, once an IDE works as an LSP client, it can use any language server, and thus is immediately able to support all languages for which there is a language client.

LSP for Haskell: Haskell IDE Engine (HIE)

It so happens that Haskell does have a language server, called HIE. We'll be using it to get all the neat error checking and autocompletion features. As explained above, a language server is an external program: we have to install it separately from an IDE, and then configure the IDE to communicate with it. So, how are we going to install this external program? We will compile HIE from source, using the latest Haskell release (8.6.3 at the time of writing). To do so, you must have a working Haskell compiler (GHC), as well as Stack.

Note: It has come to my attention that on Windows, the GHC compiler is broken for version 8.6.3, leading to the package fclabels taking an infinite amount of time to compile. If you're running Windows, I suggest using 8.4.4 instead of 8.6.3. It is not yet clear if HIE can handle versions of Haskell higher than itself (which would happen because your compiler will most likely be 8.6.3 no matter what). I will update this post as more information becomes available.

First things first, we need to clone the repository:

git clone https://github.com/haskell/haskell-ide-engine.git
cd haskell-ide-engine
git checkout 0.5.0.0

Notice that we checked out the 0.5.0.0 release. At the time of writing, this is the latest release. I have attempted to compile the master branch of HIE directly, and was not successful; It's possible that it is not completely stable.

Next, we will compile HIE. This is a fairly time consuming step: on my machine, the full compilation process took around 10 minutes. At the very least, the project repo contains a Makefile and a PowerShell file, so all we have to do is run the script. From here, different steps need to be taken depending on your operating system: choose your adventure!

UNIX-like Systems

On UNIX systems, we can use the provided Makefile to run the compilation script:

make hie-8.6.3

After the command completes, you need to add the executable to your path. You do this by modifying your .bashrc file. First, find the directory:

stack path --local-bin

Copy the output of that command, and add a new line to the bottom of your ~/.bashrc file (not including the square brackets):

export PATH=$PATH:[output of stack path]

Reload your RC file by running

source ~/.bashrc

You are now ready to verify your installation.

Windows Systems

To run the compilation process on Windows, you need to use the provided PowerShell script:

powershell -ExecutionPolicy RemoteSigned -c .\build-all.ps1 8.6.3

Once that completes, you need to add the location of the compiled executable to your Path environment variable. To do so, you first, you need to find the directory that you want to add to the variable:

stack path --local-bin

Next, you need to locate the Path variable itself. It is accessed through a Control Panel setting, which is found under System and Security > System > Change settings > Advanced > Environment Variables. Select the line that says "Path" and click Edit. If this brings up a single text box, add

;[output of stack path]

to the end of the textbox (do not delete the text that was previously in the box; only add to it). If the Edit button brings up a dialog box with a list of paths, simply click New and paste the output stack path there before pressing Enter.

You are now ready to verify your installation.

Verifying Installation

After you have added HIE to your path, you should be able to start it with the command:

hie

This should produce some output along the lines of "Run entered for HIE(hie) Version 0.5.0.0". Once this is done, HIE is installed and you can now configure your favorite editor.

Setting Up a Language Client

Atom

In Atom, you need to install atom-ide-ui, ide-haskell-hie, and language-haskell. From there, opening a Haskell file in the IDE should enable autocompletion.

If autocompletion and error checking do not work, chances are, it cannot find the HIE executable we built in the previous steps. Locate it by going to Settings > Packages > (Search ide-haskell-hie) > Settings (on ide-haskell-hie package). From there, paste the output of stack path --local-bin into Absolute path of hie executable. To the end of that line, add /hie (or \hie, if you're on Windows): the output of the command gives the folder in which the language server is installed. Adding the hie to the end makes it point to a file.

Visual Studio Code

The package Haskell Language Server provides language client support to Visual Studio Code.

Neovim

First, set up vim-plug as per the GitHub instructions, most likely by dropping the plug.vim file into your Neovim's autoload directory. From there, edit your init.vim file to include the following:

call plug#begin('~/.config/nvim/plugged')
Plug 'roxma/nvim-yarp'
Plug 'ncm2/ncm2'
Plug 'autozimu/LanguageClient-Neovim', {'branch': 'next', 'do': 'bash install.sh' }
call plug#end()

After that's done, restart Neovim, then execute the :PlugInstall command. Restart Neovim again to make sure the plugins have loaded.

Note: If you already use a different plugin manager, just install roxma/nvim-yarp, ncm2/ncm2 and autozimu/LanguageClient-Neovim

Additionally, you need to add the following options (also in init.vim):

autocmd BufEnter * call ncm2#enable_for_buffer() " Enable ncm2 for all buffers
set completeopt=noinsert,menuone

The above code installs the three plugins, nvim-yarp, ncm2 and LanguageClient-Neovim. While it's pretty clear what the last one does, the first two are less obvious. nvim-yarp is a dependency for ncm2, an allows for Neovim to communicate with Python. ncm2 is a plugin that improves autocompletion. The options we set are recommended by the ncm2 installation guide.

Finally, we need to configure the LanguageClient:

let g:LanguageClient_serverCommands = {
            \ 'haskell': ['hie', '--lsp'],
            \ }

After we're done, restart Neovim. You should now have autocompletion and error checking. Some settings that I also use with this setup are as follows:

let g:LanguageClient_diagnosticsDisplay = {1: {"name": "Error","texthl": "ALEError","signText": ">>","signTexthl": "ALEErrorSign",},2: {"name": "Warning","texthl": "ALEWarning","signText": ">>","signTexthl": "ALEWarningSign",},3: {"name": "Information","texthl": "ALEInfo","signText": ">>","signTexthl": "ALEInfoSign",},4: {"name": "Hint","texthl": "ALEInfo","signText": ">>","signTexthl": "ALEInfoSign",},}
set signcolumn=yes

Conclusion

With the Language Server Protocol, you can have consistent and effective error checking and autocompletion in any editor that can act like a client. I hope that after reading this post, you you will have a more pleasant time with using Haskell.

A Look Into Starbound's File Formats

Posted by Danila on May 17, 2017. Last edit on May 17, 2017

After playing a few hours of Starbound, I decided that I wanted to mess around with its file format, perhaps make a save manager/swapper, or something. There wasn't a really clear idea in my head as to what I was going to do, but I wanted to start writing something and see what happens. Looking up "starbound file formats" led me immediately here. I tried to follow the outline with my own implementation, but things weren't working. At all. So, I cracked open a hex editor. According to the website, I was supposed to see:

Byte[4] Signature "BTDB"

However, this was not the case. Instead, a very similar BTreeDB5 was what greeted me at the beginning of the file. Not good. The next links in my search results were py-starbound, and its friend, by the same author, go-starbound. Neither of these was in a language that I felt comfortable using for my project, so there was only one solution - to figure out the formats that they used, and try one more time to write my own.

py-starbound, nicely enough, actually has a file named FORMATS.md. This file proved very helpful, though slightly incomplete. One key piece of information was missing from it - how the BTreeDB5 was organized beyond its header. The only thing in FORMATs.md on this subject was:

This section will contain information on how to retrieve a value from a BTreeDB5 database.

Not very helpful. Before I go into what I managed to determine from the code, we may first take a look at one thing that we already know about the world format - it is a B-Tree.

Binary Search Trees

The B-Tree is a generalization of a Binary Search Tree, or BST for short. Binary Search trees (and B-Trees in general) operate on data that can be ordered consistently, the simplest example being numbers. For instance, as an example, I'll be using a BST that holds integers. A BST is made up of nodes, objects that actually hold the pieces of data that the tree itself organizes.

In a BST, the nodes are organized in a simple way. Each node can have up to two children (sub-nodes), and each of those can have up to two children, etc. The children are generally classified as right and left. Conventionally, left children always have a value that is below (or comes before) the value of the node whose child they are (their parent), and right children have a bigger value.

In order to find the data in the tree, a program would first examine the very first node of the BST, called the root. If the data it's looking for is equal to that of the root, the search ends there. If the data is less, it moves on to the left child, and performs the same operation there. Similarly, if the data is larger, it the search moves on to the right child. This continues until either a node with the same value is found, or until there is no child left to examine - the algorithm reached the end of the tree.

The purpose of the binary search tree is to speed up lookups. Let's compare it to a simple list of values. To check if a value is there, a naive approach would go from the beginning to the end of the list, searching for the data. For instance, if we wanted to check whether the list (1, 2, 3, 4, 5, 6, 7, 8, 9) contained the number nine, we would have to look at the first element (one), then the second, all the way until the 9th. This is fine for small lists, but when the size of the lists become big enough, this operation becomes fairly slow. Let's now take a look at how the same operation would be performed in a binary tree.

Here's an example of a valid binary tree with the numbers one through nine.

           (5)
     (2)         (7)
   (1) (3)     (6) (8)
         (4)         (9)

The root node of this tree is 5, its left child is 2, and its right child is 7. In order to check if the nine is in this tree, we only need to look at the five, then the seven, then the eight, and finally, the nine. We have, therefore, looked at only four of the numbers, instead of all nine. But let's look at another valid binary search tree.

(1)
  (2)
    (3)
      (4)
        (5)
          ...

In this tree, the root node is 1, and the right child is 2. None of the nodes have a left child. Looking through this tree would be as slow as looking through a list - we'd have to look through all the numbers before we find 9. No good.

Although the average efficiency of a Binary Search Tree is \(O(\log n)\), meaning that for a tree with \(n\) nodes, it will only need to examine up to \(logn\) of these nodes, it can nonetheless be as inefficient as \(O(n)\), meaning that it will have to examine every node in the tree.

This isn't good enough, and many clever algorithms have been invented to speed up the lookup of the tree by making sure that it remains balanced - that is, it isn't arranged like a simple list. Some of these algorithms include Red-Black Trees, AVL Trees, and, of course, B-Trees.

B-Trees

B-Trees are a generalization of Binary Search Trees. That means that every Binary Search Tree is a B-Tree, but not all B-Trees are BSTs. The key difference lies in the fact that B-Trees' nodes aren't limited to having only two child nodes, and can also have more than one value.

Each B-Tree node is a sorted array of values. That is, instead of a single number like the BST that we've looked at, it has multiple, and these numbers must be sorted. Below are some examples of B-Tree nodes:

(1)
(1, 2)
(2, 3, 5)

This raises the question: How do we find the next node to examine if the node doesn't contain our child? Before, we only had two choices, but now, our data can be bigger than one of the values and smaller than the other...what do we do?

This is solved using another property of B-Trees - the number of children of a node is always one more than the number of values in it. Thus, a node with one value will have two children, and a node with two will have three children. Thus, to find the next node we want to look for, we find the child that falls "between" the two values that our data lies between. To clarify, let's look at a simple B-Tree:

           (10,   20,   30)
       (1st)  (2nd) (3rd)  (4th)

If we were looking for the number 15, we'd look between the 10 and the 20, examining the 2nd node, and if we were looking for 45 we'd look past the 30, at the 4th node.

Starbound B-Trees and BTreeDB5

The BTreeDB5 data structure uses something other than integers for its keys - it uses sequences of bytes. These bytes are compared in a very similar fashion to integers. The game first looks at the first number in the sequence of bytes (like the largest digit in an integer), and if that's the same, moves on to the next one. Also, Starbound B-Trees not only have the values, or keys, that they use to find data, but the data itself.

The "nodes" in the BTreeDB are called "blocks" and are one of three types - "index", "leaf", and "free" nodes. "Index" nodes are like the (10, 20, 30) node in the above example - they point to other nodes, but actually store no data themselves. The "leaf" nodes actually contain the data, and, if that data is longer than the maximum block size, "leaf" nodes contain the index of the next leaf node where the user might continue to read the data. The "free" nodes are simply free data, empty and ready for Starbound to fill them with something useful.

To be able to read a Starbound BTree, the first thing that needs to be done is to read the general information about the tree. For this, we read the very beginning of the file, called the header. GitHub user blixt has made an excellent table of values that make up the header. The ones that matter to us are Whether to use root node #2 instead (hereby referred to as "swap", Root node #1 block index and Root node #2 block index. We also need to know the key size and block size of the B-Tree to be able to jump around in the file.

Once the header has been parsed (this is an exercise left up to the reader), the next step is to find the root node. This is following exactly the general lookup algorithm for a B-Tree. The index in the file (by bytes) of a block is \(headerSize + blockSize \times n\), where \(n\) is the block number. Depending on the value of swap (whether to use the second root node), we use either the index of root node 1 or 2 for \(n\), and move to that position.

Next, we proceed to identify the node that we've found. The first two bytes in that node are either the ASCII values of 'F', 'L', or 'I', representing, you guessed it, "Free", "Leaf", or "Index". If the node is an index node, we need to search it for the next node we need to examine. To do so, we first read two values from the node, two 32-bit integers. The first is the number of keys in the block, and the next is the block number of the first child node.

The key -> child data is stored in the node as follows:

(1st child)(1st key)(2nd child)(2nd key) ... (nth child)

(The key is of the length given in the header, and the child number is a 32-bit integer)

As you can see, the number of children is one more than the number of keys, and it's easier to treat the first child block number as a separate value, and the rest as key -> child pairs, like:

(1st child) (1st key, 2nd child) ... ((n-1)th key,  nth child)

Simply, if the value we're searching for is bigger than the first key only, we go to the second child, if it's bigger than the second key, we go to the third child, etc. If our value is not bigger than any of the keys, we go to the 1st child. After we move to the index of the new child, we once again examine its type, and if it's still "II", we repeat the process.

Once we reach a "leaf" node, we're very close. After the two ASCII characters describing its type, the leaf node will contain a 32-bit integer describing the number of key-data pairs it has. Each key-data pair is made up of the key, a variable integer describing the length of the data, and the data itself. We examine one pair after another, carefully making sure that we don't exceed the end of the block, located at \(headerSize + blockSize \times (n + 1) - 4\). The reason that we subtract four from this equation is to save space for the address of the next block. As I mentioned above, leaf nodes, if their data is bigger than the block size, contain the block number of the next leaf node to which we can continue if we reach the end of the current leaf. If we do reach the end of the current leaf, we read the 32-bit integer telling us the number of the next block, jump to its index, and, after reading the first two bytes to ensure it's actually a leaf, continue reading our data. Once that's done, voila! We have our bytes.

If we're using this method to read a Starbound world file, we need to also know that the data is zlib-compressed. I won't go far into the details, as it appears like both Python and Java provide libraries for the decompression of such data.

This concludes today's post. I might make another post later about my implementation of the BTreeDB search, written in either Kotlin or Java. In the meantime, if you're interested, check out the following Python, Go and Kotlin (my own) implementations of parsers for BTreeDBs and other Starbound file formats.

New Look, New Features!

Posted by Danila on November 23, 2016. Last edit on January 19, 2017

I've recently taken the time to redesign my website from scratch, on both the back-end and the front-end! On the back-end, this means the addition of tags, which allow me to categorize the not-so-plentiful content on the site. On the front-end, this means a new design written completely from scratch. This new front-end also allows me to render mathematics: $$A=\int_{a}^{b}f(x),dx$$
Phew!

The only downside to this is that I chose to migrate the lazy way. That is, I simply copy pasted all the post data from one database to the other. Boo! This means the dates will say all three previous posts were made today. Oh well!

Learning Emulation, Part 2.5 - Implementation

Posted by Danila on November 23, 2016. Last edit on January 9, 2017

This is the third post in a series I'm writing about Chip-8 emulation. If you want to see the first one, head here.

In the previous part of this tutorial, we created a type to represent a basic Chip-8 machine. However, we've done nothing to make it behave like one! Let's start working on that.

Initializing

If you're writing your emulator in C / C++, simply stating that we have a variable such as stackp (stack pointer) will not give you a clean value. Memory for these variables is allocated and never cleaned up, so they can (and do) contain gibberish. For some parts of our program, it's necessary to initialize everything we have to 0 or another meaningful value. A prime example is that of the timers, especially the sound timer. The sound timer beeps until its value is 0, and if it starts with a value of a few million, the game will make a sound without having asked for any. The stack pointer also needs to be set to 0. As it's proper style to initialize everything, we'll do just that.

  chip->pc = 0x200;
  chip->i = 0;
  chip->stackp = 0;
  chip->delay_timer = 0;
  chip->sound_timer = 0;

  for(int i = 0; i <   16; i++) chip->v[i] = 0;
  for(int i = 0; i < 4096; i++) chip->memory[i] = 0;
  for(int i = 0; i <   16; i++) chip->stack[i] = 0;
  for(int i = 0; i < (64 *32); i++) chip->display[i] = 0;

I set to program counter to 0x200 because, according to the specification on the Wiki page for Chip-8,

most programs written for the original system begin at memory location 512 (0x200)

The First Steps

We are now ready to start stepping the emulator. I'm going to omit loading a file into memory and assume that one is already loaded. However, I will remind the reader to load the programs into memory starting at address 0x200 when they write their file loading code.

According to the Chip-8 Wiki,

CHIP-8 has 35 opcodes, which are all two bytes long and stored big-endian.

So, the first thing we want to do in our step code is to combine the next two bytes at the program counter into a single short.

unsigned short int opcode = (chip->memory[chip->pc] << 8) | (chip->memory[chip->pc + 1]);

In this piece of code, we take the byte at the program counter, and shift it to the left 8 bits (the size of a byte). We then use binary OR to combine it with the next byte.

Now that we have an opcode, we need to figure out how to decode it. Most opcodes are discerned by the first hexadecimal digit that makes them up (for example, the F in 0xFABC), so let's first isolate that first digit. We can do that by using the binary AND operation on the program counter, with the second operand being 0xF000.

unsigned short int head = opcode & 0xf000;

If we had an opcode with a value such as 0x1234, running 0x1234 & 0xF000 will give us 0x1000. This is exactly what we want to tell each opcode apart. We can now start implementing the instructions!

The first instruction listed on the Wiki page is:

0x00E0: Clears the screen.

So, in our step code, we need to check if the opcode starts with a 0. If it does, then the whole head variable will be a 0.

if(head == 0) { 
...
}

Next, though, we have a problem. There are two codes on the Wiki page that start with 0:

0x00E0: Clears the screen.

0x00EE: Returns from a subroutine.

So now, we need to check if the ending of the instruction is 0xE0 and not 0xEE.

if((opcode & 0x00ff) == 0xe0) {
...
}

Now, we can just clear the screen the way we did when we initialized.

for(int i = 0; i < (64 *32); i++) chip->display[i] = 0;

All together, our code ends up being:

// Get the Opcode
unsigned short int opcode = (chip->memory[chip->pc] << 8) | (chip->memory[chip->pc + 1]);

// Decode opcode
if(head == 0) { 
  if((opcode & 0x00ff) == 0xe0) {
    for(int i = 0; i < (64 *32); i++) chip->display[i] = 0;
  }
}

// Increment the program counter; We don't want to execute the same thing twice. 
// We increment it by two because a single memory location is one byte, and an instruction is two bytes.
chip->pc += 2;

From here, we should be able to implement most of the instructions, as they are pretty basic. It's important that there is some independence here - though emulating a very popular thing like Chip-8 has plenty of tutorials, if we want to emulate something new, we need to be able to read the documentation and work from it alone. So, I leave the rest of the instructions up to you to implement. I might post another update with a guide on how to implement the draw instruction, as I had some trouble with it, but the documentation for Chip-8 is more than enough to help you with the opcodes on your own.

Learning Emulation, Part 2

Posted by Danila on November 23, 2016. Last edit on May 6, 2017

This is the second post in a series I'm writing about Chip-8 emulation. If you want to see the first one, head here.

Now that we have an understanding of the physical capabilities of a Chip-8 system, we can write code that will represent such a system on our computer. In this post we'll start writing some basic code - be prepared.

A Simple Representation

We'll start with the memory. Chip-8 has, as we remember, 4096 memory locations, all of which are bytes. We will create an array of 4096 bytes to match that.

unsigned char memory[4096];

If we have instructions in memory, we also need what's called a program counter, a variable that points to the next instruction in memory we have to execute. This variable would never need to be negative, and hold a value of up to 4095. The type we use for the program counter would thus have to be bigger than a char (max unsigned value of 256). The next biggest type we can have would be a short.

unsigned short int pc;

Next up, the registers. They all hold an 8-bit value, which is always positive. We can represent them with a 16-long array.

unsigned char v[16];

Note: I used v for the variable name as the registers are named V0-VF in the Chip-8 specs.

We also need to create the I register. As it's used to handle memory locations, it needs to be able to hold values up to 4095, just like the program counter. So we write it as a short integer.

unsigned short int i;

Now, the stack. The stack is, unsurprisingly, a stack datastructure. According to the Wiki page on Chip-8,

modern implementations [of the stack] normally have at least 16 levels

Because of that, we're going to represent the stack as a 16-long array of short integers. These integers will represent locations in memory at which a jump was made to a subroutine. If this sounds confusing, it'll be explained when we implement it. Additionally, we'd need a stack pointer, which holds a number up to 15 and gives the index in the stack of the latest value that was added.

unsigned short int stack[16];
unsigned char stackp;

We're almost there. We now need to add the timers. I do not remember seeing the size of these timers on the Wiki, so we'll represent them with generous integers.

unsigned int delay_timer;
unsigned int sound_timer;

Finally, we need a variable to hold the display pixels (which are true or false values). My code is in C, and C did not have a primitive boolean type out of the box unti C99. Because I'm mildly crazy, I will avoid this new feature and use, instead, a char. If your language of choice supports booleans (and / or you don't exhibit compulsive behaviors), by all means use them. Anyway, since the display is 64x32, we can make a display variable like this:

unsigned char display[64 * 32];

If we were to use booleans, this might look something like:

bool display[64 * 32]; 

Those are all the variables that we have to keep track of! Let's group them under a single chip struct.

struct chip_s {
  unsigned char memory[4096];
  unsigned short int pc;
  unsigned char v[16];
  unsigned short int i;

  unsigned short int stack[16];
  unsigned char stackp;

  unsigned int delay_timer;
  unsigned int sound_timer;

  unsigned char display[32 * 64];
};

Since this is C, if we were to just use a struct we'd have to use the struct keyword every time. To avoid doing that, I'll use a typedef to define a chip_t type:

typedef struct chip_s chip_t;

That's it for today! By the end of this post you should have a data type representing a simple Chip-8 machine. I think the next two posts will be independent of each other: one for basic rendering with GLFW, and one for actual Chip-8 code.

Learning Emulation, Part 1

Posted by Danila on November 23, 2016.

I've decided that the purpose of a blog is to actually use it every once in a while. So, to fill up this blank space, I'll be documenting my own experience of starting to learn how emulation works. I'd like to say right now that my main goal was not to learn emulation. Rather, I needed to emulate to refresh my skills for a different subject area. However, emulation turned out fun enough to write about.

Choosing a Target

To emulate is to mimic the behavior of something. In terms of video games, emulation attempts to run games made for a console - without the console. To do so, one must read and execute the game's code, which is written as machine code for the processor inside the console. To be able to execute the code, we must know exactly how the processor would do so.

At this point, you can probably imagine that emulating really complex processors would be, if nothing else, very tedious. They have huge amounts of different instructions and these instructions would need to be each implemented in our code. So, jumping too far ahead and trying to emulate something complex like a PS2 may make you overwhelmed. When it comes to starting out, small, rewarding, steps are the best.

As many people who have just started to learn about emulation, I picked Chip-8 as the first thing I'm going to emulate. Chip-8 isn't actually a console by itself. It's an interpreted programming language which is intended to be run by a processor. However, Chip-8 is very low-level and running it will give us a good taste of what console emulation is like.

Doing Some Reading

Remember how I said we'd need to know exactly how the processor executes the game code? I wasn't lying. If we want to emulate Chip-8, we need to understand exactly how it was intended to be run. Fortunately, its Wikipedia page contains lots of information about it. Since Chip-8 isn't run on a console - it's an interpreted language - it's instead run in a "virtual machine". A virtual machine we can think of, for now, as an equivalent to a console that only exists in our imagination. So, without further ado, let's start examining this virtual machine.

According to the Wiki page, Chip-8 was commonly implemented on machines with 4096 memory locations all of which are a byte. However, since on these machines the "interpreter" - what actually reads and executes the Chip-8 code - occupied the first 512 bytes, programs typically start right after 512 bytes. The last 256 + 96 bytes are used "internally", so they're not accessed either. Chip-8 has 16 1-byte registers, named V0 through VF and the last register, VF, is used as a carry flag. Another register, I, is present in Chip-8 and is twice as big as the others. This is so that it can be used in operations involving memory. Chip-8 also has a stack with a maximum length of 16, used only for subroutines as well as two timers counting down at 60hz. The display is 64x32 with pixels being either "on" or "off", and input is done through a hex keyboard with 16 keys.

Phew, that was a lot of information. Whoever might read this will probably need to take time to process this much information, and I'll need to take time to write more. So long, for now!