Post by monokr0me on Apr 30, 2012 21:51:37 GMT
Introduction to Programming the DCPU-16
The DCPU-16 is a versatile tool in the upcoming game 0x10c, which is being made by Notch (creator of Minecraft). In the game, the player pilots a space ship in the distant future. This ship comes equipped with, among other things, the DCPU-16. But what exactly is the DCPU-16? It's a computer that does nothing; or at least, does nothing on its own. It is entirely programmable to do anything from play a game of pong to guide missiles to enemy craft. To program it, however, one must use the archaic programming language known as "assembly".
What sets assembly apart from other programming languages is that instead of manipulating variables, strings, and built-in functions, one directly edits the memory values used by computers to run formulas and algorithms. For example, lets compare a program written in Python, to a program written in DCPU-assembly. Both programs will simply print the word "hi" to the console.
Python
print "hi"
Assembly
SET [0x8000], 0xF068
SET [0x8001], 0xF069
:L SET PC, L
As you can see, assembly seems to make much less sense than Python. Fortunately, I am here to explain the basics. But first, before I explain assembly, I must explain how to use Base-2 (binary) and Base-16 (hexadecimal) numbers, and the relationships between them. If you already know and understand these concepts, skip past the next section.
Binary and Hexadecimal: Explained
Most humans are used to counting in Base-10. What this means, is that each digit has a range of 10 numerals, 0-9. Once we reach that limit, we create another digit. Each represents a value 10x larger than the previous. 1 is 10^0, 10 is 10^1, 100 is 10^2, and so forth. Binary is Base-2. Base-2 works just as Base-10, except each digit can only hold a range of two numerals, 0-1. In addition, every subsequent digit is 2x larger than the previous. so 1 is 2^0, 10 (two) is 2^1, 100 (four) is 2^2, 1000 (eight) is 2^3, and so on.
Now, binary is the base in which computers count in. however, binary numbers can be quite large. For example, the number 32, which is two digits in Base-10, has a whopping six digits in binary (100000). The numbers only get larger from there. Such large numbers would take a lot of memory to process and store. To solve this, computers also support Base-16, or hexadecimal. Each digit in hexadecimal goes from 0-15. Since we don't have 15 numerals, we use the letters A, B, C, D, E, and F as 10-15.
Now, the DCPU is described as being 16 bit. What does this mean? A bit is simply a binary value of 0 or 1, virtually abinary digit. The number 1 is one-bit, the number 1010 is 4-bit, and so on. Now, 16 is 2^4. This means that you can fit 4 bits of binary in a single hexadecimal digit. So, 1111 = F (or fifteen). The DCPU is 16-bit, which means it upports numbers up to 1111 1111 1111 1111, which is long in binary but can be shortened to FFFF in hexadecimal. Here is a table representing some common binary terms.
To tell the computer what base you are using, we use the format nx####, where n can be b, to denote binary, or 0 (zero) to denote hex. The DCPU-16 system, with current knowledge, only supports hex nomenclature. Mathematical operations in binary and hex are the same as Base-10, however if you need examples just search the web. Just remember how much each digit is worth when you carry.
Using Assembly
To understand assembly, you must understand the basic structure of a computer. The two parts of the computer we will be focusing on are the CPU (Central Processing Unit) and the RAM (Random Access Memory). The CPU is what reads and performs the programs. The CPU contains an ALU (Arithmetic Logic Unit) and 11 'Registers', which are like built-in variables. These registers are A, B, C , X, Y, Z, I, J, PC, SP, and O. The A, B, C, X, Y, Z, I, and J registers have no specific purpose other than to contain values being used by the ALU. The PC (Program Counter) holds the value of what the CPU is currently reading, and the SP(Stack Pointer) and O(Overflow) registers I will detail later.
With only 8 variable registers, the CPU needs somewhere more permanent to store data. This is the purpose of the RAM; there are 0xFFFF (or 65,555) different RAM stores, each of which can hold a 16-bit word. Some of this RAM has specific purpose, however. For example, stores 0x8000 through 0x8200 are used for the video RAM (basically, to display things on the screen) and 0x9000 through 0x900F are used for the keyboard input buffer.
Now we will begin learning how to actually write assembly. Assembly has a number of operation codes, which tell the CPU what to do to numbers. The first and st basic OPcode is SET. SET is used to set a value to a register or memory store.
It is used in the format
SET X, Y
And an example:
SET A, 34
Would put the value 34 inside A
To access RAM, you use square brackets to tell the CPU to use a number as a RAM location.
SET [0x1000], 34
You can even use registers or other stores to access RAM
SET A, 0x00FF
SET [A], 34
Next are the mathematical OPcodes. These are ADD, SUB, MUL, DIV, and MOD. These are all self-explanatory, however DIV has a few quirks. binary and hex don't have decimal values, so DIV rounds the answer down (so 2.66 would be 2), and MOD is like DIV except it returns the remainder (so MOD 8, 3 would return 2). When using mathematical OPcodes, the result is stored in the first mentioned register. so ADD A, B would store the result in A.
The logic OPcodes are a bit more tricky if you've never used logic gates before. These are AND, BOR, XOR, SHL, and SHR. These do calculations on a binary level. AND, for starters, compares two (binary) numbers, bitwise. It checks to see if the relative bits are both 1. Take, for example, 1001 AND 1010. The first bits are 0 and 1, so the result's first bit is 0. The second bit is 0 and 1 again, so the result is again 0. the third bit is 0 and 0, so the result is 0. The fourth bit is 1 and 1, so the result is 1. BOR (bitwise OR) is much simpler; it outputs the highest value for each bit out of two numbers. So 1000 BOR 0110 would output 1110. XOR (exclusive OR) is similar, but with a key difference. It only outputs a 1 if the compared bits are unequal. So in 1011 and 0010, the output would be 1000. SHL and SHR are bit-shift functions, left and right. Bit shifting is simple; you simply move the bits in the direction of the shift and add a 0 to the end. so SHL once on 1000 becomes 10000. SHR once on 0101 becomes 0010 (since you cannot have decimals). Shifts can be done multiple times per operation, such as SHR 1000, 3 would shift 1000 to the right 3 times, giving you 0010.
The final OPcodes are comparision; IFE (if equal), IFN (if not equal), and IFG (if greater). These are self explanatory, and if the conditions being compared are not met, the next line of instructions is skipped.
Assembly is not linear, however. You can manipulate the PC register to send the CPU to specific lines, such as running the same block repeatedly until a condition is met. To do this, you simply use SET PC, <value>. However, instead of trying to guess what RAM your program is using, you can make a label. to make a label, simply make a line starting with a colon.
:Start
SET PC, banana ;another helpful hint, semicolons denote comments; anything in front of a semicolon is ignored by the computer and so can be used to make notes
:banana
SET PC, Start ;labels are case-sensitive so be careful
Afterword
This is simply an introduction to DCPU-16 programming; not much can be done with what is contained here. I will, however, soon add more information on how to get input, display to the screen, and more. I will also reformat and reorganize the essay, as I typed this all in one 2-hour session and is messy. In the meantime, I reccommend you get a DCPU-16 emulator program, and look up other tutorials to get started. Please point out to me any glaring mistakes that you find, and thank you for reading!