You all know how an OS is starting, right? Usually one push the start button, read the POST output from the screen then booootttt - it has started!
Well, the idea is that after the POST has successfully completed it will invoke the bootstrap loader, which is in a program that we all know it under the name BIOS (Basic Input/Output System). Well, at least we use to call it that way for all IBM compatible PCs.
There are few boot loaders on the market, the most common are:
First stage boot loader
If the BIOS find all the necessary devices to start the PC then, usually, it will load a small program (<=512 bytes) from the sector zero - aka MBR - (which has only 512 bytes as logical size) of your boot device (e.g. hard drive, floppy, cdrom, usb drive, etc). The whole process until here it's called "first-stage boot loading".
Second stage boot loader
So the program that resides on your disk boot sector has loaded into memory and the CPU is executing it. Now that the program has taken the control over your PC, it is the mother and the father of everything. The most common are: GNU GRUB (*nix systems), BOOTMGR (Windows Vista and later), Syslinux (Linux, right?) and NTLDR (Windows NT systems). What they were designed to do it's to load the real OS into memory and then to pass the control to the it. From that point further it's the OS that become the mother and the father of your PC. And finally, who controls the OS? Exactly: you. It means that your PC is your child, right? At least this is what I feel about mine, anyway.
But how they looks like? Not very complicated since they have maximum 512 bytes. Bellow you can find my boot loader, a boot program that I have written by myself, but that is another story.
What is important here is the fact that they must have exactly 512 bytes in size and the last 2 bytes (byte 512 and 512) to be 170 (AA in hexa) and 85 (55 in hexa). These 2 bytes represent the "boot sector signature".
When I said "they must have exactly 512 bytes in size" I didn't meant that the program itself can have 512 byte, I meant only that on the disk it must to be written as 512 where the last two are AA55h (in hexa). In real life probably your boot loader program doesn't take more then 446 bytes on the disk, because your boot sector should contains also some info about the disk itself.
OK, back to the picture above I should say that if you are not going to use your sector for keeping the track of the partition tables (because for example you have none, it's just a dummy disk that you are using for tests,etc) then it's OK to use the whole sector, so 512 bytes. If you intend to install a OS on that disk, a OS that normally wants 1-4 primary partitions, then it's a good idea to use a disk layout like the one above.
If you take a look at my boot loader hexdump above (the "matrix" screen-shot) you will remark that I had no intention in having such a disk layout at all ð®
Create bootloader Linux
OK, I have to admit that the "matrix" screen-shot could be little intimidating but at the end of the day it's just a code:
; ========================================================================= ; Project : Check/set VT-x flag on Intel VT-x capable CPU ; Filename : vtx-bootloader.asm ; Date : Tuesday, 2012-09-04-12.53 ; File version : 0.0.0.1 ; File revision : revision 1 ; Author : Eugen Mihailescu <eugenmihailescux at gmail dot com> ; Purpose/usage : Read the VT-x flag of your Intel CPU. ; ; Copyright : This file is part of VTxBootloader. ; ; VTxBootloader is free software: you can redistribute it and/or modify it ; under the terms of the GNU General Public License as published by ; the Free Software Foundation, either version 3 of the License, ; or (at your option) any later version. ; ; VTxBootloader is distributed in the hope that it will be useful, but WITHOUT ; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS ; FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. ; ; You should have received a copy of the GNU General Public License along ; with VTxBootloader. If not, see http://www.gnu.org/licenses ; ========================================================================= bits 16 ; generate 16-bit code (works even for 8086) ; (the first 2bytes of boot-loader) 7c00h because BIOS loades the OS at address 0:7C00h ; so ORG 7C00h makes that the refrence to date are with the right offset (7c00h). org 7c00h ; ========================================================================= ; Program global initialization ; ========================================================================= _init: ;call _clearScreen mov cx, 1 ;We will want to write 1 character mov bx, 000ah ;Page 0, colour attribute 15 (white) for the int 10 calls below xor dx,dx mov ds,dx ;Ensure ds = 0 (to let us load the message) cld ;Ensure direction flag is cleared (for LODSB) call _read_cursor_position ; ************************************************************************* ; ========================================================================= ; Main program ; ========================================================================= _start: mov [str_len],byte str_msg2-str_msg1 mov si, str_msg1 ;Loads the address of the first byte of the str_msg1 inc dh xor dl,dl call _printOut mov [str_len],byte str_len-str_msg2 mov si, str_msg2 ;Loads the address of the first byte of the str_msg1 inc dh xor dl,dl call _printOut call _checkVTx mov eax,[flag_vtx] bt eax,2 ; test bit 2 read from MSR jc _end ; if is set then abort call _enableVTx ; if not set then enable it _end: mov [str_len],byte flag_vtx-str_press_any_key mov si, str_press_any_key ;Loads the address of the first byte of the str_msg1 inc dh inc dh xor dl,dl mov bx, 000ch ;Page 0, colour attribute 15 (white) for the int 10 calls below call _printOut mov ah, 0 ;Read Key opcode int 16h cmp al, 0 ;Special function? jz _end ;If so, don't echo this keystroke call _clearScreen int 18h ; ************************************************************************* ; ========================================================================= ; Print routine ; ========================================================================= _printOut: ; set the cursor position (DH,DL should be initialized externaly) ; PC BIOS Interrupt 10 Subfunction 2 - Set cursor position;AH = 2;BH = page, DH = row, DL = column mov ah, 2 int 10h mov cx,1 lodsb ;Load a byte of the message into AL. Remember that DS is 0 and SI holds the offset of one of the bytes of the message. ; output one character from AL and send it to screen ;PC BIOS Interrupt 10 Subfunction 9 - Write character and colour; AH = 9; BH = page, AL = character, BL = attribute, CX = character count mov ah, 9 int 10h inc dl ;Advance cursor cmp dl, 80 ;Wrap around edge of screen if necessary jne _skip xor dl, dl inc dh cmp dh, 25 ;Wrap around bottom of screen if necessary jne _skip xor dh, dh _skip: dec byte [str_len] cmp byte [str_len],0 ;If we're not at end of message, jne _printOut ;continue loading characters ret ; ************************************************************************* ; ========================================================================= ; ClearScreen routine ; ========================================================================= _clearScreen: mov ax, 0600h ; Use function 6 - clear screen, clear whole screen mov bh, 7 ; use black spaces for clearing mov cx, 0 ; set upper corner value mov dx, 2479h ; coord of bottom-right of screen int 10h ; set the cursor position (DH,DL should be initialized externaly) ; PC BIOS Interrupt 10 Subfunction 2 - Set cursor position;AH = 2;BH = page, DH = row, DL = column mov dx, 00h ; coord of top-left of screen mov bh, 0 mov ah, 2 int 10h ret ; ************************************************************************* ; ========================================================================= ; Check virtualization flag (VT-x flag) routine ; ========================================================================= _checkVTx: mov ecx, 0x3a ; MSR 0x3a rdmsr ; read the MSR mov [flag_vtx], eax call _read_cursor_position mov dl, str_len-str_msg2 ; set the cursor at the end of the string mov eax,[flag_vtx] cmp eax, 0 je _default _l0: mov eax,[flag_vtx] and eax,1 cmp eax,1 jne _l1 call vtx_locked _l1: mov eax,[flag_vtx] and eax,2 cmp eax,2 jne _l2 call vtx_enabled _l2: mov eax,[flag_vtx] and eax,4 cmp eax,4 jne _l3 call vtx_disabled _l3: ret _default: mov [str_len],byte (str_vtx_changed-str_vtx_unknown) mov si, str_vtx_unknown ;Loads the address of the first byte of the str_msg1 call _printOut ret vtx_locked: mov [str_len],byte (str_vtx_enabled-str_vtx_locked) mov si, str_vtx_locked ;Loads the address of the first byte of the str_msg1 call _printOut ret vtx_enabled: mov [str_len],byte (str_vtx_disabled-str_vtx_enabled) mov si, str_vtx_enabled ;Loads the address of the first byte of the str_msg1 call _printOut ret vtx_disabled: mov [str_len],byte (str_vtx_unknown-str_vtx_disabled) mov si, str_vtx_disabled ;Loads the address of the first byte of the str_msg1 call _printOut ret ; ************************************************************************* ; ========================================================================= ; Enable VT-x capability ; ========================================================================= _enableVTx: mov ecx,0x3a ; MSR 0x3A bts eax,0x2 ; Enable VMX bts eax,0x0 ; Lock MSR 0x3A wrmsr ; write mov [str_len],byte str_press_any_key-str_vtx_changed mov si, str_vtx_changed ;Loads the address of the first byte of the str_msg1 mov dx, 0400h ; set the cursor at line 1 col 0 call _printOut ret ; ************************************************************************* ; ========================================================================= ; Read cursor position ; ========================================================================= _read_cursor_position: mov ah,03h mov bh,0 int 10h ret ; ************************************************************************* ; ========================================================================= ; Program data (constants) ; ========================================================================= str_msg1: db "quot;Checking your system for VT-x capabilities..."quot; str_msg2: db "quot;VT-x capability is "quot; str_len: db 0 str_vtx_locked: db "quot;locked in BIOS; "quot; str_vtx_enabled: db "quot;enabled by BIOS; "quot; str_vtx_disabled: db "quot;customizable"quot; str_vtx_unknown: db "quot;unknown "quot; str_vtx_changed: db "quot;VT-x flag was set successfuly!"quot; str_press_any_key: dw "quot;Press any key to reboot..."quot; flag_vtx: dw 0 ; ************************************************************************* ; ========================================================================= ; Boot sector (aka bootloader) signature ; ========================================================================= times 0200h - 2 - ($ - ) db 0 ; *************************************************************************
The scope of this code (which if even is written on boot-sector and it is capable to load at boot-time cannot be considered a genuine boot-loader, but in a future version will be) was to read a flag from a special register inside a Intel CPU and to print a statement on the screen.
To be honest I was fooling around here, because I could achieve that goal with only 10 lines of code, but it was fun to play with the keyboard, anyway.
Don't get me wrong, to write a genuine boot loader is more than that, I plan to write about that in the nearest future (to write a mini-OS in C++, a boot-loader in Assembly that will access the disk partition, will load the mini-OS program from the disk and then will run it; it will be fun, I promise!).
I was curious how looks like the GRUB MBR comparing with mine, so I dumped my SDD MBR and it looks like:
The thing that I like to Assembly language, comparing with others, is that (almost) all the instruction are encoded into machine code using a one-to-one translation. I mean one instruction (like "mov AX,BX") has its specific encoding (for Y86 hypothetical processor is 110 00 001 which is 193 in decimal or C1 in hexadecimal format). That means that if we could disassembly a program and if we would meet a byte which has the value 193 (or C1 in hexadecimal) then we know for sure that the instruction that sits there is "mov AX,BX".
To help you understand how these things works let's take a look at the table bellow (it's applied only to Y86 hypothetical processor):
The way that an instruction is encoded works like this: the first three bits (bit=0 or 1, 8 bits = 1 byte/octet) represent the kind of instruction we want to encode. The next two bits represent the register they use and finally the last 3 bits represent either an registry used as a operand or a memory address the instruction use.
Let's debug my example:
- In my example (mov AX,BX) we want to encode that kind of instruction "MOV register1, register2" which has the code 110 (look in the first table)
- Next: register1 from my "formula" is AX in my example, which has the code 00 (look in the second table)
- Next: register2 from my "formula" is BX in my example, which has the code 001 (llok in the third table)
- If we put (from left to right) these bits in line we will get 110 00 001, which is, like I said before, 193 in decimal system or C1 in hexadecimal system. More about binary to decimal here.
From one processor to another the encoding model is different, I took this one (Y86 hypothetical processor) as a reference because the encoding model is simple to understand "how things works under the hood". You can read more about this here.
Anyway, because there are so many processors types out there and because they represent different generations (so actually they could have a total different architecture) we are not going to write our "Assembly to machine code encoders" or "machine code to Assembly decoders" respectively. We cannot everything only by ourselves so that's why we use special programs called assemblers and disassemblers .
Personally I use NASM and NDISASM (which is NASM disassembler) because I work only on Intel x86 arhitecture.
Coming back to the subject touched before, the same way we encode a human readable instruction into a machine code/number, the same way (but exactly reverse) we decode a code/number from the machine code back to into a Assembly instruction. To prove this I wrote a 5 lines program in Assembly, I assembled by using NASM assembler program, then the resulted binary program I just disassembled back from the binary format (machine code) to the Assembly language:
The original Assembly program
Using NASM I did:
nasm -f bin -o test test.asm
cpu 8086 add ax,bx mov ax,1 mov ax,bx ret
How it looks in machine code
At the console I did:
hexdump test
0000000 d801 01b8 8900 c3d8 0000008
The machine code disassembled back
Using NDISASM I did:
ndisasm test
00000000 01D8 add ax,bx 00000002 B80100 mov ax,0x1 00000005 89D8 mov ax,bx 00000007 C3 ret
What we can see clearly is that, contrary to other programming languages, when you assembly something then disassembly back to readable code you get about 99.9% the initial source that you had. On the other hand if you can Assembly then you can read almost any program written by anyone (kidding). That's actually true but it's not efficient, because the size and the complexity of the code will one hundred times larger in Assembly than in any other high level language. But sometimes we need to do that. For instance, if I take a look at the GRUB 512bytes long boot loader I will learn, for example, how GRUB is working and what can I do to create my own boot loader (like GRUB) but doing something little different. Or maybe just to use the same logic for my own mini-pseudo-OS, why not?
GRUB under the hood
0000000 EB48 jmp short 0x4a
00000002 90 nop
00000003 D0BC007C sar byte [si+0x7c00],1
00000007 8EC0 mov es,ax
00000009 8ED8 mov ds,ax
0000000B BE007C mov si,0x7c00
0000000E BF0006 mov di,0x600
00000011 B90002 mov cx,0x200
00000014 FC cld
00000015 F3A4 rep movsb
00000017 50 push ax
00000018 681C06 push word 0x61c
0000001B CB retf
0000001C FB sti
0000001D B90400 mov cx,0x4
00000020 BDBE07 mov bp,0x7be
00000023 807E0000 cmp byte [bp+0x0],0x0
00000027 7C0B jl 0x34
00000029 0F851001 jnz word 0x13d
0000002D 83C510 add bp,byte +0x10
00000030 E2F1 loop 0x23
00000032 CD18 int 0x18
00000034 885600 mov [bp+0x0],dl
00000037 55 push bp
00000038 C6461105 mov byte [bp+0x11],0x5
0000003C C6460302 mov byte [bp+0x3],0x2
00000040 FF00 inc word [bx+si]
00000042 0020 add [bx+si],ah
00000044 0100 add [bx+si],ax
00000046 0000 add [bx+si],al
00000048 0002 add [bp+si],al
0000004A FA cli
0000004B 90 nop
0000004C 90 nop
0000004D F6C280 test dl,0x80
00000050 7502 jnz 0x54
00000052 B280 mov dl,0x80
00000054 EA597C0000 jmp word 0x0:0x7c59
00000059 31C0 xor ax,ax
0000005B 8ED8 mov ds,ax
0000005D 8ED0 mov ss,ax
0000005F BC0020 mov sp,0x2000
00000062 FB sti
00000063 A0407C mov al,[0x7c40]
00000066 3CFF cmp al,0xff
00000068 7402 jz 0x6c
0000006A 88C2 mov dl,al
0000006C 52 push dx
0000006D BE7F7D mov si,0x7d7f
00000070 E83401 call word 0x1a7
00000073 F6C280 test dl,0x80
00000076 7454 jz 0xcc
00000078 B441 mov ah,0x41
0000007A BBAA55 mov bx,0x55aa
0000007D CD13 int 0x13
0000007F 5A pop dx
00000080 52 push dx
00000081 7249 jc 0xcc
00000083 81FB55AA cmp bx,0xaa55
00000087 7543 jnz 0xcc
00000089 A0417C mov al,[0x7c41]
0000008C 84C0 test al,al
0000008E 7505 jnz 0x95
00000090 83E101 and cx,byte +0x1
00000093 7437 jz 0xcc
00000095 668B4C10 mov ecx,[si+0x10]
00000099 BE057C mov si,0x7c05
0000009C C644FF01 mov byte [si-0x1],0x1
000000A0 668B1E447C mov ebx,[0x7c44]
000000A5 C7041000 mov word [si],0x10
000000A9 C744020100 mov word [si+0x2],0x1
000000AE 66895C08 mov [si+0x8],ebx
000000B2 C744060070 mov word [si+0x6],0x7000
000000B7 6631C0 xor eax,eax
000000BA 894404 mov [si+0x4],ax
000000BD 6689440C mov [si+0xc],eax
000000C1 B442 mov ah,0x42
000000C3 CD13 int 0x13
000000C5 7205 jc 0xcc
000000C7 BB0070 mov bx,0x7000
000000CA EB7D jmp short 0x149
000000CC B408 mov ah,0x8
000000CE CD13 int 0x13
000000D0 730A jnc 0xdc
000000D2 F6C280 test dl,0x80
000000D5 0F84EA00 jz word 0x1c3
000000D9 E98D00 jmp word 0x169
000000DC BE057C mov si,0x7c05
000000DF C644FF00 mov byte [si-0x1],0x0
000000E3 6631C0 xor eax,eax
000000E6 88F0 mov al,dh
000000E8 40 inc ax
000000E9 66894404 mov [si+0x4],eax
000000ED 31D2 xor dx,dx
000000EF 88CA mov dl,cl
000000F1 C1E202 shl dx,0x2
000000F4 88E8 mov al,ch
000000F6 88F4 mov ah,dh
000000F8 40 inc ax
000000F9 894408 mov [si+0x8],ax
000000FC 31C0 xor ax,ax
000000FE 88D0 mov al,dl
00000100 C0E802 shr al,0x2
00000103 668904 mov [si],eax
00000106 66A1447C mov eax,[0x7c44]
0000010A 6631D2 xor edx,edx
0000010D 66F734 div dword [si]
00000110 88540A mov [si+0xa],dl
00000113 6631D2 xor edx,edx
00000116 66F77404 div dword [si+0x4]
0000011A 88540B mov [si+0xb],dl
0000011D 89440C mov [si+0xc],ax
00000120 3B4408 cmp ax,[si+0x8]
00000123 7D3C jnl 0x161
00000125 8A540D mov dl,[si+0xd]
00000128 C0E206 shl dl,0x6
0000012B 8A4C0A mov cl,[si+0xa]
0000012E FEC1 inc cl
00000130 08D1 or cl,dl
00000132 8A6C0C mov ch,[si+0xc]
00000135 5A pop dx
00000136 8A740B mov dh,[si+0xb]
00000139 BB0070 mov bx,0x7000
0000013C 8EC3 mov es,bx
0000013E 31DB xor bx,bx
00000140 B80102 mov ax,0x201
00000143 CD13 int 0x13
00000145 722A jc 0x171
00000147 8CC3 mov bx,es
00000149 8E06487C mov es,[0x7c48]
0000014D 60 pushaw
0000014E 1E push ds
0000014F B90001 mov cx,0x100
00000152 8EDB mov ds,bx
00000154 31F6 xor si,si
00000156 31FF xor di,di
00000158 FC cld
00000159 F3A5 rep movsw
0000015B 1F pop ds
0000015C 61 popaw
0000015D FF26427C jmp word [0x7c42]
00000161 BE857D mov si,0x7d85
00000164 E84000 call word 0x1a7
00000167 EB0E jmp short 0x177
00000169 BE8A7D mov si,0x7d8a
0000016C E83800 call word 0x1a7
0000016F EB06 jmp short 0x177
00000171 BE947D mov si,0x7d94
00000174 E83000 call word 0x1a7
00000177 BE997D mov si,0x7d99
0000017A E82A00 call word 0x1a7
0000017D EBFE jmp short 0x17d
0000017F 47 inc di
00000180 52 push dx
00000181 55 push bp
00000182 42 inc dx
00000183 2000 and [bx+si],al
00000185 47 inc di
00000186 656F gs outsw
00000188 6D insw
00000189 004861 add [bx+si+0x61],cl
0000018C 7264 jc 0x1f2
0000018E 204469 and [si+0x69],al
00000191 736B jnc 0x1fe
00000193 005265 add [bp+si+0x65],dl
00000196 61 popaw
00000197 640020 add [fs:bx+si],ah
0000019A 45 inc bp
0000019B 7272 jc 0x20f
0000019D 6F outsw
0000019E 7200 jc 0x1a0
000001A0 BB0100 mov bx,0x1
000001A3 B40E mov ah,0xe
000001A5 CD10 int 0x10
000001A7 AC lodsb
000001A8 3C00 cmp al,0x0
000001AA 75F4 jnz 0x1a0
000001AC C3 ret
000001AD 0000 add [bx+si],al
000001AF 0000 add [bx+si],al
000001B1 0000 add [bx+si],al
000001B3 0000 add [bx+si],al
000001B5 0000 add [bx+si],al
000001B7 00361B04 add [0x41b],dh
000001BB BC0000 mov sp,0x0
000001BE 0000 add [bx+si],al
000001C0 0101 add [bx+di],ax
000001C2 831F20 sbb word [bx],byte +0x20
000001C5 41 inc cx
000001C6 0004 add [si],al
000001C8 0000 add [bx+si],al
000001CA 0004 add [si],al
000001CC 0100 add [bx+si],ax
000001CE 0000 add [bx+si],al
000001D0 014283 add [bp+si-0x7d],ax
000001D3 FE db 0xfe
000001D4 FF db 0xff
000001D5 FF00 inc word [bx+si]
000001D7 0801 or [bx+di],al
000001D9 0000 add [bx+si],al
000001DB 081A or [bp+si],bl
000001DD 0880FEFF or [bx+si-0x2],al
000001E1 FF07 inc word [bx]
000001E3 FE db 0xfe
000001E4 FF db 0xff
000001E5 FF00 inc word [bx+si]
000001E7 109B0800 adc [bp+di+0x8],bl
000001EB F0DF01 lock fild word [bx+di]
000001EE 00FE add dh,bh
000001F0 FF db 0xff
000001F1 FF82FEFF inc word [bp+si-0x2]
000001F5 FF00 inc word [bx+si]
000001F7 101B adc [bp+di],bl
000001F9 0800 or [bx+si],al
000001FB 00800055 add [bx+si+0x5500],al
000001FF AA stosb
To be continued...
Now, if you think that this article was interesting don't forget to rate it. It shows me that you care and thus I will continue write about these things.
Eugen Mihailescu
Latest posts by Eugen Mihailescu (see all)
- Dual monitor setup in Xfce - January 9, 2019
- Gentoo AMD Ryzen stabilizator - April 29, 2018
- Symfony Compile Error Failed opening required Proxies - January 22, 2018
Reblogged this on Gigable - Tech Blog.