首页 > 代码库 > MIT 操作系统实验 MIT JOS lab1

MIT 操作系统实验 MIT JOS lab1

2024-07-25 12:38:26 222人阅读

JOS lab1

嘿嘿，实验环境还是相当的友好的. 很多东西都准备好了.把重点放在理论的印证上面.

MIT才是改变并引领世界的牛校,心神往之，吾身不能至啊~

国内的北大，上交等学校的OS实验都是直接用的JOS，这点证据还是容易找的...说明什么，不言而喻咯...

--------------------------------------------------------------------------------------------------------------------------------------------

Part 1: PC Bootstrap

这一部分就是很简单的介绍怎么使用qemu和gdb联调kernel...

打开两个terminal，都进入到lab目录，然后其中一个输入make qemu-gdb 另一个输入make gdb,即可看到下面的画面

有意思的是在读取BIOS信息这个阶段由于系统还有设置堆栈，gdb调试的时候step和next指令都是不能用的(需要堆栈信息),只有单行执行汇编指令的stepi指令可用，并提示一个??()的信息，当前被执行指令不在任何函数内部

Part 2: The Boot Loader

当读取完BIOS的信息之后，这个时候就开始执行kernel的代码了

会长跳转到0x7C00地址处

When the BIOS finds a bootable floppy or hard disk, it loads the 512-byte boot sector into memory at physical addresses 0x7c00 through 0x7dff, and then uses a jmp instruction to set the CS:IP to 0000:7c00 , passing control to the boot loader.

从real mode切换到protected model，地址长度从16bits变为32bits！观察gdb的那个【0:7c2d】到0x7c32这种地址的表现形式我们也可以觉察到这一点

而后便是设置protected model下的数据段代码段等信息，然后跳转到bootmain.注意，跳转bootmain之前就设置了堆栈！

movl $start %esp

这是我们看到的最早的内核栈

这部分需要回答一部分问题:

Be able to answer the following questions:
At what point does the processor start executing 32-bit code? What exactly causes the switch
from 16- to 32-bit mode?

从real model跳转到protected model的时候开始执行32bit code

What is the last instruction of the boot loader executed, and what is the first instruction of the
kernel it just loaded?

boot loader最后一行代码:

Where is the first instruction of the kernel?

首先得定位到上面ELFHDR->e_entry指向的位置,而ELFHDR是指向0x10000(被强制类型转换成struct Elf)

这里通过readseg使得ELFHDR得以初始化.这个初始化的数据来源就是硬盘上的内核镜像.

于是我们从那里去找这个ELFHDR->e_entry指向的位置呢？反汇编kernel镜像！

objdump -x ./obj/kern/kernel

会看到kernel的起始地址是0x10000c

设置断点就会发现这里kernel的第一条语句是

movw $0x1234, 0x472

我们能够在 kern/entry.S中得到印证，能够找到这句代码

而kernel镜像中的entry 符号就是指向entry.S 这个文件的代码起始地址的

反汇编你会看到一个entry的符号！value是0xf010000c 这就是我们镜像上内核的入口地址了，和上面的0x10000c并不冲突，前者0x1000c是后者0xF010000C转换而来的

这种转换一开始是手动的，我找了09 年和10年的同样的实验代码。

以前的代码(左边) 现在的代码(右边)

发现这里是有手动的&转换的，而我现在用的2014年的代码是没有这种强制转换的，为这个问题纠结好久...

Many machines don‘t have any physical memory at address 0xf0100000, so we can‘t count on being
able to store the kernel there. Instead, we will use the processor‘s memory management hardware to map virtual address 0xf0100000 (the link address at which the kernel code expects to run) to physical address 0x00100000 (where the boot loader loaded the kernel into physical memory). This way, although the kernel‘s virtual address is high enough to leave plenty of address space for user processes, it will be loaded in physical memory at the 1MB point in the PC‘s RAM, just above the BIOS ROM. This approach requires that the PC have at least a few megabytes of physical memory (so that physical address 0x00100000 works), but this is likely to be true of any PC built after about 1990.

因为硬件已经把0xf0100000 映射到0x100000 ，0xf010000c同理映射到0x10000c，...实质上就是手动转换变成硬件直接转换(感觉更晦涩了啊~还是手动转换的好...折腾了我一个小时)

从启动信息我们也可以知道这点(之前这个message被我无视了)

后来有发现自己巨渣...原来objdump的时候也可以看到信息...只怪自己弱，布吉岛啊...

这里的VMA== virtual memory address LMA == load memory address

So, 0xf0100000是虚拟地址,真正加载的时候使用的LMA，物理地址

How does the boot loader decide how many sectors it must read in order to fetch the entire kernel from disk? Where does it find this information?

这里我存在一点疑惑，和别人的答案不同.我保留我的意见

有些人认为是从kernel镜像，根据elf格式文件储存的信息确定并读取的.

答案是:

我有点不解，认为答案是一下内容:

由main.c源码可知，读入的sectors数目是8,每个sector的大小是SECTSIZE 512byte

Back in boot/main.c, the ph->p_pa field of each program header contains the segment‘s destination physical address (in this case, it really is a physical address, though the ELF specification is vague on
the actual meaning of this field).

Part 3: The Kernel

主要是添加一些代码.

先把下面列出来的代码读一次

Read through kern/printf.c , lib/printfmt.c , and kern/console.c (反正我是边做边读的...)

“We have omitted a small fragment of code - the code necessary to print octal numbers using patterns of the form "%o". Find and fill in this code fragment.”

找到printfmt.c然后添加如下代码即可:

这里因为很多机制都很健全，只要仿照着16进制输出的做一个8进制输出的初步处理就可以了

Be able to answer the following questions:
1. Explain the interface between printf.c and console.c . Specifically, what function does console.c
export? How is this function used by printf.c ?

这里主要是说明所有的printf相关函数(JOS中),实质上都是“一层外壳”，它调用了console.c里面的putch函数.

再者,printf的实现利用到了参数变长的技巧

对于这种技巧的使用，我在这里有详细的说明:http://blog.csdn.net/cinmyheart/article/details/24582895

2. Explain the following from console.c :

主要是检测当前屏幕的输出buffer是否满了，这里注意memmove其实就是把第二个参数指向的地址移动n byte到第一个参数指向的地址，这里n byte由第三个参数指定.

如果buffer满了,把屏幕第一行覆盖掉逐行上移，空出最后一行，并由for循环填充以‘ ’(空格)，最后把crt_pos置于最后一行的行首！

3. For the following questions you might wish to consult the notes for Lecture 2. These notes cover GCC‘s calling convention on the x86.

Trace the execution of the following code step- by- step:

int x = 1, y = 3, z = 4;
cprintf("x %d, y %x, z %d\n", x, y, z);

In the call to cprintf() , to what does fmt point? To what does ap point?

fmt指向格式说明符字符串.ap 指向一个va_list 类型变量

不过这个代码在哪儿？我始终没有找到...以后找到update.

List (in order of execution) each call to cons_putc , va_arg , and vcprintf . For cons_putc , list its argument as well. For va_arg , list what ap points to before and after the call. For vcprintf list the values of its two arguments.

4. Run the following code.

unsigned int i = 0x00646c72;
cprintf("H%x Wo%s", 57616, &i);

What is the output? Explain how this output is arrived at in the step-by-step manner of the previous exercise. Here‘s an ASCII table that maps bytes to characters.

会输出He110 World

我只想说...呵呵...原理嘛，就是很简单的根据ascii输出就是了

只是注意一下这里的%s部分是打印的i地址处的东东，由于是little endian机器，所以i的值在储存的时候是72 6c 64 00顺序储存的.这样对应的ascii码就是 r l d

The output depends on that fact that the x86 is little-endian. If the x86 were instead big-endian
what would you set i to in order to yield the same output? Would you need to change 57616 to a different value?
Here‘s a description of little- and big-endian and a more whimsical description.

如果是big endian嘛就是i = 0x726c6400,不需要改变57616.

5. In the following code, what is going to be printed after ‘y=‘ ? (note: the answer is not a specific value.) Why does this happen?
cprintf("x=%d y=%d", 3);

y后会打印垃圾值

6. Let‘s say that GCC changed its calling convention so that it pushed arguments on the stack in declaration order, so that the last argument is pushed last. How would you have to change cprintf or its interface so that it would still be possible to pass it a variable number of arguments?

还是要先看变长参数的实现

#ifndef _STDARG_H
#define _STDARG_H

typedef char *va_list;

/* Amount of space required in an argument list for an arg of type TYPE.
   TYPE may alternatively be an expression whose type is used.  */

#define __va_rounded_size(TYPE)    (((sizeof (TYPE) + sizeof (int) - 1) / sizeof (int)) * sizeof (int))

#ifndef __sparc__
#define va_start(AP, LASTARG) 						 (AP = ((char *) &(LASTARG) + __va_rounded_size (LASTARG)))
#else
#define va_start(AP, LASTARG) 						 (__builtin_saveregs (),						  AP = ((char *) &(LASTARG) + __va_rounded_size (LASTARG)))
#endif

void va_end (va_list);		/* Defined in gnulib */
#define va_end(AP)

#define va_arg(AP, TYPE)						 (AP += __va_rounded_size (TYPE),					  *((TYPE *) (AP - __va_rounded_size (TYPE))))

#endif /* _STDARG_H */

从上面可以看到, va arg 每次是以地址往后增长取出下一参数变量的地址的。而这个实现方式就默认假设了编译器是以从右往左的顺序将参数入栈的. 因为栈是以从高往低的方向增长的。后压栈的参数放在了内存地址的低位置,所以如果要以从左到右的顺序依次取出每个变量,那么编译器必须以相反的顺序即从右往左将参数压栈。如果编译器更改了压栈的顺序,那么为了仍然能正确取出所有的参数, 那么需要修改上面代码中的 va_start 和 va_arg 两个宏,将其改成用减法得到新地址即可。感觉这地方也不少说，具体情况具体分析,不难

对于堆栈的认识最好还是去做APUE的lab 2 bomb~ 提前祝炸的开心: )

关于显示器颜色输出的问题:

观察cga_putc函数，