csapp_shelllab
Linking
- why linkers:
- modularity:更结构化,可以写library啥的
- efficiency:部分更改的时候无需重新编译,library放常用的很多函数,实际上还是只链接用到的
- what linker do:
- symbol(全局变量和函数,被存为包含信息的结构体) resolution:
- symbol definition被放在.o文件的symbol table(an array of symbol)中
- 在第一步中associate each symbol reference to exact only one definition

- relocation:
- 合并所有代码,将.o中的相对位置更改为绝对位置同时更行all reference to the symbol
- symbol(全局变量和函数,被存为包含信息的结构体) resolution:



symbol resolution
- linker symbols:
- global symbols:non-static全局变量和函数
- external symbols:由m模块引用但是由其它模块定义
- local symbols:defined and refered exclusively(专门) in module m(static)
- local non static variables:stored in stack
- local static:in .data .bss
- rules
- strong symbols:procedure and initialized globals
- week:uninitialized globals
- multiple strong are not allowed
- a strong and other week,choose the strong
- multiple week,chose arbitrary one
- 所以尽量避免全局变量,要用的话static,一定初始化,引用外部变量就用extern 声明
relocation


- 编译器会留出来位置给链接器去调整
- 解释:32位绝对地址重定位和32位PC(这时已经到下一条所以相差了一位)相对地址重定位(偏移量是自己算出来的)
packaging commonly used function
static lib(.a achieve files)
- 每个函数的具体实现编译为一个.o,然后很多个连在一起形成.a
- 这允许增量更新
- 链接的时候:链接器扫描.o和.a文件,维护一个unresolved reference列表,如果扫完了还有就报错
- 命令行参数的顺序很重要:先放库的话它的列表里面是空的(因为没有未知的引用)

shared lib
- 传统静态库的缺点:每个函数都需要libc,100个用printf的可执行文件中有100份,加载也要这么多份,更改维护又要全部重新编译
- 现代:可以将代码和数据动态链接到application中(DLL, .so)


library interposition
- allow us to intercept the call to the function
- 非常有用(debug,security)
- e.g:追踪程序的malloc和free
- complie time
#ifdef COMPILETIME |

- -I,优先本目录找,主程序中的malloc才会被解释为mymalloc中的


exceptional control flow
- control flow:a sequence of instructions(从开机到关机,cpu总是一条一条的执行指令)
- altering control flow:
- in program state:jumps and branches\procedure calls and returns
- system state:(exceptional control flow)
- low level:
- exceptions(change response to a system event,implemented by hardware and os system)
- higher level:
- process context switch
- signal
- nonlocal jump
- low level:
exceptions
- a transfer to the OS kernel in response to the event(e.g:changes in processor state)
- kernel:总是在内存中的操作系统的一部分
- example of event:除0,算术溢出,IO操作完成,page fault,输入的<c-c>
- 维护着一个exception table:每种异常都有一个code,k产生时就会执行对应的handler
- Asynchronous Exceptions(异步异常)
- caused by the events external to the processor
- 标志是设置了处理器的interrupt引脚
- 返回后执行下一条指令
- e.g:timer interrupt:每几毫秒就会产生一次,由内核使用,用于从用户代码中取得控制权
- synchronous Exceptions
- caused by the result of executing an instruction
- traps:故意的,返回到下一条指令(比如系统调用,breakpoint trap)

- fault:非故意但是可以恢复,要么重新执行要么abort(page fault,floating point exception)

- abort:非故意且不可恢复,非法指令,parity error(奇偶校验错误),machine check
process
-
definition:a process is a running instance of a program
-
两层关键的抽象:
- logical control flow:每个程序似乎独占cpu,由称作context switch的内核机制提供
- private address space:each seem to have exclusive use of main memory,由内核的虚拟内存机制提供

-
concurrent(并发)
- 每个程序都有逻辑控制流,只要它们有重叠就叫并发(一个程序切换过去后,只要没结束的时间就算)

- 每个程序都有逻辑控制流,只要它们有重叠就叫并发(一个程序切换过去后,只要没结束的时间就算)
process control
- 在你的代码中进行的影响进程的调用,遇到错误时常设置返回值为-1并设置全局变量errno为错误信息
- rule:你必须在调用时检查返回值,除非返回void
pid_t getpid(void)//return the pid of current process
pid_t getppid(void)//return the parent's pid

- terminated:接收信号,从main函数返回,调用exit()(这个函数永不返回)


-
reaping child process
- zombie process:当进程结束的时候仍然占据系统资源(退出状态,os table等)



- zombie process:当进程结束的时候仍然占据系统资源(退出状态,os table等)
-
wait:synchronizing with children
int wait(int *child_status)
suspend the current process until the children terminated,return the pid value of the children,if pointer isn’t NULL,set it to an integer indicate the terminate reason

pid_t waitpid(pid_t pid,int &status,int options)
等待特定pid的进程
- execve



shell


- 直接这么搞实际上有点问题:后台子进程没reap变成僵尸进程了,开太多内存会泄露
- solution:alert us when the background task finished(signal)
signal

-
send:kernal send a signal to a destination process by updating some state in the context of the destination process
-
receive:a signal is received when the process is forced to react to the delivery of signal
- some ways to react:
- ignore:do nothing
- terminate the process
- catch the signal by some user-level function(signal handler)
- some ways to react:
-
a signal is pending when it is sent but not received
- !只能至多同时有一个同类型的信号在pending,信号不排队!
- a process can block the signal(sent yet not received)
-
kernal maintains the pending and blocked bit vectors in the context of process
- pending:kernal set bits k when it’s sent,clear when it’s received
- blocked:also named signal mask
-
how to send a signal
-
process groups:每个进程属于一个进程组,getpgrp和setpgid
-
- /bin/kill -<the number of signal> dest
- dest为数字:就是pid
- -number:进程组id
-
- from keyboard
- sent SIGINT\SIGTSTP to the foregroud process group
-
- 在代码中使用:c中为kill
-
-
how to receive
-
在进程切换到另一个进程,kernal级别的code执行完时

-
installing a handler

-
signal handler也是一种并发流

-
blocking and unblocking
- first:kernal blocks any pending signals of the type currently handling with

-
guidelines for writing safe handlers

-
async signal safty(异步信号安全):Function is async-signal-safe if either reentrant (e.g., all variables stored on stack frame, CS:APP3e 12.7.2) or non interruptible by signals.
- printf要求访问一个锁,当进行printf的异常处理中又访问printf的时候就会产生死锁
- 正确使用wait来利用事件计数信号:wait放到循环里面一次性处理完
-
我们无法对子进程做出任何假设:导致有可能在将子进程添加到列表之前其已结束:solution就是直接阻塞信号的处理

-
explicitly waitiing for signal
similar to the shell wait for the foreground program to finish -
sigsuspend():原子化做这几件事


-
pause:程序挂起直到收到一个信号,在信号被捕获并handler处理完成才返回
shell lab
- the first word in the command line is either the built-in command or the pathname of an executable filename
- built-in command:run in the current process
- filename:fork a child process and execute it in the context of child
- a job can consist multiple child process connected by pipe
- io重定向:注意恢复
- stdin:文件描述符,通常是一个非负整数,可以用于标识控制信息
// tsh.c |
- 竞争
void eval_none(struct cmdline_tokens tok, int bg, char* cmdline) { |
- sigsuspend:挂起当前线程直到未屏蔽的信号被捕获
- 进程挂起:就是暂停当前进程的执行
- 信号挂起:pending,暂时不处理
void sigchld_handler(int sig) { |
system level IO
unix IO
- a linux file is just a sequence of bytes(其实不使用拓展名区分文件类型)
- file type:regular file,directory(index of a related group of files),socket(for communicating with a process on another machine)
- regular files:
- 包含随机字节的数据,应用程序会区分纯文本文件或者二进制文件(kernal don’t know the difference)
- end of line(EOL):
- linux and mac:LF(line feed) ‘\n’(0xa)
- windows:CRLF(carriage return and line feed) ‘\r\n’
- 回车:回到最左边的位置,换行:移到下一行
- directory:
- consists an array of link,each link maps a file name to a file
- . is the link to itself … is the link to its parent
- root directory:/
- open a file:
- open returns a file descriptor(small integer)
- 每个由shell创建的进程都始终打开着三个文件(0:stdin 1:stdout 2:stderr)
- close:注意检测返回值(在多线程情况下很容易出事)
- short counts(缓冲区没用完(nbytes<sizeof(buffer)))
- 遭遇了EOF,从终端读,从网络端读写(以块的方式)
- 但是绝对不会在读写磁盘文件的时候出现(除非遇到了EOF)
solution
解决short count(不手写循环就会丢数据),被信号中断,效率低(每次read进入系统内核)
- unbuffered RIO:
- read当且仅当遭遇EOF时返回short count
- write永不返回short count
- buffered RIO:
- 有效读取部分数据缓存到内部memory buffer

- 有效读取部分数据缓存到内部memory buffer
metadata and sharing
- file metadata

man stat:命令
man 2 stat:系统调用

- 维护refcnt可以得知哪些文件不再需要
- 两个不同的文件描述符可以引用相同的文件并且pos处于不同位置
- how process share files :fork
- 注意子进程和父进程共享的是file table
- e.g:父进程的操作改变了文件描述符中的pos,子进程再操作就是从新pos中开始
- ioredirection:dup2,减少原本的引用计数,增加新的引用计数
standard
- c标准库中的一系列高级的IO操作(但是对网络操作不太好用)
- 将打开的文件建模为流

- 当你需要高性能或者信号处理的时候使用raw function(async-signal safe)
- 注意:不能对二进制文件使用基于文本的函数(scanf,fgets等)



