networkprogrammingP1

lowest level:ethernet segment(以太网)
- 一个房间或者一栋建筑这种,一组的host通过线连接到hub的port
- 每个有一个独有的48位地址(MAC address),以chunk为单位发送数据
在不同标准的设备之间通信:protocol software run on the rounter and host(提供命名规范和行为机制)
ipv4:32位大端法存储，这被映射到网址(域名)
三级域名:第一级(.com,.edu等)第二级(mit,berkeley等)第三级(www.等)
DNS(domain naming system)
- 每个主机都有自己的局部域名localhost 127:0:0:1
- 常见命令:nslookup\hostname
- 域名和地址之间不是一一对应的关系，大型网站全球都有DNS所以可能不一样,甚至可能有效的域名不指向任何地址
connection:client and server use it to send message
- socket:endpoint of a connection ipaddress:port
- port:16 bits integer identify different process(分为临时端口和wellknown,mapping at /etc/services)
- a connection is identified by socket pair
socket:
- for kernal it’s the endpoint of a connection
- for application it’s a file descriptor

int socket(int domain,int type,int protocol)//return the fd
int bind(int sockfd,SA *addr,socklen_t addrlen)
server call to tell the kernal associate socket address with file descriptor
int listen(int sockfd,int backlog)
tell the kernal that this descriptor is used by server(default see as client)
int accept(int listenfd,SA *addr,int *addrlen)
server ready for request from client,return a connected descriptor
int connect(int clientfd, SA *addr, socklen_t addrlen);

host and service convertion

为了处理多对多的映射关系，采用链表

getnameinfo:inverse of the getaddrinfo

networkprogrammingP2

//an example to get the ip address
#include "csapp.h" 
 
int main(int argc, char **argv) 
{ 
    struct addrinfo *p, *listp, hints; 
    char buf[MAXLINE]; 
    int rc, flags; 
 
    /* Get a list of addrinfo records */ 
    memset(&hints, 0, sizeof(struct addrinfo)); 
    hints.ai_family = AF_INET;       /* IPv4 only */ 
    hints.ai_socktype = SOCK_STREAM; /* Connections only */ 
    if ((rc = getaddrinfo(argv[1], NULL, &hints, &listp)) != 0) { 
        fprintf(stderr, "getaddrinfo error: %s\n", gai_strerror(rc)); 
        exit(1); 
    } 
    /* Walk the list and display each IP address */ 
    flags = NI_NUMERICHOST; /* Display address instead of name */ 
    for (p = listp; p; p = p->ai_next) { 
        Getnameinfo(p->ai_addr, p->ai_addrlen,  
                    buf, MAXLINE, NULL, 0, flags); 
        printf("%s\n", buf); 
    } 
 
    /* Clean up */ 
    Freeaddrinfo(listp); 
 
    exit(0); 
}

the code for the client to establish a connection

int open_clientfd(char *hostname, char *port) { 
  int clientfd; 
  struct addrinfo hints, *listp, *p; 
 
  /* Get a list of potential server addresses */ 
  memset(&hints, 0, sizeof(struct addrinfo)); 
  hints.ai_socktype = SOCK_STREAM;  /* Open a connection */ 
  hints.ai_flags = AI_NUMERICSERV;  /* …using numeric port arg. */ 
  hints.ai_flags |= AI_ADDRCONFIG;  /* Recommended for connections */ 
  Getaddrinfo(hostname, port, &hints, &listp);
    /* Walk the list for one that we can successfully connect to */ 
    for (p = listp; p; p = p->ai_next) { 
        /* Create a socket descriptor */ 
        if ((clientfd = socket(p->ai_family, p->ai_socktype,  
                               p->ai_protocol)) < 0) 
            continue; /* Socket failed, try the next */ 
 
        /* Connect to the server */ 
        if (connect(clientfd, p->ai_addr, p->ai_addrlen) != -1) 
            break; /* Success */ 
        Close(clientfd); /* Connect failed, try another */ 
    } 
 
    /* Clean up */ 
    Freeaddrinfo(listp); 
    if (!p) /* All connects failed */ 
        return -1; 
    else    /* The last connect succeeded */ 
        return clientfd; 
}

code for the server to prepare

 int open_listenfd(char *port) 
{ 
    struct addrinfo hints, *listp, *p; 
    int listenfd, optval=1; 
 
    /* Get a list of potential server addresses */ 
    memset(&hints, 0, sizeof(struct addrinfo)); 
    hints.ai_socktype = SOCK_STREAM;             /* Accept connect. */ 
    hints.ai_flags = AI_PASSIVE | AI_ADDRCONFIG; /* …on any IP addr */ 
    hints.ai_flags |= AI_NUMERICSERV;            /* …using port no. */ 
    Getaddrinfo(NULL, port, &hints, &listp); 
 
    /* Walk the list for one that we can bind to */ 
    for (p = listp; p; p = p->ai_next) { 
        /* Create a socket descriptor */ 
        if ((listenfd = socket(p->ai_family, p->ai_socktype,  
                               p->ai_protocol)) < 0) 
            continue;  /* Socket failed, try the next */ 
 
        /* Eliminates "Address already in use" error from bind */ 
        Setsockopt(listenfd, SOL_SOCKET, SO_REUSEADDR,  
                   (const void *)&optval , sizeof(int)); 
 
        /* Bind the descriptor to the address */ 
        if (bind(listenfd, p->ai_addr, p->ai_addrlen) == 0) 
            break; /* Success */ 
        Close(listenfd); /* Bind failed, try the next */ 
    } 
     /* Clean up */ 
    Freeaddrinfo(listp); 
    if (!p) /* No address worked */ 
        return -1; 
 
    /* Make it a listening socket ready to accept conn. requests */ 
    if (listen(listenfd, LISTENQ) < 0) { 
        Close(listenfd); 
        return -1; 
    } 
    return listenfd; 
}

服务器端:struct sockaddr_storage//enough room for any address

web server basic

http(hypertext transfer protocol):web content(语义)
tcp:stream(完整字节流)
ip:datagrams
content:a sequence of bytes with associated MIME(multipurpose internet mail extensions) type
url(universal resource locator):协议,服务器,端口,文件的位置
http request:request line+header
http response:response line+header+content

GET /cgi-bin/env.pl HTTP/1.1(cgi-bin:表示要动态内容,要求运行程序)
- http://add.com/cgi-bin/adder?15213&18213 ?开始以&分割的参数列表
- 服务器设置环境变量为参数后启动子进程运行程序,dup2写回去

concurrent programming

some classical problems with concurrent
- races:多个进程的执行结果取决于调度(父进程可能在加入子进程的时候子进程已经结束,意外的共享状态)
- deadlock:等待永远不会发生的情况
- starvation:总是调度一个进程,另一个进程调度不了
iterative web server:client 的read环节堵塞,一次只能处理一个

method

approach1:spawn seperate process for each client
- must reap the children ,parent close the connfd,children close listen fd
- 简单,开销高
approach2:
- 开销小,单线程易于调试,但是难以利用多核优势
- 难点:如何解决可能没输完的request header,
approach3:
- process=process context(program+kernal)+code+data+stack
- thread=thread context(program context+stack)+code+data+kernal context
- (相比顺序结构可以视为线程池)
- threads:

int main(int argc, char **argv)
{
	 int listenfd, *connfdp;
	 socklen_t clientlen;
	 struct sockaddr_storage clientaddr;
	 pthread_t tid;
	 listenfd = Open_listenfd(argv[1]);
	 while (1) {
		clientlen=sizeof(struct sockaddr_storage);
		connfdp = Malloc(sizeof(int)); 
		*connfdp = Accept(listenfd, 
		 (SA *) &clientaddr, &clientlen); 
		Pthread_create(&tid, NULL, thread, connfdp);
	 }
}
/* Thread routine */
void *thread(void *vargp)
{
	 int connfd = *((int *)vargp);
	 Pthread_detach(pthread_self()); 
	 //run independently and reap automatically
	 Free(vargp); 
	 echo(connfd);
	 Close(connfd);
	 return NULL;
}//echoservert.c

如果不malloc:可能在子线程程开始之前主线程又读了一个,此时该变量被修改

sync-basic

def: a variable is shared if and only if its instance is referrenced by mutiple threads(not just the global and private)
threads memory model:
- conceptually,multithreads run in the context of a single process
- but in fact the stack of a thread can be visited by other thread(e.g:by the global pointer point to local stack)
mapping memmory to reference
- global:declared outside the function,vm maintains only one
- local:non-static variables inside the function,each thread stack contains one
- local static:vm maintains only one

example

volatile:该关键字声明的变量会禁止永久保存在寄存器中(每次要算都是从内存取出再写回去)
如果遇到两个线程更新一个volatile的全局变量可能出问题(load->update->save)
- e.g:load1->update1->load2->update2->save1->save2(汇编才比较好看出来)

alt text

trajectory:a sequence of legal states(safe iff it doesn’t go into the interleaved region)
the three state above are the critical section it should not be interleaved

synchronize

to avoid the above,we need to make sure mutual exclusive access to the critical region
solution:semaphores(信号量)

non-negative global integers manipulated by P and V
P(locking mutex):
- if not zero,decrement and return immediately(test and decrement combined atomically)
- if zero,block current thread until non zero and restart by V,after restart,decrement and return the control
V(releasing/unlocking mutex):
- increment atomically
- if existing blocked by P thread,restart arbitrarily one.

basic idea:init it to 1,P right before critical region ,V right after critical region

terminology:binary semaphores(bs) mutex:bs used for mutual exclusion holding means locked

sync-advance

two example

producer-comsumer problem
producer thread and comsumer thread share a buffer,producer wait for an empty slot,comsumer wait for an item
very common in GUI(graphic user interface)
solution:

three semaphores:mutex(ensure exclusive access to shared buffer),slot,item
wait for available(slot or item),lock the buffer,operation,unlock,update semaphores
if multi-core execute the P same time,kernal will serialize them

reader-writer problem
there is no need to block the reader for reader makes no change to the data

the solution favor reader:reader get priority over writer,it doesn’t wait

int readcnt=0;
Sem_t mutex,w;
void reader(){
    P(&mutex);
    readcnt++;//manipulation on readcnt
    if(readcnt==1)//when implicit queue of read ,block the write
        P(&w);
    V(&mutex);
    /*critical region*/
    P(&mutex);
    readcnt--;
    if(readcnt==0)
        V(&w);
    V(&mutex);
}
void writer(){
    P(&w);
    /*critical region*/
    V(&w);
}

also exist favor writer

prethreaded

create and destroy thread depend on demand requires overhead
solution:create prethreaded thread pool
two ways to init buffer

first:init just in the main thread
second:static pthread_once_t once=PTHREAD_ONCE_INIT; Pthread_once(&once,init)
- the variable actually will be defined in each thread,but only once will the init execute

thread safe

a function is thread-safe iff it always produce correct result when calling from multi-thread

class 1:don’t protect shared variables
- use P and V,but will slow down
class 2:functions keep state cross invocation
- different schedule can produce different result sequence
- maintain local variables
class 3:return a pointer to a static variable
- just the race,the invocation in-between return and assign can wrong
- way1:让调用者提供缓冲区
- way2:锁保护下读并复制(不用改库函数)
class 4:call thread-unsafe function

a function is reentrant iff it doesn’t access shared variables(just a subset of thread-safe function)
avoid dead lock:just modify the order of acquire the resource
alt text

proxylab

Pthread_detach():分离线程,自动回收资源
Signal(SIGPIPE,SIG_IGN//直接丢弃该信号的宏):防止向已关闭的链接写入的操作导致触发信号终止进程
HTTP请求
- 默认1.1要求持久化链接

text

//request header
GET /index.html HTTP/1.1    //origin form
GET http://example.com:80/index.html HTTP/1.1   //absolute form

//example
GET /some/path HTTP/1.0\r\n
Host: example.com\r\n
User-Agent: ...给定那条...\r\n
Connection: close\r\n
Proxy-Connection: close\r\n
<其它端到端头，比如 Accept、Accept-Language>\r\n
\r\n

解析请求:前缀+host+port+path

thread-level paralism

multicore:multiple seperate processors on a single chip
hyperthreading:efficient execution of multithread on a single core

typical multicore chips have own L1 and L2 cache and shared L3 cache
out-of-order processor structure:
- instruction control has a decoder to convert instructions to stream of operations
- a single core has many function units,each execute different task

example

sum 0 to n-1

reminder:the operation on semaphores are quite expensive
a good practice:every thread accumulate on a local variable and put them in global arrays to be sum up
- avoid race and manipulate numbers from memory

some concept:
speed up: $S_p=\frac{T_1}{T_p}\text{ where }T_k\text{ is the running time using k cores}$
efficiency: $E_p=\frac{S_p}{p}$

sort

parallel quick sort:

text

if N<Nthresh,do sequential quick sort//avoid too fine grained paralism
else:
    choose pivot and rearrange the array
    spawn two thread to recurssively sort the two array

some lesson:
- have different strategies and experiment on it
- do not sychronize in the inner loop
- be aware of Amdahl’s law

memory consistency

principle:sequential consistency(the overall effect should be consistent with single thread)
- a.k.a:the arbitrary interleaving
how:snoopy cache
- tag the data both in the cache and the memory
- invalid\shared(readonly)\exclusive(writeable copy)
- when need a e-tagged data,seek in cache and set it shared

csapp_proxylab

networkprogrammingP1

networkprogrammingP2

web server basic

concurrent programming

method

sync-basic

example

synchronize

sync-advance

two example

prethreaded

thread safe

proxylab

thread-level paralism

example

memory consistency