Telecon 03-20-07
From OSR
(Meeting notes at bottom)
Agenda:
1) Brief round table and status updates 2) Today's Topic: Pthreads support
Upcoming Topics:
- System-level view - Interrupts - Portability and target archs
Discussion Starter:
One of the main new features of the N-way LWK is support for threading. The defacto API is POSIX threads (Pthreads) so we should have a pretty good idea of how they are going to be supported. In our world, full support probably isn't necessary, but we have to have enough to make the OpenMP runtime libraries and auto-parallelizing compilers happy. The core things I think we need to support efficiently are:
1) Spawning one thread per CPU 2) Synchronization primitives 3) Thread local data
Here's the Pthreads management API:
pthread_t thread; /* opaque thread ID */
int pthread_create(
pthread_t * pthread, /* returned thread ID */
const pthread_attr_t * attr, /* can specify stack, detached, etc. */
void *(*start)(void *), /* function to run */
void * arg /* arg to pass to function */
);
pthread_t pthread_self(void);
int pthread_exit(void *value_ptr);
int pthread_detach(pthread_t thread);
int pthread_join(pthread_t thread, void **value_ptr);
Things to discuss:
1) Some CPUs may share resources, e.g., a TLB, virtually tagged physical cache, etc. In this case, sharing the same page tables between cores might be a big win. By definition, Pthreads share the exact same address space, so this shouldn't be a problem. How about a new mechanism to say "I want this CPU to share this other CPU's address space".
int
region_mirror_all(
cpu_t src,
cpu_t dst
);
Requirements are that the destination CPU not have any regions configured and not be running a thread already.
2) Need to distinguish between a thread exiting and a process exiting. In the latter, the LWK needs to notify the job launcher that the process has exited. Do we need a concept of the MAIN_THREAD? Each thread has a MAIN_THREAD associated with it, which is inherited unless the MAIN_THREAD flag is set. Exit syscall can be used to kill all threads with the same MAIN_THREAD... a thread group.
int
thread_create(
cpu_t cpu, /* effectively the thread's ID */
void * (*start_address)(void * priv),
void * priv,
void * stack_pointer,
uint32_t flags /* MAIN_THREAD */
)
int
thread_exit(
void * exit_value,
int kill_all /* boolean */
);
Function Main Thread Other Threads
======== =========== =============
return() All die This thread dies
pthread_exit() All die This thread dies
exit() or _exit() All die All die
3) How is joining implemented?
int
thread_wait(
cpu_t cpu,
void ** exit_value
);
4) Pthreads mutexes and condition variables need to be efficiently
supported. It doesn't seem like spin-waiting at user level will be
a good idea with hundreds of cpus. Contention could be very high and
N cache misses each time somebody unlocks the mutex. Furthermore, some
cpus may share resources (an execution pipe in the case of SMT), making
spin-waiting expensive. Possible options:
1) Add kernel level support for mutexes (ala futexes)
2) Rely on advanced hardware features like the MONITOR/MWAIT
instructions on x86. MONITOR allows you to specify a cache
line to monitor for changes and MWAIT puts you to sleep until
a change occurs. AFAIK, nobody actually uses these for
synchronization yet.
3) Others?
Perhaps we needs something like Futexes on Linux.
Futexes: http://people.redhat.com/drepper/futex.pdf Linux NTPL: http://people.redhat.com/drepper/nptl-design.pdf
5) Other topics I've missed
Meeting Notes
0) Discussed current interrupt problems a bit. Traps currently disable local interrupts. For now, probably fine... plan is for there not to really be any system calls during runtime. However, kernel code still has to do locking like interrupts are enabled since other cpus can be in the kernel at the same time. So, changing to leaving interrupts enabled for traps shouldn't be very hard.
1.0) Agreed full Pthreads support not necessary. Made analogy to OpenGL implementations... most are very incomplete but have all of the important functionality.
1,1) region_mirror_all() seems OK in principle. Agreed there is an issue with cpus that share resources. Probably don't need to implement at time 0.
2) Goal is to try to keep process management out of the kernel as much as possible.
3) Time 0 join solution is busy waiting. Long term, probably need some mechanism to sleep for CPUs that share physical resources (execution pipe, cache, etc.).
4) Time 0 mutex solution is also busy waiting. MONITOR/MWAIT sound promising.
5) Discussed next weeks' topic a bit: Job loading