Distributed systems
Number 0x01: 03/23/2006
[ --- The Bug! Magazine
_____ _ ___ _
/__ \ |__ ___ / __\_ _ __ _ / \
/ /\/ '_ \ / _ \ /__\// | | |/ _` |/ /
/ / | | | | __/ / \/ \ |_| | (_| /\_/
\/ |_| |_|\___| \_____/\__,_|\__, \/
|___/
[ M . A . G . A . Z . I . N . E ]
[ Numero 0x01 <---> Edicao 0x01 <---> Artigo 0x09 ]
.> 23 de Marco de 2006,
.> The Bug! Magazine < staff [at] thebugmagazine [dot] org >
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Sistemas Distribuidos
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
.> 23 de Marco de 2006,
.> hash < hash [at] gotfault [dot] net >
"No one is more than anyone else, absolutely,
here who speaks is one more survivor"
-Racionais
Index
- Introduction
- What are distributed systems and what are their advantages?
- Preliminary knowledge
- Problems with concurrent processes
- 4.1 Race Condition
- 4.2 Deadlocks
- Performance and Security
- 5.1 Dynamic memory allocation
- 5.2 Controlling Data Entry
- Concurrent processes
- 6.1 Fork()
- 6.2 Kernel level pthreads
- Traffic lights
- Implementation
- Binary traffic light + 8.
- Mutual exclusion
- Mutex pthreads
- Shared memory
- Final Considerations
- References
- Acknowledgements
1. Introduction
With the processing capacity of personal computers increasing every day the possibilities of using shared resources in detriment of big mainframes takes us to a path through parallel programming and the development of distributed systems in order to take advantage of these shared resources, that before were concentrated in big servers and mainframes.
The possibilities become even bigger if we can develop software architectures and algorithms compatible with the new reality, from this idea comes this paper.
Distributed systems (at the software level) are more a programming methodology than something more tangible. It is about using the best algorithms, specific memory sharing functions between different processes and common sense.
The C language is ideal to approach this subject, being flexible enough to take full advantage of the technological advances and being simple enough to be as popular as it already is.
This paper will go through some topics regarding Distributed Systems, trying, and I hope with success, to approach the most important ones in the clearest way possible, to make it interesting to beginners and more experienced programmers. If I succeed you will make good use of this material.
All code presented here is complete, I will not use code snippets, only its full version.
2. What are distributed systems and what are their advantages?
Distributed systems differ from concentrated systems, where a main server (mainframe) concentrates most of the processing, sending its output to peripheral computers.
This is very outdated nowadays, since the smaller computers have in their processing capacity a great advantage. It is then up to the distributed processing to be efficient in data processing, without the loss of data, being fast enough so that the "SOFTWARE = THAT'S WHAT YOU SPEED" relation is lost in time.
The centralized topology can be exemplified like this:
+-> O mainframe suporta toda
| a carga de processamento
|
+------------------------- ____|____ ---------------------------+
| +---------+MAINFRAME+-------------+ |
| | +----+-----+-----+ | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| WorkStation1 | WorkStation2 | WorkStation3 |
| +-------+ | | +-------+ |
| | | | | |
WorkStation4 WorkStation5 WorkStation6 WorkStation7 WorkStation8
One of the problems of this topology becomes very evident, what happens if the Mainframe "dies" ? All die together, it seems like a plane hijacked by islamic radicals. The data traffic will be more concentrated and therefore, more congested, it will be necessary a mega link to give flow to all that traffic, etc.
A distributed model becomes more dynamic and less dependent on a central trunk, being able to have a better use of each network node, each machine plays a more important role, not being just a data replicator.
Evidently the complexity increases (not everything is perfect), it will be necessary a lot of organization to maintain a distributed system (from medium to high) working well, there will be many variants, however, the risk of a generalized breakdown will be much lower and the speed of processing responses will be much higher than the centralized model.
And the programming focused on the use of these resources is fundamental, essential, invaluable for the success of this organization.
From now on we will see how, programming in C Ansi, take advantage of all this.
We will see performance issues, specific C functions on memory handling, concurrent processes and much more.
And of course, how we will take advantage of this in hacking to develop fast, light and efficient tools for our tasks, whatever they may be. Much of what is covered here was left to your creativity, because nothing can be 100% complete, especially when the subject is security and software development.
Some topics were left out, before the reader asks himself "where is this topic?" or "why did he not talk about this", I want to emphasize once again that covering ALL the possible topics on this subject would result in a book, which I do not have the time or patience to write, much less all the knowledge necessary for such a task. So it is natural that some topics have not been covered here, who knows in a next paper?
According to vaxen (our friend) I should include a code that deals with 100 simultaneous threads (he himself, vaxen, didn't want to do it eheh) for the reader to realize the complexity of the subject, but I believe the reader will understand this when he starts to develop his own tools with threads and the other subjects covered, After all everything is simple when you want to print a printf("Hello world") but to do it using RPC, threads, semaphores, shared memory and dynamic memory allocation in a distributed network of 100 machines for you to print hello world in the third floor printer in Chicago with you being in Jamaica is more complicated.
Except for the 100 simultaneous threads I will try to do a little bit of all this here.
3. pre-knowledge
The knowledge of C Ansi is primordial, not necessarily a deep knowledge, but at least one that does not sound like greek, or some kind of alien language for abduction, since I use MANY practical examples and also some experience on the *nix platform (Linux) is desirable.
Repeating a bit the previous paragraph, I tried to include several practical examples to not just stay in theory. Demonstrating the actual operation of each topic discussed should, at least this is the intention, clarify the ideas in case of any doubts, and they will appear, then the practical demonstration serves as support for the theoretical statements.
So if C for you is only the third letter of the alphabet and unix sounds like the name of a second line deodorant, then I advise you to close this paper and look for information about these subjects, which is really very easy to obtain, www.google.com is a truly inexhaustible source of information.
On the other hand, most of the examples presented here can, with a little effort, be understood by beginners in the C language and distributed systems. And also more advanced readers should be interested in the subject.
Basic notions about GDB and GCC are also useful.
4. Problems with concurrent processes
When we talk about distributed systems, very often, in our implementation we will have to deal with concurrent processes, and there must be a mechanism or technique that can deal with this and prevent concurrent processes from simultaneously accessing a global variable or a system resource, which if not done, can generate inconsistency in results and general breakdown of the software among other inefficiencies.
I will talk in this topic about Deadlocks and Race Condition.
4.1. Race Condition
In this example code I make use of pthreads, don't pay attention to them yet because I will comment about this subject later, the important thing is to understand why a race condition happens. Observe:
--------- code ----------
/*
* simple_race.c (example code)
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: Este codigo mostra como uma variavel
* global, compartilhada por duas funcoes
* gera um resultado inconsistente em
* race condition.
*
*/
#include <stdio.h>
#include <pthread.h>
#include <unistd.h>
#include <stdlib.h>
long result; //variavel global, sera compartilhada entre os processos.
int Calc1() {
result = 1;
long y;
for(y=0;y<1000000;y++)
result = result + 2; //manipulacao 1
return 0;
}
int Calc2() {
result = 1;
long y;
for(y=0;y<1000000;y++)
result = result + 3; //manipulacao 2
return 0;
}
int main() {
pthread_t my_thread1, my_thread2; //<--- pthreads serao abordadas mais adiante
int chk_thread;
//chamo a primeira funcao
if((chk_thread = pthread_create(&my_thread1,NULL,(void*)&Calc1,NULL)) == -1) {
printf("Impossible to create thread 1\n");
exit(EXIT_FAILURE);
} else {
printf("Thread 1 OK\n");
}
//chamo a segunda funcao, em paralelo com a primeira
if((chk_thread = pthread_create(&my_thread2,NULL,(void*)&Calc2,NULL)) == -1) {
printf("Impossible to create thread 2\n");
exit(EXIT_FAILURE);
} else {
printf("Thread 2 OK\n");
}
pthread_join(my_thread1,NULL);
pthread_join(my_thread2,NULL);
printf("Result: %ld\n",result);
return 0;
}
--------- code ----------
In the above example code, the global variable result is manipulated simultaneously by the functions Calc1() and Calc2(). When Calc1() adds a value to the result, simultaneously Calc2() does the same, resetting the variable to its previous state and adding the value, when it returns to Calc1() this value is already changed and from then on nothing else is guaranteed, the result is now always inconsistent and variable.
Have a look:
$ ./simple_race
Thread 1 OK
Thread 2 OK
Result: 3000001<---->OK
$ ./simple_race
Thread 1 OK
Thread 2 OK
Result: 3000001<---->OK
$ ./simple_race
Thread 1 OK
Thread 2 OK
Result: 3000001<---->OK
$ ./simple_race
Thread 1 OK
Thread 2 OK
Result: 3673293<---->RACE !!!!!!!
$ ./simple_race
Thread 1 OK
Thread 2 OK
Result: 3000001<---->OK
$
Further down in chapter 8 I will demonstrate how, using Mutex, to solve this problem in a way that nullifies the race condition present in the simple_race.c code.
4.2. Deadlock
Imagine I have a knife and you have a stretcher, I can't use the knife because I don't have the stretcher, and you can't eat the stretcher because you don't have the knife.
The same happens with concurrent processes that share resources, if there is no way to avoid this, the program will probably become unstable and freeze.
The following code demonstrates a deadlock occurring, when one function waits for a resource from the other and vice-versa, they both wait for each other indefinitely. An important addendum is that when using pthreads (calm down, I'll get to that) if the code doesn't generate any output it will be killed immediately in the presence of a deadlock, so I added a printout before the deadlock, that will also help you visualize exactly where the problem occurs, observe:
--------- code ----------
/*
* simple_deadlock.c (example code)
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: Exemplo de como um deadlock
* pode ocorrer usando pthreads
* e processos concorrentes.
*
*/
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
int Func1(int);
int Func2(int);
int Func1(int arg) {
int send_arg,x;
printf("Func1: 01a\n");
sleep(1);
send_arg = Func2(arg); // <----------------------------+
// |
printf("Func1: 01b\n"); //Isso nunca sera impresso |
// |
send_arg = arg * 2; // |
// |
x = send_arg; // |
// |
printf("Func1: %d\n",x); // |
// | DEADLOCK!!!!!
return send_arg; // | DEADLOCK!!!!!
// | DEADLOCK!!!!!
} // | DEADLOCK!!!!!
// |
int Func2(int arg) { // |
// |
int send_arg,x; // |
// |
printf("Func1: 02a\n"); // |
// |
sleep(1); // |
// |
send_arg = Func1(arg); // <----------------------------+
printf("Func2: 02b\n"); //Isso nunca sera impresso
send_arg = arg * 2;
x = send_arg;
printf("Func2: %d\n",x);
return send_arg;
}
int main(int av, char *ac[]) {
if(av<2) {
printf("Usage %s <number>\n",ac[0]);
exit(EXIT_FAILURE);
}
int errno;
int arg;
int x;
arg = atoi(ac[1]);
pthread_t my_thread[1];
printf("Deadlock example\n");
printf("[*] Starting threads\n");
if((errno = pthread_create(&my_thread[0],NULL,(void*)&Func1,(void*)arg)) == -1) {
printf("Impossible to create thread 1\n");
exit(EXIT_FAILURE);
}
if((errno = pthread_create(&my_thread[1],NULL,(void*)&Func1,(void*)arg)) == -1) {
printf("Impossible to create thread 1\n");
exit(EXIT_FAILURE);
}
printf("Can you see 01b and 02b??\n");
for(x=0;x<2;x++)
pthread_join(my_thread[x],NULL);
return 0;
}
--------- code ----------
Func1() expects an argument from Func2(), but for Func2() to send this argument it must first receive it from Func1() and vice versa. Deadlock.
It is always important to pay attention to this kind of problem, so in chapter 7 we will see how to solve it.
5. Performance & security
A distributed system created without performance improvement has no reason to exist, performance is fundamental, response times between one call and another must be fast enough to justify the mental investment used to create this multiprocessed environment.
The environment variables such as connections, cabling, processor speed do not depend on the developer (this is not 100% true), so it is up to the programmer to develop the fastest and lightest software possible, so that when the budget people decide to invest in hardware you will be one step ahead.
First I will cover dynamic memory allocation and then I will talk quickly about some vulnerabilities like buffer overflow and how dynamic memory allocation can avoid some of these flaws.
5.1. Dynamic memory allocation
Dynamic memory allocation is often the most efficient way to handle input data, or when your algorithm is not able to predict the memory segment that will be copied into a variable. malloc() is part of the standard stdlib.h library and its use is advisable.
If you are dealing with distributed systems, a dynamic memory allocation mechanism can be useful because you just don't know exactly the capacity of the processors involved in the system.
On the other hand, the lack or misuse of this function in programs leaves us a loophole for an attack, be it heap overflow, stack overflow, etc.
Now I will describe a little bit about malloc() and related functions like realloc(), calloc() and free().
- malloc() -> void *malloc(size_t size)
Take a look at the following code:
--------- code ----------
/*
* simple_malloc.c (example code)
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: Um exemplo do uso de malloc()
* para alocacao dinamica de memoria.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int av, char *ac[]) {
if(av<2) {
printf("Use %s <string>\n",ac[0]);
exit(EXIT_FAILURE);
}
//Aqui a alocacao dinamica e` feita com malloc(strlen...)
char *string = (char*)malloc(strlen(ac[1])+1);
strcpy(string,ac[1]);
printf("String: %s\n",string);
return 0;
}
--------- code ----------
No matter how long the string is that the user passes to the program, there will be no buffer overflow with possible arbitrary code execution.
Take a look:
$ ./simple_malloc `perl -e 'print "A"x200'`
...200 A`s
$
Now without malloc(), just setting a fixed size:
--------- code ----------
/*
* simple_no_malloc.c (example code)
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: Um exemplo simples de como a nao
* checagem da entrada de dados
* pode gerar um buffer overflow.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int av, char *ac[]) {
if(av<2) {
printf("Use %s <string>\n",ac[0]);
exit(EXIT_FAILURE);
}
char string[50]; //alocacao estatica de memoria
strcpy(string,ac[1]);
printf("String: %s\n",string);
return 0;
}
--------- code ----------
Repeating the test now with the new code:
$ ./simple_no_malloc `perl -e 'print "A"x200'`
...200 A`s
Segmentation fault
$
Now what happens, using gdb:
$ gdb -q ./simple_no_malloc
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r `perl -e 'print "A"x200'`
Starting program: /home/hash/simple_no_malloc `perl -e 'print "A"x200'`
String: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAA
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb)
At this point the system would be completely compromised if it were a remote vulnerability or a local root suid.
This approach is a bit off the point of this paper, but on the other hand it is not, because a secure programming technique is a basic and essential part of any system, distributed or not.
Although all that I have described about malloc() was focused on security, malloc() makes the code robust, portable when used together with sizeof, to retrieve the size of the variable type for each architecture.
Let's look at an example with sizeof instead:
--------- code ----------
/*
* simple_sizeof.c (example code)
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: sizeof() retorna o tamanho
* do tipo de variavel de acordo
* com a arquitetura.
*
*/
#include <stdio.h>
int main() {
printf("size of int: %d bytes\n",sizeof (int));
printf("size of short int: %d bytes\n",sizeof (short int));
printf("size of char: %d bytes\n",sizeof (char));
printf("size of float: %d bytes\n",sizeof (float));
printf("size of double: %d bytes\n",sizeof (double));
printf("size of long double: %d bytes\n",sizeof (long double));
return 0;
}
--------- code ----------
Executando o codigo:
$ gcc simple_sizeof.c -o simple_sizeof
$ ./simple_sizeof
size of int: 4 bytes
size of short int: 2 bytes
size of char: 1 bytes
size of float: 4 bytes
size of double: 8 bytes
size of long double: 12 bytes
$
This could, in more practical code, be used to define a variable in a portable form:
int var = (int)malloc(sizeof(int));
var now stores the size of an integer, which in my case is 4 bytes. malloc() can be used in conjunction with free() as described below.
- free() -> void free(void *ptr)
free() frees a previously used block of memory with malloc(). Once freed this block will no longer be accessible.
See the example:
--------- code ----------
/*
* simple_free.c (example code)
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: Demonstracao da funcao free(), que
* deve ser utilizada em conjunto com
* malloc().
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int Func1(char *arg){
char *string = (char*)malloc(strlen(arg)+1);
strcpy(string,arg);
printf("String: %s\n",string); //OK
free(string); //aqui desaloca o bloco de memoria.
printf("String: %s\n",string); //Vazio
return *string;
}
int main(int av, char *ac[]) {
if(av<2) {
printf("Use: %s <string>\n",ac[0]);
exit(EXIT_FAILURE);
}
Func1(ac[1]);
return 0;
}
--------- code ----------
Executando o codigo:
$ gcc simple_free.c -o simple_free
$ ./simple_free teste
String: teste
String: <------ vazio
$
Simple, isn't it? Using free() whenever possible makes the code lighter by freeing resources as soon as they are used. In a larger code this makes a difference.
- alloca() -> void *alloca(size_t size)
alloca() works slightly different from malloc() since it frees the memory block as soon as it is used, without free().
This has its positive and negative sides. The positive side is that you don't have to worry about freeing memory, but the negative side is exactly the same, you don't have to worry about freeing memory, so what happens if a function X allocates a space with alloca() and that content needs to be reclaimed by main() or some other function?
This is what happens:
--------- code ----------
/*
* alloca.c
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: alloca() libera o bloco de memoria
* assim que termina a funcao que o chamou,
* de forma que outras funcoes nao estao
* aptas a recuperarem este espaco.
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char *Func(char *arg) {
char *imprime = (char*)alloca(strlen(arg)+1);
strcpy(imprime,arg);
printf("Variavel imprime: %s\n",imprime); //aqui imprime
return imprime;
}
int main(int av, char *ac[]) {
if(av<2) {
printf("Use: %s string\n",ac[0]);
exit(EXIT_FAILURE);
}
char *imprime_main = (char*)malloc(strlen(ac[1])+1);
imprime_main = Func(ac[1]);
printf("Variavel imprime_main: %s\n",imprime_main); //aqui vem lixo concatenado
return 0;
}
--------- code ----------
Observe o resultado:
$ gcc alloca.c -o alloca
$ ./alloca teste
Variavel imprime: teste
Variavel imprime_main: †∂ˆ∑*8
$
malloc() and free() should be used to avoid this behavior.
- realloc() -> void *realloc(void *ptr, size_t size)
realloc() changes the size of the allocated block of memory, if the space is not compatible it copies the contents to a new address with a suitable size.
free() should be used to free the space changed by realloc().
5.2. Controlling data input
Before any data is copied to memory, either by strcpy, strcat, etc., a previous filtering of this data is advisable. This helps in detecting bugs, preventing vulnerabilities and will train you to notice similar implementation flaws in other people's code, where you can exploit them in the future.
In this topic I will quickly approach 3 very common vulnerabilities, not going into too much detail because this information is available in papers at TheBug! Magazine, like the xgc, sandimas and spud paper in previous issues.
- stack overflow;
In chapter 5 "Performance & Security" further up in the paper you will find the simple_no_malloc.c code that demonstrates a stack overflow.
Please reread chapter 5 if you have any doubts.
- heap overflow;
PS: This example could be in the section about malloc(), still in chapter 5, but I preferred a separate chapter since it is also a separate subject. The same goes for the integer overflow example.
Here is a code vulnerable to heap overflow, contributed by barros <barros at barrossecurity dot com>:
--------- code ----------
/*
* simple_heap.c
*
* Written by barros <barros at barrossecurity dot com>
*
* Descricao: Um erro na aplicacao da funcao malloc()
* gera um heap overflow.
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int ac, char **av){
if(ac<2) {
printf("Use: %s string\n",av[0]);
exit(EXIT_FAILURE);
}
char *b1 = (char *)malloc(128); //128 fixos?
char *b2 = (char *)malloc(128); //128 fixos?
strcpy(b1,av[1]);
strcpy(b2,av[1]);
free(b1);
free(b2);
return 0;
}
--------- code ----------
See the execution:
$ gdb -q ./simple_heap
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r `perl -e 'print "A"x200'`
Starting program: /home/hash/simple_heap `perl -e 'print "A"x200'`
Program received signal SIGSEGV, Segmentation fault.
0xb7ebd945 in _int_free () from /lib/libc.so.6
(gdb)
Now a solution, with the correct implementation of malloc():
--------- code ----------
/*
* simple_heap_solucao.c
*
* Written by hash <hash at gotfault dot net>
*
*
* Descricao: Uma correta implementacao
* da funcao malloc() evita um
* heap overflow.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int ac, char **av){
if(ac<2) {
printf("Use: %s value\n",av[0]);
exit(EXIT_FAILURE);
}
char *b1 = (char *)malloc(strlen(av[1])+1);
char *b2 = (char *)malloc(strlen(av[1])+1);
strcpy(b1,av[1]);
strcpy(b2,av[1]);
free(b1);
free(b2);
return 0;
}
--------- code ----------
The execution:
$ gdb -q ./simple_heap_solucao
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r `perl -e 'print "A"x200'`
Starting program: /home/hash/simple_heap_solucao `perl -e 'print "A"x200'`
Program exited normally.
(gdb)
..malloc(strlen(av[1])+1); guarantees that the buffer will be the right size and we won't get heap overflow in this case.
- integer overflow;
Integer overflow, unlike the rest, happens because of bad integer initialization.
In general (but it is not a rule) integer overflow or integer underflow, which are lower values, do not allow direct manipulation of the %eip register, instead the approach becomes indirect, note:
--------- code ----------
/*
* simple_integer.c
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Codigo vulneravel a integer overflow
*
* observe:
* ./simple_integer 19 -1073726060
*/
#include <stdio.h>
#include <stdlib.h>
int main(int av, char *ac[]) {
if(av<3) {
printf("Use: %s slot value\n",ac[0]);
exit(EXIT_FAILURE);
}
int recv[16];
unsigned int slot = atoi(ac[1]);
recv[slot] = atoi(ac[2]);
printf("Value: %d\n",atoi(ac[2]));
return 0;
}
--------- code ----------
We have an integer of type int with buffer 16, but when we try to overwrite beyond this buffer, 16+value, the buffer will overflow into an integer overflow, see:
$ gdb -q ./simple_integer
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r 19 -1073745920
Starting program: /home/hash/simple_integer 19 -1073745920
Value: -1073745920
Program received signal SIGSEGV, Segmentation fault.
0xbffff000 in ?? ()
(gdb)
Notice that the return address is 0xbffff000, so we can store our shellcode in an environment variable preceded by many nop`s (0x90) and viola.
A while ago I made a fuzzer in Perl called flawseeker.pl, and in this fuzzer there is a test option for integer overflow, I'll just show the end of its execution against the vulnerable code above:
sh: line 1: 3307 Segmentation fault ./simple_integer 19 -1073745940 >/dev/null 2>&1
SIGSEGV AT -1073745920 -> $eip: eip 0xbffff000 0xbffff000
sh: line 1: 3310 Segmentation fault ./simple_integer 19 -1073745920 >/dev/null 2>&1
[--->Done!
eax 0x0 0
ecx 0xb7fdd013 -1208102893
edx 0x13 19
ebx 0xb7fc5f2c -1208197332
esp 0xbfcf02d0 0xbfcf02d0
ebp 0xbfcf02f8 0xbfcf02f8
esi 0xb7ff3900 -1208010496
edi 0xbfcf0324 -1076952284
eip 0xbffff000 0xbffff000 /*nosso %eip
eflags 0x10282 66178 nosso nop+shellcode ficara
cs 0x73 115 ficara em um endereco proximo*/
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x0 0
Got return address at value: -1073745920
Hmmm 0xbffff*! Hack it y/N>y
Value: -1073745920
sh-3.00b$
One of the possible solutions, would be to define that the buffer of the integer type int recv[] be dynamic, one of the ways to achieve this:
--------- code ----------
* simple_integer_fixed.c
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Interger overflow nao vulneravel
*
* Check it out:
* ./simple_integer 19 -1073726060
*/
#include <stdio.h>
#include <stdlib.h>
int main(int av, char *ac[]) {
if(av<3) {
printf("Use: %s slot value\n",ac[0]);
exit(EXIT_FAILURE);
}
unsigned int slot = atoi(ac[1]);
int recv[slot+1]; //uma alocacao dinamica sem o uso de malloc()
recv[slot] = atoi(ac[2]);
printf("Value: %d\n",atoi(ac[2]));
return 0;
}
--------- code ----------
Executing:
$ gdb -q ./simple_integer_fixed
Using host libthread_db library "/lib/libthread_db.so.1".
(gdb) r 19 -1073726060
Starting program: /home/hash/simple_integer_fixed 19
-1073726060
Value: -1073726060
Program exited normally.
(gdb)
This was a very basic approach, since other documents deal exclusively with vulnerabilities like this one, I limited myself to a light description with examples in order to point out the need to make use of preventive functions and algorithms and also to help those who are not familiar with overflow exploitation.
In the following topics we will look deeper into important issues like process competition, message exchange and others.
6. Concurrency between processes
Concurrent process is one of the many hearts of distributed systems, and there are many ways to implement this mechanism.
In this topic I will navigate over two controllers (if you can use that term) of concurrent processes, they are:
- fork()
- pthreads
The advantages, characteristics, problems and implementation of each of them will be seen in this chapter.
6.1. fork()
Prototipo: pid_t fork(void)
The primitas fork() and wait() were the first language notations to specify concurrency. Both processes execute the same code.
Each process has its own address space with copy of all variables, which are independent with respect to the variables of the calling process.
The synchronization between the parent and child processes are controlled by the wait() primitive, its prototype is pid_t wait(int *status) , which waits for the child process to execute before continuing the parent process.
First I will demonstrate a fork() implementation and then another with fork() & wait():
--------- code ----------
/*
* simple_fork.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: exemplo de uso de fork()
*
*/
#include <stdio.h>
#include <unistd.h>
int main() {
printf("Father pid: %d\n",getpid());
if(fork() == 0) {
printf("Wait...\n");
sleep(20);
} else {
printf("Child process running in background\n");
}
return 0;
}
--------- code ----------
Running this code:
$ ./simple_fork ; ps ax |grep simple_fork |head -1
Father pid: 3579
Child pid: 3580
Child process running in background
3580 tty1 S 0:00 ./simple_fork
$
As we don't have wait() as primitive in this example, the program calls the child process, which runs in the background and exits releasing the command prompt. Now in the next example the functioning is different, the parent process waits for the son process to execute, before releasing the command prompt, observe:
--------- code ----------
/*
* simple_fork_wait.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Simples fork() & wait() exemplo
*
*/
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
int main() {
printf("Father pid: %d\n",getpid());
if(fork() == 0) {
printf("Wait...\n");
sleep(20);
} else {
int i,status;
i = wait(&status);
printf("Child pid: %d\nDone\n",i);
}
return 0;
}
--------- code ----------
Now from TTY1 I run the code:
hash@scarface:~/security/C/paper/fork$ ./simple_fork_wait
Father pid: 3653
Wait...
.
.
.
Now in the TTY2 terminal:
$ ps ax |grep simple_fork_wait |head -2
3653 tty1 S+ 0:00 ./simple_fork_wait
3654 tty1 S+ 0:00 ./simple_fork_wait
$
And after the 20 seconds of the child process have passed I get this message from TTY1, the tty from which I called simple_fork_wait and was stuck waiting for the child process, which finally finishes:
.
.
.
Child pid: 3654
Done
$
This behavior is especially useful, if not fundamental when dealing with concurrent processes, to avoid race condition and other inconsistencies in the output.
fork() is still widely used in the real world (the one Neo believed he lived in) and very often, especially in less complex or critical programs, it still works very well. But sometimes fork() is not enough when dealing with concurrent processes and even then by having all the memory space duplicated and becomes heavier compared to the primitive we will see next, pthreads.
6.2. Kernel level pthreads
Posix threads (pthreads), the Ferrari of concurrent processes, also called lightweight process, pthread shares the memory address, data thread and code thread, unlike fork() where the child process only shares the code thread with the parent process.
With pthreads the cost to the processor is lower and makes the difference. Graphs show a processing superiority, speed and cost to the processor lower with pthreads than with fork():
+------------------------------------------------------------------+
| Platforma | fork() | pthread_create() |
+------------------------------------------------------------------+
| | real user sys | real user sys|
|------------------------------------------------------------------|
|IBM 375 MHz POWER3 | 61.94 3.49 53.74 | 7.46 2.76 6.79 |
|------------------------------------------------------------------|
|IBM 1.5 GHz POWER4 | 44.08 2.21 40.27 | 1.49 0.97 0.97 |
|------------------------------------------------------------------|
|IBM 1.9 GHz POWER5 p5-575 | 50.66 3.32 42.75 | 1.13 0.54 0.75 |
|------------------------------------------------------------------|
|INTEL 2.4 GHz Xeon | 23.81 3.12 8.97 | 1.70 0.53 0.30 |
|------------------------------------------------------------------|
|INTEL 1.4 GHz Itanium 2 | 23.61 0.12 3.42 | 2.10 0.04 0.01 |
+------------------------------------------------------------------+
The advantage of using pthreads over fork() is evident. Distributed systems need to be lightweight, fast and consistent, and for this pthread is the best option when it comes to process concurrency.
Pthread uses the standard pthread.h library and a usage example is as follows:
--------- code ----------
/*
* simple_thread.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Exemplo de uma implementacao pthread
*
*
*/
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
void Func1() {
printf("Func1: thread\n");
sleep(10);
}
int main() {
pthread_t my_thread;
pthread_create(&my_thread,NULL,(void*)&Func1,NULL);
pthread_join(my_thread,NULL);
return 0;
}
--------- code ----------
Running on TTY1:
Note: to compile the code it is necessary to tell the compiler to link the pthread primitives with -lpthread, e.g:
$ gcc -lpthread simple_pthread.c -o simple_pthread -Wall
$ ./simple_thread
Func1: thread
Enquando o TTY1 nao e` liberado observe o TTY2:
$ ps ax |grep simple_thread |head -3
3762 tty1 S+ 0:00 ./simple_thread
3763 tty1 S+ 0:00 ./simple_thread
3764 tty1 S+ 0:00 ./simple_thread
$
Now, but 3 processes? why not just 2? The answer is simple, the first process number 3762 is the father process, normal.
The difference comes in the second process number 3763, that is a process that the system creates that is called process controller, it has the function of controlling the following processes created by the threads, no matter how many threads are created, we will have only 1 controller process, this makes things easier for the kernel but on the other hand it limits the number of threads that can be created in the system since every thread will have its unique PID.
The third process number 3764 is the respective light process created by the thread.
The primitive for creating a new thread is pthread_create(), see prototype:
int pthread_create(pthread_t * thread, pthread_attr_t * attr, void * (*start_routine)(void *), void * arg)
pthread_t e` a struct, entao:
pthread_t my_thread;
Onde my_thread sera o identificador da thread. Seguindo:
pthread_create(&my_thread,NULL,(void*)&Func1,NULL);
In this line the thread related to my_thread is created, the second parameter is NULL, it would fit some element of the first parameter. The third parameter (void*)&Func1, is a void pointing to the function that the thread will fire followed by NULL, where you could enter the options of this function, similar to "int function(char *arg)" for example.
O equivalente a wait() para threads e` pthread_join(), veja:
int pthread_join(pthread_t th, void **thread_return);
Where pthread_t is the thread identifier, in this case my_thread and NULL would receive a return value from this function.
Without pthread_join() the function would terminate, but, and pay attention, but, with pthreads this would be drastic, since threads share most of the resources and addressing of the calling process the thread would die immediately, along with main(), so the use of pthread_join() is invaluable and for each thread there should be a specific pthread_join().
If we add one more function to the above code and remove the pthread_join() primitives see what happens:
--------- code ----------
/*
* simple_thread_no_join.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Exemple de uma implementacao pthread
* sem pthread_join()
*
*
*/
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
void Func1() {
int x;
for(x=0;x<100;x++)
printf("%d\n",x);
}
void Func2() {
int x;
for(x=0;x<100;x++)
printf("%d\n",x);
}
int main() {
pthread_t my_thread1, my_thread2;
pthread_create(&my_thread1,NULL,(void*)&Func1,NULL);
pthread_create(&my_thread2,NULL,(void*)&Func2,NULL);
sleep(0.5);
return 0;
}
--------- code ----------
After compiling with -lptread I execute the code consecutively:
$ ./simple_thread_no_join
$ ./simple_thread_no_join
$ ./simple_thread_no_join
$ ./simple_thread_no_join
$ ./simple_thread_no_join
0
1
1
$ ./simple_thread_no_join
The loops of each initialized thread cannot complete, because when main() ends, it takes the threads with it.
There is much more to be said about pthreads, including functions not seen here, but this should introduce the subject properly.
7. Traffic lights
A traffic light, in a short description, is a counter used to control access to shared resources.
Let's first learn about some primitives for the implementation of semaphores.
The libraries are
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
As primitivas que utlizo nesse paper sao:
int semget(key_t, int nsems, int semflag);
int semctl(int semid, int semnum, int cmd,
union semun arg);
int semop(int semid, struct sembuf *sops, size_t nsops);
- semget() : Creates a set of traffic lights or gets the identification of an existing one;
- semctl() : Control the options of a traffic light;
- semop() : Performs simple operations on a traffic light.
Next I will perform the implementation of some traffic lights.
7a - Implementation
An example of how to create a traffic light, that will later be available for other processes is described in the following code:
--------- code ----------
/*
* simple_semaphore.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Cria um semaforo
*
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <unistd.h>
#include <stdlib.h>
#define KEY 1337
int main() {
int semid;
//a primitiva semget() cria o semaforo
if((semid = semget(KEY,1,0600|IPC_CREAT|IPC_EXCL)) == -1){
printf("Impossible to create semaphore\n");
exit(EXIT_FAILURE);
} else {
//semid guarda a identificacao do semaforo
printf("Semaphore id: %d\n",semid);
}
return 0;
}
--------- code ----------
semget() takes as parameters first
KEY, which is the identification of the semaphore, all processes that will access this semaphore need to know the value of KEY.
Then the number 1 indicates that it will be a 2-position traffic light, because it starts with zero, so the positions are 0 and 1.
0600 is the traffic light permission,
IPC_CREAT is the primitive that tells the traffic light to be created, followed by IPC_EXCL, which fails if the traffic light already exists.
A third primitive in addition to IPC_CREAT and IPC_EXCL is IPC_RMID, which removes the traffic light.
I will now execute the code and observe the output, then the ipcs command is used:
$ ./simple_semaphore
Semaphore id: 32768
$ ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000539 32768 hash 600 1
$
The hexadecimal value 0x00000539 equals 1337, which is the traffic light's KEY, the semid value, 32768, is the traffic light's identification, then we see the permission 0600 where only the creator of the traffic light, hash, seen in owner, can change it, besides root, nsens means that it is a 2-position traffic light.
You can use the ipcrm command to remove this traffic light:
$ ipcrm sem 131072
resource(s) deleted
$ ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
$
Traffic light deleted. Now a code that creates a traffic light, sleeps for a while, and then removes the traffic light:
--------- code ----------
/*
* simple_semaphore2.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Cria um semaforo, dorme um pouco
* e deleta o semaforo.
*
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/sem.h>
#include <unistd.h>
#include <stdlib.h>
#define KEY 1337
int main() {
int semid,
semkill;
if((semid = semget(KEY,1,0600|IPC_CREAT|IPC_EXCL)) == -1){
printf("Impossible to create semaphore\n");
exit(EXIT_FAILURE);
} else {
printf("Semaphore id: %d\n",semid);
printf("Check it out with ipcs -s from another tty\n");
}
sleep(20);
//IPC_RMID e` a primitiva que remove o semaforo
if((semkill = semctl(semid,0,IPC_RMID,0)) == -1) {
printf("Impossible to destroy semaphore\n");
exit(EXIT_FAILURE);
} else {
printf("Semaphore id: %d destroyed\n",semid);
}
return 0;
}
--------- code ----------
semctl() uses the semid identifier to then remove the traffic light from the system with IPC_RMID.
TTY1:
$ ./simple_semaphore2
Semaphore id: 163840
Check it out with ipcs -s from another tty
.
.
.
TTY2:
$ ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
0x00000539 163840 hash 600 1
$
TTY1:
.
.
.
Semaphore id: 163840 destroyed
$
TTY2:
$ ipcs -s
------ Semaphore Arrays --------
key semid owner perms nsems
$
Later on I will demonstrate the use of semop() to change the value of a traffic light.
7.1. Binary Traffic Light
For the beginner, the complete implementation of a traffic light may seem a bit complex and difficult to understand, thinking about this Professor Edsger Wybe Dijkstra developed in 1979 what is known today as a binary traffic light, which consists of two basic operations V() and P(), where V() frees and P() blocks.
I will not include an implementation of Dijkstra's semaphore but will give some information:
Non-standard library: #include "dijkstra.h";
In possession of this library the binary traffic light can be implemented in your code, the algorithm goes like this:
----- algoritmo ----
#include "dijkstra.h"
int variavel_global;
funcao1() {
declaracao de variaveis;
comandos;
P(semaforo) //bloqueia area critica
Manipulacao da variavel;
Fim da manipulacao
V(semaforo) //libera area critica
}
funcao2() {
declaracao de variaveis;
comandos;
P(semaforo) //bloqueia area critica
Manipulacao da variavel;
Fim da manipulacao
V(semaforo) //libera area critica
}
int main() {
funcao1();
funcao2();
}
----- algoritmo ----
The first function that accesses the global_variable blocks the access of the other function, which sleeps while waiting, more precisely enters a wait state, at the process level that is controlled by the kernel. When the critical zone is cleared with V(semaphore) the other function gains access to the variable and in turn enters P(semaphore) blocking any other competing processes.
The whole thing is done transparently, freeing the programmer from any logistic burden on the semaphore, it makes life much easier, but it is beyond the scope of this paper, so I preferred to approach a primordial semaphore.
Any information about Dijkstra's binary traffic light can be found on the web as well as his library dijkstra.h, necessary for implementation.
One last interesting fact is that this V() and P() mechanism is very similar to the use of Pthread Mutex, you will see later, but first, Mutual Exclusion.
8. Mutual exclusion
Mutual exclusion, like semaphores, is another way to handle concurrent processes, but mutual exclusion is an algorithm, while semaphores are system functions.
A code to demonstrate mutual exclusion:
--------- code ----------
/*
* simple_mutual_exclusion.c (example code)
*
* Algorithm de Peterson (1981) and
*
* Com pequenas mudancas by hash <hash@gotfault.net>
*
* Descricao: Algoritmo de exclusao mutua, apesar de gerar
* resultado consistente, possui o defeito de
* gerar espera ocupada.
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
int mutex[2];
int share;
int change;
void Func1() {
int i;
for(i=0;i<100;i++) {
mutex[0] = 1;
change = 0;
while(mutex[1]==1 && change==0){} <--+
|
share = share + 5; |
|
mutex[0] = 0; |
} |
|
} | Espera
|-> ocupada
void Func2() { |
|
int i; |
|
for(i=0;i<100;i++) { |
|
mutex[1] = 1; |
change = 1; |
|
while(mutex[0]==1 && change==1) <----+
share = share + 3;
mutex[1] = 0;
}
}
int main() {
pthread_t my_thread1, my_thread2;
int chk_thread;
if((chk_thread = pthread_create(&my_thread1,NULL,(void*)&Func1,NULL)) == -1)
printf("Impossible to start thread1\n");
if((chk_thread = pthread_create(&my_thread2,NULL,(void*)&Func2,NULL)) == -1)
printf("Impossible to start thread2\n");
pthread_join(my_thread1,NULL);
pthread_join(my_thread2,NULL);
printf("Result is: %d\n",share);
return 0;
}
--------- code ----------
Executing the code:
$ ./simple_mutual_exclusion
Result is: 800 <----------------+
$ ./simple_mutual_exclusion |
Result is: 800 <----------------+
$ ./simple_mutual_exclusion |---> 800
Result is: 800 <----------------+
$ ./simple_mutual_exclusion |
Result is: 800 <----------------+
$
An array and a variable (global) are used for critical zone access control, it works very well, but on the other hand it generates busy waits, which in many systems could cause slowdowns and loss of processing and CPU performance, the busy waits in question occur at:
Func1()
while(mutex[1]==1 && change==0){}
Func2()
while(mutex[0]==1 && change==1) {}
Both tests wait for the variable values to change before moving on.
Now, for smaller, lower impact systems this could be easily implemented, ensuring consistency of result, with no root condition.
9. Pthreads Mutex
Phtread has a function called pthread mutex , the struct:
pthread_mutex_t
Initializes the mutex, see declaration:
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
Another option would be with:
pthread_mutex_init
And its initialization form:
pthread_mutex_t mutex;
pthread_mutex_init(&mutex,NULL);
Let's just use the first form for practicality.
In the general sense pthread_mutex_t looks a lot like a binary traffic light, if you have paid attention to the traffic light chapter you will soon notice the visual similarities.
A solution to the race condition simple_race.c code in chapter 4 with mutex is in the following code:
--------- code ----------
/*
* simple_race_mutex.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: pthread_mutex_t em processos concorrentes
*
*/
#include <stdio.h>
#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>
long result;
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
int Calc1() {
long y;
pthread_mutex_lock(&mutex);
result = 1;
for(y=0;y<1000000;y++)
result = result + 2;
pthread_mutex_unlock(&mutex);
return 0;
}
int Calc2() {
long y;
pthread_mutex_lock(&mutex);
result = 1;
for(y=0;y<1000000;y++)
result = result + 3;
pthread_mutex_unlock(&mutex);
return 0;
}
int main() {
pthread_t my_thread1, my_thread2;
int chk_thread;
if((chk_thread = pthread_create(&my_thread1,NULL,(void*)&Calc1,NULL)) == -1) {
printf("Impossible to create thread 1\n");
exit(EXIT_FAILURE);
} else {
printf("Thread 1 OK\n");
}
if((chk_thread = pthread_create(&my_thread2,NULL,(void*)&Calc2,NULL)) == -1) {
printf("Impossible to create thread 2\n");
exit(EXIT_FAILURE);
} else {
printf("Thread 2 OK\n");
}
pthread_join(my_thread1,NULL);
pthread_join(my_thread2,NULL);
printf("Result: %ld\n",result);
return 0;
}
--------- code ----------
Remember that we are dealing with kernel level pthreads, this means that when a process enters the critical zone with pthread_lock() the other processes that try to access this zone enter a waiting state, this is done transparently by the kernel, not requiring any additional code load as with the semaphore before.
Let's see the execution:
$ ./simple_race_mutex
Thread 1 OK
Thread 2 OK
Result: 3000001<------------------------->\
$ ./simple_race_mutex | \
Thread 1 OK | \
Thread 2 OK | \
Result: 3000001<-------------------------+------->\
$ ./simple_race_mutex | \
Thread 1 OK | \
Thread 2 OK | \ +---------+
Result: 3000001<-------------------------+------------------>> 3000001 |
$ ./simple_race | / +---------+
Thread 1 OK | /
Thread 2 OK | /
Result: 3000001<-------------------------+------->/
$ ./simple_race | /
Thread 1 OK | /
Thread 2 OK | /
Result: 3000001<------------------------->/
$
Here is another example:
--------- code ----------
/*
* mutex2.c
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Mais um exemplo com mutex.
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>
#define GO_TO_HELL -1
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER; <- incializador mutex
struct hell {
char *i,*kick,*your,*ass,*jackass;
};struct hell devil;
long result;
int bill1(void *m) {
long i;
struct hell *silicio;
char *daemon;
silicio = (struct hell *) m;
daemon = silicio->i;
printf("%s\n",daemon);
pthread_mutex_lock(&mutex); <-----------------+
|
result = 0; | Mutex protege
|-> a funcao
for(i=0;i<100000;i++) | de calculo.
result = result + 2; |
|
pthread_mutex_unlock(&mutex); <---------------+
return 0;
}
int main() {
int err;
pthread_t you, should, recieve, kick, gambler;
char sick[] = "hey";
char hate[] = "bill,";
char fury[] = "damm!";
char anger[] = "you got";
char greed[] = "billions!?!";
devil.i = sick;
devil.kick = hate;
devil.your = fury;
devil.ass = anger;
devil.jackass = greed;
if((err = pthread_create(&you,NULL,(void*)&bill1,(void*)&devil.i)) == -1)
{
printf("you: possesed!\n");
exit(GO_TO_HELL);
}
if((err = pthread_create(&should,NULL,(void*)&bill1,(void*)&devil.kick)) == -1)
{
printf("should: possesed!\n");
exit(GO_TO_HELL);
}
if((err = pthread_create(&recieve,NULL,(void*)&bill1,(void*)&devil.your)) == -1)
{
printf("recieve: possesed!\n");
exit(GO_TO_HELL);
}
if((err = pthread_create(&kick,NULL,(void*)&bill1,(void*)&devil.ass)) == -1)
{
printf("kick: possesed!\n");
exit(GO_TO_HELL);
}
if((err = pthread_create(&gambler,NULL,(void*)&bill1,(void*)&devil.jackass)) == -1)
{
printf("kick: possesed!\n");
exit(GO_TO_HELL);
}
pthread_join(you,NULL);
pthread_join(should,NULL);
pthread_join(recieve,NULL);
pthread_join(kick,NULL);
printf("%ld\n",result);
pthread_join(gambler,NULL);
return 0;
}
--------- code ----------
Execution:
$ ./teste2
hey <--+
bill, <--|
damm! <--|
you got <--|--> OK
200000 <--|
billions!?!<--+
$ ./teste2
hey
bill,
damm! IDEM
you got
200000
billions!?!
$ ./teste2
hey
bill,
damm!
you got
billions!?! <-- ERRADO!!
200000
$
Notice that the value doesn't change, it's always 200000, so mutex worked, but the order in which the treads are scaled by the kernel can vary, generating the result above. For the code to have its order guaranteed, besides mutex it would have to be implemented a sorting system via semaphore or mutual exclusion, for example, to guarantee that tread 1 would always be followed by thread 2 and so on. It is not difficult and you can do this for training.
A sleep() between each pthread_create() in main() doesn't count!
New threads are already being developed to make this even more efficient, one of them is the NPTL (Next Generation POSIX Threads) developed by IBM that offers more conformity with the classic POSIX model of Pthreads (POSIX Threads) that we have seen so far.
The NPTL was developed from the 2.5 kernel and migrating to this system now requires, at least, the 2.6.x kernel and some adaptations will be necessary, if the system is already using NPTL the getconf command informs:
$ getconf GNU_LIBPTHREAD_VERSION
$ nptl-0.60
Another novelty is the Intel Hyperthreading. These are processors that consecrate the use of threads, according to Intel the performance increase can reach up to 30% depending on the system configuration.
The technology simulates in a single physical processor, two logical processors.
In very loaded systems that depend a lot on performance this makes the difference. Imagine a tool to break 64 or 128 bit encryption on such a system, the chances of success are higher.
But for now POSIX Threads has been meeting most needs well.
10. Shared memory
There are 4 basic types of message exchanges between processes:
- Sinais
- IPC
- Sockets
- RPC
In this chapter I will cover IPC.
And there are still 4 sub-categories with IPC`s, they are:
- Unix Pipes;
- FIFO's Drivers;
- Message Queue;
- Shared Memory.
Among these 4 methods, shared memory is undoubtedly the fastest because it does not use any intermediary such as drivers or queues.
For this reason I consider the shared memory method the most elegant of the four because it meets an important goal in distributed systems, which is speed, not to mention the amplitude of this type of storage that can store various information.
In the process address space, the information will be mapped into memory, making it available to any process that has a valid permission and its identification number, much like the traffic light system seen earlier.
Unlike techniques like unix pipes or drivers, sharing memory allows you to write a variety of information to this segment, a string for example, respecting the available size.
Like mutex the kernel also provides an internal routine with all the necessary parameters in linux/shm.h
, they are
shmid_ds {
struct ipc_perm shm_perm; //permissoes do compartilhamento
int shm_segsz; //tamanho do segmento em bytes
time_t shm_atime; //data ultima ligacao
time_t shm_dtime; //data ultima desconexao
time_t shm_ctime; //data ultima atualizacao
unsigned short shm_cpid; //PID do processo criador
unsigned short shm_lpid; //PID do ultimo processo
short shm_nattch; // numero de ligacoes
}
The libraries are:
#include <sys/ipc.h>
#include <sys/shm.h>
To create a new memory segment the function shmget()
, its prototype:
int shmget(key_t key, int size, int shmflg);
When creating a memory area we have to define the paging of this memory (its size) and this is architecture dependent. A symbol called PAGE_SIZE represents the size that a page of memory can have. On my linux Slackware PIII 750 a page has 4096 bytes, use the code below to check yours:
--------- code ----------
/*
* getpagesize.c (example code)
*
* Written by hash <hash@gotfault.net>
*
*
* Description: Retorna o PAGE_SIZE da
* da arquitetura.
*
*
*/
#include <stdio.h>
#include <unistd.h>
int main(){
int size = getpagesize(); //funcao que retorna o PAGE_SIZE
printf("Page size: %d bytes\n",size);
return 0;
}
--------- code ----------
Observe:
$ ./getpagesize
Page size: 4096 bytes
$ cat getpagesize.c
We can subdivide this page in blocks of 1024 bytes for example.
Like in traffic lights, the memory sharing has arguments IPC_CREAT and IPC_EXCL that have the same function, such as:
- SHMMAX:
Sets the maximum size that a segment
segment should have, default 2M.
- SHMMIN:
Minimum size, default 1 byte.
- SHMMIN:
Maximum number of segments, default 4096.
- SHMALL:
Maximum size of memory pages the system should support. The formula is:
(SHMMAX/PAGE_SIZE*(SHMMNI/16).
So just to illustrate:
--------- code ----------
/*
* shmall.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Recupera, apenas como representacao,
* o tamanho maximo de paginas que o
* sistema deve suportar de acordo com
* os valores padrao.
*
*/
#include <stdio.h>
#include <unistd.h>
int main() {
int size = getpagesize();
int shmall = 2000000/size*(4096/16);
printf("SHMALL: %d\n",shmall);
return 0;
}
--------- code ----------
$ ./shmall
SHMALL: 124928
So in possession of this information I will demonstrate a code that creates a memory segment in the system:
--------- code ----------
/*
* shared_memory_create.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Description: Cria um segmento de memoria compartilhda.
* Teste o sucesso com o comando: ipcs -m
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#define PAGE 1024
#define KEY 31337
int main() {
int shmid;
char *path = "/etc/passwd";
if((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_CREAT|IPC_EXCL|0600)) == -1) {
printf("Memory segment already exist\n");
exit(EXIT_FAILURE);
} else {
printf("Memory segment id: %d\n",shmid);
}
return 0;
}
--------- code ----------
ftok() de prototipo:
key_t ftok(const char *pathname, int proj_id);
It generates a composite key with a path "/etc/passwd", but it could be any valid system file, and the KEY key followed by the page size, in this case 1024 bytes and the primitives IPC_CREAT, IPC_EXCL and the permissions 0600.
Let's see how it looks like:
$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x69047442 2457623 hash 600 1024 0
$
There, the memory segment is created and able to store information.
To be able to store information in this memory segment we will use some functions, first shmat():
int shmdt(const void *shmaddr);
shmat() is the function responsible for coupling the memory segment to the process. Once coupled we use shmdt():
int shmdt(const void *shmaddr);
Now in possession of this information I can create writes to this thread, but in the code I first need to verify its existence and capture its shmid, which is the thread identification number, see:
--------- code ----------
/*
* shared_memory_write.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Escreve em um segmento de memoria
* compartilhada.
*
* Para remover: ipcrm -m shmid(valor)
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#define PAGE 1024
#define KEY 31337
int main() {
int shmid;
int w,status;
char *path = "/etc/passwd";
char *write;
if((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_CREAT|0600)) == -1) {
printf("Cannot read memory segment\n");
exit(EXIT_FAILURE);
}
printf("Memory segment id: %d\n",shmid);
write = shmat(shmid,0,0);
printf("Memory segment pointed to var 'write'\n");
if(fork() == 0) {
strcpy(write,"i wanna hack the world"); <-- strcpy para gravar na memoria
printf("Memory segmente has been written\n");
shmdt(write);
sleep(2);
} else {
w = wait(&status);
printf("Memory segment %d content is: %s \n",shmid,write);
shmdt(write);
}
return 0;
}
--------- code ----------
So, by executing the two codes above, one followed by the other the result will be:
$ ./shared_memory1_create
Memory segment id: 3506194
$ ./shared_memory1_write
Memory segment id: 3506194
Memory segment pointed to var 'write'
Memory segmente written
Memory segment 3506194 content is: i wanna hack the world
$
And take another look at our live segment:
$ ipcs -m
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x6904747f 3506194 hash 600 1024 0
$
In the previous code obeserve that char *write gets the return from shmat():
write = shmat(shmid,0,0);
With that I use strcpy to write to this segment transparently:
strcpy(write,"string");
Then I decouple the memory segment:
shmdt(write);
Simple, isn't it?
To simply read the memory contents is also very easy, just attach it to the memory segment and you are done:
--------- code ----------
/*
* shared_memory_read.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Description: Le o conteudo de um segmento de memoria.
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#include <string.h>
#define PAGE 1024
#define KEY 31337
int main() {
char *read;
char *path = "/etc/passwd";
int shmid;
if((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_CREAT|0600)) == -1) {
printf("Cannot read segment\n");
exit(EXIT_FAILURE);
} else {
printf("Memory segment id: %d\n",shmid);
read = shmat(shmid,0,0); <-- read agora contem a string
printf("Memory segment content: %s\n",read);
shmdt(read); <- desacoplamento
}
return 0;
}
--------- code ----------
Take a look:
$ ./shared_memory1_read
Memory segment id: 3506194
Memory segment content: i wanna hack the world
$
I will now remove this segment of memory from the system:
--------- code ----------
/*
* shared_memory_destroy.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Description: Destroy a shared memory segment.
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#include <string.h>
#define PAGE 1024
#define KEY 31337
int main() {
char *path = "/etc/passwd";
int shmid;
if((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_CREAT|0600)) == -1) {
printf("Cannot read segment\n");
exit(EXIT_FAILURE);
}
if((shmctl(shmid,IPC_RMID,NULL)) == -1) {
printf("Cannot destroy segment\n");
exit(EXIT_FAILURE);
} else {
printf("Memory segment %d erased\n",shmid);
}
return 0;
}
--------- code ----------
Remember IPC_RMID for traffic lights? Well, it fits here too.
This code now contains shmctl(), prototype:
int shmctl(int shmid, int cmd, struct shmid_ds *buf);
shmctl() provides access to the memory thread information, changes permissions and can also destroy the thread, which is the case with this code.
With shmctl() one can read:
- shm_perm.care -> segment owner id
- shm_perm.cgid -> id of the owner group
- shm_perm.mode -> access permissions
- shm_segsz -> segment size
- shm_atime -> last coupling date
- shm_dtime -> last disconnection date
- shm_ctime -> date of last change
- shm_cpid -> PID date of the creator process
- shm_lpid -> date of last process
- shm_nattch -> number of coupled processes
Let's see for example the segment creator ID:
--------- code ----------
/*
* shared_memory_id.c (example code)
*
* Written by hash <hash@gotfault.net>
*
* Description: Returns ID of segment`s owner.
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#include <string.h>
#define PAGE 1024
#define KEY 31337
int main() {
struct shmid_ds dados;
char *path = "/etc/passwd";
int shmid;
if((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_CREAT|0600)) == -1) {
printf("Cannot read segment\n");
exit(EXIT_FAILURE);
}
if(shmctl(shmid,IPC_STAT,&dados) == -1) {
printf("Cannot read data\n");
exit(EXIT_FAILURE);
}
printf("Owner`s ID: %d\n",dados.shm_perm.cuid);
return 0;
}
--------- code ----------
I recovered the shmid_ds structure (see beginning of chapter) in "data" and with this structure it is very fast to recover the data. Take a look at the execution:
$ ./shared_memory1_id
Owner`s ID: 1000
$ id
uid=1000(hash) gid=100(users) groups=100(users),106(hash)
$
A new required option was IPC_STAT which reads the contents of shmid_ds and its members, other options are:
- IPC_SET: Changes the identification of group, user and permissions. Only owner or root can change.
- IPC_RMID: As seen before marks the segment to be destroyed after decoupling.
And to finish the chapter, two codes, one writes in a segment and the other one loops reading what is written in the same segment.
write.c:
--------- code ----------
/*
* write.c
*
* Written by hash <hash at gotfault dot net>
*
* Descricao: Escreve em um segmento de memoria.
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#include <string.h>
#include <unistd.h>
#define PAGE 1024
#define KEY 31337
int main(int av, char *ac[]) {
int shmid;
char *path = "/etc/hosts";
char *write = (char*)malloc(strlen(ac[1]));
if(ac[2]) {
printf("Use: %s <string>\n",ac[0]);
exit(EXIT_FAILURE);
}
if(strlen(ac[1]) > PAGE) {
printf("String nao deve ter mais que %d bytes\n",PAGE);
exit(EXIT_FAILURE);
}
//Cria o segmento:
if((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_CREAT|IPC_EXCL|0600)) == -1)
{
printf("Segmento de memoria ja existente\n");
exit(EXIT_FAILURE);
} else {
printf("ID do segmendo: %d\n",shmid);
}
write = shmat(shmid,0,0); //acopla no segmento
//Escreve no segmento:
strcpy(write,ac[1]);
printf("Segmento %d escrito\n",shmid);
shmdt(write); //desacopla
sleep(1);
return 0;
}
--------- code ----------
read.c:
--------- code ----------
/*
* codename
*
* Written by hash <hash@gotfault.net>
*
* Descricao: Le um segmento de memoria e em
* seguida o destroi.
*
*
*
*/
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/shm.h>
#include <string.h>
#include <unistd.h>
#define PAGE 1024
#define KEY 31337
int main() {
int shmid;
char *read;
char *path = "/etc/hosts";
printf("Aguardando por escrita no segmento...\n");
printf("Ctrl+c para sair\n");
//Le o segmento:
while(1) {
if ((shmid = shmget(ftok(path,(key_t)KEY),PAGE,IPC_EXCL|0600)) == -1)
{
sleep(1);
} else {
read = shmat(shmid,0,0);
printf("Conteudo do segmento %d: %s\n",shmid,read);
//Remove o segmento:
if((shmctl(shmid,IPC_RMID,NULL)) == -1)
{
printf("Impossivel destruir segmento\n");
exit(EXIT_FAILURE);
} else {
printf("Segmento %d destruido\n",shmid);
}
}
}
return 0;
}
--------- code ----------
No TTY1 executo read:
$ ./read
Aguardando por escrita no segmento...
Ctrl+c para sair
.
.
.
Agora em TTY2 executo write:
$ ./write teste1
ID do segmendo: 33816594
Segmento 33816594 escrito
$ ./write teste2
ID do segmendo: 33849363
Segmento 33849363 escrito
$
Meanwhile on TTY1:
.
.
.
Conteudo do segmento 33816594: teste1
Segmento 33816594 destruido
onteudo do segmento 33849363: teste2
Segmento 33849363 destruido
11. Final considerations
All the content presented so far is not even close to the whole amplitude that the subject has, a lot still has to be said and discussed about Distributed Systems and some topics like RPC, for time reasons, have not been discussed in this paper, what does not prevent that in another opportunity this and other aspects will be described.
On the other hand, the topics discussed here should help the reader to feel more familiar with the aspects of parallel programming and distributed systems, and I hope to have contributed to that.
This document was made in a short period of time, just a few days, so any mistake that may have been made here I ask for the reader's understanding and feel free to send me possible corrections or suggestions.
I hope this subject has aroused your interest, so this document has achieved its goal. Thank you.
12. References
- http://www.llnl.gov/computing/tutorials/pthreads/#Management
- http://www.dcmonster.com/pthread/
- "Sistemas Distribuidos, Desenvolvendo aplicacoes de alta performance no Linux" de Uira Ribeiro
13. Acknowledgments
- A minha gata kk, claro porque sem ela nao tem graca.
- Quero agradecer a todos na gotfault e tripbit
- sandimas, dx, vaxen, barros, lucas, f-117, illusion, spud,
- eniac, posidron, codak, feyman, roberto...
- Se eu esqueci de alguem va se acostumando, a vida e` cruel :)