## Stacks
+ We start with empty stack, append, and take elements off
+ we can approximate this with a Python list
+ LIFO Queue
![[Screenshot 2024-07-25 at 7.08.19 AM.png]]
## Call by Value versus call by reference
+ Call by value - value copied into the function directly, the parameter has the copy of the function
+ Within the function we subtract 10, but in the main function it's unchanged
+ The outside world is unaffected
+ A copy is made on the stack frame
+ Call by reference - the stuff inside can affect the function
+ ![[Screenshot 2024-07-28 at 9.13.11 AM.png]]
+ At the first print statement, 42 is allocated
+ Then, the C runtime library allocated a stack frame
![[Screenshot 2024-07-28 at 9.14.19 AM.png]]
Within the function, we only see the top part of the stack. When we return the C runtime pops off the stack.
## Register variables
+ A register is 40x faster than regular memory, keep in a register which is faster
+ You can't get the memory address of a variable declared as a register
## Recursion
+ A function calling itself with multiple stack frames
## Basics of Functions
[Functions break large computing tasks](https://www.cc4e.com/book/chap04.md) into smaller ones, and enable people to build on what others have done instead of starting over from scratch. Appropriate functions can often hide details of operation from parts of the program that don't need to know about them, thus clarifying the whole, and easing the pain of making changes.
+ C programs generally consist of numerous small functions rather than a few big ones
### Grep implementation
Note: changed a bunch of stuff to satisfy compiler
```c
#include <stdio.h>
#define MAXLINE 1000
int get_line(char s[], int lim) /* get line into s, return length */
{
int c, i;
for (i = 0; i < lim - 1 && (c = getchar()) != EOF && c != '\n'; ++i)
s[i] = c;
if (c == '\n')
{
s[i] = c;
++i;
}
s[i] = '\0';
return (i);
}
int get_index(char s[], char t[]) /* return index of t in s, -1 if none */
{
int i, j, k;
for (i = 0; s[i] != '\0'; i++)
{
for (j = i, k = 0; t[k] != '\0' && s[j] == t[k]; j++, k++)
;
if (t[k] == '\0')
return (i);
}
return (-1);
}
int main(void) /* find all lines matching a pattern */
{
char line[MAXLINE];
while (get_line(line, MAXLINE) > 0)
if (get_index(line, "the") >= 0)
printf("%s", line);
}
```
## Order in functions
A program is just a set of individual function definitions. Communication between the functions is (in this case) by arguments and values returned by the functions; it can also be via external variables. The functions can occur in any order on the source file, and the source program can be split into multiple files, so long as no function is split.
_calling_ routine must state that `atof` returns a non-int value. The declaration is shown in the following primitive desk calculator (barely adequate for check-book balancing), which reads one number per line, optionally preceded by a sign, and adds them all up, printing the sum after each input.
```c
#include <stdio.h>
#define MAXLINE 100
main() /* rudimentary desk calculator */
{
double sum, atof();
char line[MAXLINE];
sum = 0;
while (get_line(line, MAXLINE) > 0)
printf("\t%.2f\n", sum += atof (line));
}
```
## Strings are passed by value with a trick to avoid copying
Pass by value: functions receive private copies of function
+ Functions can change items in arrays
+ structs are also passed by reference
+ Python follows this design decision for the same (efficiency) reason as C. Normal single variables like `int` or `float` are copied before being passed into a function and therefore are passed by value. Collections like `list` or `dict` are passed into functions by reference and so the contents can be changed within a function. Python strings are not copied when being passed into a function, but the way assignments happen in Python makes it seem like strings are passed by value (i.e. they cannot be modified). You can learn more with a bit of web research, but the easy way is to imagine that strings are passed by value in Python with a clever trick to avoid copying.
+ Because in Java, JavaScript, and PHP strings are objects, those languages can make sure that strings act as if they were passed by value and not passed by reference the way they are passed in C.
+ C made decisions on run-time based on getting the maximum performance out of hardware in the 1970's at the expense of making it too easy to write code that overwrites memory and leads to corrupted programs that have dangerous and undefined behavior.
+ Languages like PHP, Java, and JavaScript add a small amount of run-time overhead to do things like store the length of an array and make sure we programmers don't over reference the array and overwrite random bits of our program's code or data.
## Scope Rules
The _scope_ of a name is the part of the program over which the name is defined. For an automatic variable declared at the beginning of a function, the scope is the function in which the name is declared, and variables of the same name in different functions are unrelated. The same is true of the arguments of the function.
## Static variables
internal `static` variables provide private, permanent storage in a function
An external `static` variable is known within the remainder of the _source file_ in which it is declared, but not in any other file. External static thus provides a way to hide names like `buf` and `bufp` in the `getch`-`ungetch` combination, which must be external so they can be shared, yet which should not be visible to users of `getch` and `ungetch`, so there is no possibility of conflict. If the two routines and the two variables are compiled in one file, as
Only one copy is created and it is available to the whole program
Preserves values, not initialized at zero, etc.
![[Screenshot 2024-08-08 at 6.45.23 AM.png]]
![[Screenshot 2024-08-08 at 6.44.56 AM.png]]
## Recursion
A function calling intself directly or indirectly
When a function calls itself recursively, each invocation gets a fresh set of all the automatic variables, quite independent of the previous set.
**Recursion** generally provides no saving in storage, since somewhere a stack of the values being processed has to be maintained. Nor will it be faster. But recursive code is more compact, and often much easier to write and understand. Recursion is especially convenient for recursively defined data structures like trees; we will see a nice example in [Chapter 6](https://www.cc4e.com/book/chap06.md).
```c
#include <stdio.h>
int printd(int n)
{
int i;
if (n < 0)
{
printf(n);
putchar('-');
n = -n;
printf("%d\n", n);
}
if ((i = n / 10) != 0)
printd(i);
putchar(n % 10 + '0');
}
int main()
{
printd(10);
}
```
## Macro Preprocessor
In this section we are talking about the "pre-processor". It is probably a good time to talk a bit about why we use this terminology. For those of you with a Computer Science degree from "back in the day", many of you wrote a compiler as a senior project (just like I did). Building a compiler was a great project because part of the goal of Computer Science is to understand the technologies that make programming possible from the language syntax down to the hardware. The compiler that translates our source code into machine code is an essential part of that technology stack.
Early compilers for languages like the early FORTRAN variants tended to be "translators" - they translated code one line at a time from the high level language to the assembly language. You could think of early FORTRAN programs in the 1950's and 1960's as just more convienent ways to write assembly language for programmers that knew assembly language. You needed to be aware of assembly language and the translation that was going on to write fast FORTRAN. Programs were small and optimization was done at the FORTRAN level, often leading to some hard-to-understand code.
By the mid 1970's programming languages were based on parsing theory and we used what is called a "grammar" to define a language. Kernighan and Ritchie kept I/O statements out of the C language to keep its formal description (i.e. its grammar) as simple as possible. As these new languages emerged, they allowed for a more theoretical and powerful approach to converting source code to machine language.
Part of what made a "compiler generator" possible was the idea of a multi-step compiler - where the tasks of a compiler were broken down into a series of simpler and more well defined steps. Here are the steps of a typical C compiler in the 1970's:
- A preprocessor step that takes C code with syntax like `#define` and `#include` as its input and produces raw C code output with those instructions processed. The preprocessor was a C to C transformation.
- A parser step that took the raw C code, applied the grammar to the language and created what is called a "parse tree". Think of the tree as a hierarchy of statements, grouped into blocks, grouped into functions etc. - Things like loops are just nodes in the parse tree.
- A code generation step that turns the parse tree into some kind of simplistic internal code that expanded things like loops and if-then-else statements into code
- A code optimization step that looked at the internal code, and moved things around and eliminated any redundant computations (i.e. don't compute the same thing twice). This step is why the authors make such a big fuss about how there are times where C might do things in a slightly different order in an expression even in the presence of parenthesis. Remember the "K&R C Rearrangement License" back in Chapter 2? That rule removes constraints on the compiler's optimization step so it can generate the most efficient code.
- I would note that all of the steps up to this point do not depend in any way on the actual machine language of the system they were running on. This meant that a preprocessor, parser, code generator, and code optimizer could be written in C and used on any architecture.
- A "code generator" step takes the optimized intermediate code and generates the actual assembly and then machine language for the processor. For fun, you can add the `-s` parameter to your C compiler and see the resulting assembly language output for your system.
If you look at the machine language generated on an `Intel` or `AMD` processor and compare it to the machine language on an ARM processor, it will look very different.