Chapter 2 Types Operators and Expressions - Learning C

⌨️ ([03:39:39](https://www.youtube.com/watch?v=PaPN51Mm5qQ&t=13179s)) K&R 2: Types, Operators, and Expressions What seems weird coming from Python? ![[Screenshot 2024-07-03 at 7.07.36 AM.png]] + Everything in Python is an object and you don't notice it + Float and double work well with modern languages Division![[Screenshot 2024-07-03 at 7.09.07 PM.png]] See also [[Chapter 1 A Tutorial Introduction#C Program Structure | Program Structure]] + C and Unix were byte-addressable computers + Previously, you could only load words, not characters + C became byte-oriented ## Endianness + Do you load words with least significant digits or most significant digits first? + Most computers were big-endian, but Intel microprocesors were little-endian first so they could start addition while they were doing the high-end + Little-endian is about performance ![](https://www.youtube.com/watch?v=NcaiHcBvDR4) + Computers arrange memory in bytes + endianness is about the order that computer read bytes in, in memory allocation + HOw does your computer store values in bytes in memory? + We have an address for each of those bytes + You have a 32-bit value and you need to assign it into a location + you'll have to split it into 4 bytes and assign each piece into each byte location + We need to assign the bytes into memory locations + ![[Screenshot 2024-07-03 at 7.29.53 PM.png]]![[Screenshot 2024-07-03 at 7.32.31 PM.png]] + The internet is big-endian + Machines need to agree + It matters when you transfer data between machines of different types + Translation often happens in software to communicate over a network in IP ## Chapter 2 in K+R + Expressions: combine variables and constants to produce values + Underscore is useful for improving the readability of long variable names + Lower case for variable names, upper case for symbolic constants + ### Data types in C: There are only a few basic data types in C: ``` char a single byte, capable of holding one character in the local character set. int an integer, typically reflecting the natural size of integers on the host machine. float single-precision floating point. double double-precision floating point. ``` In addition, there are a number of qualifiers which can be applied to `int`: `short`, `long`, and `unsigned`. `short` and `long` refer to different sizes of integers, `unsigned` numbers obey the laws of arithmetic modulo 2n, where _n_ is the number of bits in an `int`; `unsigned` numbers are always positive. The declarations for the qualifiers look like ``` short int x; long int y; unsigned int z; ``` + Compilers interpret short and long differently for hardware Modern `int` values in C are 32-bits long and `long` values are 64-bits long. Even though modern computers can do 64-bit computations in a single instruction, using the shorter `int` type when appropriate can save on memory storage and memory bandwidth when using `int` values. ### On different floating point sizes Also if you look at the `float` and `double` types, you see different bit-sizes. Even worse, each of these computers did floating point computation using slightly different hardware implementations and the same code run on different computers would give _slightly different_ results and have unpredictable behavior on overflow, underflow and other extraordinary floating point operations. This was solved by the introduction of the IEEE 754 (1985) floating point format standard - which standardized both the length of `float` and `double` but insured that the same set of floating point calculations would produce the _exact same result_ on different processors. ### Constants Technically, a string is an array whose elements are single characters. The compiler automatically places the null character `\0` at the end of each such string, so programs can conveniently find the end. This representation means that there is no real limit to how long a string can be, but programs have to scan one completely to determine its length. The physical storage required is one more location than the number of characters written between the quotes. The following function `strlen(s)` returns the length of a character string `s`, excluding the terminal `\0`. Be careful to distinguish between a character constant and a string that contains a single character: `'x'` is not the same as `"x"`. The former is a single character, used to produce the numeric value of the letter _x_ in the machine's character set. The latter is a character string that contains one character (the letter _x_) and a `\0`. ## String length ```c strlen(s) /* return length of s */ char s[]; { int i; i = 0; while (s[i] != '\0') ++i; return(i); } ``` ## Integer division and modulus Integer division truncates any fractional part. The expression ``` x % y ``` produces the remainder when `x` is divided by `y`, and thus is zero when `y` divides `x` exactly. For example, a year is a leap year if it is divisible by 4 but not by 100, except that years divisible by 400 _are_ leap years. Therefore if (year % 4 == 0 && year % 100 != 0 || year % 400 == 0) _it's a leap year_ else _it's not_ The `%` operator cannot be applied to `float` or `double`. ## Structured Programming he 1970s and 1980s saw a decline in the use of GOTO statements in favor of the [structured programming](https://en.wikipedia.org/wiki/Structured_programming "Structured programming") [paradigm](https://en.wikipedia.org/wiki/Programming_paradigm "Programming paradigm"), with GOTO criticized as leading to unmaintainable [spaghetti code](https://en.wikipedia.org/wiki/Spaghetti_code "Spaghetti code"). Some [programming style](https://en.wikipedia.org/wiki/Programming_style) coding standards, for example the GNU Pascal Coding Standards, recommend against the use of GOTO statements. (Structured Progrmaming)[https://www.cc4e.com/book/chap02.md] One of the great debates of the 1970's was how to use [Structured Programming](https://en.wikipedia.org/wiki/Structured_programming) to avoid any use of "goto" statements that lead to completely unreadable "spaghetti code". Structured code was easier to read, debug, and validate. Structured Programming advocated for if-then-else, else if, while-do loops and do-while loops where the loop exit test was at the top or bottom of loops respectively. There was a move from [Flow Charts](https://en.wikipedia.org/wiki/Flowchart) with lines, boxes and arrows to structured diagramming techniques like [Nassi–Shneiderman](https://en.wikipedia.org/wiki/Nassi-Shneiderman_diagram) diagrams that used nested boxes to emphasize the structured nature of the code. The proponents of each approach tended to approach the problem based on the language they used. ALGOL and PASCAL programmers were strong advocates of structured programming and those languages had syntax that encouraged the approach. FORTRAN programmers had decades of flow-chart style thinking and tended to eschew full adoption of structured programming. Kernighan and Ritchie chose a "middle path" and made it so C could support both approaches to avoid angering either side of the structured programming debate. One area where the "structured programming" movement kept hitting a snag was implementing a loop that reads a file and processes data until it reaches the end of file. The loop must be able to handle an empty file or no data at all. There are three ways to construct a "read and process until EOF" loop and none of the approaches are ideal. ## GOTO considered Harmful [GOTO Considered Harmful](https://homepages.cwi.nl/~storm/teaching/reader/Dijkstra68.pdf)![[Screenshot 2024-07-04 at 4.11.02 PM.png]] ## Type Conversions Notice that all `float` values in an expression are converted to `double`; all floating point arithmetic in C is done in double precision. ## Increment and Decrement The unusual aspect is that `++` and `--` may be used either as prefix operators (before the variable, as in `++n`), or postfix (after the variable: `n++`). In both cases, the effect is to increment `n`. But the expression `++n` increments `n` _before_ using its value, while `n++` increments `n` _after_ its value has been used. This means that in a context where the value is being used, not just the effect, `++n` and `n++` are different. ## Prefix and Postfix > [There are two other very important expression formats](https://runestone.academy/ns/books/published/pythonds/BasicDS/InfixPrefixandPostfixExpressions.html) that may not seem obvious to you at first. Consider the infix expression A + B. What would happen if we moved the operator before the two operands? The resulting expression would be + A B. Likewise, we could move the operator to the end. We would get A B +. These look a bit strange. > These changes to the position of the operator with respect to the operands create two new expression formats, **prefix** and **postfix**. Prefix expression notation requires that all operators precede the two operands that they work on. Postfix, on the other hand, requires that its operators come after the corresponding operands. A few more examples should help to make this a bit clearer (see [Table 2](https://runestone.academy/ns/books/published/pythonds/BasicDS/InfixPrefixandPostfixExpressions.html#tbl-example1)). ![[Screenshot 2024-07-08 at 12.44.30 PM.png]] Consider these three expressions again (see [Table 3](https://runestone.academy/ns/books/published/pythonds/BasicDS/InfixPrefixandPostfixExpressions.html#tbl-parexample)). Something very important has happened. Where did the parentheses go? Why don’t we need them in prefix and postfix? The answer is that the operators are no longer ambiguous with respect to the operands that they work on. Only infix notation requires the additional symbols. The order of operations within prefix and postfix expressions is completely determined by the position of the operator and nothing else. In many ways, this makes infix the least desirable notation to use. ### bitwise operations ![[Screenshot 2024-07-08 at 12.56.45 PM.png]] C provides a number of operators for bit manipulation; these may not be applied to `float` or `double`. ``` & bitwise AND | bitwise inclusive OR ^ bitwise exclusive OR << left shift >> right shift ~ one's complement (unary) ``` The bitwise AND operator `&` is often used to mask off some set of bits; for example, ``` c = n & 0177; ``` sets to zero all but the low-order 7 bits of `n`. The bitwise OR operator `|` is used to turn bits on: ``` x = x | MASK; ``` sets to one in `x` the bits that are set to one in `MASK`. You should carefully distinguish the bitwise operators `&` and `|` from the logical connectives `&&` and `||`, which imply left-to-right evaluation of a truth value. For example, if `x` is 1 and `y` is 2, then `x & y` is zero while `x && y` is one. ## Advice With modern-day compilers and optimizers, you gain little performance by writing dense / obtuse code. Write the code and describe what you want done and let the compiler find the best way to do it. One of the reasons that a common senior project in many Computer Science degrees was to write a compiler is to make sure all Computer Science students understand that they can trust the compiler to generate great code.