# Clean Code ## Overview Topic:: [[Programming (MOC)]] URL:: [Course Website](https://www.udemy.com/share/103MgS3@zCdPMnO8CacegkJ5UP2wz6glakSJLJues5RfFNZ4MnIPyMj3YEVUU0U01pmYnzKb/) ## Table of Content 1. naming 2. structure and comments 3. functions 4. conditionals and error handling 5. classes and data structure ## Notes ### What is Clean Code your code is like a story (Related:: [[Storytelling]]), something that is easily understood by humans, and not just computers. it should be: 1. readable and meaningful 2. reduce cognitive load 3. concise 4. short and with low complexity 5. follows best practices 6. fun to write and maintain clean code is an iterative process, while working on a code, or while reviewing it, refracturing is likely to be necessary, its not a linear "once and done" process. #### Key Pain Points usual suspects where code becomes a problem to understand: 1. naming - variables / functions / classes 2. structure and comments - formatting / comments 3. functions - length / parameters 4. conditionals and error handling - nesting / error handling 5. classes and data structure - missing distinction / bloated classes ### Naming in general, **names should be meaningful**, others should be able to understand the code without deep diving into the code. 1. **remove redundancy** - names should also not include redundant information. For example a class "user" shouldn't be called "user_with_age_and_name", what happens if we will add gender as well? it will be too long and cumbersome. 2. **avoid slang, abbreviations and disinformation** - (like describing something as list when it is a dict) 3. **choose distinctive names** - avoid creating functions with similar names 4. **be consistent** - if you use "get_user" in one part of the code, don't switch to "fetch_user" later on... #### Variables and Constants use nouns or short phrases with adjectives - "user_data", "is_valid" a good description of **what is stored in them** a variable can be: 1. an object (list/dict): "database", "list_of_states" 2. a number/string: "name", "age" 3. a boolean value: "is_active", "logged_in" - something that sounds like a question that the variables answers examples: | what is stored | bad | okay | good | | --------------------------- | -------- | --------------------- | --------------- | | a user object (email, name) | u / data | person / user_data | customer / user | | input validation (bool) | v / val | correct / valid_input | is_correct / is_valid | #### Functions use verbs or short phrases with adjectives - "Send_data", "Input_is_valid" a good description of **what it does** | what the function do | bad | okay | good | | --------------------------- | -------- | --------------------- | --------------- | | saves a user data | process / handle | save / store_data | saveUser / user.save | | validate input | process / save | validateSave / check | validate / is_valid | #### Classes use nouns or short phrases with nouns - "User", "RequestBody" a good description of **what it represents** | what is the object | bad | okay | good | | --------------------------- | -------- | --------------------- | --------------- | | user | uEntity / objA | userObj / appUser | User / Admin | | a database | data / dataStorage | Db / class_db | database / sqlDabase | ### Structure and Comments #### Comments comments are usually bad because: 1. they convey redundant information (which is obvious based on the naming in the code) 2. unnecessarily splits the code into sections 3. they can be misleading (if the code updates but the comment doesn't) 4. commented out code is redundant good comments: 1. legal comments 2. adds explanation that can't be replaced by good naming 3. warnings 4. todo notes 5. documentation for external users #### Code Formatting there are two types of formatting: 1. Vertical - spaces between lines / grouping code 2. Horizontal - indentation, width, spaces within lines. ##### Vertical * the file shouldn't be too long, if so, consider splitting it to multiple scripts. * add blank lines between different "sections" of the code. * related objects should be close to each other (for example - all imports are grouped together) * have a logical reading "flow", for example - define functions should be before calling them. ##### Horizontal * use indentation * use multiple lines instead of one super long line * don't have too long variable names ### Functions when writing a function, aside from naming, we should also consider: 1. the amount and order of parameters 2. the length of the body of the function #### Parameters In general, you should try to minimize the amount of parameters the function uses, so it will be easier to call. As a rule of thumb, try to have no more than 3 params. A possible solution might be to construct a class, with contains all the necessary parameters that are just passed along from one function to the other. πŸ”΄ don't to this (since order might be confusing) ```python def my_func(email, password): save_func(email, password) my_func("my_email", "my_password") ``` 🟒 do this: ```python class user(email, password): def init: self.email=email self.password=password def save(self): save_func(self.email, self.password) new_user = user("my_email", "my_password") new_user.save() ``` if you need more than 3 parameters, try to use a dictionary instead: πŸ”΄ don't to this ```python def my_func(email, password, name): do something... my_func("my_email", "my_password", "my name") ``` 🟒 do this: ```python def my_func(userdata): do something... my_func(userdata={"email":"my_email", "password":"my_password", "name":"my name"}) ``` beware of "output" parameters. cases when the parameter you used is in itself modified, instead of only using the return value. For example a function that changes the dataframe given, or adds class attributes. also, use default parameters when possible. #### Function Body Functions should only do **one thing** the "one thing" depends on the level of abstraction of the code. As a general rule, the function should contain a single logic within, for example, a function that takes a parameter, run some validations, manipulates the data, and exports to csv is actually doing 3 things: validation, manipulation, export. Each one should be a different function. Clean functions are also easier to test. (Related:: [[Testing]]) its important to follow (Jump:: [[the DRY principle]]), i.e don't repeat yourself. Same logic should be contained within a function, so that it would be easier to change/adapt when necessary. we should keep functions pure, which means that **the same input will always result in the same output**. Functions shouldn't have side effects, which means manipulation the main environment of the script, for example changing a global variable. In that case, side effects should either be obvious based on function name, or in a dedicated function. ### Control Structures and Error Handling For example, the more your code is nested, the harder it is to understand. For example having to many if else statements within each statement. to have better control structures is to: 1. avoid deep nesting 2. using factory functions and polymorphism 3. use positive checks over negative ones 4. have error handling #### Deep Nesting one way to avoid deep nesting is by using *guards* and *failing fast*. meaning **adding parameter validations and assertions to the beginning of the code/function**, that way you will need less "if" statements later (because each param is clean and verified), and if it is not validated the code will fail or return nothing right at the beginning, instead of later. #### Error Handling by adding error handling functions (try/except), we can make sure our code will run smoothly, and we will be able to make our code cleaner by catching certain situations without too much nested if-else/ patchwork solutions. #### Factory Functions & Polymorphism polymorphism are functions that can accept a certain of input, but the output depends on the type of input. For example functions that accept "transactions", but process it differently based on whether its a credit card or cash transaction. factory functions are functions that create class instances or objects. πŸ”΄ don't to this ```python def my_func(name): if condition 1: # ... elif condition 2: # ... elif condition 3: ... ``` 🟒 do this: ```python name_op_dict = { "type_1":func_1, "type_2": func_2, "type_3": func_3 } def my_func(name): result = name_op_dict[name] ``` #### Hardcoding try to avoid hardcoding, for example, if there is a value that is repeated a lot, turn it into a constant which is defined at the beginning of the code. ### Classes and Data Structure classes can be very helpful for enabling polymorphism in your code. Meaning you will create a class for each type of "mission" your code has to handle, and then you simply create one function that determines which class will be used. in general, **classes should be small**. classes should be focused on one responsibility only (which is not the same as "one thing" a function should do, a responsibility would usually have several methods). so its better a lot of small classes than one big class. Classes should have high **Cohesion**, which means that all your methods use most of the class properties. classes should stick to the **law of demeter** which states that classes should only interact with objects that: 1. the object it belongs to (the class itself) 2. properties of that object (such as the class properties) 3. method parameters 4. objects created in the method the purpose of the law of demeter is to make sure your code doesn't rely too much on the structure of other classes/objects, but rather is self-sufficient. Classes should be built according to the (Jump:: **[[SOLID principle]]**) ## External Resources