# Automate the Boring Stuff with Python (course)
## Overview
Topic:: [[Python (MOC)]]
URL:: [Course Website]()
## Table of Content
1. Python Basics
2. Flow control
3. Functions
4. Error handling
5. Lists
6. Dictionaries
7. strings
8. command line
9. Regular expression
10. Files
11. Debugging
12. Web scrapping
13. Excel, word and pdf
14. Email
15. GUI automation
## Notes
### Basics
python components:
1. **Expressions** - for example: 2+2. A line of code that has a absolute single value
2. **Functions** - a collection of lines to preform the same code with the same logic. Functions are *called*
3. **Comments** - start with #. Python ignores these lines of code
4. **Statements** - lines of code that together form a conditional (they are not expressions, but might contain expressions).
### Form Control
comparison operators:
$ <=, ==, >=, !=, <, > $
Boolean operators:
$ or, not, and $
#### If Statements
if statements are checked according to their order, and only 1 will be used. (if the first statement is true, it will run and not all others)
```python
x = 5
if x==5:
# run this line
print("x is equal to 5")
elif x<5:
print("x is lower than 5")
else:
# run this line
print("x is either higher than 5 or something else happened")
```
#### While Loops
an iteration that will continue as long as a statement is true. make sure to update the condition so that the while loop wont run forever
```python
x = 5
while x<5:
if x==1:
x +=1
continue # skip this iteration and go back to the start
print("the value of x is" + x)
x +=1
if x==3:
break # stop the loop on this condition
```
#### For Loops
For loops are preforming an action over a given range, you can control the size of the range and the ticks
```python
for item in range(10, 21, 2):
print(item) # would result in 10, 12, 14, 16, 18, 20
```
### Functions
defining the function is not the same as calling it. They are meant to reduce repetition and chances of error.
Function can have:
**Arguments** - a variable to be used each time the function is called (for example which name to print)
**Optional arguments** - same as arguments, but with a default value so they are not mandatory when calling the function
**A return value** - the function could return a value to be assigned in a variable
```python
def my_func():
print("hello")
print("my name is")
print("idan")
my_func()
def func_with_argument(name):
print("hello")
print("my name is")
print(name)
func_with_argument("idan")
def func_with_default_argument(name, greeting='hello'):
print(greeting)
print("my name is")
print(name)
func_with_default_argument("idan")
def func_with_return(name, greeting='hello'):
print(greeting)
print("my name is")
print(name)
full_greeting = greeting + " my name is " + name
return full_greeting
my_greeting = func_with_return(name='idan')
```
### Scope
variable that is assigned outside of a function is part of the global scope, and is available for all functions. Variable assigned inside of a function is part of the local scope, and only available inside that function (and will be deleted afterwards).
When a function tries to use a variable, it will first look for it in the local scope, and go up(out) if not found (for example to a parent function or the global scope).
You can use "global my_var" to treat that variable inside the function as a global var.
```python
eggs = 40
def my_func():
eggs = 20
print(eggs)
my_func() # will return 20
print(eggs) # will return 40
def my_func2():
global eggs
eggs = 20
print(eggs)
my_func() # will return 20
print(eggs) # will return 20
```
### Lists
a comma separated list of objects, for example int, str or variables.
```python
My_list =["hey", "hello", "hi"]
# Lists can be indexed (zero based)
My_list[0] # would return "hey"
# Or sliced:
My_list[:2] # would return ["hey", "hello"]
# useful list methods
my_list.index("hello") # find the first occurance of an item - would return 1
my_list.append("hey there") # add item to the end of the list
my_list.insert(2, "hi") # add item in a given index
my_list.remove("hey") # delete the first occurance of an item
my_list.sort() # either numerical, or alphabet.
```
### Mutability
Strings and tuples are immutable objects, you can't "update" them, they must be replaced with a new variable.
Lists, dictionaries, and dataframes for example are mutable (can be updated)
### References
Some variables are kept with a unique reference ID each time they are generated, while others are kept with the same reference ID, so when one is copied, they still all point to the same object. For example:
```python
str_a = "pizza"
str_b = str_a
str_b = "hello"
```
in this case, the values of str_a and str_b are different, since these are immutable. However, for mutable variables, an update of one will propagate to the other.
```python
my_list = ["a", "b", "c"]
new_list = my_list
new_list.append("d")
print(new_list)
print(my_list)
# both lists will have the added "d"
```
To avoid this - use the "copy" method.
### Dictionaries:
A list of key value tuples. Dictionaries are not ordered.
```python
my_dict = {"age": 25, "name": "johnas", "gender": "male"}
# You can access (or loop) the dictionary using
my_dict.keys() # for ["age", "name", "gender"]
my_dict.values() # for [25, 'johnas', 'male']
my_dict.items() # for [("age", 25), ("name", "johnas"), ("gender": "male")]
# useful methods
my_dict.get("school", "no school in dict") # checks if a key exist, and if not returns a default value.
my_dict.set_default("age", 15) # checks if a key exist, and if not adds it to the dictionary with a default value.
```
### Strings
strings are list like element, which means:
```python
my_str = "hello"
# they can be indexed
hel = my_str[:3]
# "in" works in str
"hello" in my_str
```
escape characters are useful to add problematic letters or added functionality to your string
```python
my_str = 'this is carol\'s cat'
# to treat the front slash as part of the string
my_str = r'this is carol\'s cat'
```
for very long strings, you can use:
```python
my_long_str = """this is a very lonnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnng string"""
```
useful methods
```python
my_str = "Hello"
# boolean testing
my_str.isalpha() # only letters
my_str.isalnum() # letters and numbers only
my_str.isdecimal() # only numbers
my_str.startswith("h")
my_str.endswith("o")
# str manipulation
new_str = my_str.upper() # capitaliza all
new_str = my_str.lower() # uncapitalize all
new_str = ' '.join(['one', 'two', 'three']) # list to a single str
new_list = 'hello world'.split(" ") # from str to list
new_str = my_str.rjust(20,"*") # adds characthers from the right side until the str is at this length
new_str = my_str.ljust(20,"*") # adds characthers from the left side until the str is at this length
new_str = my_str.strip() # removes spaces from both sides of the str
new_str = my_str.replace("h", "e") # replaces every instances of a letter in the str
```
string injecting
```python
place = "home"
time = "noon"
food = "pizza"
new_str = 'hey, we are meeting at {} at {}, dont forget to bring a {}'.format(place, time, food)
```
### Regex
```python
# pattern to look for
message = "hey, my number is 111-3333-111"
pattern = re.compile(r'/d/d/d-/d/d/d/d-/d/d/d')
match = pattern.search(message)
result = match.group()
# to look for all matches
match = pattern.findall(message)
# to look for one option out of several - pipe factor
re.compile(r'Bat(man|mobile|copter))
# one or more times - "?"
re.compile(r'Bat(wo)?man)
# one or more times - "+"
# zero of more times - "*"
# a defined number of times - {n}, or {3,5} for 3 or five.
```
be default, regex matches are "greedy", they try to find the longest string that matches the pattern. we can make the regex non-greedy (return the smallest possible string) by adding a "?" at the end of the pattern item, for example
```python
pattern = re.compile(r'/d{3,5}?')
```
Findall will return a list of matches when the pattern has 0/1 groups, or list of tuples of matches if there are 2 or more groups. Findall doesn't return a match object but rather a list.
```python
\d # numeric characters
\w # - word characters
\s # - space.
^ # - must be at the start of the string
$ # - must be at the end of the string
. # - wildcard, any character except new line.
() # - create groups (return a partial match from your pattern). for example: pattern = "first name:
{3,5} # this will match the entire pattern, but will return only the first name
# to use the literal version (for example - "." as a dot, not a wildcard),
# you need to add \, so:
\.
```
Capitalize matches the opposite value (so /D would match all non numeric characters)
You can create a custom class - r'[aeiou]' for example will match all the vowels. You can add ^ inside the squared brackets to negate the pattern (return all non vowels)
additional params:
re.DOTALL = . will match also new lines
re.I (re.IGNORECASE) - treat upper case as lower case and vice versa (case-insensitive)
you can also use the re.VERBOSE method to make you regex a bit more readable
```python
pattern = re.compile(r''' # this regex is for a phone number
\d{3}- # first 3 digits for state
\d{5}- # main 5 digits for city
\d{3}- # last 3 digits for household
''', re.VERBOSE)
```
since the compile function has only 1 second argument, you can add multiple arguments by:
re.VERBOSE | re.IGNORECASE ...
#### The sub Method
replace a match with a different string.
```python
pattern = re.compile(r'agent \w+')
message = 'hey, my name is agent Boris'
pattern.sub('REDACTED', message) # hey, my name is REDACTED
# use the match in the "sub" - for example, return only the first letter of the name
pattern = re.compile(r'agent (\w)\w*')
pattern.sub('Agent \1', message) # hey, my name is agent B
```
### File Management
#### The Os Package
a useful package to manage file paths
```python
import os
os.path.join("my_folder", "nested_folder", "file_name.csv")
# print current working directory
os.getcwd()
# choose the work directory
os.chdir("my_path")
# .. - goes one level up
# extract the the folder path for a given file
os.path.dirname("my_path")
# extract file name from a given path
os.path.basename("my_path")
# check if a file exists
os.path.exists("my_path")
# create new folder
os.makedirs("new_folder_path")
```
#### Reading Text File
```python
my_text_file = open("file_path")
content = my_text_file.read()
content_by_line = my_text_file.readlines()
my_text_file.close()
```
#### Excel Files
```python
Import openpyxl
File = openpyxl. Load_workbook("file_path")
Sheets = workbook. Get_sheet_names()
Sheet = workbook.get_sheet_by_name("sheetname")
Workbook.save("file_path")
Sheet = workbook.create_sheet()
Sheet.title = "my_new_sheet_name"
```
#### Pdfs
```python
Import PyPDF2
Pdf_file = open ("file_path", 'rb')
Reader = PyPDF2.PdfFileReader(Pdf_file)
page = reader.get_page(0).extract Text()
```
#### Word
```python
Import docx # package name python-docx
Doc = docx.Document("file_path")
Doc.paragraphs[0].text
doc.add_paragraph("text")
doc.save("file_path")
```
each text is separated per "run", which are sections that end when there is a change in styles (bold, underscore, italic...)
### Debugging
#### Assertions
you can use asserts and raises to provide custom errors in your code
```python
def my_func(num_of_states):
assert num_of_states<51, "there are too many states!"
```
#### Logging
print statements to your console so that you will have more information on the actions and progress of your code
```python
import logging
logging.basicConfig(level=logging.INFO, format=format='%(asctime)s - %(levelname)s - %(message)s', filename="my_log.txt", datefmt='%Y-%m-%d %H:%M:%S')
logging.info("this is my message")
# loggers level
"""
debug
info
warning
error
critical
"""
logging.disable(level=logging.DEBUG) # cancels all loggers on this level or lower
```
#### Debugging
over - skip (after executing) to next action or statement, for example - next line of code, next definition of a function, etc...
step in - go inside a function call
step out - skip to the return of that function
Go - run the code until the next breakpoint (or the end of the script)
quit - terminate the run
a breakpoint will make Python run up to this point and stop there.
### Webscrapping
opening urls
```python
import webbrowser
webbrowser.open("https://www.mysite.com")
```
requests
```python
import requests
res = requests.get("my_url")
res.Raise_for_status()
res.text
# you can also parse html with beautiful soup
Import bs4
soup = bs4.BeautifulSoup(res.text, "html.parser")
Element = soup.select("my_css_element")
Element[0].Text.strip()
```
If you need to fill out logins or search bars online
```python
From selenium import webdriver
My_browser = webdriver.firefox()
Site = my_browser.get("url")
Element = site.find_element_by_css_selector("css_id")
Element.click()
Element.send_keys("insert text here")
Element.submit()
Browser.quit()
```
### Emails
#### Sending
```python
import smtplib
conn = smtplib.SMTP("smtp.gmail.com", 587)
conn.elho()
conn.starttls()
conn.login(user='
[email protected]',password='1234')
conn.sendmail(from_addr='mygmailadd',to='
[email protected]',
body='Subject: my email title\n\n Hello dear User\n How are you doing?\n Best of luck\n Idan\n\n')
# this function returns a dictionary of failed sends, if its empty it means it was sent.
conn.quit()
```
#### Reading
```python
import imapclient
conn = imapclient.IMAPclient('imap.gmail.com',ssl=True)
conn.login(username='
[email protected]',password='password')
conn.select_folder('INBOX',readonly=True)
UID = conn.search(['SINCE 20-Aug-2015']) # see imap documentation for more search options
raw_message = conn.fetch([message_id], ['BODY[]', 'FLAGS'])
import pyzmail
message = pyzmail.Pyzmessage.factory(raw_message[message_id], [b'BODY[]'])
message.get_subject()
message.get_addresses('from')
message.get_addresses('to')
message.text_part.get_payload().decode('UTF-8')
conn.logout()
```
### GUI Automation
since controlling your mouse could be dangerous (because you can't control your computer in the meantime), there is a failsafe that the program will automatically stop if you move your mouse to the top left corner
```python
import pyautogui
# control the mouse
res_widgth, res_height = pyautogui.size()
pyautogui.moveTo(x=5,y=10) # move to position
pyautogui.moveRel(5,10) # move to a relative (offset) position
pyautogui.click(x=300, y=30) # click on position
# control keyboard
pyautogui.typewrite("hello world")
pyautogui.press("F1")
pyautogui.hotkey("ctrl", "o")
# search for component
pyautogui.screenshot("save_image_to_path")
pyautogui.locateCenterOnScreen("my_image_path")
```
## External Resources