`sqlite3` is a [[Python]] library for [[SQLite]]. It is included as a core Python library so there is no need to install anything. Alternatively, [[SQLAlchemy]] is a database-agnostic Python library for working databases. DuckDB ## Building a database Base level `sqlite3` functions for creating database, inserting data, and creating views. ```python import sqlite3 from sqlite3 import Error import sys # define paths sql_dir = 'sql' db_filepath = 'your_database.db' def create_table(conn, create_table_sql): """ Create a table from the create_table_sql statement :param conn: connection object :param create_table_sql: a CREATE TABLE statement :return: None """ try: c = conn.cursor() c.execute(create_table_sql) except Error as e: conn.close() print(e) def insert_data(conn, sql_file, data): """ Insert provided data into the specified table using a .sql file :param conn: connection object :param sql_file: a .sql file containing the sql command :param data: a list of data :return: th unique identifier for the data """ with open(sql_file, 'r') as file: sql = file.read() try: c = conn.cursor() c.execute(sql, data) except Error as e: conn.close() print(e) return c.lastrowid def create_view(conn, sql_file): """ Executes a sql file :param conn: connection object :param sql_file: a .sql file containing the sql command :return: None """ with open(sql_file, 'r') as file: sql = file.read() try: c = conn.cursor() c.execute(sql) except Error as e: conn.close() print(e) def create_database(conn, database_path, sql_dir): """ Creates a database :param conn: connection object :param database_path: a .db filepath :return: None """ # Create list of table files tables = [table for table in os.listdir(sql_dir) if table.startswith('create_table')] # Create each table from list for table in tables: print('Running {}'.format(table)) with open (os.path.join(sql_dir, table), 'r') as sql_file: sql = sql_file.read() create_table(conn, sql) conn.close() else: print ("Error! failed to create database connection.") with sqlite3.connect(db_filepath) as conn: # Read ddl from file ddl_filepath = os.path.join(sql_dir, 'insert_.sql') # Read data df = pd.read_csv('data.csv') # Insert data into database for _, record in df.iterrows(): data = tuple(record) insert_data(conn, ddl_filepath, data) # Create views view_desktop_results_sql = os.path.join(sql_dir, 'view_.sql') create_view(conn, view_desktop_results_sql) # Query data query = "SELECT * FROM table" # select all from 'table' cursor.execute(query) results = cursor.fetchall() for row in results: print(row) ``` Sqlite 3 does not have a boolean data type, rather stores True/False as 1/0. Maintain this behavior (as opposed to 'True', 'False' as text) as a standard. In pandas, convert boolean data types to int and then to boolean to block for reading as an object. (`.astype('int').astype('bool')`) ## Using a .bat file to initiate a database For some reason, between sessions my simple .bat file stopped working. It probably was finding a different install of python and missing the xlrd package (although I think both of my python installs include xlrd). Ideally, I'd be working in a conda environment and the .bat file would activate it first. Here's some code that might work: ``` call activate [my_env] python my_script.py call conda deactivate ``` #rough see NVAdminTools on EI old hard drive ## SQLite database from Spreadsheet To create a SQLite database from a spreadsheet, use [[Pandas/pandas]] to read the spreadsheet file and then use `sqlite3` to convert to a database. ```python import pandas as pd import sqlite3 # Paths to files data_filepath = os.path.join('local_indicators.xlsx') db_filepath = 'your_database.db' # will be created if not exists # Read data dfs = pd.read_excel(filepath, sheet_name=None) # Build database conn = sqlite3.connect(db_filepath) cursor = conn.cursor() for sheet_name, df in dfs.items(): df.to_sql(sheet_name, conn, if_exists='replace', index=False) # Close connection conn.close() ``` You can create an in-memory database using SQLite if you don't need to persist the database on disk using `sqlite3.connect(':memory:')`; however, [[DuckDB]] is more performant for in-memory database operations with Pandas. ```python ```