`sqlite3` is a [[Python]] library for [[SQLite]]. It is included as a core Python library so there is no need to install anything.
Alternatively, [[SQLAlchemy]] is a database-agnostic Python library for working databases.
DuckDB
## Building a database
Base level `sqlite3` functions for creating database, inserting data, and creating views.
```python
import sqlite3
from sqlite3 import Error
import sys
# define paths
sql_dir = 'sql'
db_filepath = 'your_database.db'
def create_table(conn, create_table_sql):
"""
Create a table from the create_table_sql statement
:param conn: connection object
:param create_table_sql: a CREATE TABLE statement
:return: None
"""
try:
c = conn.cursor()
c.execute(create_table_sql)
except Error as e:
conn.close()
print(e)
def insert_data(conn, sql_file, data):
"""
Insert provided data into the specified table using a .sql file
:param conn: connection object
:param sql_file: a .sql file containing the sql command
:param data: a list of data
:return: th unique identifier for the data
"""
with open(sql_file, 'r') as file:
sql = file.read()
try:
c = conn.cursor()
c.execute(sql, data)
except Error as e:
conn.close()
print(e)
return c.lastrowid
def create_view(conn, sql_file):
"""
Executes a sql file
:param conn: connection object
:param sql_file: a .sql file containing the sql command
:return: None
"""
with open(sql_file, 'r') as file:
sql = file.read()
try:
c = conn.cursor()
c.execute(sql)
except Error as e:
conn.close()
print(e)
def create_database(conn, database_path, sql_dir):
"""
Creates a database
:param conn: connection object
:param database_path: a .db filepath
:return: None
"""
# Create list of table files
tables = [table for table in os.listdir(sql_dir)
if table.startswith('create_table')]
# Create each table from list
for table in tables:
print('Running {}'.format(table))
with open (os.path.join(sql_dir, table), 'r') as sql_file:
sql = sql_file.read()
create_table(conn, sql)
conn.close()
else:
print ("Error! failed to create database connection.")
with sqlite3.connect(db_filepath) as conn:
# Read ddl from file
ddl_filepath = os.path.join(sql_dir, 'insert_.sql')
# Read data
df = pd.read_csv('data.csv')
# Insert data into database
for _, record in df.iterrows():
data = tuple(record)
insert_data(conn, ddl_filepath, data)
# Create views
view_desktop_results_sql = os.path.join(sql_dir, 'view_.sql')
create_view(conn, view_desktop_results_sql)
# Query data
query = "SELECT * FROM table" # select all from 'table'
cursor.execute(query)
results = cursor.fetchall()
for row in results:
print(row)
```
Sqlite 3 does not have a boolean data type, rather stores True/False as 1/0. Maintain this behavior (as opposed to 'True', 'False' as text) as a standard. In pandas, convert boolean data types to int and then to boolean to block for reading as an object. (`.astype('int').astype('bool')`)
## Using a .bat file to initiate a database
For some reason, between sessions my simple .bat file stopped working. It probably was finding a different install of python and missing the xlrd package (although I think both of my python installs include xlrd). Ideally, I'd be working in a conda environment and the .bat file would activate it first. Here's some code that might work:
```
call activate [my_env]
python my_script.py
call conda deactivate
```
#rough see NVAdminTools on EI old hard drive
## SQLite database from Spreadsheet
To create a SQLite database from a spreadsheet, use [[Pandas/pandas]] to read the spreadsheet file and then use `sqlite3` to convert to a database.
```python
import pandas as pd
import sqlite3
# Paths to files
data_filepath = os.path.join('local_indicators.xlsx')
db_filepath = 'your_database.db' # will be created if not exists
# Read data
dfs = pd.read_excel(filepath, sheet_name=None)
# Build database
conn = sqlite3.connect(db_filepath)
cursor = conn.cursor()
for sheet_name, df in dfs.items():
df.to_sql(sheet_name, conn, if_exists='replace', index=False)
# Close connection
conn.close()
```
You can create an in-memory database using SQLite if you don't need to persist the database on disk using `sqlite3.connect(':memory:')`; however, [[DuckDB]] is more performant for in-memory database operations with Pandas.
```python
```