### PostgreSQL Schema for E-commerce Platform
Here are some questions designed to challenge you to query an E-commerce database effectively and discover the embedded trends:
### Basic Questions
1. How many products are there in each category?
2. List the usernames and emails of customers who have made more than 3 orders.
3. What is the average rating for products in the 'Electronics' category?
4. How many orders are in each status (Pending, Shipped, Delivered)?
5. Which product has the highest stock quantity?
### Intermediate Questions
6. What is the total revenue generated from the 'Electronics' category?
7. List the top 5 most ordered products.
8. Which customers have never left a review?
9. What is the average cart quantity for each customer?
10. On which date were the most orders placed?
### Advanced Questions
11. What is the average order value for each customer?
12. Which products have never been ordered?
13. What is the total revenue for each month of the past year?
14. List the customers who have both ordered 'Electronics' and left a review for an 'Electronics' product.
15. What is the correlation between product price and average rating in the 'Electronics' category?
---
# Exercise Setup
Here's the PostgreSQL schema for the E-commerce Platform, designed to give you practice in querying data.
```sql
-- Customers Table
CREATE TABLE Customers (
customer_id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(50) UNIQUE NOT NULL,
password VARCHAR(50) NOT NULL
);
-- Categories Table
CREATE TABLE Categories (
category_id SERIAL PRIMARY KEY,
name VARCHAR(50) NOT NULL
);
-- Products Table
CREATE TABLE Products (
product_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
price DECIMAL(10, 2) NOT NULL,
stock_quantity INT NOT NULL,
category_id INT REFERENCES Categories(category_id)
);
-- Orders Table
CREATE TABLE Orders (
order_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES Customers(customer_id),
order_date DATE NOT NULL,
status VARCHAR(50) NOT NULL
);
-- OrderDetails Table (Many-to-Many relationship between Orders and Products)
CREATE TABLE OrderDetails (
order_id INT REFERENCES Orders(order_id),
product_id INT REFERENCES Products(product_id),
quantity INT NOT NULL,
PRIMARY KEY (order_id, product_id)
);
-- Cart Table
CREATE TABLE Cart (
cart_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES Customers(customer_id),
product_id INT REFERENCES Products(product_id),
quantity INT NOT NULL
);
-- Reviews Table
CREATE TABLE Reviews (
review_id SERIAL PRIMARY KEY,
customer_id INT REFERENCES Customers(customer_id),
product_id INT REFERENCES Products(product_id),
rating INT CHECK (rating >= 1 AND rating <= 5),
review_text TEXT
);
```
#### Concepts Covered
- **One-to-Many Relationships**:
- `Customers` to `Orders`
- `Customers` to `Cart`
- `Customers` to `Reviews`
- `Categories` to `Products`
- **Many-to-Many Relationships**:
- `Orders` to `Products` through `OrderDetails`
- **Data Integrity and Constraints**:
- Unique constraints on `username` and `email` in `Customers`
- Check constraints on `rating` in `Reviews`
This schema should provide a comprehensive foundation for practicing various SQL queries, including JOIN operations, aggregations, and conditional queries.
---
Below is a Python script that uses the `Faker` library to generate fake but realistic data for the E-commerce Platform schema. I've also included some trends for you to discover.
First, you'll need to install the `Faker` library and `psycopg2` for PostgreSQL interaction:
```bash
pip install Faker psycopg2
```
Here's the Python script:
```python
import random
from faker import Faker
import psycopg2
fake = Faker()
# Connect to PostgreSQL
conn = psycopg2.connect(
host="localhost",
database="ecommerce_db",
user="your_username",
password="your_password"
)
cursor = conn.cursor()
# Generate Categories
categories = ['Electronics', 'Clothing', 'Home & Garden', 'Books', 'Toys']
for category in categories:
cursor.execute("INSERT INTO Categories (name) VALUES (%s)", (category,))
# Generate Products with a trend of Electronics being more expensive
for _ in range(100):
name = fake.unique.first_name()
price = round(random.uniform(10.0, 100.0), 2)
stock_quantity = random.randint(1, 100)
category_id = random.randint(1, 5)
if category_id == 1: # Electronics
price += 50
cursor.execute("INSERT INTO Products (name, price, stock_quantity, category_id) VALUES (%s, %s, %s, %s)",
(name, price, stock_quantity, category_id))
# Generate Customers
for _ in range(50):
username = fake.unique.user_name()
email = fake.unique.email()
password = fake.password()
cursor.execute("INSERT INTO Customers (username, email, password) VALUES (%s, %s, %s)",
(username, email, password))
# Generate Orders with a trend of more orders from customer_id 1 to 10
for _ in range(200):
customer_id = random.randint(1, 50)
if customer_id <= 10:
customer_id = random.randint(1, 10)
order_date = fake.date_between(start_date='-1y', end_date='today')
status = random.choice(['Pending', 'Shipped', 'Delivered'])
cursor.execute("INSERT INTO Orders (customer_id, order_date, status) VALUES (%s, %s, %s)",
(customer_id, order_date, status))
# Generate OrderDetails
for _ in range(400):
order_id = random.randint(1, 200)
product_id = random.randint(1, 100)
quantity = random.randint(1, 5)
cursor.execute("INSERT INTO OrderDetails (order_id, product_id, quantity) VALUES (%s, %s, %s)",
(order_id, product_id, quantity))
# Generate Cart
for _ in range(100):
customer_id = random.randint(1, 50)
product_id = random.randint(1, 100)
quantity = random.randint(1, 3)
cursor.execute("INSERT INTO Cart (customer_id, product_id, quantity) VALUES (%s, %s, %s)",
(customer_id, product_id, quantity))
# Generate Reviews with a trend of higher ratings for Electronics
for _ in range(100):
customer_id = random.randint(1, 50)
product_id = random.randint(1, 100)
rating = random.randint(1, 5)
review_text = fake.sentence()
if product_id <= 20: # Assuming first 20 products are Electronics
rating = random.randint(3, 5)
cursor.execute("INSERT INTO Reviews (customer_id, product_id, rating, review_text) VALUES (%s, %s, %s, %s)",
(customer_id, product_id, rating, review_text))
# Commit and close
conn.commit()
cursor.close()
conn.close()
```
Replace `your_username` and `your_password` with your PostgreSQL username and password.
This script will generate:
- 5 categories
- 100 products with Electronics generally being more expensive
- 50 customers
- 200 orders with more orders from customers with IDs 1-10
- 400 order details
- 100 cart entries
- 100 reviews with higher ratings for Electronics
Run this script to populate your PostgreSQL database with fake but realistic data, including some trends that are asked about in the questions above.