Google Cloud Platform - Obsidian Publish

[Google Cloud Platform](https://cloud.google.com/?hl=en) (GCP) is a [[cloud computing]] platform offering services for compute, storage, big data and machine learning including [[BigQuery]], [[Vertex AI]], and [[Cloud Run]] among others. To use GCP, set up an account in the [Google Cloud Console](https://console.cloud.google.com/welcome?hl=en&project=ei-dev-assets) with your Gmail account. ## billing Add a credit card to enable pay-as-you-go services. Set a budget to alert you about high spending. I have a $10 per month budget which is more than sufficient for my needs. I like to get an alert based on forecasted spend of 100% of budget. > [!Warning] > Budgets are for alerts only! GCP will not stop your cloud services when you reach your budget. You can still get a huge bill if you mis-provision a resource or run the wrong query. Don't get a [$10,000 bill in just 22 seconds](https://medium.com/data-engineer-things/bigquerys-ridiculous-pricing-model-cost-us-10-000-in-just-22-seconds-7d52e3e4ae60). The [Google Cloud Resource Manager](https://cloud.google.com/resource-manager) helps manage billing, permissions, and service enablement for resources in projects on GCP. You can also set quotas on rates of use and allocation of resources to limit potential billing spikes. (Google sets their own quotas, to increase a quota submit a ticket to Google.) See the **Billing Report** and **Cost Table** for current and forecasted costs. If you have multiple resources, you can label each with a key-value pair to better track usage and billing (you can label each resource with multiple labels, up to 64). Billing is typically per second (minimum one minute), but varies **widely** by the resources you've provisioned and whether you're using a Marketplace solution. Helpfully, GCP will email you rightsizing recommendations based on your usage. ## compute GCP is a good option for securing compute necessary to run ML/AI workflows or simply to run a [[virtual machine]] configured to your needs. - [[Compute Engine]]: standard service for spinning up a virtual machine - [[Cloud Functions]]: run scripts in the cloud - [[Kubernetes]]: you don't need Kubernetes - [[Cloud Run]]: scalable solution for web apps ## Compute Engine Compute Engine is a service for spinning up a [[virtual machine]] (VM) on [[Google Cloud Platform]]. Depending on the number of CPUs and amount of RAM, virtual machines can cost between $8 per year to a few hundred dollars per year. A VM on Compute Engine can be configured with various options for vCPU, RAM, [[disk]], networking and operating system (OS) image. Pre-configured [machine types](https://cloud.google.com/compute/docs/machine-types) are available for general computing, memory-optimized, compute-optimized and accelerator-optimized machines. Newer generations of Compute Engine will have newer CPUs and may cost more. Compute Engine also has a variety of [[GPU]] and [[TPU]] options for [[base/Deep Learning/deep learning]] tasks in the [accelerator-optimized machine family](https://cloud.google.com/compute/docs/accelerator-optimized-machines). Look for available public images for your use case like "Deep Learning on Linux". Check the Marketplace for pre-configurations based on your tech stack, but watch the price! For example, you can set up a Django site with MySQL database in one click (for free). You can also load a [[container]] on top of a VM to run a [[Docker]] container. To connect to a VM running Linux, [[SSH]] from the GCP Console or [[Cloudshell]]. See the [docs](https://cloud.google.com/compute/docs/insatnaces/connecting-to-instance) for more information. Sustained use discounts are offered if you use a resource for over 25% of the month. If you plan to use a VM for an extended period of time, apply for extended use discounts. A preemptible instance is one that can be terminated at any time to provision resources to another customer, and offers big discounts if your task is compatible. ## Kubernetes Kubernetes is a [[container]] orchestrator. The Kubernetes API uses a **control plane** to manage **nodes** which are groups of **pods** which are groups of containers. Google Kubernetes Engine is the [[Google Cloud Platform|GCP]] service for managing Kubernetes (Kubernetes itself is an open source software). ## App Engine ## Cloud Functions [Cloud Functions](https://cloud.google.com/functions/docs) is a managed service from [[Google Cloud Platform|GCP]] that automatically runs scripts when triggered. ## Cloud Build Cloud Build is [[Google Cloud Platform|GCP]]'s [[CI/CD]] service. ## storage and database [[GCP]] offers many storage and database options for relational and non-relational data depending on latency and i/o requirements. GCP encrypts all data at rest by default. - [[Cloud Storage]]: store objects (images, media, backups) or binary data - [[Cloud SQL]]: use a SQL database as a backend for web apps - [[Filestore]]: network file storage - [[Cloud Spanner]]: horizontally-scalable SQL databases - Firestore: mobile and web application management like user profiles and app states - [[Cloud Bigtable]]: high I/O data for non-relational data in analytics workflows - [[BigQuery]]: technically not a storage option but widely used for data retrieval GCP has partnerships with some providers like [[neo4j]] to host an Aura graph database or the no-SQL database [[MongoDB]]. ## Cloud Storage Cloud Storage is a service for storing objects like images on [[Google Cloud Platform]]. Objects are stored in buckets. Use a different [[storage and database]] option for structured data like [[Cloud SQL]]. Four options of **storage class** are available: - **Standard**: best for data that is frequently accessed ("hot" data). Best for temporary storage of intermediate objects that are deleted quickly. - **Nearline**: lower cost for infrequently accessed data (at most monthly). - **Coldline**: very low cost for very infrequently accessed data (at most quarterly). - **Archive**: best for online backup and disaster recovery (access at most annually). Colder storage options have higher cost of retrieval not necessarily slower retrieval times (contrast with AWS Glacier, for example). When networking, co-locating storage with compute maximizes performance and minimizes cost. Cloud Storage is also an option to host a static website, like your blog or portfolio (bonus: use GitHub Actions to automatically deploy updates). ## Filestore Filestore on [[GCP]] is a Network File System (NFS) or distributed file system which allows you to access files over a virtual network just like a local file system. You might mount a Filestore to a VM on [[Compute Engine]] or [[Kubernetes]] for web content or data intensive applications like [[genomics]] that require file-based data. Filestore is pretty expensive for persistent usage so consider other [[storage and database]] options first. To mount Filestore to a VM, first set up the VM and Filestore in a project. In the Cloud Engine dashboard, click the SSH for the VM to open the shell and connect via SSH. Use the command below to install `nfs`. ```bash sudo apt-get -y update && sudo apt-get -y install nfs-common ``` Then create a directory `/mnt` and mount the Filestore instance by its IP address and name. ```bash sudo mkdir -p /mnt sudo mount 10.24.183.98:/my-filestore /mnt ``` Use the IP address from the Filestore dashboard and the name you initialized it with. Finally set permissions so the files are read and writable. ```bash sudo chmod go+rw /mnt ``` To test the mount, create a file, check that it was created, and print the contents. ```bash echo 'This is a test' > /mnt/testfile ls /mnt cat /mnt/testfile ``` ![vid](https://youtu.be/QcsAb2RR52c?si=BgfikusNP7rtPPAf) ## Cloud SQL Cloud SQL is a fully managed relational database service with [[MySQL]], [[Postgres]], and [[SQL Server]]. Each Cloud SQL instance operates the database on the persistent disk of its own [[virtual machine]] on a host Google Cloud server. You can enable automatic vertical scaling. A static IP address allows connection to other services. If you need more customization than is available, you could simply set up your own database on your own virtual machine using Compute Engine. You might be able to find a preconfigured VM on the Marketplace that includes a relational database and better meets your needs. If you need a horizontally scalable solution, see [[Cloud Spanner]]. After setting up your Cloud SQL in your project, you can connect via the Cloud Shell. ```bash gcloud sql connect myinstance --user=postgres ``` You may need to authorize and/or enable APIs and retry the command. Provide your password (it will not show up when typing). ## Cloud Spanner Cloud Spanner is a database designed to scale. This is a better option than [[Cloud SQL]] for large production apps with many users (but may be a good choice for some smaller apps as well). Cloud Spanner brings together relational, graph, and key-value databases. It automatically distributes data across regions for redundancy and low latency. ![video](https://youtu.be/bUSU1e9j8wc](https://youtu.be/bUSU1e9j8wc) The interface for Spanner is actually a bit easier to create database schemas, add tables, and add records as there is a GUI available for all of these actions, to view the data, and to query the database. ## Cloud Bigtable Cloud Bigtable is a sparsely populated table that can store billions of rows and thousands of columns to store terabytes or petabytes of data. It utilizes a key-value system rather than a true row-store or column-store table. It supports high I/O with low latency and is optimal for map reduce. A single key-value pair can have multiple timestamped entries for versioning. It's integrable with [[Hadoop]], [[Hbase]] and [[MapReduce]]. All client requests are handled by a front-end server pool. The Bigtable instance is organized into clusters and nodes. Add nodes to handle more simultaneous requests. Nodes are pointers to tablets on Colossus, which is Google's internal storage system. Read more about the architecture [here](https://cloud.google.com/bigtable/docs/overview). ![video](https://youtu.be/Lq9uDOM4whI?si=yIBT0DIlAFf0qS0M) ## Memorystore Memorystore is used with [[Redis]] for application caches, gaming, and stream processing. Redis is a high-performance in-memory data storage system. ## BigQuery BigQuery is a fully managed enterprise [[data warehouse]] with built-in functionality for machine learning, geospatial analysis, and business intelligence. BigQuery's distributed analysis engine can query terabytes of data in seconds by separating the compute from the storage, allowing independent scaling of both. You can store ana analyze data within BigQuery or use BigQuery to analyze data where it lives. Federated queries allow reading from external sources and streaming. BigQuery storage is fully managed by BigQuery (meaning you don't need to provision resources yourself or worry about scaling). BigQuery automatically replicates data for durability. BigQuery uses columnar data storage for performance using Google's proprietary Capacitor file format. Capacitor can also reshuffle data for compact storage. BigQuery ML allows building [[machine learning]] models inside of BigQuery with a SQL-like syntax. BigQuery also supports client libraries for Python, JavaScript, Java, and Go. Play around with BigQuery's public datasets to understand its capabilities (turn off billing for the project to avoid surprise charges). ## data pipelines [[Google Cloud Platform|GCP]] offers a variety of managed services for data pipelines. - **Dataprep** is used for data preparation and cleaning prior to analytics and visualization tasks. Think of it like Tableau Prep. - **Dataflow** is used for data batching and streaming in the middle of a data pipeline. - **Dataproc** is used for big data processing across multiple compute clusters with Apache [[Spark]] and [[Hadoop]]. Dataproc integrates with [[Vertex AI]] and common interfaces such as [[Jupyter Notebook]]. - **Pub/Sub** is used for streaming analytics and data integration pipelines to ingest and distribute data. Pub/Sub is commonly used to distribute change events between databases. ## Identity Access and Management Google uses [Identity Access and Management (IAM)](https://cloud.google.com/iam/docs) to manage resources across an organization and roles and privileges provided to entities in an organization. A full IAM Policy is a set of role bindings from a principal to a resource. ### organizational hierarchy An organization can be delineated into folders, subfolders, projects and resources. For individuals, only projects and resources are available. Projects are the main organizational unit in GCP. The Project ID must be globally unique and is immutable. The Project number is assigned by Google and are also immutable. The Project name is for convenience and can be changed. Folders (and subfolders) are optional and can be used to reflect an organizational structure (e.g., the departments and teams). Policies are inherited from parent nodes. The effective policy at any level is the union of all relevant privilege including inherited privileges. ### roles Roles are collections of permissions on a resource. Basic roles are owner, editor, and viewer. Roles can be assigned at the organization level, folder level, project level, and resource level. Permissions are coded as strings like `bigquery.tables.delete` in the general format `resource.attribute.permission`. In production environments, basic roles should not be used. Instead use one of the other pre-defined roles or define a custom role with the most limited set of permissions possible (the **principle of least privilege**). The best way to create a custom role is to copy and edit a predefined role. When an organization starts with GCP, they are either already a Google Workspace customer, in which case Google associates the existing account with the GCP, or they created a Cloud Identity account. Every organization has a **super administrator** to manage the Workspace and GCP roles. The super admin can assign **Organization Admin** roles for GCP. The org admin can define IAM policies and role bindings. The **Organizational Viewer** role provides organization-wide viewer access (provided to the CTO for example). Roles are granted to a **principal**, which can be an email address associated with a Google account or a [[Google service account]] or a domain name from a Google Workspace account or Cloud Identity domain (which will include all users in the associated group). Two additional principles include `allUsers` for anyone on the internet or `allAuthenticatedUsers` for anyone signed in to Google. ## Google service account A service account is used when scripts, APIs and other applications interact with [[Google Cloud Platform|GCP]]. The service account can be granted IAM roles to let it access resources. Service accounts can be user-managed or Google-managed. Provide a meaningful name so you can remember what the account is for later! ## identity aware proxy [Identity aware proxy (IAP)](https://cloud.google.com/iap/docs) is a service from [[Google Cloud Platform]] that helps automate authentication for GCP web apps. # regions and zones Google Cloud is divided into regions. Regions are subdivided into zones. Zones can be composed of multiple data centers, but often a zone is a single data center. Connectivity with zones is very fast, typically requiring less than 5ms for a round trip signal. Zones can be a single failure domain, so you might spread an application across multiple zones. For applications with broad customer bases, applications can be spread across multiple regions. Regions are guaranteed to be at least 160 km apart and so unlikely to be effected by the same disaster. ## request routing Requests to GCP first pass through edge **Points of Presence** (POPs). From the POP, the request is routed to either a data center or an Edge Node (collectively referred to as Google Global Cache (GGC); you may also see Edge Nodes referred to as Content Delivery Network (CDN)). There are many more Edge Nodes than POPs, so Edge nodes can be used to cache commonly referenced information for a particular region. GCP is connected to the rest of the internet via peering, which (in this context) means an agreement between Google and an Internet Service Provider (ISP). ## Cloud VPN [Cloud VPN](https://cloud.google.com/network-connectivity/docs/vpn/concepts/overview) connects your on-premises resources and networks to the cloud on [[GCP]] through a gateway on each side. Cloud Router handles routing. Cloud VPN can also connect multiple virtual machines. For more secure connections, consider Cloud Interconnect. Service providers have physically co-located facilities with Google Cloud to avoid passing data over the public web and increase speed. For a review of all options for connect to Google Cloud, see [here](https://cloud.google.com/network-connectivity/docs/how-to/choose-product). [[Google Cloud CLI]] ## Firebase Start a new Firebase project or connect an existing Google Cloud Project at the [Firebase Console](https://console.firebase.google.com).