10-03-25 Wikipedia Deep Research

# Project Overview Realizing that there exists a majority of "Deep Research" tasks stemming from summarizing non-realtime components, I figured I could reduce research costs significantly by using the [open sourced deep research from LangGraph](https://github.com/langchain-ai/open_deep_research) and a locally hosted Wikipedia instance for retrieving topic context. I setup a hosted Wikipedia site and "Open Deep Research" (bundled with Wikipedia embeddings) over FastAPI service to handle requests inside of my local [microk8s](https://microk8s.io/) cluster. This took quite a bit of effort to setup/get working and, given my limited free time, it will take me another bit of effort to stand up a repo/submit a PR to the LangGraph repo for stable, portable code. Forgive me for not including the entire change code. ## Goals - **Offline retrieval**: no external web search calls during research besides LLM calls - **Semantic search** over Wikipedia via [txtai](https://huggingface.co/NeuML/txtai-wikipedia) embeddings - **Deterministic citations** that point to your local [Kiwix](https://kiwix.org/en/) URLs from latest Wikipedia [ZIM dump](https://dumps.wikimedia.org/kiwix/zim/wikipedia/) - **Drop-in tool** named `web_search` so the supervisor/researcher loops reuse the same interface. ## Dependencies - [LangGraph Open Deep Research](https://github.com/langchain-ai/open_deep_research) - [txtai](https://huggingface.co/NeuML/txtai-wikipedia) embeddings for finding similar articles to an input query - [Kiwix](https://kiwix.org/en/) hosted Wikipedia site using a recent [ZIM dump](https://dumps.wikimedia.org/kiwix/zim/wikipedia/) ## High-level architecture ``` LLM plan -> researcher -> `web_search` tool -> `wiki_search_async(queries)` -> txtai Embeddings -> result doc ids -> Kiwix-serve HTTP fetch (local ZIM) -> HTML -> Markdown sanitizer -> structured “SOURCE i” blocks with local URLs -> compression -> final report ``` **Key Kubernetes components** - **Kiwix mirror**: `kiwix-serve` on Kubernetes, fronted by Ingress at `https://wiki.<domain>` with mounted ZIM volume - **Embeddings store and Research API**: `txtai` with `neuml/txtai-wikipedia` used when calling the Deep Research `wiki_search` search tool when setting the configuration `search_api=WIKIPEDIA` ## Wikipedia Search details 1. **Query resolution** `wiki_search_async` fans out queries. Each query runs a `txtai` `similar(...)` SQL against the Wikipedia index. Keep hits above a percentile threshold (default `0.99`) to avoid low-signal pages 2. **Document fetch** For each hit, format the Wikipedia title and fetch content from Kiwix: - Try direct `https://wiki.<ingress_domain>/content/<ZIM>/content/A/<Title>` - Else resolve via `https://wiki.<ingress_domain>/content/<ZIM>/search?pattern=...&books.name=<ZIM>` and follow the content link 3. **Normalization** Strip HTML, remove references section, collapse whitespace, and output concise Markdown for model-friendly summarization 4. **Citations** Emit `SOURCE i` blocks with the canonical local Kiwix URL: ``` https://wiki.<domain>/content/<ZIM>/A/<Title> ``` 5. **Synthesis** The normal compression and final-report nodes run unchanged. The only difference is the source of notes. ## Integration with other search providers No changes to planners or reducers. The tool registry wires `wiki_search` as the canonical `web_search` when `search_api=WIKIPEDIA`. You can still swap to Tavily or Exa by changing the config without touching graph logic. ## Security and reproducibility - All fetches remain inside the cluster. - Results are pinned to your ZIM snapshot for auditability. - Ingress TLS terminates at your controller using your wildcard cert. ## Cost Difference Costs vary from my semi-structured testing (run report, check OpenAI usage cost difference as measurement) but both Open Deep Research instances use the same default LLM configuration (see the [README.md](https://github.com/langchain-ai/open_deep_research/blob/main/README.md#results)) and just isolating whether using local/OpenAI Web Search significantly decreases cost per report | Research Provider | Search Provider | Report Cost | | --- | --- | --- | | Open Deep Research | OpenAI Web Search | $2.36 | Open Deep Research | Wikipedia | $0.32 ## Summary This adds a **local, deterministic, citation-clean** retrieval path for the Deep Research agent. It keeps the LangGraph contract the same, swaps only the search backend, and produces final reports with reliable, linkable local sources. **Final note:** if you're still looking for an external provider, from using several "Deep Research" providers, it seems like the lowest dollar cost is OpenAI's `o4-mini-deep-research-2025-06-26` [model](https://platform.openai.com/docs/models/o4-mini-deep-research) # Example Wikipedia Report Query: Write a company overview report on NVIDIA, noting the key suppliers/customers/input materials/ and industry/sector dynamics ## NVIDIA Company Overview Report (October 2025) ### Company Overview Corporation, headquartered in Santa Clara, California, is a global leader in accelerated computing and artificial intelligence (AI). Since its founding in 1993, NVIDIA has evolved from its original focus on graphics processing units (GPUs) for gaming into a multifaceted technology powerhouse developing solutions for data centers, AI, professional visualization, autonomous vehicles, and edge computing. For the fiscal year ending January 26, 2025, NVIDIA commands an overwhelming 92% of the discrete GPU market for desktops and laptops. Its principal business areas are: - **Gaming**: GeForce GPUs remain industry standards for gaming and creative workloads. - **Data Center and AI**: Serving hyperscale cloud providers, enterprise AI workloads, and scientific research. - **Professional Visualization**: Supplying graphics hardware and software to professionals, including designers and researchers. - **Automotive and Embedded Systems**: Providing system-on-chip (SoC) and AI solutions for driver assistance, autonomous vehicles, and robotics. NVIDIA’s product development is organized around advanced GPU architectures (e.g., Ada Lovelace, Hopper, and Blackwell), and its solutions increasingly blend hardware with proprietary software stacks, like CUDA for high-performance computing and the Omniverse platform for 3D collaboration and simulation [1]. Recent strategic moves have deepened NVIDIA’s position in generative AI and cloud infrastructure, and expanded its ecosystem through acquisitions and partnerships with leading data center, automotive, and cloud companies. Subsidiaries, such as Mellanox Technologies (high-speed datacenter networking) and Cumulus Networks (software-defined networking), bolster its vertical integration strategy [1]. ### Key Suppliers #### Major Suppliers and Their Roles NVIDIA follows a “fabless” semiconductor model, focusing on design while outsourcing manufacturing and major component production to third parties. The company’s crucial suppliers include: - **Semiconductor Foundries**: - **Taiwan Semiconductor Manufacturing Company (TSMC)**: The principal foundry partner for NVIDIA’s most advanced GPUs and AI processors. TSMC fabricates chips using leading-edge process nodes (e.g., 4nm, 5nm), which is essential for keeping NVIDIA’s products competitive in terms of performance and energy efficiency [2]. - **Samsung**: Occasionally serves as a secondary foundry partner for select GPU lines, providing manufacturing diversification and capacity flexibility. - **Memory Suppliers**: - **SK Hynix, Samsung, Micron**: Supply high-bandwidth memory (HBM), GDDR6/6X DRAM, and other specialized memory critical for GPU and AI workloads. - **Substrate and Packaging Providers**: - **Unimicron, Ibiden, and others**: Produce advanced substrates essential for high-performance, multi-die GPU assemblies. - **Component Suppliers**: - Several companies provide capacitors, power management ICs, and connectors. Although individual firms are rarely named in disclosures, NVIDIA notes a dependence on a broad network of specialized component makers. - **Software Partners**: - **Microsoft**: Collaborates with NVIDIA to integrate support for features like real-time ray tracing (RTX) with DirectX APIs, broadening NVIDIA’s platform reach [3]. #### Supplier Importance - TSMC’s role is critical—any disruption in TSMC’s advanced process node capacity would have immediate repercussions for NVIDIA’s product roadmap and supply chain resilience [2]. - Memory suppliers are also vital, especially as advanced AI and GPU workloads require ever more bandwidth and capacity, making these relationships strategic for product competitiveness. - Substrate shortages have, in industry-wide fashion, periodically constrained GPU and AI accelerator shipments, emphasizing the importance of securing reliable substrate vendor partnerships. #### Limitations Detailed supplier lists (by name, proportion, or spend) are confidential and not comprehensively disclosed publicly outside of general industry awareness and historic referencing in filings [1][2]. The analysis above reflects best-available industry insights. ### Major Customers #### Customer Segments and Examples NVIDIA’s products serve a widely diversified global customer base, but revenue and business influence are concentrated in a few main segments: - **Cloud Service Providers (“Hyperscalers”)**: The largest customers by far are global cloud operators who deploy NVIDIA hardware for AI training, inference, and cloud-based GPU/AI compute services. These include industry giants—such as Amazon Web Services (AWS), Microsoft Azure, Google Cloud, and, to a lesser extent, Alibaba and Tencent—who each deploy tens or hundreds of thousands of GPUs in their data centers. - **Enterprise and High-Performance Computing (HPC)**: Large enterprises, universities, and national laboratories leverage NVIDIA hardware for AI research, simulations, and large-scale computation. - **Gaming OEMs and Channel Partners**: PC manufacturers, gaming system builders, and distributors who bundle GeForce GPUs into consumer systems. - **Automotive**: Leading automotive OEMs and Tier-1 suppliers source NVIDIA’s DRIVE compute platforms for driver assistance and autonomous vehicle programs. - **Government and Research**: National projects related to AI, supercomputing, and scientific research. #### Customer Concentration and Analysis - The data center segment—driven by AI and cloud computing—accounted for the majority of NVIDIA’s recent revenue growth as of 2024–2025. - Specific customer identities are typically not disclosed in SEC filings for confidentiality. However, filings sometimes reference the risk of customer concentration, and it is well-understood that a handful of cloud hyperscalers account for a significant percentage of total revenue in data center sales. - Gaming remains a critical revenue base but is more diversified across OEM and channel partners, while the automotive and enterprise sectors show moderate but growing concentration. #### Limitations Precise revenue assignments to specific named companies are generally not available due to customer privacy or contractual restrictions. Where public, the main hyperscalers are widely inferred as dominant clients through industry reporting and partner announcements. ### Critical Input Materials Manufacturing of NVIDIA’s GPUs and AI accelerators is underpinned by a variety of sophisticated materials and components, which are critical to product performance and supply chain continuity. #### Key Materials - **Semiconductor Wafers (Silicon)**: Advanced wafers—particularly 300 mm—manufactured at nodes as fine as 4 nm or below, primarily provided by TSMC. - **Advanced Packaging and Substrates**: High-end, multi-layer substrates capable of handling power, heat, and multi-die interconnection (e.g., CoWoS, HBM interposers), produced by companies such as Unimicron and Ibiden. - **High-Bandwidth Memory (HBM, GDDR6/6X)**: Provided by SK Hynix, Samsung, and Micron. Used for rapid data access and parallel processing critical to AI workloads. - **Rare Earth Elements and Specialty Metals**: Materials like cobalt, tantalum, gold, palladium, and various rare earths are used in tiny amounts in semiconductor fabrication and board components. - **Specialty Chemicals and Gases**: Used in lithography, etching, deposition, and cleaning processes during chip production (e.g., photoresists, gases like argon and fluorine). #### Supply Chain Considerations and Challenges - Supply of advanced substrates and memory has proven to be a chokepoint, with industry-wide constraints affecting not just NVIDIA but all advanced semiconductor vendors. - Geopolitical risks (e.g., US-China trade tensions, export controls on advanced tech to China) have prompted NVIDIA to redesign certain SKUs and adapt its sourcing in response to regulatory restrictions. - The scale and complexity of AI hardware have increased per-unit material requirements, putting further pressure on upstream suppliers and making supply chain management a major focus for NVIDIA and its partners. #### Limitations Granular breakdowns of supplier-specific sourcing by material quantity or value are not published for competitive and commercial sensitivity reasons. Broad categories and major vendor names are industry facts supported by public manufacturing announcements and supply chain analyses. ### Industry and Sector Dynamics #### Competitive Landscape NVIDIA operates within several overlapping yet competitive sectors: - **AI and Accelerated Computing**: Competitors include AMD (Instinct and Radeon), Intel (Xe accelerators, Gaudi AI processors), and rapidly scaling custom AI chip ventures by hyperscalers themselves (e.g., Google TPU, Amazon Trainium). - **Semiconductors and GPUs**: NVIDIA maintains dominance in discrete GPUs but faces constant pressure from AMD in gaming and professional visualization. - **Networking and Interconnect**: Mellanox (a NVIDIA business), Intel, Broadcom, and Marvell compete to define next-generation datacenter fabrics and storage interconnects. #### Major Trends (2024–2025) - **AI Boom and AI Infrastructure Buildout**: Unprecedented demand for AI training/inference hardware driven by advances in large language models, generative AI, and enterprise AI adoption. Investments such as the proposed Stargate LLC (a $500 billion+ US AI infrastructure plan) dwarf historic data center buildouts, setting new demand records [5]. - **Edge and Automotive Expansion**: AI and video processing are pushing computational intelligence to the edge and into vehicles; NVIDIA’s automotive platforms power critical ADAS and autonomous test fleets worldwide. - **Supply Chain and Geopolitics**: Global tension has led to export restrictions, especially around sales of advanced AI chips to China, causing NVIDIA to adapt supply and design strategies to remain compliant yet competitive [4]. - **Industry Consolidation and Ecosystem Building**: Mergers, acquisitions, and tighter software-hardware integration are defining the next wave of competitive differentiation. #### Challenges - **Supply Chain Volatility**: Shortages of substrates, wafers, and memory chips can limit NVIDIA’s ability to fulfill demand. - **Regulatory Risks**: Export bans (notably to China) and technology transfer restrictions are persistent threats. - **Customer Concentration**: Dependence on a small number of cloud hyperscaler customers exposes NVIDIA to cyclical procurement and market power risks. - **Talent and R&D Pressure**: The global race for AI technical talent and the escalating costs of advanced chip R&D create barriers for smaller rivals but drive up operational costs. #### Industry Outlook The intersection of AI, cloud computing, and high-performance hardware cements NVIDIA’s leading sector position as of October 2025, with market momentum but significant risks in supply chains, regulatory frameworks, and competitive rivalry. ### Sources 1. [Nvidia](https://en.wikipedia.org/wiki/Nvidia) 2. [GeForce GTX 900 series](https://en.wikipedia.org/wiki/GeForce_GTX_900_series) 3. [Nvidia RTX](https://en.wikipedia.org/wiki/Nvidia_RTX) 4. [AI Action Summit](https://en.wikipedia.org/wiki/AI_Action_Summit) 5. [Stargate LLC](https://en.wikipedia.org/wiki/Stargate_LLC) # Code Agent Prompt While I am not including the code changes (formatting/submitting the PRs would take time and I'm already exhausted from a day job), I have pasted the necessary files into ChatGPT to generate a prompt for a coding agent to recreate the code as a starting template inside of a freshly cloned [Open Deep Research](https://github.com/langchain-ai/open_deep_research) repo if you want to implement this project [here](Published/Projects/Resources/Wikipedia%20Deep%20Research%20Prompt)