### <u>T.L.D.R</u> Whisper is a data-as-a-service provider that structures and indexes the worlds private technology markets. Unlike a web search tool like Google, data in Whisper is structured by meaning, not markup. Instead of websites, we have entities like people, organizations, technologies, and products. People entities have attributes like name, job title, and skillset . Organization entities have attributes like sector, revenue, and list of employees. Each entity is the result of crawling, structuring, and processing hundreds of millions of pages on the web, generating billions of facts, and linking those facts to each entity. Whisper allows you to search the web as a humongous graph database of entities, filtered by their attributes. We’re starting with information on the public web first but plan on merging it with third-party datasets over time, unlocking insights in private markets that were not possible before. **Data is a growing business** One of the biggest themes in the last 20 years has been products that help companies manage and use first party data better. If you invested in that trend, you had an amazing two decades. Those companies include core tools like CRM's ([Salesforce](https://www.salesforce.com/), [HubSpot](https://www.hubspot.com/products/get-started-f049?utm_id=607212716856&utm_medium=paid&utm_source=google&utm_term=marketing_hubspot_EN&utm_campaign=Marketing_MQLs_EN_NAM_NAM_Brand-HubSpot_e_c_campaignid804389993_agid43208773113_google&utm_content=_&hsa_ver=3&hsa_net=adwords&hsa_acc=2734776884&hsa_kw=hubspot&hsa_grp=43208773113&hsa_mt=e&hsa_cam=804389993&hsa_ad=607212716856&hsa_tgt=kwd-298569398281&hsa_src=g&gad=1&gclid=Cj0KCQjw-pyqBhDmARIsAKd9XIMT3VTMI7npTSBJFoagSDb0q884Dm8baHBSPHZvA2WrzO2xvvAy_JgaAn3QEALw_wcB)), BI ([Tableau](https://www.tableau.com/), [Alteryx](https://www.alteryx.com/designer-trial/free-30-days?utm_campaign=NA_Search_Demgen_Mixed_Brand_New_D_E_AO&utm_source=google&utm_medium=cpc&utm_content=&utm_term=alteryx&gad=1&gclid=Cj0KCQjw-pyqBhDmARIsAKd9XINfZkIlIcrGttcJ2_-dIE2-1xHgD_p3NKowBru_Ro0aVtPfEeajJTMaAuUyEALw_wcB)), big data analytics ([Palantir](https://www.palantir.com/), [Splunk)](https://www.splunk.com/), data cloud platforms ([Snowflake](https://www.snowflake.com/en/), [Databricks](https://www.snowflake.com/en/)), and many many more. The amount of collected first party data and its consumption has grown exponentially due to better tools, increased internet usage, and more business activity coming online. Companies are getting better and better at managing their own data. At the same time, compute and storage costs continue to fall dramatically every year, so it is becoming cheaper and cheaper to process larger volumes of information. ![[Screenshot 2023-11-05 at 11.36.03 AM.png|600]] <u>First-party data is not enough</u> Most industries have spent the last 10 years building internal data science teams, they've hired the right talent, built the infrastructure and have gotten very good at taking their internal data and doing something with it. But even the most advanced organizations are now reaching an asymptote where there's only marginal value they can get from mining their own information. As markets get more competitive, asset managers and companies of all sizes will look externally for third-party sources more and more. This is happening across industries. **The Rise of Third Party Data** An order of magnitude more companies buy data today, than they did five years ago. One great example is hedge funds. The demand for third party or “external” datasets naturally developed here at a faster pace than other industries because the fundamental business of investing is looking at other peoples companies from an outside-in view. This idea of using external data to make more calculated decisions is not new, and it’s being applied to new verticals in pursuit of an information advantage. If you know something that competitors in your market don't, you can exploit it. ![[verticalizedd.png|700]] Unless you're active in these industries, there's a good chance you've never heard of these providers before. Although they serve very different markets, they all share something in common. These providers create common knowledge in their industries from information only middlemen had access to before, from public-but-hard-to aggregate data, or from information collected from users themselves. Instead of trying to hoard information, these vendors become the authoritative source on an industry or market at scale. Owning demand for a given dataset gives these companies a compounding advantage, allowing them to own search for their market and build monopolies. The data-as-a-service market is evolving. ![[Horizontall.png|700]] Because a niche for a given dataset is often hyper-competitive and dominated by 1 or 2 players, newer vendors have emerged that are targeting multiple markets with datasets that span across a spectrum of different use-cases (think web traffic, people contacts, location data). The way these providers scale their products is by making their datasets more static and generic so that they can meet the marginal needs of many customers but not meet the critical needs of any customer. As a result, these datasets can seamlessly fit into the workflows of asset managers and other companies and directly impact returns and revenue. Both types of data vendors primarily focus on selling one type of dataset, but there is no reason why they can’t be combined and crossover. **Introducing Whisper** Whisper is a data-as-a-service provider that structures and indexes the worlds private technology markets. Founded in 23’, Whisper is being built to provide private technology companies and investors with market intelligence data they can act on quickly. What does this all mean? In it's simplest form, Whisper can be: - A people, organizations, technologies, and products graph database that never goes out of date - A real-time market intelligence platform for technology companies and investors. With a little more imagination, Whisper can be a powerful source of data powering: - Sourcing new companies and identifying them before competitors - Tracking and recruiting the best talent all across the world - Rigorous market research, sizing, and analysis - Underwriting and pricing optimization - Onboarding and KYB for private credit **The Problem** The proliferation of data in today’s private markets has not led to a proliferation of insights. Technology companies and private market investors have plenty of internal data, but they need a better way to understand what is happening in the world around them. Petabytes of relevant private market information is being underutilized because it is unstructured, fragmented, and sits on the shelves of organizations that haven’t had enough incentive to share it in the past. Some organizations have managed to “lift” their capabilities by building internal pipelines and tools around search, but let’s be honest, nobody is completely thrilled with the results. Working with massive volumes of unstructured and third-party data is expensive, nuanced, and challenging to process for most organizations who want to use it. **Indexing The Worlds Private Markets is Now Possible** To bring structure and intelligence to the private markets, we need to get rid of this idea that there is not enough data available. Technically, we need to build a system that extracts information about people, organizations, technologies, and products at web scale, with a near real-time update cadence. Thanks to advancements in data sharing and graph database technology, this is now possible. While doing research in 2023, one of the key insights I uncovered was that there is a tremendous amount of value that can be unlocked by linking different datasets together. The reason for this is simple: data is only as useful as the questions it can help answer. In the context of private markets, graphing the relationships between the different entities allows you to answer questions that were never able to be asked before and analyze how networks of people, organizations, technologies, and products are related to each other. This can help connect the dots and transform the way technology companies and investors make decisions. For example, here is a list of queries Whisper could answer: - Give me a list of every machine learning engineer that has more than 3 years of experience working at a public company or hedge fund - Find me every stealth fintech company that registered a domain in the last 30 days that has not raised a round of financing - Where are software engineers from Palantir going? - What is the average monthly subscription price for SaaS products that are generating more than 10M in annual recurring revenue? - How much revenue did Ramp generate in 2023 and how many new customers did they acquire? - Give me a list of every person in tech that has left their job in the last 30 days - How big is the artificial intelligence infrastructure market, and how has it changed in the last 60 days? - Who are the the top 10 leaders in the expense management software market and sort them by revenue. **The Plan (just between you and me)** **Step 1: Start with Public Data** Whisper’s first step is to structure every person, organization, technology, and product into entities and map them. Define a standard ontology for each entity and include the fields and attributes that our customers care about. The beachhead market is venture capital and private equity firms that invest in software, where there is a lack of structured data they can act on and demand to become more systematic and generate higher returns. With Whisper, firms will be able to search people/companies/technologies/products by their attributes, identify them earlier in their lifecycle, and stop wasting time making gut decisions. **Step 2: Acquire High-Value Datasets** Providing deeper insights for our customers requires combining public data with third party sources. These sources are expensive and difficult to procure and clean. I spent the last 9 months doing demo’s and testing different datasets with over 500 providers and have identified a few that warrant licensing agreements. In addition to purchasing data, I have identified key organizations that I would like to partner with through a cooperative model. These alliances take time to materialize and often require complex licensing/revenue share agreements. Privacy surrounding the third-party data we’ll be collecting is a top concern for partners and customers. As a result, every piece of information that is ingested by the system will be GDPR and SOC 2 compliant. **Step 3: Deriving insights** It’s not enough to just provide clean structured data, we need to build a layer that connects everything together to deliver insights for repeated decisions that are made by our customers. This requires building software that integrates/interacts with our data and creating some sort of work flow around it. Whisper’s third step is to build a user interface that makes it easy to query, visualize the relationships between the data, and get answers to critical questions. This is one of the most central features for operators and investors as it will dictate their ability to understand what is truly happening in their markets and act on it quickly. **Step 4: Dogfooding the system** Due to the nature of the data being collected, Whisper has the unique vantage point of being able to monitor and track the identity and financial health of millions of private technology companies. This information can be used to build a comprehensive funnel to target our ideal customer profiles, and learn how their needs evolve over time. By design, the system will also generate a significant amount of metadata from user activity. This activity can be directly analyzed to determine how our customers are using the platform, where they are deriving the most value, and how additional resources should be allocated. **The World After Whisper Succeeds** Alternative data is already a $6B market, and it’s projected to grow [by $131B in the next 7 years to $137B](https://www2.deloitte.com/us/en/insights/industry/financial-services/financial-services-industry-predictions.html?utm_source=www.mattober.co&utm_medium=referral&utm_campaign=it-s-just-data#fueled-by-better-info). By then, alternative data will be nearly half as big as the current CRM market, in which the top 10 players average over [$2B in revenue and $20B in valuation](https://www.appsruntheworld.com/top-10-crm-software-vendors-and-market-forecast/). The market opportunity is exciting, but I get up every day and work hard on this problem because of what success for Whisper means. Entrepreneurs and investors today are struggling to outperform their predecessors. Technology is moving faster than ever, markets are getting more competitive by the day, and yet the companies and firms that operate in the private markets don’t have the tools and resources to extract value from the abundance of information available to them. Success for Whisper means riding the wave of advancements in data sharing, machine learning, and graph database technology to help transform the way millions of people across the globe make decisions.