> "A data scientist is someone who is better at software than a statistician and better at stats than a software engineer." > -- ??? The birth of data science (http://radar.oreilly.com/2011/09/building-data-science-teams.html) ## data science process 1. Import data 2. Tidy data 3. Transform 4. Visualize 5. Model (return to step 3 if needed) 6. Communicate > [!Tip]- Additional Resources > [R for Data Science](https://r4ds.hadley.nz/) ## communincation - motivation - data sources - analysis/visuals - conclusions - limitations, potential error ## history of The promise of machines to compute was envisioned as far back as Charles Babbage and Ada Lovelace in the late 19th century. While Babbage's Analytical Engine was never in fact realized, it laid the groundwork for modern computing. [^1] In 1952, John Tukey predicted that computers would revolutionize how we reason with data. ## Data Scientists **Brian Brown, Engineering Director** | Brian leads the Geo Machine Intelligence team at Google focused on applied Machine Learning for Google Maps. Having worked at Google for more than 15 years, he has been involved in many aspects of the evolution of map data development for Google Maps. Brian started his Google career as a result of the acquisition of the 3D modeling software SketchUp, which also created the first Google office in Boulder, CO.  Prior to Google, Brian earned his BS in AREN from the University of Colorado and worked on software to design non-imaging optics. **Rinaldo Maldera, Staff Product Analyst |** Rinaldo leads one of Google Maps’ data science teams focused on evaluating Maps quality. For more than 12 years he has helped to create innovative solutions to drive data-driven decision making in companies like P&G and M&S. During his tenure at Google, he has paved the way to estimate critical gaps in Google Maps data, democratize machine learning developments and grow an engaged worldwide community of data scientists. **Natalie Jackson** is the Director of Research at the Public Religion Research Institute (PRRI), a nonprofit, nonpartisan organization dedicated to conducting independent research at the intersection of religion, culture, and public policy. She has held senior and management positions in media, academia, and nonprofit organizations. Most recently, she was the Managing Director of Polling at JUST Capital, where she built and managed a survey research team, as well as contributed to the overall mission and strategy of the nonprofit organization. Her work has appeared in peer-reviewed journals _Electoral Studies and Social Science Quarterly_, as well as in several edited volumes. Natalie received her PhD in political science from the University of Oklahoma and was a postdoctoral associate at the Duke University Initiative on Survey Methodology. **Dr. Vilja Hulden** is a labor and social historian focusing on the modern United States. Her book manuscript, "Government by the Bosses: Employers Against Worker Power Before the New Deal" argues that debates over who should govern at the workplace have crucially shaped American ideas about democratic participation and the relationship between individual rights and collective action. She is also active in digital humanities research and regularly presents her work at digital humanities conferences. Her most recent digital project, called "Speaking to the State,"  constructs a computational analysis of representation of labor and business at Congressional hearings since 1877. **Professor Robin Burke** conducts research in personalized recommender systems, a field he helped found and develop. Among other topics, his research group, That Recommender Systems Lab, explores fairness, accountability and transparency in recommendation through the integration of objectives from diverse stakeholders. He joined the Department of Information Science in 2019 from the School of Computing at DePaul University. Professor Burke is the author of more than 150 peer-reviewed publications in a range of areas including recommender systems, machine learning, and digital humanities. His work has received support from the National Science Foundation, the National Endowment for the Humanities, the Fulbright Commission and the MacArthur Foundation, among others. **Dr. Seth Spielman** is an Associate Professor of Geography and Information Science at the University of Colorado and a Data Scientist at Apple.  His expertise is at the intersection of maps, statistics, machine learning, and the social sciences. His recently published book on [Urban Analytics](https://www.amazon.com/Urban-Analytics-Spatial-Gis-ebook/dp/B077PTKRV3) tries to bring together these diverse fields.   Dr. Spielman has received the [Breheny Prize](https://journals.sagepub.com/page/epb/collections/the-breheny-prize "Breheny Prize") for the best paper in Urban Analytics and City Science, he won a [kaggle.com](https://www.kaggle.com/c/us-census-challenge/overview/winners) data science competition, and was awarded a distinguished scholar award in Urban Planning from the American Association of Geographers.   The journal  [Science](http://sciencecareers.sciencemag.org/career_magazine/previous_issues/articles/2013_03_18/caredit.a1300045) profiled him as an archetype of a new generation of data-centric geographers.  His [publications](http://www.sethspielman.org/publications/) have appeared in a diverse set of journals including PNAS, PlosOne, Demography, Annals of the Association of American Geographers, and the International Journal of GIS.  Outside of academia his professional experience has ranged from the hyper digital world of Data Science and Software Engineering for a large tech company in Silicon Valley to the insanely analog practice of being the sole proprietor of an antiquarian bookshop in Manhattan. **Dr. Katharina Kann** is an Assistant Professor of Computer Science at CU Boulder. The main focus of her research lies on deep learning for natural language processing. In particular, she is interested in transfer learning, approaches for low-resource languages, and computational morphology. **Dr. Daniel Larremore** is an Assistant Professor in the Department of Computer Science and the BioFrontiers Institute at the University of Colorado Boulder. He is also an affiliate of the Department of Applied Mathematics at the University of Colorado Boulder, and is a member of the external faculty of the Center for Communicable Disease Dynamics at the Harvard T. H. Chan School of Public Health. His research develops mathematical methods using novel combinations of networks, dynamical systems, and statistical inference to solve problems in two main areas: infectious disease epidemiology and computational social science. This work focuses on generative models for networks, the ongoing evolution and genomic epidemiology of the malaria parasite, and the origins of social inequalities in academic hiring and careers. Prior to joining the University of Colorado faculty, he was an Omidyar Fellow at the Santa Fe Institute 2015-2017 and a post-doctoral fellow at the Harvard T.H. Chan School of Public Health 2012-2015. He obtained his Ph.D. in Applied Mathematics from the University of Colorado Boulder in 2012, and holds an undergraduate degree in Chemical Engineering from Washington University in St. Louis. ## interesting articles - [Your Apps Know Where You Were Last Night, and They're Not Keeping It Secret](https://www.nytimes.com/interactive/2018/12/10/business/location-data-privacy-apps.html) [^1]: Information, Gleick