Simply put, Dark Data is stored, largely non-inventoried, unstructured data not currently used for the purpose of conducting data science, but which is nevertheless maintained on a “just in case” basis – either to meet regulatory requirements, or in the hope that the data will prove useful for research purposes at some time in the future. “Gathering dust” in archives, Dark Data, is – as less simply put by CIO and industry pundit Isaac Sacolick – “data and content that exists and is stored, but is not leveraged and analyzed for intelligence or used in forward looking decisions. It includes data that is in physical locations or formats that make analysis complex or too costly, or data that has significant data quality issues. It also includes data that is currently stored and can be connected to other data sources for analysis, but the business has not dedicated sufficient resources to analyze and leverage.” Add to this unstructured data of a nature for which sufficiently robust or accurate analysis tools have not yet been invented, and some data (notably most log files) which will simply never be of use and will never yield useful Business Intelligence, commonly known as BI.
In DARK DATA & DARK SOCIAL, Lars Nielsen explores then nature of Dark Data, how to go about discerning genuinely useful Dark Data amid the large balance of useless data debris with which most enterprises are swamped, how to build a data science team to accomplish this task and leverage Dark Data to its utmost potential, how to safely and irrevocably dispose of unusable data debris, and also how to exploit the darkest of dark data - "Dark Social," the hard-to-track but incredibly valuable real-time data pegged to largely-anonymous second party referrals to web sites (as opposed to direct click-throughs). Throughout the book, Nielsen provides information in a user-friendly, jargon-free manner which assumes little technical background. Thus the volume is ideal for would-be data scientists as well as managers and marketers working with, or intending to work with, data science teams.
CONTENTS: What is Dark Data? * The Ebb and Flood of Data Value * Building a Team to Enlighten Dark Data * The Varying Natures of Dark Data * The Need for Universal Ubiquitous Encryption * Dark Data Retention Policies * The Safe Disposal of Data Debris via Crypto-Shredding and Other Approaches * Summing Up: Recognizing the Value Within Non-Transactional Dark Data and Exploiting “Dark Social”
ABOUT THE AUTHOR: Lars Nielsen is a leading systems analyst and developer living in Amsterdam. His bestselling books include A SIMPLE INTRODUCTION TO DATA SCIENCE, UNICORNS AMONG US: UNDERSTANDING THE HIGH PRIESTS OF DATA SCIENCE, and COMPUTING: A BUSINESS HISTORY.
In DARK DATA & DARK SOCIAL, Lars Nielsen explores then nature of Dark Data, how to go about discerning genuinely useful Dark Data amid the large balance of useless data debris with which most enterprises are swamped, how to build a data science team to accomplish this task and leverage Dark Data to its utmost potential, how to safely and irrevocably dispose of unusable data debris, and also how to exploit the darkest of dark data - "Dark Social," the hard-to-track but incredibly valuable real-time data pegged to largely-anonymous second party referrals to web sites (as opposed to direct click-throughs). Throughout the book, Nielsen provides information in a user-friendly, jargon-free manner which assumes little technical background. Thus the volume is ideal for would-be data scientists as well as managers and marketers working with, or intending to work with, data science teams.
CONTENTS: What is Dark Data? * The Ebb and Flood of Data Value * Building a Team to Enlighten Dark Data * The Varying Natures of Dark Data * The Need for Universal Ubiquitous Encryption * Dark Data Retention Policies * The Safe Disposal of Data Debris via Crypto-Shredding and Other Approaches * Summing Up: Recognizing the Value Within Non-Transactional Dark Data and Exploiting “Dark Social”
ABOUT THE AUTHOR: Lars Nielsen is a leading systems analyst and developer living in Amsterdam. His bestselling books include A SIMPLE INTRODUCTION TO DATA SCIENCE, UNICORNS AMONG US: UNDERSTANDING THE HIGH PRIESTS OF DATA SCIENCE, and COMPUTING: A BUSINESS HISTORY.