Chapter 1. Introduction -- chapter 2. The mother of invention's triplets : Moore's law, the proliferation of data, and data storage technology -- chapter 3. Hadoop -- chapter 4. HBase and other big data databases -- chapter 5. Machine learning -- chapter 6. Statistics -- chapter 7. Google -- chapter 8. Geographic information systems (GIS) -- chapter 9. Discovery -- chapter 10. Data quality -- chapter 11. Benefits -- chapter 12. Concerns -- chapter 13. Epilogue
Summary
Covering prominent software packages, including Hadoop, Oracle Endeca, and SAP HANA, this book demonstrates the utility and promise of these applications. It also demonstrates the need to understand data quality and the ability of statistics to mislead when due rigor is not applied. As the authors are both ASQ-certified Six Sigma Black Belts, they demonstrate how common statistical tools and investigative methodologies can mitigate risks that arise from limitations in the data