Limit search to available items
Book Cover
E-book
Author Berman, Jules J.

Title Principles of big data : preparing, sharing, and analyzing complex information
Published BOSTON MORGAN KAUFMANN 2013

Copies

Description 1 online resource
Contents Machine generated contents note: 1. Providing Structure to Unstructured Data -- Background -- Machine Translation -- Autocoding -- Indexing -- Term Extraction -- 2. Identification, Deidentification, and Reidentification -- Background -- Features of an Identifier System -- Registered Unique Object Identifiers -- Really Bad Identifier Methods -- Embedding Information in an Identifier: Not Recommended -- One-Way Hashes -- Use Case: Hospital Registration -- Deidentification -- Data Scrubbing -- Reidentification -- Lessons Learned -- 3. Ontologies and Semantics Background -- Classifications, the Simplest of Ontologies -- Ontologies, Classes with Multiple Parents -- Choosing a Class Model -- Introduction to Resource Description Framework Schema -- Common Pitfalls in Ontology Development -- 4. Introspection -- Background -- Knowledge of Self -- eXtensible Markup Language -- Introduction to Meaning -- Namespaces and the Aggregation of Meaningful Assertions -- Resource Description Framework Triples -- Reflection -- Use Case: Trusted Time Stamp -- Summary -- 5. Data Integration and Software Interoperability -- Background -- Committee to Survey Standards -- Standard Trajectory -- Specifications and Standards -- Versioning -- Compliance Issues -- Interfaces to Big Data Resources -- 6. Immutability and Immortality -- Background -- Immutability and Identifiers -- Data Objects -- Legacy Data -- Data Born from Data -- Reconciling Identifiers across Institutions -- Zero-Knowledge Reconciliation -- Curator's Burden -- 7. Measurement -- Background -- Counting -- Gene Counting -- Dealing with Negations -- Understanding Your Control -- Practical Significance of Measurements -- Obsessive-Compulsive Disorder: The Mark of a Great Data Manager -- 8. Simple but Powerful Big Data Techniques -- Background -- Look at the Data -- Data Range -- Denominator -- Frequency Distributions -- Mean and Standard Deviation -- Estimation Only Analyses -- Use Case: Watching Data Trends with Google Ngrams -- Use Case: Estimating Movie Preferences -- 9. Analysis -- Background -- Analytic Tasks -- Clustering, Classifying, Recommending, and Modeling -- Data Reduction -- Normalizing and Adjusting Data -- Big Data Software: Speed and Scalability -- Find Relationships, Not Similarities -- 10. Special Considerations in Big Data Analysis -- Background -- Theory in Search of Data -- Data in Search of a Theory -- Overfitting -- Bigness Bias -- Too Much Data -- Fixing Data -- Data Subsets in Big Data: Neither Additive nor Transitive -- Additional Big Data Pitfalls -- 11. Stepwise Approach to Big Data Analysis -- Background -- Step 1 Question Is Formulated -- Step 2 Resource Evaluation -- Step 3 Question Is Reformulated -- Step 4 Query Output Adequacy -- Step 5 Data Description -- Step 6 Data Reduction -- Step 7 Algorithms Are Selected, If Absolutely Necessary -- Step 8 Results Are Reviewed and Conclusions Are Asserted -- Step 9 Conclusions Are Examined and Subjected to Validation -- 12. Failure -- Background -- Failure Is Common -- Failed Standards -- Complexity -- When Does Complexity Help? -- When Redundancy Fails -- Save Money; Don't Protect Harmless Information -- After Failure -- Use Case: Cancer Biomedical Informatics Grid, a Bridge Too Far -- 13. Legalities -- Background -- Responsibility for the Accuracy and Legitimacy of Contained Data -- Rights to Create, Use, and Share the Resource -- Copyright and Patent Infringements Incurred by Using Standards -- Protections for Individuals -- Consent -- Unconsented Data -- Good Policies Are a Good Policy -- Use Case: The Havasupai Story -- 14. Societal Issues -- Background -- How Big Data Is Perceived -- Necessity of Data Sharing, Even When It Seems Irrelevant -- Reducing Costs and Increasing Productivity with Big Data -- Public Mistrust -- Saving Us from Ourselves -- Hubris and Hyperbole -- 15. Future -- Background -- Last Words
Bibliography Includes bibliographical references and index
Subject Big data.
Database management.
Big data
Database management
Form Electronic book
ISBN 9780124047242
0124047246
0124045766
9780124045767