Description |
1 online resource |
Contents |
Machine generated contents note: 1. Providing Structure to Unstructured Data -- Background -- Machine Translation -- Autocoding -- Indexing -- Term Extraction -- 2. Identification, Deidentification, and Reidentification -- Background -- Features of an Identifier System -- Registered Unique Object Identifiers -- Really Bad Identifier Methods -- Embedding Information in an Identifier: Not Recommended -- One-Way Hashes -- Use Case: Hospital Registration -- Deidentification -- Data Scrubbing -- Reidentification -- Lessons Learned -- 3. Ontologies and Semantics Background -- Classifications, the Simplest of Ontologies -- Ontologies, Classes with Multiple Parents -- Choosing a Class Model -- Introduction to Resource Description Framework Schema -- Common Pitfalls in Ontology Development -- 4. Introspection -- Background -- Knowledge of Self -- eXtensible Markup Language -- Introduction to Meaning -- Namespaces and the Aggregation of Meaningful Assertions -- Resource Description Framework Triples -- Reflection -- Use Case: Trusted Time Stamp -- Summary -- 5. Data Integration and Software Interoperability -- Background -- Committee to Survey Standards -- Standard Trajectory -- Specifications and Standards -- Versioning -- Compliance Issues -- Interfaces to Big Data Resources -- 6. Immutability and Immortality -- Background -- Immutability and Identifiers -- Data Objects -- Legacy Data -- Data Born from Data -- Reconciling Identifiers across Institutions -- Zero-Knowledge Reconciliation -- Curator's Burden -- 7. Measurement -- Background -- Counting -- Gene Counting -- Dealing with Negations -- Understanding Your Control -- Practical Significance of Measurements -- Obsessive-Compulsive Disorder: The Mark of a Great Data Manager -- 8. Simple but Powerful Big Data Techniques -- Background -- Look at the Data -- Data Range -- Denominator -- Frequency Distributions -- Mean and Standard Deviation -- Estimation Only Analyses -- Use Case: Watching Data Trends with Google Ngrams -- Use Case: Estimating Movie Preferences -- 9. Analysis -- Background -- Analytic Tasks -- Clustering, Classifying, Recommending, and Modeling -- Data Reduction -- Normalizing and Adjusting Data -- Big Data Software: Speed and Scalability -- Find Relationships, Not Similarities -- 10. Special Considerations in Big Data Analysis -- Background -- Theory in Search of Data -- Data in Search of a Theory -- Overfitting -- Bigness Bias -- Too Much Data -- Fixing Data -- Data Subsets in Big Data: Neither Additive nor Transitive -- Additional Big Data Pitfalls -- 11. Stepwise Approach to Big Data Analysis -- Background -- Step 1 Question Is Formulated -- Step 2 Resource Evaluation -- Step 3 Question Is Reformulated -- Step 4 Query Output Adequacy -- Step 5 Data Description -- Step 6 Data Reduction -- Step 7 Algorithms Are Selected, If Absolutely Necessary -- Step 8 Results Are Reviewed and Conclusions Are Asserted -- Step 9 Conclusions Are Examined and Subjected to Validation -- 12. Failure -- Background -- Failure Is Common -- Failed Standards -- Complexity -- When Does Complexity Help? -- When Redundancy Fails -- Save Money; Don't Protect Harmless Information -- After Failure -- Use Case: Cancer Biomedical Informatics Grid, a Bridge Too Far -- 13. Legalities -- Background -- Responsibility for the Accuracy and Legitimacy of Contained Data -- Rights to Create, Use, and Share the Resource -- Copyright and Patent Infringements Incurred by Using Standards -- Protections for Individuals -- Consent -- Unconsented Data -- Good Policies Are a Good Policy -- Use Case: The Havasupai Story -- 14. Societal Issues -- Background -- How Big Data Is Perceived -- Necessity of Data Sharing, Even When It Seems Irrelevant -- Reducing Costs and Increasing Productivity with Big Data -- Public Mistrust -- Saving Us from Ourselves -- Hubris and Hyperbole -- 15. Future -- Background -- Last Words |
Bibliography |
Includes bibliographical references and index |
Subject |
Big data.
|
|
Database management.
|
|
Big data
|
|
Database management
|
Form |
Electronic book
|
ISBN |
9780124047242 |
|
0124047246 |
|
0124045766 |
|
9780124045767 |
|