Limit search to available items
Book Cover
E-book
Author Sergio Pulvirenti, Adrián

Title Pentaho data integration 4 cookbook : over 70 recipes to solve ETL problems using Pentaho Kettle / Adrián Sergio Pulvirenti, María Carina Roldán
Published Birmingham, U.K. : Packt Pub., 2011

Copies

Description 1 online resource (1 electronic source (iii, 332 pages)) : illustrations
Series Quick answers to common problems
Quick answers to common problems.
Contents Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Working with Databases; Introduction; Connecting to a database; Getting data from a database; Getting data from a database by providing parameters; Getting data from a database by running a query built at runtime; Inserting or updating rows in a table; Inserting new rows where a simple primary key has to be generated; Inserting new rows where the primary key has to be generated based on stored values; Deleting data from a table
Creating or altering a database table from PDI (design time)Creating or altering a database table from PDI (runtime); Inserting, deleting, or updating a table depending on a field; Changing the database connection at runtime; Loading a parent-child table; Chapter 2:Reading and Writing Files; Introduction; Reading a simple file; Reading several files at the same time; Reading unstructured files; Reading files having one field by row; Reading files with some fields occupying two or more rows; Writing a simple file; Writing an unstructured file
Providing the name of a file (for reading or writing) dynamicallyUsing the name of a file (or part of it) as a field; Reading an Excel file; Getting the value of specific cells in an Excel file; Writing an Excel file with several sheets; Writing an Excel file with a dynamic number of sheets; Chapter 3:Manipulating XML Structures; Introduction; Reading simple XML files; Specifying fields by using XPath notation; Validating well-formed XML files; Validating an XML file against DTD definitions; Validating an XML file against an XSD schema; Generating a simple XML document
Generating complex XML structuresGenerating an HTML page using XML and XSL transformations; Chapter 4:File Management; Introduction; Copying or moving one or more files; Deleting one or more files; Getting files from a remote server; Putting files on a remote server; Copying or moving a custom list of files; Deleting a custom list of files; Comparing files and folders; Working with ZIP files; Chapter 5:Looking for Data; Introduction; Looking for values in a database table; Looking for values in a database (with complex conditions or multiple tables involved)
Looking for values in a database with extreme flexibilityLooking for values in a variety of sources; Looking values by proximity; Looking for values consuming a web service; Looking for values over an intranet; Chapter 6:Understanding Data Flows; Introduction; Splitting a stream into two or more streams based on a condition; Merging rows of two streams with the same or different structures; Comparing two streams and generating differences; Generating all possible pairs formed from two datasets; Joining two or more streams based on given conditions; Interspersing new rows between existent rows
Summary This book has step-by-step instructions to solve data manipulation problems using PDI in the form of recipes. It has plenty of well-organized tips, screenshots, tables, and examples to aid quick and easy understanding. If you are a software developer or anyone involved or interested in developing ETL solutions, or in general, doing any kind of data manipulation, this book is for you. It does not cover PDI basics, SQL basics, or database concepts. You are expected to have a basic understanding of the PDI tool, SQL language, and databases
Notes Includes index
English
Subject Data integration (Computer science)
Database management -- Computer programs.
Open source software.
Database Management Systems
COMPUTERS -- Data Processing.
Data integration (Computer science)
Database management -- Computer programs
Open source software
Form Electronic book
Author Roldán, María Carina
ISBN 9781849515252
1849515255
1849515247
9781849515245
1283349442
9781283349444
9786613349446
6613349445