As such, I hope you see this book as an open invitation to get started with Kettle in the wonderful world of data integration. Whether there is a migration to do, an ETL process to run, or a need for massively loading data into a database, you have several software tools, ranging from expensive and sophisticated to free open source and friendly ones, which help you accomplish the task. Ten years ago, the scenario was clearly different. By , Matt Casters, a Belgian business intelligent consultant, had been working for a while as a datawarehouse architect and administrator.
As such, he was one of quite a number of people who, no matter if the company they worked for was big or small, had to deal with the difficulties that involve bridging the gap between information technology and business needs. What made it even worse at that time was that ETL tools were prohibitively expensive and everything had to be crafted done. The last employer he worked for, didn't think that writing a new ETL tool would be a good idea.
- Navegación de entradas.
- Le prieuré sur la colline (French Edition).
- استعراض بحث;
- Book Preview!
- Present Over Perfect: Leaving Behind Frantic for a Simpler, More Soulful Way of Living!
- The Train Set of Terror (Measle Stubbs Adventure).
This was one of the motivations for Matt to become an independent contractor and to start his own company. That was in June At the end of that year, he told his wife that he was going to write a new piece of software for himself to do ETL tasks.
It was going to take up some time left and right in the evenings and weekends. Surprised, she asked how long it would take you to get it done. He replied that it would probably take five years and that he perhaps would have something working in three. Working on that started in early Matt's main goals for writing the software included learning about databases, ETL processes, and data warehousing. This would in turn improve his chances on a job market that was pretty volatile.
Ultimately, it would allow him to work full time on the software. Another important goal was to understand what the tool had to do. Matt wanted a scalable and parallel tool, and wanted to isolate rows of data as much as possible. The last but not least goal was to pick the right technology that would support the tool. The first idea was to build it on top of KDE, the popular Unix desktop environment. Trolltech, the people behind Qt, the core UI library of KDE, had released database plans to create drivers for popular databases.
However, the lack of decent drivers for those databases drove Matt to change plans and use Java. He picked Java because he had some prior experience as he had written a Japanese Chess Shogi database program when Java 1. After a year of development, the tool was capable of reading text files, reading from databases, writing to databases and it was very flexible.
The code had grown unstructured, crashes occurred all too often, and it was hard to get something going with the Java graphic library used at that moment, the Abstract Window Toolkit AWT ; it looked bad and it was slow. As for the library, Matt decided to start using the newly released Standard Widget Toolkit SWT , which helped solve part of the problem.
More titles to consider
As for the rest, Kettle was a complete mess. It was time to ask for help. At various intervals over the next few years, Wim involved himself in the project, giving advices to Matt about good practices in Java programming. Listening to that advice meant performing massive amounts of code changes. As a consequence, it was not unusual to spend weekends doing nothing but refactoring code and fixing thousands of errors because of that.
But, bit by bit, things kept going in the right direction. At that same time, Matt also showed the results to his peers, colleagues, and other senior BI consultants to hear what they thought of Kettle. That was how he got in touch with the Flemish Traffic Centre www. All of a sudden, he was being paid to deploy and improve Kettle to handle that job. The diversity of test cases at the traffic center helped to improve Kettle dramatically. That was somewhere in and Kettle was by its version 1.
While working at Flemish, Matt also posted messages on Javaforge www. He got a few reactions. Despite some of them being remarkably negative, most were positive. The most interesting response came from a nice guy called Jens Bleuel in Germany who asked if it was possible to integrate third-party software into Kettle. Kettle didn't have a plugin architecture, so Jens' question made Matt think about a plugin system, and that was the main motivation for developing version 2.
For various reasons including the birth of Matt's son Sam and a lot of consultancy work, it took around a year to release Kettle version 2. It was a fairly complete release with advanced support for slowly changing dimensions and junk dimensions Chapter 9 explains those concepts , ability to connect to thirteen different databases, and the most important fact being support for plugins.
Matt contacted Jens to let him know the news and Jens was really interested.
There was a lot of excitement, and they agreed to start promoting the sales of Kettle from the Kettle. Those were days of improvements, requests, people interested in the project. However, it became too much to handle. Doing development and sales all by themselves was no fun after a while. As such, Matt thought about open sourcing Kettle early in and by late summer he made his decision. Jens and Proratio didn't mind and the decision was final. When they finally open sourced Kettle on December , the response was massive. The downloadable package put up on Javaforge got downloaded around times during first week only.
The news got spread all over the world pretty quickly. What followed was a flood of messages, both private and on the forum.
At its peak in March , Matt got over messages a day concerning Kettle. In no time, he was answering questions like crazy, allowing people to join the development team and working as a consultant at the same time. Added to this, the birth of his daughter Hannelore in February was too much to deal with. Fortunately, good times came.ssllabel-admin.wecan-group.com/6-grado-social-estudia-gua-de-currculum.php
Instant Pentaho Data Integration Kitchen by Sergio Ramazzina | NOOK Book (eBook) | Barnes & Noble®
They had selected Enhydra Octopus, a Java-based ETL software, but they didn't have a strong reliance on a specific tool. While Jens was evaluating all sorts of open source BI packages, he came across that thread.
- In dulci jubilo!
- Санкт-Петербургский государственный университет;
- Pop!: Why Bubbles Are Great For The Economy.
- A new Pentaho book.
- Down By The Water.
Matt replied immediately persuading people at Pentaho to consider including Kettle. And he must be convincing because the answer came quickly and was positive.
Pentaho 3.2 Data Integration Beginner's Guide
Later on, Matt came in touch with one of the other Pentaho founders, Richard Daley, who offered him a job. That allowed Matt to focus full-time on Kettle.
Four years later, he's still happily working for Pentaho as chief architect for data integration, doing the best effort to deliver Kettle 4. Jens Bleuel, who collaborated with Matt since the early versions, is now also part of the Pentaho team. She has been working as a BI consultant for the last 10 years. At the beginning she worked with Cognos suite.
However, over the last three years, she has been dedicated, full time, to developing Pentaho BI solutions both for local and several Latin-American companies, as well as for a French automotive company in the last months. Writing my first book in a foreign language and working on a full time job at the same time, not to mention the upbringing of two small kids, was definitely a big challenge. Now I can tell that it's not impossible. I dedicate this book to my husband and kids; I'd like to thank them for all their support and tolerance over the last year.
I'd also like to thank my colleagues and friends who gave me encouraging words throughout the writing process. Special thanks to the people at Packt; working with them has been really pleasant. I'd also like to thank the Pentaho community and developers for making Kettle the incredible tool it is. Thanks to the technical reviewers who, with their very critical eye, contributed to make this a book suited to the audience.
This action might not be possible to undo. Are you sure you want to continue? Upload Sign In Join. Home Books Technology. Save For Later. Create a List. Pentaho 3. A practical, easy-to-read guide that gives you full understanding of the Pentaho Data Integration tool and shows you how to use it to your advantage to manipulate data Approach As part of Packt's Beginner's Guide, this book focuses on teaching by example.
Who this book is for This book is for software developers, database administrators, IT students, and everyone involved or interested in developing ETL solutions, or, more generally, doing any kind of data manipulation. Read on the Scribd mobile app Download the free Scribd mobile app to read anytime, anywhere. Book Preview Pentaho 3. Table of Contents Pentaho 3. Spoon Setting preferences in the Options window Storing transformations and jobs in a repository Creating your first transformation Time for action—creating a hello world transformation What just happened?
Summary 2. Getting Started with Transformations Reading data from files Time for action—reading results of football matches from files What just happened? Input files Input steps Reading several files at once Time for action—reading all your files at a time using a single Text file input step What just happened? Time for action reading all your files at a time using a single Text file input step and regular expressions What just happened? Regular expressions Troubleshooting reading files Grids Have a go hero—explore your own files Sending data to files Time for action—sending the results of matches to a plain file What just happened?