Posted on 10:10 by Klaas Bosteels and filed under , , ,

The next HUGUK meetup will take place on Mon 6th Sept at Skills Matter, starting at 6PM. Cloudera is sending Mike Olson, their CEO, to talk about their Hadoop distribution and Ian Wrigley to present their exciting new log collector Flume, and Andy Kent will explain how Forward does the clever data analysis that Wired wrote about recently:

"Cloudera's Distribution for Hadoop and Cloudera Enterprise" by Mike Olson


Cloudera has assembled a comprehensive, fully open-source distribution of the Apache Hadoop software and related projects. This package, Cloudera's Distribution for Hadoop (CDH), version 3, is easy to acquire, install, configure, run and administer, and dramatically simplifies the use and operation of Hadoop. In addition, Cloudera offers a subscription-based package based on CDH called Cloudera Enterprise, with management and administrative dashboards aimed at IT staff, backed by the company's enterprise support and other services. In this talk, I'll describe CDH and Cloudera Enterprise.


Mike Olson is the CEO of Cloudera. Formerly he was CEO of Sleepycat Software, makers of Berkeley DB, the open source embedded database engine. He spent two years at Oracle Corporation as Vice President for Embedded Technologies after Oracle’s acquisition of Sleepycat in 2006. Prior to joining Sleepycat, Mike held technical and business positions at database vendors Britton Lee, Illustra Information Technologies and Informix Software. Mike has Bachelor’s and Master’s degrees in Computer Science from the University of California at Berkeley.

"Reliable, Distributed Streaming Log Collection with Flume" by Ian Wrigley


As the size of your log files and other dynamically-generated data increases, they becomes more and more difficult to manage. In this talk we'll discuss how to use Flume, an open-source framework from Cloudera, to collect your log files as they're generated and aggregate them to where you want to process them.


After a stint as a technology journalist, Ian Wrigley started one of the UK's first Web consultancies. He has been managing large amounts of data ever since, starting with flat files and Perl scripts, moving on to database servers such as MySQL, and now via the power of Hadoop. He has taught courses for companies such as Learning Tree International, Sun, and Oracle, and he is currently a Hadoop Instructor at Cloudera, where he describes his job as "helping geeks become geekier." Ian is also PC Pro's Contributing Editor for Unix and Open Source.

"Hadoop in Context" by Andy Kent


Although introduced initially for research into a large document clustering problem, Hadoop (and it's related sub-projects) now forms a significant part of our infrastructure that has let us more than quadruple our data volumes within a year whilst putting the majority of data into the hands of analysts before 9:00am.

This talk will tell the story of our adoption of Hadoop from our initial in-house virtualised cluster and EC2 experiment to our current dedicated cluster, the migration from our more traditional RDBMS data warehouse to Hive, and how we've developed tools and infrastructure to integrate Hadoop with the rest of our systems and help put the data in the hands of our analysts.


Andy Kent is an engineer at Forward who works with Hive and Hadoop on a daily basis.

In addition to all this, Ian Broadhead will be giving a lightning talk about the usage of Hadoop at Playfish, the social gaming site from Electronic Arts, and rumour has it there will be several attendants who are looking to hire Hadoop developers and/or data analysts. So there are plenty of reasons to reserve your spot by registering right now!

Cloudera also offered to, once again, sponsor for free beer and pizza, so make sure you arrive early and with an appetite. They are still giving substantial discounts to all HUGUKers for their September training sessions in London too -- those of you who would like to improve or extend their Hadooping skills should definitely consider taking advantage of this offer.
Responses to ... HUGUK #5
Unknown said... 24 December 2016 at 05:06

Thanks for posting this useful content, Good to know about new things here, Let me share this, . cisco training in pune

invincible01 said... 6 October 2020 at 13:28

Amazing Article, Really useful information to all So, I hope you will share more information to be check and share here.

flask in python
how to install flask in python
what is flask in python
flask in python tutorial
how to create a web page using flask in python
rest api using flask in python
how to install flask in python without pip
flask in python is used for
what is flask in python used for
learn flask in python

Helen Henson said... 20 October 2020 at 12:29

Excellent Post

Emma Ludwig said... 28 October 2020 at 06:56

Excellent post.
best survey sites in india 2020

Post a comment