Posted on 11:56 by Dan Harvey and filed under

It's that time again for our next user group meet-up, this time the talks are on the theme of data integration. We're got three talks organised for the evening :-
  • Informatica - Leveraging Unstructured Data Stored in Hadoop
Learn how Informatica (the leader in Data Integration) can help with unlocking unstructured, semi-structured and complex XML data stored in a Hadoop environment.
Find out how Flume can be used to easily digest data from a range of difference source in your system to places like HDFS/Hive and how this has helped SongKick to improve their data analytics internally.
  • Talend - Leveraging MapReduce with Talend: Hadoop, Hive, Pig, and Talend FileScale
Talend Integration Suite MPx packages Talend’s support for Hadoop technologies, and Talend Filescale, a high-performance MapReduce solution for flat file processing. In this presentation, you will learn how to use MPx to incorporate MapReduce-based processing into your data integration solutions. An overview of the MPx technologies, use cases and business benefits will be followed by a demo of the Hadoop features of TIS MPx. 
We'll have pizza and drinks again thanks to Informatica and Talend who are our sponsors for the evening. As usual because it's hosted at SkillsMatter you will also need to register on their events page for the evening, but please use our meetup.com/hadoop-users-group-uk page too so you can find out about others who are going!
As usual we'll be going to The Slaughtered Lamb afterwards to carry the social and discussion on after the talks.
See you there,
Posted on 20:38 by Dan Harvey and filed under

Our september meetup is now less than a week away so, to try and start to update this blog more frequently again, here's a post about what's coming up. The theme for the evening will be on how the Hadoop project has developed over the last few months. We've got the following two talks lined up :-

State of the Hadoop ecosystem - Dan Harvey
"Over the last 6 months the Hadoop community has seen a lot of change with many new companies opening up work based around the map reduce model that Hadoop uses. This talk will summaries these developments and point out some interesting discussion that has occurred on this to highlight where the Hadoop project appears to be heading."

Applications of MapReduce to the Financial Industry - Simon Waterer (Platform Computing)
"The threat of a double-dip recession and greater cost controls are further complicating the abilities of firms to create this utility grid and to scale-up their risk and customer analytics systems. In days past, adding hardware to enable deeper analytics and more complex simulations was the easy answer. No longer. Firms now need to provide greater compute capacity - without performance degradation - using the same or fewer resources. In addition, banks and insurers require highly available systems. The new Platform MapReduce, a 100% compatible version of Apache Hadoop, includes a highly available, scalable utility grid required by financial firms."

Platform Computing are also kindly sponsoring beer and pizza for this evenings meetup so that will be making a return again!

If you want to come along you'll need to register at SkillsMatter as usual before hand.

See you at the meetup on Thursday,

Posted on 23:04 by Dan Harvey and filed under

Last week was our April meet-up when we had two great presentations on the more algorithmic use of Hadoop. This was a first step into showing the wider use of map reduce beyond "bean counting" which people associate it with a lot of the time.

The first was from Last.fm where Mark Levy gave an interesting talk on their use of Hadoop for topic modelling, graph based recommendations, and audio analysis. This went into details of how both topic modelling and some of graph based algorithms can be calculated though iterative map reduce processing.

After this we had a presentation from Sean Owen about the collaborative filtering algorithms used in the Apache Mahout Project. This started out with explain the details behind collaborative filtering before going on to explain how this can be implemented on top of map reduce.

You can find the slides and videos for all these talks on the Lanyrd event page, thanks again to SkillsMatter for hosting the event and recording the videos for us.

Overall there was a lot of interest in these talks with some great discussion in the pub afterwards, so we'll try and hopefully get some speakers on these and related topics again.

As ever please let us know if you want to give a talk on anything Hadoop related and we'll be sending out details of our May meet up in the next few days.
Posted on 12:28 by Dan Harvey and filed under

Last Wednesday was the March meet-up for the Hadoop Users Group in London. We were lucky to have Jakob Homan, Owen O'Malley and Sanjay Radia over from Linkedin and Yahoo! in the San Francisco bay area. They were over to accept the Guardian Media Innovation Award which Hadoop won beating out competition from wiki-leaks and the iPad, congratulations to everyone involved in the Apache Hadoop project!

The evening was a great success with around 70 people turning out in the Yahoo! London office along with pizza thanks to Cloudera and drinks in the pub afterwards by Yahoo Developer Networks who were both sponsors for the event.

The two talks from Yahoo! were focusing on improvements to MapReduce and HDFS :-

Federated HDFS By Sanjay Radia

and the talk from Linkedin was one of their new open source projects for stream processing :-

Kafka by Jakob Homan

It's great to see where the future of the Hadoop project is going and it is as interesting as ever. We'll be having out next meet-up on April the 14th so best to keep that date free in your diary, details of that will be coming soon.

Also as ever if you want to give a talk about anything Hadoop related or help out the with group in anyway let us know, or speak out on our mailing list.
Posted on 22:04 by Dan Harvey and filed under

The first HUGUK meetup of 2011 is going to be in two weeks on Feburary 10th and this time we're back at Skills Matter over in east London. We've got two talks arranged for the evening with details below. You need to sign up if you want to attend so please head over to the Skills Matter event page to register. The details of the two talks are :-

Overview of Hadoop in 2010 and what's coming up in 2011
Pig & Project Voldemort: Big data loading (15 min)
Dan Harvey is a Datamining Engineer at Mendeley

Lily - an open source smart content repository running on top of HBase and friends. (45 mins)
Steven Noels is co-founder and CEO of Outerthought

These should be two great talks so we hope to see you there!

Also we're looking for speakers for evenings every month this year so please get in touch if you have anything you would like to present.
Posted on 14:38 by Dan Harvey and filed under , , ,

Over the last year HUGUK has gathered a lot of momentum in the UK and London due to the ever growing need for large scale data processing in a range of companies from startups to large established companies. In 2010 HUGUK was being run by Klaas from Last.fm who is now leaving the UK, so I am taking over the running of the meetups in 2011. To start I though I would review the last year and see what people would like from HUGUK over the coming year.

We have had four meetups this year covering quite a range of hadoop related projects such as the Hive and Pig higher level languages, up coming middleware products from companies like Cloudera who have been kind to sponsor the meetups over the last year, and many applications and uses of hadoop in production. Finally we had a half day meetup on HBase which went down well with speakers from Facebook, and StumbleUpon in California. You can find more about all the past meetups on Lanyrd which I've linked to from here on the meetups page.

If you were a speaker at one of these it would be great if you could add yourself and attach sides if you like so users can easily find past talks. Hopefully overtime this will make it easier to find previous HUGUK talks and what's going on with Hadoop in the UK.

Finally I'm starting to organise meetups for next year so please let me know if you have anything you want to present or let people know about? and also what you would like to get from the meetups? Are the current presentations useful? Would more practical workshops be intresting? or maybe more details of the algorithms and research that are being used with hadoop? Let me know and I'll see what I can do!

You can also now follow us on twitter to keep up with hadoop in the UK @hug_uk

Hope you all have a great New Year.

Dan Harvey
Posted on 10:33 by Klaas Bosteels and filed under , ,

All of the HUGUK #7 slides and videos are now available on the Lanyrd page for this meetup. Many thanks to Dario Liberman for going through the trouble of recording the talks and uploading the videos.

Hopefully being able to download the slides and videos can mitigate the disappointment somewhat for the unfortunately quite large group of people who would've liked to attend but didn't manage to get a ticket. Next time we'll have to try and find a bigger room I guess, but in any case the best strategy will probably always be to keep an eye on this blog and register as soon as you can!