You are here


Co-Founder of Mahout Isabel Drost talks about "Going from raw data to information" at ApacheCon US

Apache Mahout - Going from raw data to information is the title of the talk of Isabel Drost at ApacheCon US. Isabel is well-known in the Open Search community. She co-founded the Lucene sub-project Apache Mahout and organizes meet ups of core contributors around Hadoop, Lucene and Mahout in Berlin. This years ApacheCon US is in Oakland California. The talk of Isabel is on Friday, November, 6, 2009 at 10:00 in the morning (track 4).
Isabel Drost - ApacheCon

About the talk: It has become very easy to create, publish, and collect data in digital form. The volume of structured and unstructured data is increasing at tremendous pace. This has led to a whole new set of applications that can be build if one solves the problem of turning raw data into valuable information.

Hadoop Meet up in Berlin

Hadoop LogoAm 8. September, 2008 um 17 Uhr gibt es ein neues Hadoop get together im newthinking store in der Tucholskystr. 48 in Berlin.
This is going to be the second German Hadoop get together in Berlin. Just like last time there will be slots of 20min each for talks on your Hadoop topic. After each talk there will be a lot time to discuss. ...

Talks scheduled so far:

  • The topic of Marc Hofer's talk is: "UIMA scale-out with MapReduce using Apache Hadoop".
  • Rasmus Hahn is going to share his experiences with Hadoop from the perspective of his projects at neofonie (
Apache Hadoop is a free Java software framework that supports data intensive distributed applications running on large clusters of commodity computers.[1] It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google's MapReduce and Google File System (GFS) papers. Hadoop is a top level Apache project, being built and used by a community of contributors from all over the world[2]. Yahoo! has been the largest contributor[3] to the project and uses Hadoop extensively in its Web Search and Advertising businesses.[4]IBM and Google have announced a major initiative to use Hadoop to support University courses in Distributed Computer Programming. [5] Hadoop was created by Doug Cutting (now a Yahoo employee), who named it after his child's stuffed elephant. It was originally developed to support distribution for the Nutch search engine project.[6] (Wikipedia Version 16 August 2008, at 19:45,



[via Isabel Drost]

Build your Own Search Service - BOSS: Yahoo öffnet seine Such-Infrastruktur weiter

Yahoo oeffnet seine Suche mit einem neuen Service, "Build your Own Search Service" (BOSS), weiter als je zuvor. Yahoo hat bereits mit Search Monkey, eine zwar nicht 100% offene, aber zusammenklickbare individuelle Suchmaschine geschaffen. Auch Nutzer anderer Dienste konnten bei Eurekster, Rollyo, Microsoft und Google persönliche Suchdienste zusammenklicken. "Allerdings sind alle diese Dienste in irgendeiner Weise eingeschränkt; vollen Zugriff auf den Index erhalten die Nutzer nicht." (

Yahoo BOSS

BOSS (Build your Own Search Service) is Yahoo!'s open search web services platform. The goal of BOSS is simple: to foster innovation in the search industry. Developers, start-ups, and large Internet companies can use BOSS to build and launch web-scale search products that utilize the entire Yahoo! Search index. BOSS gives you access to Yahoo!'s investments in crawling and indexing, ranking and relevancy algorithms, and powerful infrastructure. By combining your unique assets and ideas with our search technology assets, BOSS is a platform for the next generation of search innovation, serving hundreds of millions of users across the Web. (11.7.2008,

Sind wir mit der derzeitigen Entwicklung bei Yahoo auf dem Weg zu freien Suchmaschinen und offenen Zugang zu Suchtechnologien? Wie Semager kuerzlich berichtete benutzt Yahoo bereits Open Source Suchsoftware fuer seine Webmap.

Hadoop Now at the Heart of Every Yahoo! Search ... On a very related note, we're announcing today that we implemented what we believe is the world's largest commercial application of Apache Hadoop. We are now using Hadoop to process the Webmap -- the application which produces the index from the billions of pages crawled by Yahoo! Search. (19.2.2008,

Subscribe to RSS - Hadoop