Pig Pig Pig!


Have you heard of Pig? It is a very useful tool on Hadoop to read your database. Pig does the MapReduce jobs for me on Hadoop clusters, so that I don't have to write out the lengthy and complicated MapReduce Python code. I got IMDB dataset again, and this time I want to find out which movie is the oldest 5-star movie and which movies are popular bad movies with ratings less than 2. The below file talks about the whole procedure, please take a look.

1 comment:

  1. Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. big data projects for students But it’s not the amount of data that’s important.Project Center in Chennai


    Python Training in Chennai
    Python Training in Chennai
    The new Angular TRaining will lay the foundation you need to specialise in Single Page Application developer. Angular Training Project Centers in Chennai

    ReplyDelete