Tuesday, September 10, 2013

A Naive Implementation of Rabin Karp


Compile with gcc filename.c -lm -Werror -Wall

/*
 *
 * Program to check if a string is a substring of another
 * string. We will use Rolling Hash for this. Rabin Algorithm
 *

 */

#include
#include
#include

/*
 * The PRIME has to be big but for debugging I have taken a smaller value
 */
#define PRIME 5

/*
 * Function which will find the initial hash
 */
long hash(char *str, int window, int prime) {
        unsigned long hash=0;

        int i, temp;
        for(i=0; i                temp=str[i];
                temp=temp*pow(prime, window-1-i);
                hash=hash+temp;
        }
        return hash;
}

/*
 * It finds the rolling has by removing the first character and
 * adding one more character
 */

long rehash(char *str, int window, int prime, long oldhash) {
        char first=str[0];
        char last=str[window];
        long temp=first*pow(prime,window-1);
        long hash=oldhash-temp;
        hash=hash*prime;
        hash=hash+last;
        return hash;
}


/*
 * substr function works hard to find if the string is a
 * substring or not.
 */

char *substr(char *haystack, char *needle) {
        int i=0;
        long temphash;
        int window=strlen(needle);

        /*
         * Find the hash of the needle
         */
        long needleHash=hash(needle, window, PRIME);

#ifdef DEBUG
        /*
         * This part of the code just prints the different hash for the
         * particular window
         */
        printf("\nHash of the needle is %ld", needleHash);
        for (i=0; i                printf("\nRolling hash of %.*s is %ld", window, haystack+i, temphash=hash(haystack+i, window, PRIME ));
        }
#endif
        /*
         * Find the hash of the first window
         */
        temphash=hash(haystack, window, PRIME);

        /*
         * Find the hashes for the next windows
         */
        for (i=1; i                temphash=rehash(haystack+i-1, window, PRIME, temphash);

#ifdef DEBUG
                printf("\nRolling hash Actual of %.*s is %ld", window, haystack+i, temphash);
#endif
                /*
                 * You have the hash of the next window, check
                 * against the needle hash
                 */
                if (temphash==needleHash) {
                        printf("\nHere is the start of the string \"%s\"", haystack+i);
                        /*
                         * Here if you wish you can write code to check it
                         * byte by byte to rule out chances of collison
                         */
                        return haystack+i;
                        return 0;
                }
        }
        return NULL;
}

/*
 * main
 */
int main() {
        char haystack[]="I Love My Country Very Much";
        char needle[]="Much";
        char *str=substr(haystack, needle);
        if (str) {
                printf("\nThe substr is \"%s\"", str);
        } else {
                printf("\nCould not find the substring");
        }
       
        str=substr("HelloWorldIsTheTestString", "Hey");
        if (str) {
                printf("\nThe substr is \"%s\"", str);
        } else {
                printf("\nCould not find the substring");
        }
        return 0;
}

Wednesday, April 25, 2012

Editing PDFs in Linux

Sometimes we need to edit the PDF files to make it suite our requirements. There are many tools available for it like pdfedit, pdfcrop, gpdftext etc. on Linux and lot other proprietary software in Windows.

I found editing with GIMP (GNU Image Manipulation Program) to be the best and easiest one, though it requires some hands-on with itself or Adobe Photoshop.

Steps

  1. First open the PDF file with GIMP. You will see all the pages come as layers.
  2. Go to the particular layer you want to edit. Make changes using the various versatile tools available for editing. 
  3. In my case I had to remove the white-space around the contents so that while printing the paper is fully utilised. For cropping you need to just crop the first layer and all the layers are cropped.
  4. Check the layers for accidental croppings.
  5. Save the file, exit from GIMP
  6. Now comes the interesting part. We need to save all the layers as separate images so that we can concat them to one pdf.
  7. We can do this by using this magnificent script. Before this script I was unaware that we can write scripts in GIMP. The author has done a great task.
  8.  Following is the script if the links are broken
    #!/usr/bin/env python

    from gimpfu import *

    from os.path import join

    def save_all_layers(image, __unused_drawable, directory, name_pattern):

        for layer in image.layers:
            filename = join(directory, name_pattern % layer.name)
            raw_filename = name_pattern % layer.name
            pdb.gimp_file_save(image, layer, filename, raw_filename)

    register(

        "python_fu_save_all_layers",
        "Save all layers into separate files",
        "Save all layers into separate files",
        "Lie Ryan",
        "Lie Ryan",
        "2012",
        "/File/Save layers as images...",
        "*",
        [
            (PF_DIRNAME, "directory", "Directory to put the images on", "/"),
            (PF_STRING, "name_pattern", "Pattern for file name, %s will be replaced with layer name", "%s.png"),
        ],
        [],
        save_all_layers
    )

    main()

  9. Save the script in the folder ".gimp-X-X/plug-ins/save_all_layers.py
  10. Restart the GIMP and open the file.
  11. In the menu you will have a new entry "FILE>Save all layers into separate files"
  12. Add the required prefix, better add "page" as prefix. Give the folder where you want to save the images.
  13. Now using the convert tool on your Linux systems you need to change the images to a single pdf. 
  14. For those who do not have convert installed run "sudo apt-get install imagemagick". It will install the related tools.
  15. Running "convert page*.png" mydoc.pdf will give the pages in incorrect order, the order will be same as that of the order in "ls" command. If the pages are more than 100 then it will create ordering issues in the pdf.
  16. For solving this find the maximum number of images files, and run the following script.
  17. #!/bin/sh
    main() {
        start=$1
        end=$2
        prefix=$3
        ext=$4
        output_file=$5
        echo "convert --quality=100" > convert.sh
        for i in `seq $start $end`; do
            echo "$prefix$i.$ext" >> convert.sh
        done
        echo "$output_file" >> convert.sh
       
        tr '\n' ' ' < convert.sh > convert2.sh
        mv -f convert2.sh convert.sh
        chmod 777 convert.sh
    }

    if [ "$#" != "5" ]; then
        echo "Error in input"; echo "Usage -- START_NUMBER END_NUMBER PREFIX EXTENSION OUTPUT_FILE"
        exit
    else
        main $1 $2 $3 $4 $5".pdf"
    fi
     
  18. Save the following in any file, give it execute permission "chmod 777 gen_script.sh"
  19. call the script with proper inputs for example "./gen_script START_NUMBER END_NUMBER PREFIX EXTENSION OUTPUT_FILE"
    1. where 
      1. START_NUMBER is the first page number
      2. END_NUMBER is the last page number
      3. PREFIX is the prefix of every file you gave in GIMP
      4. EXTENSION is the extension is the file extension of the images
      5. OUTPUT_FILE is the name of the pdf file you want to generate
  20. Example: ./gen_script.sh 1 210  page png mydoc.pdf
  21. You will get a script named convert.sh in your directory. You just need to run the script and you have the final edited pdf.

Tuesday, July 27, 2010

Saving FLV videos from browser's cache


Rather than installing plug-ins in firefox for downloading videos you can easily copy the viewed video from your firefox's cache. This is how you do it. This is specific to Linux.

Find the location of your browser's cache on the disk 
Ideally it should be located at /home/YOUR-HOME-DIR/.mozilla/firefox/YOUR_PROFILE/Cache

Use the file command to find the flv/flash files in the cache
Sort them by time to search through them easily
   ls -t arranges the output with respect to the modification time
   # ls -t | xargs file | grep -i flash
   A9E2629Ed01: Macromedia Flash Video
   D69CD130d01: Macromedia Flash data (compressed), version 10
   792592C0d01: Macromedia Flash Video
   DBABC51Ed01: Macromedia Flash data (compressed), version 10
   21DF1037d01: Macromedia Flash data (compressed), version 10
   43258D0Fd01: Macromedia Flash data (compressed), version 8
   17DA266Fd01: Macromedia Flash data (compressed), version 7
 


By some hit and trial you will be able to find the file you want to save
    Play the files using any video player which supports flv videos
    # vlc A9E2629Ed01 &

Use mv command to rename and move the file to the desired location
   # mv A9E2629Ed01 ~/Videos/rise_up.flv

You are done :-)