Randomized

Reservoir Sampling

Pinterest LinkedIn Tumblr

Reservoir sampling is a family of randomized algorithms for randomly choosing a sample of k items from a list S containing n items, where n is either a very large or unknown number.

Retrieve k random numbers from an array of the undetermined size we use a technique called reservoir sampling. Write a program to display the random strings from the given file?

source Code

package com.dsacode.Algorithm.randomized;
 
import java.util.List;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Random;
import java.util.Scanner;
 
public class ReservoirSampling {
 
    public List < String > sampler (int reservoirSize) throws FileNotFoundException, IOException {
        String currentLine=null;
        List < String > reservoirList= new ArrayList < String >(reservoirSize);
        int count=0;
 
        Random ra = new Random();
        int randomNumber = 0;
         
        @SuppressWarnings("resource")
        Scanner sc = new Scanner(new File("c:/a.txt")).useDelimiter("\n");
        while (sc.hasNext())  {
            currentLine = sc.next();
            count ++;
            if (count <= reservoirSize) {
                reservoirList.add(currentLine);
            } else if ((randomNumber = (int) ra.nextInt(count)) < reservoirSize) {
                reservoirList.set(randomNumber, currentLine);
            }
        }
        sc.close();
        return reservoirList;
    }
     
    public static void main(String[] args) {
        ReservoirSampling mySampler = new ReservoirSampling();
        System.out.println("Print the file content using  Reservoir Sampling\n");
            List < String > myList = null;
            try {
                myList = mySampler.sampler(10);
            } catch (FileNotFoundException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
            for(int index = 0; index < myList.size(); index++){
                System.out.print(myList.get(index));
            }
 
    }
 
}

Output

Print the file content using  Reservoir Sampling
 
import java.util.List;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Random;
import java.util.Scanner;

Algorithm Explanation

Read the text file
Randomly select k items from a stream of items of unknown length.
Display the strings.

Time Complexity

Best CaseAverage CaseWorst Case
O(n)

Reference

  1. https://en.wikipedia.org/wiki/Reservoir_sampling
  2. http://csiq.blogspot.com/2013/10/reservoir-sampling.html
  3. http://gregable.com/2007/10/reservoir-sampling.html

Write A Comment