ch.hevs.io.hadoop

Class MultipleLineTextRecordReader

    • Constructor Summary

      Constructors 
      Constructor and Description
      MultipleLineTextRecordReader(org.apache.hadoop.conf.Configuration job, org.apache.hadoop.mapreduce.lib.input.FileSplit split) 
      MultipleLineTextRecordReader(java.io.InputStream in, long offset, long endOffset, org.apache.hadoop.conf.Configuration job) 
      MultipleLineTextRecordReader(java.io.InputStream in, long offset, long endOffset, int maxLineLength) 
    • Method Summary

      Methods 
      Modifier and Type Method and Description
      void close() 
      org.apache.hadoop.io.LongWritable createKey() 
      org.apache.hadoop.io.Text createValue() 
      org.apache.hadoop.io.LongWritable getCurrentKey() 
      org.apache.hadoop.io.Text getCurrentValue() 
      float getProgress()
      Get the progress within the split
      void initialize(org.apache.hadoop.mapreduce.InputSplit is, org.apache.hadoop.mapreduce.TaskAttemptContext tac) 
      boolean nextKeyValue()
      Read a given number of lines.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        private static final java.util.logging.Logger LOG
      • compressionCodecs

        private org.apache.hadoop.io.compress.CompressionCodecFactory compressionCodecs
      • start

        private long start
      • pos

        private long pos
      • end

        private long end
      • maxLineLength

        int maxLineLength
      • lineRead

        private org.apache.hadoop.io.Text lineRead
      • linesNumber

        private int linesNumber
      • key

        private org.apache.hadoop.io.LongWritable key
      • value

        private org.apache.hadoop.io.Text value
    • Constructor Detail

      • MultipleLineTextRecordReader

        public MultipleLineTextRecordReader(org.apache.hadoop.conf.Configuration job,
                                    org.apache.hadoop.mapreduce.lib.input.FileSplit split)
                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • MultipleLineTextRecordReader

        public MultipleLineTextRecordReader(java.io.InputStream in,
                                    long offset,
                                    long endOffset,
                                    int maxLineLength)
      • MultipleLineTextRecordReader

        public MultipleLineTextRecordReader(java.io.InputStream in,
                                    long offset,
                                    long endOffset,
                                    org.apache.hadoop.conf.Configuration job)
                                     throws java.io.IOException
        Throws:
        java.io.IOException
    • Method Detail

      • createKey

        public org.apache.hadoop.io.LongWritable createKey()
      • createValue

        public org.apache.hadoop.io.Text createValue()
      • getProgress

        public float getProgress()
        Get the progress within the split
        Specified by:
        getProgress in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
      • close

        public void close()
                   throws java.io.IOException
        Specified by:
        close in interface java.io.Closeable
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
        Throws:
        java.io.IOException
      • getCurrentKey

        public org.apache.hadoop.io.LongWritable getCurrentKey()
                                                        throws java.io.IOException,
                                                               java.lang.InterruptedException
        Specified by:
        getCurrentKey in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • getCurrentValue

        public org.apache.hadoop.io.Text getCurrentValue()
                                                  throws java.io.IOException,
                                                         java.lang.InterruptedException
        Specified by:
        getCurrentValue in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • initialize

        public void initialize(org.apache.hadoop.mapreduce.InputSplit is,
                      org.apache.hadoop.mapreduce.TaskAttemptContext tac)
                        throws java.io.IOException,
                               java.lang.InterruptedException
        Specified by:
        initialize in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
        Throws:
        java.io.IOException
        java.lang.InterruptedException
      • nextKeyValue

        public boolean nextKeyValue()
                             throws java.io.IOException,
                                    java.lang.InterruptedException
        Read a given number of lines. Number of read lines must be specified into the property mapreduce.textrecordreader.linecount
        Specified by:
        nextKeyValue in class org.apache.hadoop.mapreduce.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
        Throws:
        java.io.IOException
        java.lang.InterruptedException