All Downloads are FREE. Search and download functionalities are using the official Maven repository.

ucar.nc2.iosp.bufr.MessageCompressedDataReader Maven / Gradle / Ivy

Go to download

The NetCDF-Java Library is a Java interface to NetCDF files, as well as to many other types of scientific data formats.

There is a newer version: 4.3.22
Show newest version
/*
 * Copyright (c) 1998 - 2009. University Corporation for Atmospheric Research/Unidata
 * Portions of this software were developed by the Unidata Program at the
 * University Corporation for Atmospheric Research.
 *
 * Access and use of this software shall impose the following obligations
 * and understandings on the user. The user is granted the right, without
 * any fee or cost, to use, copy, modify, alter, enhance and distribute
 * this software, and any derivative works thereof, and its supporting
 * documentation for any purpose whatsoever, provided that this entire
 * notice appears in all copies of the software, derivative works and
 * supporting documentation.  Further, UCAR requests that the user credit
 * UCAR/Unidata in any publications that result from the use of this
 * software or in any product that includes this software. The names UCAR
 * and/or Unidata, however, may not be used in any advertising or publicity
 * to endorse or promote any products or commercial entity unless specific
 * written permission is obtained from UCAR/Unidata. The user also
 * understands that UCAR/Unidata is not obligated to provide the user with
 * any support, consulting, training or assistance of any kind with regard
 * to the use, operation and performance of this software nor to provide
 * the user with any updates, revisions, new versions or "bug fixes."
 *
 * THIS SOFTWARE IS PROVIDED BY UCAR/UNIDATA "AS IS" AND ANY EXPRESS OR
 * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
 * WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
 * DISCLAIMED. IN NO EVENT SHALL UCAR/UNIDATA BE LIABLE FOR ANY SPECIAL,
 * INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING
 * FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT,
 * NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION
 * WITH THE ACCESS, USE OR PERFORMANCE OF THIS SOFTWARE.
 */

package ucar.nc2.iosp.bufr;

import ucar.ma2.*;
import ucar.nc2.Structure;
import ucar.nc2.Sequence;
import ucar.unidata.io.RandomAccessFile;

import java.io.IOException;
import java.util.Formatter;
import java.util.List;
import java.util.HashMap;
import java.util.ArrayList;

/**
 * Reads through the data of a message.
 * Can count bits / transfer all or some data to an Array.
 *
 * @author caron
 * @since Nov 15, 2009
 */

/*
    Within one message there are n obs (datasets) and s fields in each dataset.
    For compressed datasets, storage order is data(fld, obs) (obs varying fastest) :

      Ro1, NBINC1, I11, I12, . . . I1n
      Ro2, NBINC2, I21, I22, . . . I2n
      ...
      Ros, NBINCs, Is1, Is2, . . . Isn

    where Ro1, Ro2, . . . Ros are local reference values (number of bits as Table B) for field i.
    NBINC1 . . . NBINCs contain, as 6-bit quantities, the number of bits occupied by the increments that follow.
     If NBINC1 = 0, all values of element I are equal to Ro1; in such cases, the increments shall be omitted.
     For character data, NBINC shall contain the number of octets occupied by the character element.
     However, if the character data in all subsets are identical NBINC=0.
    Iij is the increment for the ith field and the jth obs.

   A replicated field (structure) takes a group of fields and replicates them.
   Let C be the entire compressed block for the ith field, as above.

        Ci =  Roi, NBINCi, Ii1, Ii2, . . . Iin

   data:
        
        C1, (C2, C3)*r, ... Cs

   where r is set in the data descriptor, and is the same for all datasets.

   A delayed replicated field (sequence) takes a group of fields and replicates them, with the number of replications
   in the data :

         C1, dr, 6bits, (C2, C3)*dr, ... Cs

   where the width (nbits) of dr is set in the data descriptor. This dr must be the same for each dataset in the message.
   For some reason there is an extra 6 bits after the dr. My guess its a programming mistake that is now needed.
   There is no description of this case in the spec or the guide.


   --------------------------

   We use an ArrayStructureMA to hold the data, and fill it sequentially as we scan the message.
   Each field is held in an Array stored in the member.getDataArray().
   An iterator is stored in member.getDataObject() which keeps track of where we are.
   For fixed length nested Structures, we need fld(dataset, inner) but we have fld(inner, dataset) se we transpose the dimensions
     before we set the iterator.
   For Sequences, inner.length is the same for all datasets in the message. However, it may vary across messages. However, we
     only iterate over the inner sequence, never across all messages. So the implementation can be specific to the meassage.

   */

public class MessageCompressedDataReader {

  /**
   * Read all datasets from a single message
   * @param s outer variables
   * @param proto prototype message, has been processed
   * @param m   read this message
   * @param raf from this file
   * @param f  output bit count debugging info (may be null)
   * @return  ArrayStructure with all the data from the message in it.
   * @throws IOException on read error
   */
  public ArrayStructure readEntireMessage(Structure s, Message proto, Message m, RandomAccessFile raf, Formatter f) throws IOException {
    // transfer info (refersTo, name) from the proto message
    DataDescriptor.transferInfo(proto.getRootDataDescriptor().getSubKeys(), m.getRootDataDescriptor().getSubKeys());

    // allocate ArrayStructureMA for outer structure
    int n = m.getNumberDatasets();
    ArrayStructureMA ama = ArrayStructureMA.factoryMA(s, new int[] {n});
    setIterators(ama);

    // map dkey to Member recursively
    HashMap map = new HashMap(100);
    associateMessage2Members(ama.getStructureMembers(), m.getRootDataDescriptor(), map);

    readData(m, raf, f, new Request(ama, map, null));

    return ama;
  }

  /**
   * Read some or all datasets from a single message
   *
   * @param ama place data into here in order (may be null). iterators must be already set.
   * @param m   read this message
   * @param raf from this file
   * @param r which datasets, reletive to this message. null == all.
   * @param f  output bit count debugging info (may be null)
   * @throws IOException on read error
   */
  public void readData(ArrayStructureMA ama, Message m, RandomAccessFile raf, Range r, Formatter f) throws IOException {
    // map dkey to Member recursively
    HashMap map = null;
    if (ama != null) {
      map = new HashMap(2*ama.getMembers().size());
      associateMessage2Members(ama.getStructureMembers(), m.getRootDataDescriptor(), map);
    }

    readData(m, raf, f, new Request(ama, map, r));
  }

  // manage the request
  private class Request {
    ArrayStructureMA ama; // data goes here, may be null
    HashMap map; // map of DataDescriptor to members of ama, may be null
    Range r; // requested range
    DpiTracker dpiTracker; // may be null
    int outerRow; // if inner process needs to know what row its on

    Request(ArrayStructureMA ama, HashMap map, Range r) {
      this.ama = ama;
      this.map = map;
      this.r = r;
    }

    boolean wantRow(int row) {
      if (ama == null) return false;
      if (r == null) return true;
      return r.contains(row);
    }
  }

  // An iterator is stored in member.getDataObject() which keeps track of where we are.
  // For fixed length nested Structures, we need fld(dataset, inner1, inner2,  ...) but we have fld(inner1, inner2, ... , dataset)
  //  so we permute the dimensions
  //  before we set the iterator.
  public static void setIterators(ArrayStructureMA ama) {
    StructureMembers sms = ama.getStructureMembers();
    for (StructureMembers.Member sm : sms.getMembers()) {
      //System.out.printf("doin %s%n", sm.getName());
      //if (sm.getName().startsWith("first"))
      //  System.out.println("HEY");
      Array data = sm.getDataArray();
      if (data instanceof ArrayStructureMA) {
        setIterators( (ArrayStructureMA) data);

      } else {
        int[] shape = data.getShape();
        if ((shape.length > 1) && (sm.getDataType() != DataType.CHAR)) {
          Array datap;
          if (shape.length == 2)
            datap = data.transpose(0, 1);
          else {
            int[] pdims = new int[shape.length]; // (0,1,2,3...) -> (1,2,3...,0)
            for (int i=0; i< shape.length-1; i++) pdims[i] = i+1;
            datap = data.permute( pdims);
          }
          sm.setDataObject(datap.getIndexIterator());
        } else {
          sm.setDataObject(data.getIndexIterator());
        }
      }
    }
  }

  private void associateMessage2Members(StructureMembers members, DataDescriptor parent, HashMap map) throws IOException {
    for (DataDescriptor dkey : parent.getSubKeys()) {
      if (dkey.name == null) {
        //System.out.printf("ass skip %s%n", dkey);
        if (dkey.getSubKeys() != null)
          associateMessage2Members(members, dkey, map);        
        continue;
      }
      //System.out.printf("ass %s%n", dkey.name);
      StructureMembers.Member m = members.findMember(dkey.name);
      if (m != null) {
        map.put(dkey, m);

        if (m.getDataType() == DataType.STRUCTURE) {
          ArrayStructure nested = (ArrayStructure) m.getDataArray();
          if (dkey.getSubKeys() != null)
            associateMessage2Members(nested.getStructureMembers(), dkey, map);
        }

      } else {
        // System.out.printf("Cant find %s%n", dkey);
        if (dkey.getSubKeys() != null)
          associateMessage2Members(members, dkey, map);
      }
    }
  }

  // read / count the bits in a compressed message
  private int readData(Message m, RandomAccessFile raf, Formatter f, Request req) throws IOException {
    
    BitReader reader = new BitReader(raf, m.dataSection.getDataPos() + 4);
    DataDescriptor root = m.getRootDataDescriptor();
    if (root.isBad) return 0;

    DebugOut out = (f == null) ? null : new DebugOut(f);
    BitCounterCompressed[] counterFlds = new BitCounterCompressed[root.subKeys.size()]; // one for each field
    readData(out, reader, counterFlds, root, 0, m.getNumberDatasets(), req);

    m.msg_nbits = 0;
    for (BitCounterCompressed counter : counterFlds)
      if (counter != null) m.msg_nbits += counter.getTotalBits();
    return m.msg_nbits;
  }

  /**
   *
   * @param out       debug info; may be null
   * @param reader    raf wrapper for bit reading
   * @param fldCounters one for each field
   * @param parent    parent.subkeys() holds the fields
   * @param bitOffset bit offset from beginning of data
   * @param ndatasets number of compressed datasets
   * @param req       for writing into the ArrayStructure;
   * @return bitOffset
   * @throws IOException  on read error
   */
  private int readData(DebugOut out, BitReader reader, BitCounterCompressed[] fldCounters, DataDescriptor parent, int bitOffset,
                                  int ndatasets, Request req) throws IOException {

    List flds = parent.getSubKeys();
    for (int fldidx = 0; fldidx < flds.size(); fldidx++) {
      DataDescriptor dkey = flds.get(fldidx);
      if (!dkey.isOkForVariable()) { // dds with no data to read

        // the dpi nightmare
        if ((dkey.f == 2) && (dkey.x == 36)) {
          req.dpiTracker = new DpiTracker( dkey.dpi, dkey.dpi.getNfields());
          //System.out.printf("HEY gotta dpiTracker %n");
        }

        if (out != null) out.f.format("%s %d %s (%s) %n", out.indent(), out.fldno++, dkey.name, dkey.getFxyName());
        // System.out.printf("HEY skipping %s %n", dkey);
        continue;
      }

      BitCounterCompressed counter = new BitCounterCompressed(dkey, ndatasets, bitOffset);
      fldCounters[fldidx] = counter;

      // sequence
      if (dkey.replication == 0) {
        reader.setBitOffset(bitOffset);
        int count = (int) reader.bits2UInt(dkey.replicationCountSize);
        bitOffset += dkey.replicationCountSize;

        reader.bits2UInt(6);
        // System.out.printf("EXTRA bits %d at %d %n", extra, bitOffset);
        if (null != out)
          out.f.format("%s--sequence %s bitOffset=%d replication=%s %n", out.indent(), dkey.getFxyName(), bitOffset, count);
        bitOffset += 6; // LOOK seems to be an extra 6 bits.

        counter.addNestedCounters(count);

        // make an ArrayObject of ArraySequence, place it into the data array
        bitOffset = makeArraySequenceCompressed(out, reader, counter, dkey, bitOffset, ndatasets, count, req);
        // if (null != out) out.f.format("--back %s %d %n", dkey.getFxyName(), bitOffset);
        continue;
      }

      // structure
      if (dkey.type == 3) {
        if (null != out)
          out.f.format("%s--structure %s bitOffset=%d replication=%s %n", out.indent(), dkey.getFxyName(), bitOffset, dkey.replication);

        // p 11 of "standard", doesnt really describe the case of replication AND compression
        counter.addNestedCounters(dkey.replication);
        for (int i = 0; i < dkey.replication; i++) {
          BitCounterCompressed[] nested = counter.getNestedCounters(i);
          req.outerRow = i;
          if (null != out) {
            out.f.format("%n");
            out.indent.incr();
            bitOffset = readData(out, reader, nested, dkey, bitOffset, ndatasets, req);
            out.indent.decr();
          } else {
            bitOffset = readData(null, reader, nested, dkey, bitOffset, ndatasets, req);
          }
        }
        //if (null != out) out.f.format("--back %s %d %n", dkey.getFxyName(), bitOffset);

        continue;
      }

      // all other fields

      IndexIterator iter = null;
      ArrayStructure dataDpi = null; // if iter is missing - for the dpi case
      if (req.map != null) {
        StructureMembers.Member m = req.map.get(dkey);
        if (m == null)
          System.out.printf("HEY missing member %s%n", dkey);
        iter = (IndexIterator) m.getDataObject();
        if (iter == null) {
          //System.out.printf("HEY missing iter %s%n", dkey);
          dataDpi = (ArrayStructure) m.getDataArray();
        }
      }

      reader.setBitOffset(bitOffset);  // ?? needed ??

      // char data special case
      if (dkey.type == 1) {
        int nc = dkey.bitWidth / 8;
        byte[] minValue = new byte[nc];
        for (int i = 0; i < nc; i++)
          minValue[i] = (byte) reader.bits2UInt(8);
        int dataWidth = (int) reader.bits2UInt(6); // incremental data width in bytes
        counter.setDataWidth(8*dataWidth);
        int totalWidth = dkey.bitWidth + 6 + 8*dataWidth * ndatasets; // total width in bits for this compressed set of values
        bitOffset += totalWidth; // bitOffset now points to the next field
 
        if (null != out)
          out.f.format("%s read %d %s (%s) bitWidth=%d defValue=%s dataWidth=%d n=%d bitOffset=%d %n",
                  out.indent(), out.fldno++, dkey.name, dkey.getFxyName(), dkey.bitWidth, new String(minValue, "UTF-8"), dataWidth, ndatasets, bitOffset);

        for (int dataset = 0; dataset < ndatasets; dataset++) {
          if (dataWidth == 0) { // use the min value
            if (req.wantRow(dataset))
              for (int i = 0; i < nc; i++)
                iter.setCharNext((char) minValue[i]); // ??

          } else { // read the incremental value
            int nt = Math.min(nc, dataWidth);
            byte[] incValue = new byte[nc];
            for (int i = 0; i < nt; i++)
              incValue[i] = (byte) reader.bits2UInt(8);
            for (int i = nt; i < nc; i++) // can dataWidth < n ?
              incValue[i] = 0;

            if (req.wantRow(dataset))
              for (int i = 0; i < nc; i++) {
                int cval = incValue[i];
                if (cval < 32 || cval > 126) cval = 0; // printable ascii KLUDGE!
                iter.setCharNext((char) cval); // ??
              }
            if (out != null) out.f.format(" %s,", new String(incValue, "UTF-8"));
          }
        }
        if (out != null) out.f.format("%n");
        continue;
      }

      // numeric fields
      int useBitWidth = dkey.bitWidth;

      // a dpi Field needs to be substituted
      boolean isDpi = ((dkey.f == 0) && (dkey.x == 31) && (dkey.y == 31));
      boolean isDpiField = false;
      if ((dkey.f == 2) && (dkey.x == 24) && (dkey.y == 255)) {
        isDpiField = true;
        DataDescriptor dpiDD = req.dpiTracker.getDpiDD(req.outerRow);
        useBitWidth = dpiDD.bitWidth;
        //System.out.printf("HEY gotta dpiField bitWidth=%d %n", useBitWidth);
      }

      long dataMin = reader.bits2UInt(useBitWidth);
      int dataWidth = (int) reader.bits2UInt(6);  // increment data width - always in 6 bits, so max is 2^6 = 64
      if (dataWidth > useBitWidth && (null != out))
        out.f.format(" BAD WIDTH ");
      if (dkey.type == 1) dataWidth *= 8; // char data count is in bytes
      counter.setDataWidth(dataWidth);

      int totalWidth = useBitWidth + 6 + dataWidth * ndatasets; // total width in bits for this compressed set of values
      bitOffset += totalWidth; // bitOffset now points to the next field

      if (null != out)
        out.f.format("%s read %d, %s (%s) bitWidth=%d dataMin=%d (%f) dataWidth=%d n=%d bitOffset=%d %n",
                out.indent(), out.fldno++, dkey.name, dkey.getFxyName(), useBitWidth, dataMin, dkey.convert(dataMin), dataWidth, ndatasets, bitOffset);

      // numeric fields

      // if dataWidth == 0, just use min value, otherwise read the compressed value here
      for (int dataset = 0; dataset < ndatasets; dataset++) {
        long value = dataMin;

        if (dataWidth > 0) {
          long cv = reader.bits2UInt(dataWidth);
          if ( BufrNumbers.isMissing(cv, dataWidth))
            value = BufrNumbers.missingValue(useBitWidth); // set to missing value
          else // add to minimum
            value += cv;
        }

        // workaround for malformed messages
        if (dataWidth > useBitWidth) {
          long missingVal = BufrNumbers.missingValue(useBitWidth);
          if ((value & missingVal) != value) // overflow
            value = missingVal;     // replace with missing value
        }

        if (req.wantRow(dataset)) {
          if (isDpiField) {
            DataDescriptor dpiDD = req.dpiTracker.getDpiDD(req.outerRow);

            StructureMembers sms = dataDpi.getStructureMembers();
            StructureMembers.Member m0 = sms.getMember(0);
            IndexIterator iter2 = (IndexIterator) m0.getDataObject();
            iter2.setObjectNext(dpiDD.getName());

            StructureMembers.Member m1 = sms.getMember(1);
            iter2 = (IndexIterator) m1.getDataObject();
            iter2.setFloatNext(dpiDD.convert( value));
          } else {
            iter.setLongNext(value);
          }
        }
        // since dpi must be the same for all datasets, just keep the first one
        if (isDpi && (dataset == 0))
          req.dpiTracker.setDpiValue(req.outerRow, value); // keep track of dpi values in the tracker - perhaps not expose

        if ((out != null) && (dataWidth > 0)) out.f.format(" %d (%f)", value, dkey.convert(value));
      }
      if (out != null) out.f.format("%n");
    }

    return bitOffset;
  }

  // read in the data into an ArrayStructureMA, holding an ArrayObject() of ArraySequence
  private int makeArraySequenceCompressed(DebugOut out, BitReader reader, BitCounterCompressed bitCounterNested, DataDescriptor seqdd,
                         int bitOffset, int ndatasets, int count, Request req) throws IOException {

    // construct ArrayStructureMA and associated map
    ArrayStructureMA ama = null;
    StructureMembers members = null;
    HashMap nmap = null;
    if (req.map != null) {
      Sequence seq = (Sequence) seqdd.refersTo;
      int[] shape = new int[]{ndatasets, count};  // seems unlikely this can handle recursion
      ama = ArrayStructureMA.factoryMA(seq, shape);
      setIterators(ama);

      members = ama.getStructureMembers();
      nmap = new HashMap(2*members.getMembers().size());
      associateMessage2Members(members, seqdd, nmap);
    }
    Request nreq = new Request(ama, nmap, req.r);

    // iterate over the number of replications, reading ndataset compressed values at each iteration
    if (out != null) out.indent.incr();
    for (int i = 0; i < count; i++) {
      BitCounterCompressed[] nested = bitCounterNested.getNestedCounters(i);
      nreq.outerRow = i;
      bitOffset = readData(out, reader, nested, seqdd, bitOffset, ndatasets, nreq);
    }
    if (out != null) out.indent.decr();

    // add ArraySequence to the ArrayObject in the outer structure
    if (req.map != null) {
      StructureMembers.Member m = req.map.get(seqdd);
      if (m == null)
        System.out.printf("HEY missing seq %s%n", seqdd);
      ArrayObject arrObj = (ArrayObject) m.getDataArray();

      // we need to break ama into seperate sequences, one for each dataset
      int start = 0;
      for (int i = 0; i < ndatasets; i++) {
        ArraySequence arrSeq = new ArraySequence(members, new SequenceIterator(start, count, ama), count);
        arrObj.setObject(i, arrSeq);
        start += count;
      }
    }

    return bitOffset;
  }

  private class DpiTracker {
    DataDescriptorTreeConstructor.DataPresentIndicator dpi;
    boolean[] isPresent;
    List dpiDD = null;

    DpiTracker(DataDescriptorTreeConstructor.DataPresentIndicator dpi, int nPresentFlags) {
      this.dpi = dpi;
      isPresent = new boolean[nPresentFlags];
    }

    void setDpiValue(int fldidx, long value) {
      isPresent[fldidx] = (value == 0); // present if the value is zero
    }

    DataDescriptor getDpiDD(int fldPresentIndex) {
      if (dpiDD == null) {
        dpiDD = new ArrayList();
        for (int i=0; i




© 2015 - 2024 Weber Informatics LLC | Privacy Policy