
BINEX
Record structure
Synchronization
Record ID bytes
Record message length
Record message
Record checksum/CRC
Reverse record length
Time stamps
Proposed records
Current records
Forward parsing
Conventions
Record ID 0x00
Record ID 0x01
Record ID 0x7d
Record ID 0x7e
Record ID 0x7f
Log
|
 |
BINEX: Binary Exchange Format
Binary Exchange format for GPS/GLONASS/SBAS
Data/Metadata/Ephemerides/Orbits/Solutions
Index:
What is BINEX?
Current BINEX design:
Using BINEX with teqc
BINEX software (including a C/Fortran/Perl callable library)
Compression of BINEX files
BINEX email forum: Subscribe, unsubscribe, search email archive
Related proposals
Contact: UNAVCO data guru
Last modified: 21 July 2004
What is BINEX?
BINEX, for "BINary EXchange",
is a now operational binary format standard for GPS/GLONASS/SBAS research purposes.
It has been designed to grow and allow encapsulation of all (or most) of the
information currently allowed for in RINEX OBS, GPS RINEX NAV, GLONASS RINEX NAV,
RINEX MET, IONEX, SP3, SINEX, and so on, plus other GNSS-related
data and metadata as encountered, including next-generation GNSS.
(One notable exception has been to drop the requirement of being backwards
compatible with RINEX files with NNSS Transit data, if any such files exist.)
BINEX is being used or considered by UNAVCO, UCAR/GST, UCAR/COSMIC, Ashtech, and Trimble in support of
- new data stream standard from certain receivers
- EarthScope PBO GPS raw data
- UCAR/GST Suominet raw data
- certain COSMIC data products
At this time, BINEX should be viewed as an ongoing design, with a possible long-term
goal of a replacement for existing exchange formats (e.g. RINEX, SINEX, IONEX, SP3, etc.).
Some of the design goals of BINEX:
- any two BINEX files must be able to be concatenated -- e.g. using the UNIX cat command
or DOS copy /b command -- to form a new, validly formatted BINEX file
- each BINEX file would be composed of one or more BINEX records
- most (or perhaps all) BINEX records should allow for a wide-range
of possible subrecords
- the data in each BINEX record could be written in either the big- or
little-endian byte order, so that a BINEX file could be optimized for
big- or little-endian processors
- a BINEX parser/reader should be able to parse/read a mixed-endian BINEX file,
whether running on either a big- or little-endian processor
- each BINEX record would have a 1-12 byte CRC (number of bytes in CRC depends on
the record length = sum of bytes in ID, record message length, and the record message itself)
- if desired, individual BINEX records could allow for parsing a BINEX
file back-to-front (i.e. essentially allowing the file to be read backwards)
- all time tags in BINEX records have to be valid at least from 1980 to 3000 A.D.
- BINEX should be designed to allow encapsulation of all information
currently in RINEX, SINEX, SP3, IONEX, etc. ASCII-formatted files currently used
by the GPS/GLONASS/SBAS community in different BINEX records and subrecords
(with the exception of NNSS Transit data)
- BINEX has to be very extensible
Current BINEX design
There are really two distinct phases to defining BINEX:
- defining the generalized record structure
- defining specific records
The first task had to be done very correctly, because once the generalized
record structure was defined and started in use, there could be no
returning to correct an earlier oversight. The generalized record structure
is fairly simple, yet flexible. Since there is no "version" numbers in
BINEX records, a BINEX parser should be able to parse any
BINEX file or data stream, i.e. the parser should be able to:
- identify discreet records
- tell whether each record is secure (i.e. CRC of record matches the record)
The parcer should identify valid records, but not necessarily be able to "translate"
specific records.
Conventions used in this documentation:
Please consult the Web page on BINEX conventions
used in this documentation before proceeding, or, if you get confused, come back to
this first.
Generalized Record Structure:
There are two sets of possible records depending on the level of CRC (cyclic redundancy check)
needed: one with what is called the regular CRC model and another (under design) with what is
called the enhanced CRC model.
The two designs for the regular CRC generalized record structure are:
1 byte: synchronization byte, also containing
little/big endian bit for record
1-4 bytes: record ID
1-4 bytes: record message length in bytes
n bytes: record message
1-16 bytes: checksum or CRC (of ID, length, and message bytes)
and
1 byte: leading synchronization byte, also containing
little/big endian bit for record message
1-4 bytes: record ID
1-4 bytes: record message length in bytes
n bytes: record message
1-16 bytes: checksum or CRC (of ID, length, and message bytes)
1-4 bytes: total number of bytes in:
leading synchronization byte
record ID
record message length
record message
checksum or CRC
with bytes in reverse order
1 byte: terminating synchronization byte, also containing
little/big endian bit for record message
Without worrying about specific details at this point, there is a fundamental
difference between these two designs:
the first allows one to easily read a BINEX file in the forward direction
(consistent with the model used by nearly all native GPS/GLONASS/SBAS formats)--though
reading in a backward direction is still possible;
the second allows one to easily read a BINEX file either forwards or backwards,
but the size of each record is slightly inflated (probably by 3 bytes, in most cases).
The two designs for the enhanced CRC generalized record structure will be similiar,
but allow for yet more secure and reliable data transfer and storage.
A quick aside: Why worry about being able to read a file backwards? The analogy
here is determining the start and end epochs (times) of a RINEX OBS file.
The start time, by definition, is a required RINEX OBS header field, whereas the
end time is an optional RINEX OBS header field--and either may correctly
or incorrectly represent the actual start and end times in the file.
In fact, because the header metadata is unreliable, teqc searches
the actual time tags in the file to determine the start and end times
(ignoring the header metadata for this). It turns out the one of
nice things about RINEX is that one can (usually) read a file backwards.
So, determining the start and end epochs directly from the time tags
does not require reading an entire file forwards to the end. For some
applications, this saves considerable CPU time.
Note: What would be the typical byte-count overhead for each epoch's worth
of GPS/GLONASS/SBAS data represented in BINEX? Usually, about 6 bytes (for the first
of the above structure models), independent of the number of satellites being tracked for
that epoch. For ConanBinary (a fairly compact representation), this compares
with 5 x number of satellites being tracked, e.g. if 10 satellites are
being tracked, 50 bytes are being used in ConanBinary to define the record structure
for a single epoch's worth of data. Additionally, in BINEX the 6 bytes
includes a checksum or CRC for record security, which is totally absent in
ConanBinary.
Synchronization/Endian bytes:
BINEX will utilize both models of the generalized record structure:
the key is the synchronization bytes. In short,
a different synchronization byte is used for the two models. The upshot
is that the bulk of a BINEX file could be composed of the
"forward" readable records, but a reversible record could
be appended on to the end of the file. (If one where to try to
read a "forward" readable record backwards, one would quickly
find that this was invalid; e.g. the last byte would not be equal to
the defined terminating synchronization byte, the CRC for the record would be wrong, etc.)
The current design is that the synchronization byte of any "forward" readable
record is:
- 0xc2 = [11000010] (little endian record message, regular CRC)
- 0xe2 = [11100010] (big endian record message, regular CRC)
or
- 0xc8 = [11001000] (little endian record message, enhanced CRC)
- 0xe8 = [11101000] (big endian record message, enhanced CRC)
and the leading synchronization byte of any "reverse" readable record is:
- 0xd2 = [11010010] (little endian record message, regular CRC)
- 0xf2 = [11110010] (big endian record message, regular CRC)
or
- 0xd8 = [11011000] (little endian record message, enhanced CRC)
- 0xf8 = [11111000] (big endian record message, enhanced CRC)
and the terminating synchronization byte of any "reverse" readable record is:
- 0xb4 = [10110100] (little endian record message, regular CRC)
- 0xb0 = [10110000] (big endian record message, regular CRC)
or
- 0xe4 = [11100100] (little endian record message, enhanced CRC)
- 0xe0 = [11100000] (big endian record message, enhanced CRC)
(Note: These values were selected by using the ASCII hexadecimal values for
the letters
- 'B' and 'b' for the "forward" records, regular CRC [for "BINEX", obviously]
- 'R' and 'r' for the "reverse" records, regular CRC
- 'H' and 'h' for the "forward" records, enhanced CRC
- 'X' and 'x' for the "reverse" records, enhanced CRC
after adding 0x80 (= 128). The terminating values are obtained by reversing and negating the
bits of 0xd2, 0xf2, 0xd8, and 0xf8. More importantly, all these hexadecimal values also
turn out not to be in conflict with any usual leading bytes of other
common GPS native formats.)
Record ID Bytes:
One of the driving philosophies for BINEX is extensibility. Consequently,
the design isn't pigeon-holed in the number of possible BINEX record types that
are allowable. With the unsigned
BINEX integer (ubnxi) method of using one to four bytes for the record ID,
a total of 2^29 ~= 500 million different record types are possible, though usually
only one or two bytes will be necessary. Additionally, most of these records
will probably have a subrecord ID (usually only one byte will be necessary), which
essentially allows for almost 3e17 possible record-subrecord combinations--far more
than enough for the foreseeable future!
The one-byte record IDs (0-127) are reserved for public domain standardized
records, i.e. for the type of information currently in RINEX, IONEX, SP3, SINEX,
etc. files.
The multi-byte record IDs (128-536870911) have initially assigned for private use
in blocks of 4 record IDs on a request basis, allowing over 4000 such requests before
the two-byte record IDs were used up. The idea here is that if JPL, say, wants
a block of BINEX record IDs for internal use. They might estimate they need
20 or so records. Five blocks of 4 records IDs would then be assigned to them.
Any BINEX parser should be able to parse a file containing these JPL BINEX
records once the records were in use; the records would just be recognized as private
and skipped. If, at a later date, the organization that requested a private block
of IDs might develop one or more of their private records to the point where
they felt the general community could make use of them. Then the specified
records would move into the public arena with the necessary documentation so
that a BINEX reader could actually translate those records.
At this time:
- record IDs 0x80 - 0x87 have been assigned to COSMIC/UCAR
(Contact Doug Hunt <dhunt ucar.edu> for more information.)
- record IDs 0x88 - 0xa7 have been assigned to Ashtech Precision Products
(Contact Xinhua Qin <xqin thalesnavigation.com> for more information.)
- record IDs 0xa8 - 0xaf have been assigned to Topcon Positioning Systems
(Contact Dmitry Kolosov <d.kolosov topconps.com> for more information.)
- record IDs 0xb0 - 0xb3 have been assigned to GPS Solutions, Inc.
(Contact Jim Johnson <jjohnson gps-solutions.com> for more information.)
- record IDs 0xb4 - 0xb7 have been assigned to NRCan for IGS Real-Time Working Group GNSS development
(Contact Ken MacLeod <KMacleod NRCan.gc.ca> for more information.)
Record Message Length Bytes:
The record message length is also represented using the
unsigned BINEX integer (ubnxi)
method of using one to four bytes.
Thus, individual records of up to 0.5 Gbytes would be possible, requiring 4 bytes
to store the length, though most records will probably only require one or two
bytes for the record message length. The value for the record message length specifies the number
of bytes in the record message which immediately follow the bytes for the record message length.
Record Message:
Immediately following the record message length byte(s) is the record message. The
format of each record message depends on which type of record it is (identified,
of course, by the record ID).
Record Checksum/CRC:
Each record contains a record checksum that is generated from all of
the bytes in the record ID, the record message length, and the
record message itself. The decision of which type of checksum to
use (and hence, the number of bytes in the checksum) is based on the
total number of bytes covered by the checksum, and on whether the
record format uses regular CRC or enhanced CRC.
The design for this sum is:
- for regular CRC records:
- 0-127 bytes; use 1-byte checksum: 8-bit XOR of all bytes
- 128-4095 bytes; use 2-byte CRC
(generating polynomial = x^16 + x^12 + x^5 + x^0)
- 4096-1048575 bytes; use 4-byte CRC
(generating polynomial = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + x^0)
- >= 1048576 bytes; use 16-byte CRC: 128-bit MD5 checksum
- for enhanced CRC records:
- 0-127 bytes; use 2-byte CRC
(generating polynomial = x^16 + x^12 + x^5 + x^0)
- 128-1048575 bytes; use 4-byte CRC
(generating polynomial = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x^1 + x^0)
- >= 1048576 bytes; use 16-byte CRC: 128-bit MD5 checksum
C/C++, Perl, and Fortran code for the 1-, 2-, and 4-byte checksum/CRC cases
has already been written and tested.
Reverse Record Length Bytes:
For reversible record structure records, i.e. the leading synchronization byte is either
0xd2 or 0xf2, the record length bytes are repeated in reverse order from
what they are at the first part of the record. Note that only the byte order
is reversed; the bit-order within each byte remains the same.
Time Stamps:
Depending on the resolution needed, a variety of time stamps can be used
within various BINEX records, subrecords, and/or fields. All
measure time from 6.0 Jan 1980, coincident with the start of GPS time. All "minutes"
are understood to be standard 60-second minutes. By expressing the basic time
in minutes as an uint4 (unsigned 4-byte integer), the time stamps have a range of slightly
over 8166 years (so starting in 1980, are usable until 10146 A.D.!). Depending on
the time resolution required for a particular use, the seconds are either ignored
or represented in a variety of ways:
- 4-bytes (uint4) = minutes since 6.0 Jan 1980; dividing by 10080 and
ignoring the fractional part = corresponding GPS week, used alone or with one of:
- 1-byte second (uint1) = 0.25 s (second) resolution; 1 = 0.25 second
- 2-byte second (uint2) = ms (millisecond) resolution; 1 = 0.001 second
- 4-byte second (uint4) = 20-ns (nanosecond) resolution; 1 = 20 ns second
- 5-byte second = 0.1-ns resolution; 6 bits of first byte are used to store
the integer second value, the other 34 bits are used to store the fractional
seconds as 0.1 nanoseconds (1 = 0.1 ns)
- 6-byte second = 0.25-ps (picosecond) resolution; 6 bits of first byte are used to store
the integer second value, the other 42 bits are used to store the fractional
seconds as 0.25 picoseconds (1 = 0.25 ps)
- 8-byte second (real8) = seconds stored as 8-byte floating-point number (i.e. real8)
Proposed Records:
The assignment of the first few public records IDs are fairly straightforward:
- record ID 0x00 = 0:
site/monument/marker/reference point/visit
metadata; think of this as a superset of the header information in RINEX OBS and/or
RINEX MET files and/or selected information in IGS logs
- record ID 0x01 = 1:
ephemeris/orbit (i.e. navigation) information; think of this as the ephemeris information in
GPS or GLONASS or SBAS RINEX NAV files, or orbit positions in SP3 files, etc.
- record ID 0x02 = 2:
GNSS time tag/channel info/constellation/observables;
think of this as corresponding to the entire epoch of information
in a RINEX OBS file, plus allowing for channel information, better specifications
as to which observable to being reported, higher resolution in the observables themselves,
etc.; constellations could be GPS, GLONASS, SBAS, or future generation GNSS
- record ID 0x03 = 3:
site ancillary data; think of this as all the possible RINEX MET data that's currently
defined, plus any other pertinent meteorological or geophysical data that could be measured at the site
- record ID 0x7d = 125:
prototyping for receiver internal state
- record ID 0x7e = 126:
prototyping for ancillary site data
(see record ID 0x03 = 3 above); this record allows for a prototyping test bed;
various subrecords will define and allow testing of specific prototype algorithms
- record ID 0x7f = 127:
prototyping for GNSS time tag/channel info/constellation/observables
(see record ID 0x02 = 2 above); since compact storing of the observables is the trickiest part,
this record allows for a prototyping test bed; various subrecords will define and allow testing of
specific prototype algorithms
(Only another 121 public records to go!)
Current Outline for Various Records:
- Record 0x00 = 0 for site/monument/marker/reference point/setup metadata
- Record 0x01 = 1 for GNSS navigation information
- Record 0x02 = 2 for generalized GNSS data
(no subrecords proposed at this time; see Record 0x7f instead)
- Record 0x03 = 3 for generalized ancillary site data
(no subrecords proposed at this time; see Record 0x7e instead)
- Record 0x7d = 125 for receiver internal state prototyping
(initial subrecord 0x00 developed for PBO)
- Record 0x7e = 126 for ancillary site data prototyping
(initial subrecord 0x00 developed for SuomiNet)
- Record 0x7f = 127 for GNSS data prototyping
(initial subrecord 0x00, 0x01, and 0x02)
At this point, the above BINEX record outlines should be intepreted
as complete up to the point that they have been defined, though generally allowing
for additions to be made in the future.
BINEX Forward Record Parsing Algorithm
The forward parsing of all BINEX files or data streams follows the same algorithm,
and makes use of the
generalized BINEX record structure.
The first valid byte will be the synchronization and endian byte, either:
- 0xc2 = record is little-endian, "forward" readable, regular CRC
- 0xe2 = record is big-endian, "forward" readable, regular CRC
- 0xd2 = record is little-endian, "forward and backward" readable, regular CRC
- 0xf2 = record is big-endian, "forward and backward" readable, regular CRC
- 0xc8 = record is little-endian, "forward" readable, enhanced CRC
- 0xe8 = record is big-endian, "forward" readable, enhanced CRC
- 0xd8 = record is little-endian, "forward and backward" readable, enhanced CRC
- 0xf8 = record is big-endian, "forward and backward" readable, enhanced CRC
Search the file or data stream from the start forward until one of
these bytes is located. A possible synchonization/endian byte has now been found.
At this point, it is already assumed that user knows the "endian-ness" of
the processor being used. With the reading of the synchronization/endian
byte, the user should establish the values of two other boolean values:
- BINEX_little_endian_record: set equal to TRUE if the endian-ness of this
record is little-endian; set equal to FALSE otherwise
- BINEX_bytes_reversed: set equal to TRUE if the endian-ness of your
processor and the endian-ness of this record are reversed; set equal to FALSE
otherwise
The next 1-4 bytes will establish the record ID of the record. The
C/C++ function read_ubnxi() can be used to do this.
The next 1-4 bytes will establish the length of the message in the record.
The C/C++ function read_ubnxi() can be used to do this.
Next, the number of bytes specified in the length of message are now
actually read. These bytes form the body of message which will have to
be decoded on a per record basis.
Next, 1-16 bytes will be read which form the checksum or CRC for this particular
record. The number of bytes used in the checksum/CRC depends on the total
number of bytes needed for the record ID, length of record message, and the
record message itself. The value of the checksum/CRC should be verified
against the checksum or CRC generated by the actual bytes collected for
this record in the record ID bytes, the length bytes, and the record message bytes.
If there is not 100% agreement, integrity of this record is suspect, or
the synchronization/endian byte selected was actually another part of
a valid record. Start at the byte after the suspect synchronization/endian
byte selected for this pass, and search for the next possible synchronization/endian
byte and repeat the process until a valid checksum/CRC match occurs.
For a synchronization/endian byte value of
the user has now successfully read in an entire forward readable only BINEX record,
and can proceed with decoding the contents of the message bytes according to
the value of the record ID value.
For a synchronization/endian byte value of
the user must complete the reading of the tail end of the current record,
since these leading synchronization/endian byte values indicate a
forward and backward readable record.
This involves reading the number of length bytes for the message again.
The value of the bytes read should be identical to the value of the bytes read earlier
in the record for the length, except the bytes are in reverse order.
Next the terminating synchronization/endian byte is read.
- If the leading synchronization/endian byte was 0xd2,
the valid terminating synchronization/endian byte will be 0xb4.
- If the leading synchronization/endian byte was 0xf2,
the valid terminating synchronization/endian byte will be 0xb0.
- If the leading synchronization/endian byte was 0xd8,
the valid terminating synchronization/endian byte will be 0xe4.
- If the leading synchronization/endian byte was 0xf8,
the valid terminating synchronization/endian byte will be 0xe0.
If the terminating length bytes and the terminating sychronization/endian byte
are what they should be, the user has now successfully read a forward and backward
readable BINEX record in the forward direction, and can proceed with decoding
the contents of the message bytes according to the value of the record ID value.
Using BINEX with teqc
BINEX C/C++ functions are being prototyped as part of the teqc engine.
You can try using teqc with BINEX with any
teqc version after ~1 July 1999.
BINEX software
Prototype BINEX software is available. This consists
of C/C++ functions, which can be used as is or modified to create your own.
Also, Doug Hunt (dhunt ucar.edu) at COSMIC/UCAR has modified, added to,
and improved on this prototype code for the standardized reading and writing of BINEX
files, resulting in a BINEX library which can be used in C, Fortran, and Perl code.
See:
Compression of BINEX files
Testing of standard UNIX and GNU compression programs has been tested on
BINEX files made up of mostly
0x7f-00 records (GPS observables), with a few
0x01-01 records (GPS navigation messages).
The compression programs gzip and zip, both with and without the
optimal compression using the -9 option, compressed the test files
nearly equally and seem to have the most effect, compressing files by 20-30%.
The compression program compress has little effect, compressing files by
only a few %. In all cases, the test files with big-endian records compressed
slightly better than the files with little-endian records of the same data
(and we have no idea why this happens).
BINEX email forum
If you want to be included in the BINEX email forum, please go to the
ls.unavco.org / mailman / listinfo / binex support page and subscribe.
(Remove spaces from the italicized address. We have not hyperlinked the preceding URL to help prevent spammers,
web-bots, and harvesters from learning about mail addresses at UNAVCO.)
Once subscribed, emails about BINEX questions and issues should then be sent to:
binex unavco.org
You will be notified of BINEX related issues, etc.
Related Proposals
Last modified Wednesday, 26-Jul-2006 12:02:37 MDT
|