Matlab Training Session 10: Loading Binary Data

Matlab Training Session 10:
Loading Binary Data

Course Website:
http://www.queensu.ca/neurosci/Matlab Training Sessions.htm

Course Outline
Term 1
1. Introduction to Matlab and its Interface
2. Fundamentals (Operators)
3. Fundamentals (Flow)
4. Importing Data
5. Functions and M-Files
6. Plotting (2D and 3D)
7. Plotting (2D and 3D)
8. Statistical Tools in Matlab
Term 2
9. Term 1 review
10. Loading Binary Data
Weeks 11-14 Topics: Statistics, Creating Gui’s, Exponential curve
fitting ….


Week 5 Lecture Outline
loading Binary Data
A. Week 5 Review – Importing Text Data
B. Binary Encoding
C. Binary Data Formats
D. Exercise

A. Week 5 Review:
Importing Text Data
• Basic issue:
– How do we get data from other sources into
Matlab so that we can play with it?

• Other Issues:
– Where do we get the data?
– What types of data can we import?

• Lots of options to load files
– load for basics

– fscanf for complex
– textread for any text
– xlsread for Excel worksheets

load
• Command opens and imports data from a
standard ASCII file into a matlab variable
• Usage: var_name = load(‘filename’)
• Restrictions
– Data must be constantly sized
– Data must be ASCII
– No other characters

load
• Works for simple and unstructured code
• Powerful and easy to use but limited
• Will likely force you to manually handle
simplifying data which is prone to error
• More complex functions are more flexible


File Handling
• f* functions are associated with file opening,
reading, manipulating, writing, …
• Basic Functions of Interest for opening and
reading generic files in matlab






fopen
fclose
fseek/ftell/frewind
fscanf
fgetl

fopen
• Opens a file object in matlab that points to
the file of interest

• fid = fopen(‘filepath’)
• fid is an integer that represents the file
– Can open multiple files and matlab will assign
unique fids

fclose
• When you are done with a file, it is a good idea
to close it especially if you are opening many
files
• fclose(fid)

What is a File?
• A specific organization of data
• In matlab it is identified with a fid
• Location is specified with a pointer that can be
moved around
fid

Pointer
file_name


Moving the Pointer
• We already know how to assign a fid (fopen)
• To find where the file is pointing:
– x = ftell(fid)

• To point somewhere else
– fseek(fid,offset,origin)
• Move pointer in file fid by offset relative to origin
– Origin can be beginning, current, end of file

• To point to the beginning
– frewind(fid)

Getting Data
• Why move the pointer around?
– Get somewhere in the file from where you want data

• fscanf(fid,format,size)
• Format

– You have to tell matlab the type of data it should be
expecting in the text file so that it can convert it
• ‘%d’, ‘%f’, ‘%c’

• Size
– You can specify how to organize the imported data
• [m,n] – import the data as m by n, n can be infinite
• Be careful because matlab will mangle your data and not tell
you

Getting Data
• fgetl returns the next line of the file as a
character array
• You may need to convert these to numbers
>> fid1 = fopen(‘test1.txt’);
>> a_str = fgetl(fid1)
a_str = 1 2
>> a_num = str2num(a_str)
a_num = [1 2]


B. Binary Encoding
• All data files are binary encoded
• ASCII text format is generally the easiest
because it is relatively simple, easy to visualize
in a text editor, and is a common output format
BUT
• ASCII text is not the fastest or the most efficient
way of encoding data
• Not all data files are ASCII!

B. Binary Encoding
• Binary data consists of sequences of 0’s and 1’s
• 10101010101010101000010111110111101011
• Depending on the encoding used, individual
meaningful values will occur every 4, 8, 16, 32 or
64 bits


For a tutorial on converting between binary and decimal numbers
see: http://www.rwc.uc.edu/koehler/comath/11.html


B. Binary Encoding
• Binary data consists of sequences of 0’s and 1’s
• 1010 1010 1010 1010 1000 0101 1111
• Depending on the encoding used, individual
meaningful values will occur every 4, 8, 16 or 32
bits

B. Binary Encoding
• Binary data consists of sequences of 0’s and 1’s
• 10101010 10101010 10000101 11110111
• Depending on the encoding used, individual
meaningful values will occur every 4, 8, 16 or 32
bits

B. Binary Encoding
• Binary data consists of sequences of 0’s and 1’s
• 1010101010101010 1000010111110111
• Depending on the encoding used, individual
meaningful values will occur every 4, 8, 16 or 32

bits

B. Binary Encoding
• Each group of bits can represent a value,
character, delimiter, command, instruction ect.
• Generally binary data is divided into 8 bit (1
byte) segments
• 00000000 = zero
• 11111111 = 255
• IT IS VERY IMPORTANT TO KNOW WHAT
FORMAT THE DATA IS IN BEFORE YOU CAN
READ IT!

ASCII ENCODING
• ASCII: American Standard Code for Information
Interchange (1968).
• ASCII every character is coded by only seven
bits of information. The eighth bit is ignored (it can
be a zero or one).
• ASCII consists of 127 characters which include

uppercase, lowercase, spaces and formatting
characters
•See www.asciitable.com for the full ascii table

ASCII vs Simple Binary
Encoding
• ASCII requires 1 byte to be used for every
character
Data Table:
105 124 27
101 102 111
• In ascii 1 byte is used for every character, space
and carriage return = 23 bytes
• If this was encoded in a simple 8 bit binary
representation this would only use 11 bytes (1
byte for every number and space)

Binary Precision
• The number of bits used to represent a value
determines how large or small that value can be

•8 bits 0 to 256
•16 bits 0 to 65536
•32 bits 0 to 4.2950e+009
•Precision also determines how many decimal
places can be represented

C. Binary Formats:
Integers and Characters
'schar' Signed character; 8 bits
'uchar' Unsigned character; 8 bits
'int8'
Integer; 8 bits
'int16' Integer; 16 bits
'int32' Integer; 32 bits
'int64' Integer; 64 bits
'uint8' Unsigned integer; 8 bits
'uint16' Unsigned integer; 16 bits
'uint32' Unsigned integer; 32 bits
'uint64' Unsigned integer; 64 bits

* The first bit denotes
the sign if the integer
or character is signed.

Readable Binary Data Formats
Floating Point Representation
Used for numbers that require decimal representation
(real numbers)
•Established by IEEE (Institute of Electrical and
Electronics Engineers )
• Encoded in 32 (single precision) or 64 bits (double
precision)

• Single precision(short): 32 bits 1 bit for the sign, 8 bits for the
exponent, and 23 bits for the mantissa.
• Double precision(Long) Real: 64 bits 1 bit for the sign, 11 bits for the
exponent, and 52 bits for the mantissa.

Readable Binary Data Formats
Floating Point Representation
• By default matlab stores all values with double
precision
• The functions realmax and realmin return max
and min value representations
'float32‘, ‘single’
'float64', 'double'

Floating-point; 32 bits
Floating-point; 64 bits

Specifying Machine Formats
• The computer system used to record or save the
binary data in unique addressing orders
• In order to load binary data from a particular
system, Matlab needs to know the machine
format
•You can use the fopen function to determine the
machine format
[filename, mode, machineformat] = fopen(fid)

Binary File Machine Formats
'ieee-be' or 'b‘:
IEEE floating point with big-endian byte ordering
'ieee-le' or 'l' :
IEEE floating point with little-endian byte ordering
'ieee-be.l64' or 's‘: IEEE floating point with big-endian byte ordering
'ieee-le.l64' or
'native' or 'n' :
'vaxd' or 'd' :
'vaxg' or 'g' :

and 64-bit long data type
'a‘: IEEE floating point with little-endian byte ordering
and 64-bit long data type
Numeric format of the machine on which MATLAB
is running (the default)
VAX D floating point and VAX ordering
VAX G floating point and VAX ordering

Reading Binary Data
• The function fread() performs all binary data
reading in matlab
Syntax
A = fread(fid)
A = fread(fid, count)
A = fread(fid, count, precision)
A = fread(fid, count, precision, skip)
A = fread(fid, count, precision, skip, machineformat)
[A, count] = fread(...)

Reading Binary Data
Input Arguments:
Count:

x: read x elements
Inf: read to end of file
[m,n]: read enough to fill a m by n matrix

Precision:

Specify input data format eg. Int8, int16, short,
long… see previous slides

Skip:

Skip specified number of bits between
segments specified by the Precision argument

MachineFormat:

Specify machine format 'ieee-be‘, 'ieee-le‘…..
See previous slides

Exercise
Load and plot position data saved in: week10data.rob


This file contains binary position data saved in 32 bit floating point
format precision

1. Use the fopen function to determine the machine format
hint: [fname, mode, mformat] = fopen(fid)
2. Load the data using the fread function
3. Plot the position
4. Try loading the data with an incorrect argument to see how this
changes/corrupts the data

Exercise Solution
fid = foopoeono(o'wdeoeo1�0edataat.rrobr'w''wr'w羂 %oopoeono fileo foor reoatdinnoe
%oDeoaeorminnoeo fileo foormata
[fonoatmeo' modeo' mfoormata] = foopoeono(ofid羂
%oFormata ins ineoeoeo-leo
%oReoatd brinnoatry dataat
poos_dataat = foreoatd(ofid' innofo' 'wsinnoeleo'w' 'wineoeoeo-leo'w羂
poloa(opoos_dataat羂 %o poloa poosinainono dataat
focloseo(ofid羂 %o closeo fileo

Getting Help
Help and Documentation
Digital
1. Accessible Help from the Matlab Start Menu
2. Updated online help from the Matlab Mathworks website:
http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.html
3. Matlab command prompt function lookup
4. Built in Demo’s
5. Websites

Hard Copy
3. Books, Guides, Reference
The Student Edition of Matlab pub. Mathworks Inc.