Middlemind Games

Tutorial 0: Tensorflow Linear Regression

Author: Brian A. Ree

Files/Resources for this Tutorial
Project Files v0.4.0.7_python2.7 (PyCharm)
TensorFlow For Machine Intelligence: A hands-on introduction to learning algorithms

0: Tensorflow Linear Regression Loading Data: Intro

Hello, welcome to our tensorflow linear regression tutorial. In this tutorial you will learn how to load stock data from a CSV file into local data structures for feature development and eventually a linear regression model in tensorflow. For this tutorial you should have PyCharm setup on your computer. PyCharm is a python IDE that has a free community edition and is a great all around IDE. You can download and install the IDE from here. Follow the simple setup instructions and load up our StockWorm project. Now you're ready to go.

1: Tensorflow Linear Regression Loading Data: Data Structures

So first thing's first, I'm not a procedural language coder, and I like my curly brackets. So I decided to end all my python blocks of code with comments that make me feel warm and fuzzy inside like the curly brackets would have. A primary focus of tensorflow is to provide machine learning APIs to the masses but in order to start crunching numbers we're going to need some data. Settings up a database and populating it with stock pricing data is a bit beyond the scope of this tutorial but I can show you how to load up CSV files into python data structures, shall we? But before I dive into that here is the general process we'll be creating in this tutorial.
1. Load CSV data into local data structures.
2. Generate or pinpoint features we want to use tensorflow to analyze for us.
3. Convert our python data structures in tnesors.
4. Push our tensors into a linear regression model.
5. Train and evaluate the model.
So let's dive right in. First up we'll create a generic object that is capable of holding different types of data from CSV files. I don't want to have to recode a lot to introduce new data types so we're going to focus on abstracting certain parts of this process so that the process can remain general. Ideally I'd like to have one and only one specific class associated with the source data. The rest of the linear regression model should be plug and play. This is an advanced tutorial so I'm going to assume certain python coding constructs are familiar or that you know where to look them up. First up our DataRow class.
```
import sys
import os

class DataRow:
    """ A class for holding generic CSV data. """

    # Base variables
    colData = {}
    colName2Idx = {}
    colIdx2Name = {}
    id = -1
    NEXT_ID = 0

    # Internal variables
    error = False
    verbose = False
    ignoreError = False

    def __init__(self, lVerbose=False, lIgnoreError=False):
        self.verbose = lVerbose
        self.ignoreError = lIgnoreError
    # edef

    def copy(self):
        r = DataRow()
        r.verbose = self.verbose
        r.id = self.id
        r.colData = copy.copy(self.colData)
        r.colName2Idx = copy.copy(self.colName2Idx)
        r.colIdx2Name = copy.copy(self.colIdx2Name)
        return r
    #def

    def stampId(self):
        self.id = DataRow.NEXT_ID
        DataRow.NEXT_ID += 1
    # edef

    def printDataRow(self):
        for key in self.colData:
            print("%s: %s" % (key, self.colData[key]['value']))
        # efl
        print("---------------------")
    # edef

    def toRow(self):
        csvRow = ''
        for key in self.colData:
            csvRow += self.colData[key]['value'] + ","
        # efl
        csvRow = csvRow[:-1]
        csvRow += os.linesep
        return csvRow
    # edef

    def fromRow(self, csvRow):
        cols = csvRow.strip().split(',')
        colnum = 0
        for col in cols:
            self.setMember(colnum, col)
            colnum += 1
        # efl
    # edef

    def setMember(self, name, idx, val):
        try:
            self.colIdx2Name[idx] = name
            self.colName2Idx[name] = idx
            self.colData[name] = {'value': val, 'idx': idx}
            return True
        except:
            if self.ignoreError == False:
                self.error = True
            # eif
            if self.verbose:
                print("Error setting member with index: ", idx, " with value: ", val)
                print("Unexpected error:", sys.exc_info()[0])
            # eif
        # etry
        return False
    # edef

    def getMemberByIdx(self, idx):
        return self.colData[self.colIdx2Name[idx]]
    # edef

    def getMemberByName(self, name):
        return self.colData[name]
    # edef

# eclass
```
Let's review our class variables first.
- colData: A dictionary for holding all the column data for this row.
- colName2Idx: A dictionary for converting the column name to the index of that column.
- colIdx2Name: A dictionary for converting the index to the name of that column.
- id: A unique id value for this row of data.
- NEXT_ID: A static value used to keep track of unique ids.
- error: An internal variable that flags whether an exception occured when setting a value in the setMember method.
- verbose: A boolean flag that indicates whether or not verbose logging is turned on.
- ignoreError: A boolean flag that indicates whether or not to ignore any error flags the come up.
Next up is our class methods.
- __init__: Default constructor for the class.
- printDataRow: A custom printing method, I didn't use built in python stuff.
- copy: Creates a new copy of this object with the same data.
- stampId: Used to set a unique id for this data row object, each row gets a unique incrementing id.
- toRow: A simple method for creating a CSV export of this object.
- fromRow: A simple method for importing a CSV export from this class.
- setMember: A general method for setting a class value. The method and class members are data driven from the CSV file headers.
- getMemberByIdx: Returns the value of a member at the given index.
- getMemberByName: Returns the value of a member with the given name.
Take a look at the contructor yourself. It is very simple. We'll review the setMember method first so that you know how our class data is stored. This will make the accessor methods more intuitive.
```
def setMember(self, name, idx, val):
    try:
        self.colIdx2Name[idx] = name
        self.colName2Idx[name] = idx
        self.colData[name] = {'value': val, 'idx': idx}
        return True
    except:
        if self.ignoreError == False:
            self.error = True
        # eif

        if self.verbose:
            print("Error setting member with index: ", idx, " with value: ", val)
            print("Unexpected error:", sys.exc_info()[0])
        # eif
    # etry

    return False
# edef
```
Because we want to be able to support almost any CSV file we use a dynamic way of storing our row information in a local class dictionary. The setMember method is the way we set our class variables and their values. The name, index of the column, and value are the arguments to the method. The colData dictionary adds a key with the value of name. It sets the value of that key to a dictionary that contains the value of the column and the index of the column. I decided to add this layer of abstraction because now we have the ability to add more meta information about each column in our CSV file. We could potentially add data types, default values, cleaning methods, conversion methods etc to the dictionary.

You'll notice that we also make entries into colName2Idx, and colIdx2Name. These dictionaries make it easy for us to convert between indexes and column names. We can also lookup each column name iteratively now. THe last thing we want to look at is the exception handling. If there is an issue storing the dictionary values we can choose to set an error flag. This can then be picked up later in a cleaning or re-evaluation loop. This might not come in handy here but if we start enforcing a data type conversion the exception handling will certainly come in handy. Let's look at a simple example of accessing our class data.
```
def getMemberByIdx(self, idx):
    return self.colData[self.colIdx2Name[idx]]
# edef
    
def getMemberByName(self, name):
    return self.colData[name]
# edef
```
The accessor methods are getMemberByIdx and getMemberByName. The let you pull information out of the class with an index or a column name. See how flexible this is? Sure we could have used the default tensorflow CSV file loader but I wanted to create something dynamic I had complete control over. We can conect these classes to our own special database or webservice calls furthermore we can control our feature and statistic generation better and can data drive the way our data is fed into a model and ultimately ran in tensorflow. Trust me it will be awesome. Next up the custom print method.
```
def printDataRow(self):
    for key in self.colData:
        print("%s: %s" % (key, self.colData[key]['value']))
    # efl
    print("---------------------")
# edef
```
The print data method uses an iterator over the classData dictionary. For each key in the dictionary the key and value of the key are printed out. Again take a moment to notice how flexible this design is.
```
def toRow(self):
    csvRow = ''
    for key in self.colData:
        csvRow += self.colData[key]['value'] + ","
    # efl
    csvRow = csvRow[:-1]
    csvRow += os.linesep
    return csvRow
# edef

def fromRow(self, csvRow):
    cols = csvRow.strip().split(',')
    colnum = 0
    for col in cols:
        self.setMember(colnum, col)
        colnum += 1
    # efl
# edef
```
Last but not least are our little helper methods toRow, and fromRow. We are going to use these methods to create a CSV export of loaded data rows. So that takes care of our DataRow class. Next up we're going to take a look at a mapping class Data2DataRow. This is another layer in our abstraction/generalization technique. The mapping between CSV columns and our DataRow class data is defined here.

2: Tensorflow Linear Regression Loading Data: Abstraction Layers
```
#use -1 to ignore loading a column
mapping = {
        "google_price":
            {
                "Date": "0",
                "Open": "1",
                "Close": "2",
                "High": "3",
                "Low": "4",
                "Volume": "5",
                "Symbol": "6"
            },

        "weight_age":
            {
                "Weight": "0",
                "Age": "1",
                "BloodFat": "2"
            },
        }
```
The Data2DataRow class holds a dictionary with mappings that are associated with types of CSV files. The CSV column and column index are listed here and used to load data into instances of the DataRow class. Again notice how flexible the design is, we can switch which loading technique we need by changing our index. That brings us to our next topic the execution dictionary we use to drive the entire process. The dictionary has keys for classes you haven't seen yet but we're going to review the ones we will use and it'll help you understand the CSV loading step and the way the software is configured.
```
"goog_lin_reg_avg100day":
{
    'type': 'csv',
    'data_2_datarow_type': 'google_price',
    'datarow_2_tensor_type': 'goog_lin_reg_avg100day',
    'version': '1.0',
    'reset': False,
    'checkpoint': False,
    'limitLoad': False,
    'cleanData': True,
    'verbose': False,
    'rowLimit': 25,
    'validatePrct': 0.30,
    'trainPrct': 0.70,
    'randomSeed': False,
    'trainStepsMultiplier': 5,
    'learning_rate': 0.000000001,
    'log_reg_positive_result': 0.00,
    'lin_reg_positive_result': 0.50,
    'model_type': 'linear_regression',
    'loader': 'load_csv_data',
    'files': {
        'file1': {'name': dataDir + "/spy.csv.xls", 'appendCols': [{'Symbol': 'spy', 'idx': '6'}]},
        'file2': {'name': dataDir + "/voo.csv.xls", 'appendCols': [{'Symbol': 'voo', 'idx': '6'}]},
        'file3': {'name': dataDir + "/ivv.csv.xls", 'appendCols': [{'Symbol': 'ivv', 'idx': '6'}]},
    }
},
```
The execution dictionary is the main definition of how we're going to execute our linear regression model. Read over the following list of entries and what they are for carefully. You'll see how these values drive the next few classes we look at.
- type: The type of data we're going to be loading. This vlaue is checked in the LoadCsvData class along with a version number. This helps us support different versions of CSV file format from the same source.
- data_2_datarow_type: This represents the Data2DataRow entry that is used to load data from the CSV file into our DataRow class.
- version: The version of the type of data we're loading. Corresponds with the type entry listed above.
- reset: A flag indicating if the LoadCsvData classes data storage should be reset and cleared.
- checkpoint: A flag that turns on checkpoint control during the training process.
- limitLoad: A flag that limits the loaded amount of data to the number of rows specified in the row limit entry.
- cleanData: A flag that turns on data cleaning. That is if any error flag is set in the data rows those rows are then removed.
- verbose: A flag that turns on optional debug logging.
- rowLimit: The maximum number of rows to load when the limitLoad flag is set to true.
- validatePrct: The percentage of the data set to use for the validation process.
- trainPrct: The percentage of the data set to use for the training process.
- trainStepsMultiplier: The number of training steps is set to the number of rows found in the training data set. This field multiplies that amount by the number stored here.
- learningRate: The rate at which the model is trained.
- log_reg_positive_result: The percentage value that indicates a logistic regression model positive result.
- lin_reg_positive_result: The percentage value that indicates a linear regression model positive result.
- loader: The loader to use when loading files for this model. Allows use to specify custom loaders if our default CSV loader doesn't do the trick.
- files: The files to load to provide data for this model.
It all should appear pretty straight forward once you look at the description of each entry. If you can't see it we're defining all the variables and information we need to run a linear regression model on a set of data. This is part of our abstraction design, to separate the data from it's use by adding manipulation layers that are designed to be general in nature. In this way we're designing a very flexible tensorflow linear regression model engine. One thing that may have stood out to you is the appendCols field in the file entries. This little gem allows us to add columns and data on the fly to each row of our imported file. In this case we're using it to add a column, Symbol, to the imported stock data. The idx entry in this little object is used to set the index value for that column, this value should match properly the Data2DataRow entry and any other appended columns. That brings us to our data loading process.

3: Tensorflow Linear Regression Loading Data: Loading CSV Data
```
import csv
import DataRow
import codecs
import sys

class LoadCsvData:
    """ A class for loading csv data into a data row. """
    rows = []
    rowCount = 0
    limitLoad = True
    rowLimit = 25
    cleanData = False
    cleanCount = 0
    verbose = False

    def __init__(self, lRows=[], lLimitLoad=False, lRowLimit=-1, lCleanData=False, lVerbose=False):
        self.rows = lRows
        self.limitLoad = lLimitLoad
        self.rowLimit = lRowLimit
        self.cleanData = lCleanData
        self.verbose = lVerbose
    #edef

    def loadData(self, csvFile='', type='csv', version='1.0', reset=False, dataMap={}, appendCols={}):
        print ("")
        print ("")
        print("Loading Data: " + csvFile + " Type: " + type + " Version: " + version + " Reset: " + str(reset))

        if self.verbose:
            print "Found data mapping:"
            for i in dataMap:
                print(i, dataMap[i])
            #efl
        #eif

        if self.verbose:
            print "Found append cols mapping:"
            for i in appendCols:
                print(i)
            #efl
        #eif

        if reset == True:
            print "Resetting rows:"
            self.resetRows()
        #eif

        if type == 'csv' and version == '1.0' and csvFile != '':
            ifile = codecs.open(csvFile, 'rb', encoding="utf-8-sig")
            reader = csv.reader(ifile)
            lrows = []
            rownum = 0
            for row in reader:
                if rownum == 0:
                    header = row
                    rownum += 1
                else:
                    colnum = 0
                    dRow = DataRow.DataRow()
                    dRow.verbose = self.verbose

                    # Append static values outside of the csv like stock symbol etc
                    for entry in appendCols:
                        for key in entry:
                            dRow.setMember(key, str(entry['idx']), entry[key])
                        # efl
                    # efl

                    if self.verbose:
                        print ('')
                    # eif

                    for col in row:
                        if self.verbose:
                            print (' %-8s: %s' % (header[colnum], col))
                        #eif

                        colName = header[colnum]
                        if len(dataMap) > 0:
                            try:
                                memberIdx = dataMap[colName]
                                if int(memberIdx) != -1:
                                    dRow.setMember(colName, int(memberIdx), col)
                                # eif
                            except:
                                if self.verbose:
                                    print ("Error setting member with index: ", colnum, " with value: ", col)
                                    print ("Unexpected error:", sys.exc_info()[0])
                                # eif
                            # etry
                        else:
                            dRow.setMember(colName, colnum, col)
                        # eif
                        colnum += 1
                    # efl

                    if self.verbose:
                        if self.limitLoad:
                          dRow.printDataRow()
                        # eif
                    # eif

                    dRow.stampId()
                    lrows.append(dRow.copy())
                    rownum += 1
                    self.rowCount += 1
                #eif

                if self.limitLoad == True and self.rowCount >= self.rowLimit and self.rowLimit > 0:
                    break;
                #eif
            #efl

            ifile.close()

            print ("Loaded %i rows from this data file." % (rownum))
            lrows = self.sortRows(lrows)
            self.cleanRows(lrows)
            self.rows.extend(lrows)

            print ('CleanCount: %i RowCount: %i RowsFound: %i' % (self.cleanCount, self.rowCount, len(self.rows)))
        # eif
    # edef

    def resetRows(self):
        self.rows = []
        self.rowCount = 0
        self.cleanCount = 0
    # edef

    def sortRows(self, lrows):
        return sorted(lrows, key=id)
    # edef

    def cleanRows(self, lrows):
        if self.cleanData == True:
            print ("Cleaning row data...")
            should_restart = True
            while should_restart:
                should_restart = False
                for row in lrows:
                    if row.error == True:
                        lrows.remove(row)
                        self.rowCount -= 1
                        self.cleanCount += 1
                        should_restart = True
                    # eif
                # efl
            # fwl

        # eif
    # edef

# eclass
```
Let's take a look at our class variables first.
- rows: The row storage data structure, this objects contains a list of DataRow objects.
- rowCount: The number of rows loaded into the row data structure.
- limitLoad: A flag that limits the number of rows loaded to the rowLimit.
- rowLimit: The number of rows to load when the limitLoad flag is toggled.
- cleanData: A flag that toggles the clean data check.
- cleanCount: The number of rows that were cleaned from this data import.
- verbose: A flag that toggles verbose debug logging.
Taking a quick look at the constructor __init__ you can see that it takes arguments lRows, lLimitLoad, lRowLimit, lCleanData, lVerbose. Thinking about our execution dictionary? Well you should be! Just kidding. So we can kind of guess that the values in our execution dictionary are getting passed into the constructor. It's pretty straight forward so we'll move on to our supports methods.
```
def resetRows(self):
    self.rows = []
    self.rowCount = 0
    self.cleanCount = 0
# edef

def sortRows(self, lrows):
    return sorted(lrows, key=id)
# edef

def cleanRows(self, lrows):
    if self.cleanData == True:
        print ("Cleaning row data...")
        should_restart = True

        while should_restart:
            should_restart = False

            for row in lrows:
                if row.error == True:
                    lrows.remove(row)
                    self.rowCount -= 1
                    self.cleanCount += 1
                    should_restart = True
                # eif
            # efl
        # fwl

    # eif
# edef
```
You can quickly see that the resetRows method clears all the data storage and data count values. The cleanRows method will loop over the loaded data and remove any rows that have the error flag set. Short and sweet, let's see how the data from our CSV file actually gets loaded.
```
def loadData(self, csvFile='', type='csv', version='1.0', reset=False, dataMap={}, appendCols={}):
    print ("")
    print ("")
    print("Loading Data: " + csvFile + " Type: " + type + " Version: " + version + " Reset: " + str(reset))
    if self.verbose:
        print "Found data mapping:"
        for i in dataMap:
            print(i, dataMap[i])
        #efl
    #eif

    if self.verbose:
        print "Found append cols mapping:"
        for i in appendCols:
            print(i)
        #efl
    #eif

    if reset == True:
        print "Resetting rows:"
        self.resetRows()
    #eif

    if type == 'csv' and version == '1.0' and csvFile != '':
        ifile = codecs.open(csvFile, 'rb', encoding="utf-8-sig")
        reader = csv.reader(ifile)
        lrows = []
        rownum = 0
        for row in reader:
            if rownum == 0:
                header = row
                rownum += 1
            else:
                colnum = 0
                dRow = DataRow.DataRow()
                dRow.verbose = self.verbose

                # Append static values outside of the csv like stock symbol etc
                for entry in appendCols:
                    for key in entry:
                        dRow.setMember(key, str(entry['idx']), entry[key])
                    # efl
                # efl

                if self.verbose:
                    print ('')
                # eif

                for col in row:
                    if self.verbose:
                        print (' %-8s: %s' % (header[colnum], col))
                    #eif

                    colName = header[colnum]
                    if len(dataMap) > 0:
                        try:
                            memberIdx = dataMap[colName]
                            if int(memberIdx) != -1:
                                dRow.setMember(colName, int(memberIdx), col)
                            # eif
                        except:
                            if self.verbose:
                                print ("Error setting member with index: ", colnum, " with value: ", col)
                                print ("Unexpected error:", sys.exc_info()[0])
                            # eif
                        # etry
                    else:
                        dRow.setMember(colName, colnum, col)
                    # eif
                    colnum += 1
                # efl

                if self.verbose:
                    if self.limitLoad:
                        dRow.printDataRow()
                    # eif
                # eif

                dRow.stampId()
                lrows.append(dRow.copy())
                rownum += 1
                self.rowCount += 1
            #eif
            if self.limitLoad == True and self.rowCount >= self.rowLimit and self.rowLimit > 0:
                break;
            #eif
        #efl

        ifile.close()
        print ("Loaded %i rows from this data file." % (rownum))
        lrows = self.sortRows(lrows)
        self.cleanRows(lrows)
        self.rows.extend(lrows)

        print ('CleanCount: %i RowCount: %i RowsFound: %i' % (self.cleanCount, self.rowCount, len(self.rows)))
    # eif
# edef
```
Let's knock out the first few blocks of code and get into the real stuff.
```
print ("")
print ("")
print("Loading Data: " + csvFile + " Type: " + type + " Version: " + version + " Reset: " + str(reset))

if self.verbose:
    print "Found data mapping:"
    for i in dataMap:
        print(i, dataMap[i])
    #efl
#eif

if self.verbose:
    print "Found append cols mapping:"
    for i in appendCols:
        print(i)
    #efl
#eif

if reset == True:
    print "Resetting rows:"
    self.resetRows()
#eif
```
First off we do a little logging on the configuration of our file load. This is really useful when you're running models because you can see if you have a data issue right away. Now if the verbose flag is set to true, we print out the keys and values of our data mapping. We also print out the entries found in the appendCols dictionary. And last but not least if the reset flag is set then we will reset all our row counts and row storage lists before we load new data. The start of the file loading loop is up next.
```
if type == 'csv' and version == '1.0' and csvFile != '':
    ifile = codecs.open(csvFile, 'rb', encoding="utf-8-sig")
    reader = csv.reader(ifile)
    lrows = []
    rownum = 0
    for row in reader:
        if rownum == 0:
            header = row
            rownum += 1
        else:
```
We check to see if this is the correct type and version, this allows use ot add new types and versions in the future if we need to deal with different file formats or CSV formats. A file handle is opened up and then a CSV reader is created. This takes care of the complexities of reading CSV files for us. We reset our local data storage list and row count variable. We are expecting a header row so the first row is loaded into a local header variable. Notice our append and isHeader variables are set to false on each new loop iteration. Next we'll look into the data loading code.
```
else:
    colnum = 0
    dRow = DataRow.DataRow()
    dRow.verbose = self.verbose

    # Append static values outside of the csv like stock symbol etc
    for entry in appendCols:
        for key in entry:
            dRow.setMember(key, str(entry['idx']), entry[key])
        # efl
    # efl

    if self.verbose:
        print ('')
    # eif

    for col in row:
        if self.verbose:
            print (' %-8s: %s' % (header[colnum], col))
        #eif

        colName = header[colnum]
        if len(dataMap) > 0:
            try:
                memberIdx = dataMap[colName]
                if int(memberIdx) != -1:
                    dRow.setMember(colName, int(memberIdx), col)
                # eif
            except:
                if self.verbose:
                    print ("Error setting member with index: ", colnum, " with value: ", col)
                    print ("Unexpected error:", sys.exc_info()[0])
                # eif
            # etry
        else:
            dRow.setMember(colName, colnum, col)
        # eif
        colnum += 1
    # efl

    if self.verbose:
        if self.limitLoad:
            dRow.printDataRow()
        # eif
    # eif

    dRow.stampId()
    lrows.append(dRow.copy())
    rownum += 1
    self.rowCount += 1
#eif
```
In the else clause we are dealing with data rows, not header rows, so this is where the main processing of information occurs. The first thing we do is reset our local variables. If you're been thinking inner loop, you're right. We're going to be looping over the columns in each row and building an instance of our DataRow class that we'll store in our local data list. The first thing we do is process the appendCols parameter. This is a list of dictionary objects that contain a column name, a column value, and an index. It is to provide global values to the rows that are being loaded. For instance you could pass in the file name, the date, the symbol the data is for, etc. Take a close look and you'll see that we're using our setMember method that we reviewed earlier. The process of building our DataRow entry is underway! Now let's loop over the columns and load up that data as well.
```
for col in row:
    if self.verbose:
        print (' %-8s: %s' % (header[colnum], col))
    #eif
    colName = header[colnum]
    if len(dataMap) > 0:
        try:
            memberIdx = dataMap[colName]
            if int(memberIdx) != -1:
                dRow.setMember(colName, int(memberIdx), col)
            # eif
        except:
            if self.verbose:
                print ("Error setting member with index: ", colnum, " with value: ", col)
                print ("Unexpected error:", sys.exc_info()[0])
            # eif
        # etry
    else:
        dRow.setMember(colName, colnum, col)
    # eif
    colnum += 1
# efl
```
If the verbose flag is toggles we'll get a print out of each column name and it's value. I would only recommend doing this if the limitLoad flag and rowLimit max have been set. The column name is stored, we get this from the header row we picked up on row number zero. If we have a data mapping to use the we use it. If not we just use the column name and the column number and pass in the column value to set member. This would be considered almost an automapping. For the case when we do have mapping info we pull the index we want from the data map, we check if it is a column we want to ignore (-1 index number). If it is not we call set member with the column name, the column index from our mapping, and the column value. Last but not least if you look back a few code sample you'll notice at the end of our data loading loop is a special logging call that is only run if both verbose and limiLoad are set to true, this prints out a string represetation of each row of data loaded. Not a good idea if you plan to load a lot of data.

4: Tensorflow Linear Regression Loading Data: Conclusions

That brings us to the conclusion of our data loading tutorial. There are a few more lines of code that are used to append the new DataRow object to a local data list. Loop variables control header detection and row appending. Once outside of the main processing loop the file accessors are closed and the local data row list is appended to our class member data row list. Bam! You're sooo close to running a simple neural network linear regression model in tensorflow. In the next tutorial we'll see how to generate features on our data, think statistics and things we want tensorflow to pay attention to.
```
def run(exeCfg):
    type = exeCfg['type']
    data_2_datarow_type = exeCfg['data_2_datarow_type']
    datarow_2_tensor_type = exeCfg['datarow_2_tensor_type']
    version = exeCfg['version']
    reset = exeCfg['reset']
    checkpoint = exeCfg['checkpoint']
    verbose = exeCfg['verbose']
    limitLoad = exeCfg['limitLoad']
    rowLimit = exeCfg['rowLimit']
    validatePrct = exeCfg['validatePrct']
    trainPrct = exeCfg['trainPrct']
    randomSeed = exeCfg['randomSeed']
    learning_rate = exeCfg['learning_rate']
    log_reg_positive_result = exeCfg['log_reg_positive_result']
    lin_reg_positive_result = exeCfg['lin_reg_positive_result']
    model_type = exeCfg['model_type']
    loader = exeCfg['loader']
    cleanData = exeCfg['cleanData']
    trainStepsMultiplier = exeCfg['trainStepsMultiplier']
    dataMap = Data2DataRow.mapping[data_2_datarow_type]
    files = exeCfg['files']
    data = None

    print("Found loader: " + loader)
    if loader == 'load_csv_data':
        data = LoadCsvData.LoadCsvData()
        data.checkpoint = checkpoint
        data.limitLoad = limitLoad
        data.rowLimit = rowLimit
        data.verbose = verbose

        for file in files:
            csvFileName = files[file]['name']
            appendCols = files[file]['appendCols']
            data.loadData(csvFileName, type, version, reset, dataMap, appendCols)
        # efl

    # eif

# edef
```
Oops, forgot to go over the code that executes our execution configuration object, we went over this earlier in this tutorial. It's fairly simple we offload all the values in the execution configuration dictionary and the we begin the file loading process. Global variables are set with the instantiation of the LoadCsvData object. Then for each file listed in the execution configuration's files entry we load the csv data and pass in the necessary parameters to the loadData method. The parameters are a mixture or local and global params. Aaaaaaaand that now concludes out tutorial. You can run the tutorial by executing the Main.py file.

To edit your run configurations in PyCharm find the Edit Configurations option under the Run menu. See the picture below.

Fill out the information on the project you want to register on the Edit Configurations screen. See the picture below.

And last but not least you should be able to execute your CSV data loading code and see a print out similar to the one depicted below.

Tutorial 0: Tensorflow Linear Regression

Author: Brian A. Ree

Files/Resources for this Tutorial
Project Files v0.4.0.7_python2.7 (PyCharm)
TensorFlow For Machine Intelligence: A hands-on introduction to learning algorithms

0: Tensorflow Linear Regression Loading Data: Intro

1: Tensorflow Linear Regression Loading Data: Data Structures

Let's review our class variables first.

Next up is our class methods.

Take a look at the contructor yourself. It is very simple. We'll review the setMember method first so that you know how our class data is stored. This will make the accessor methods more intuitive.

The print data method uses an iterator over the classData dictionary. For each key in the dictionary the key and value of the key are printed out. Again take a moment to notice how flexible this design is.

2: Tensorflow Linear Regression Loading Data: Abstraction Layers

The execution dictionary is the main definition of how we're going to execute our linear regression model. Read over the following list of entries and what they are for carefully. You'll see how these values drive the next few classes we look at.

3: Tensorflow Linear Regression Loading Data: Loading CSV Data

Let's take a look at our class variables first.

You can quickly see that the resetRows method clears all the data storage and data count values. The cleanRows method will loop over the loaded data and remove any rows that have the error flag set. Short and sweet, let's see how the data from our CSV file actually gets loaded.

Let's knock out the first few blocks of code and get into the real stuff.

4: Tensorflow Linear Regression Loading Data: Conclusions

To edit your run configurations in PyCharm find the Edit Configurations option under the Run menu. See the picture below.

Fill out the information on the project you want to register on the Edit Configurations screen. See the picture below.

And last but not least you should be able to execute your CSV data loading code and see a print out similar to the one depicted below.

Tutorial 0: Tensorflow Linear Regression

Author: Brian A. Ree

Files/Resources for this Tutorial Project Files v0.4.0.7_python2.7 (PyCharm) TensorFlow For Machine Intelligence: A hands-on introduction to learning algorithms

0: Tensorflow Linear Regression Loading Data: Intro

1: Tensorflow Linear Regression Loading Data: Data Structures

Let's review our class variables first.

Next up is our class methods.

Take a look at the contructor yourself. It is very simple. We'll review the setMember method first so that you know how our class data is stored. This will make the accessor methods more intuitive.

The print data method uses an iterator over the classData dictionary. For each key in the dictionary the key and value of the key are printed out. Again take a moment to notice how flexible this design is.

2: Tensorflow Linear Regression Loading Data: Abstraction Layers

The execution dictionary is the main definition of how we're going to execute our linear regression model. Read over the following list of entries and what they are for carefully. You'll see how these values drive the next few classes we look at.

3: Tensorflow Linear Regression Loading Data: Loading CSV Data

Let's take a look at our class variables first.

You can quickly see that the resetRows method clears all the data storage and data count values. The cleanRows method will loop over the loaded data and remove any rows that have the error flag set. Short and sweet, let's see how the data from our CSV file actually gets loaded.

Let's knock out the first few blocks of code and get into the real stuff.

4: Tensorflow Linear Regression Loading Data: Conclusions

To edit your run configurations in PyCharm find the Edit Configurations option under the Run menu. See the picture below.

Fill out the information on the project you want to register on the Edit Configurations screen. See the picture below.

And last but not least you should be able to execute your CSV data loading code and see a print out similar to the one depicted below.

Files/Resources for this Tutorial
Project Files v0.4.0.7_python2.7 (PyCharm)
TensorFlow For Machine Intelligence: A hands-on introduction to learning algorithms