— Scene III —
What are Plato's Pizzas best and worst selling pizzas?

Intro

In this act I'm going to show off my capability with Go and SQL using SQLite. I'm going to be using Go 1.18 which has support for generic programming similar to C++ and Rust although I'll be keeping to simple concrete in so much as possible. The source for this act is available at github.com/PaulioRandall/platos-pizzas-go.

Objective

The question to answer is What are Plato's Pizzas best and worst selling pizzas? from the Maven Pizza challenge. Full details can be found in the Pizza Challenge's full description.

Commentary

Go is currently my favourite programming language. It is by no means perfect but has a non-exhaustive set of amiable characteristics include high performance, compilation speed, ease of automated testing and formatting, straghtforward syntax, concise production rules, and a focus more on readability over condensed syntax. It also has a growing presence in the data analytics domain with native support for complex numbers.

On the downside error handling can be a little verbose and Go is not well suited for developing heavy graphical user interfaces or for scripting. It is, however, exceptionally well suited for Web API development, compiler and developer tools, and internet infrastructure, e.g. orchestration tools. Go is often referred to as the internets infrastructure language.

SQLite is my favourite SQL database. It is small, minimalist, fast, reliable, portable, and doesn't require cumbersome database management system. Providing you have the SQLite drivers you can just create a database as a file on your computer or smart phone.

While it's exceptional for storing a moderate amount of structured data in highly portable files, there are two major drawbacks. The database read capabilities are amazing across many processes, but database writes beyond a single process are not so. Combine this with the lack of networking features you get a database that won't power moderate to heavy use services or handle big data projects.

That being said, it is often overlooked as means of reducing costs on small projects, service prototyping, and as a starting point for developing larger services, end to end systems testing, and small to moderate data storage.

Skills

  • General Go competency
    • Interfaces
    • Error handling
  • Basic SQL & SQLite competency
    • Creating tables
    • Understanding and using keys
    • Writing queries
  • Using Github to store code
  • The importance of interfaces

Menu

— Act I —

The data

The data for this project is the same as provided in the Spreadsheets project. Therefore, I'll be omitting my previous commentary and repeating only the metadata for reference. You may find the data files in the R repository github.com/PaulioRandall/platos-pizzas-r/data/.

Entity relationship diagram of the data tables

orders

Field Description
order_id Unique identifier for each order placed by a table
date Date the order was placed (entered into the system prior to cooking & serving)
time Time the order was placed (entered into the system prior to cooking & serving)

order_details

Field Description
order_details_id Unique identifier for each pizza placed within each order (pizzas of the same type and size are kept in the same row, and the quantity increases)
order_id Foreign key that ties the details in each order to the order itself
pizza_id Foreign key that ties the pizza ordered to its details, like size and price
quantity Quantity ordered for each pizza of the same type and size

pizzas

Field Description
pizza_id Unique identifier for each pizza (constituted by its type and size)
pizza_type_id Foreign key that ties each pizza to its broader pizza type
size Size of the pizza (Small, Medium, Large, X Large, or XX Large)
price Price of the pizza in USD

pizza_types

Field Description
pizza_type_id Unique identifier for each pizza type
name Name of the pizza as shown in the menu
category Category that the pizza fall under in the menu (Classic, Chicken, Supreme, or Veggie)
ingredients Comma-delimited ingredients used in the pizza as shown in the menu (they all include Mozzarella Cheese, even if not specified; and they all include Tomato Sauce, unless another sauce is specified)
— Act II —

Database interface

Below is an initial design for the interface to the database. I've also added function signatures for querying the first n rows of each table. So I can manually confirm the correct data was added to the correct table. They are superfluous for this write up so they've been omitted.

package database 
 
type PlatosPizzaDatabase interface { 
	InsertMetadata(MetadataEntry) error 
	InsertOrder(Order) error 
	InsertOrderDetail(OrderDetail) error 
	InsertPizza(Pizza) error 
	InsertPizzaType(PizzaType) error 
	... 
}

I have a confession to make. This is not actually the simplest database interface design I could have crafted. I have specified some custom complex data structures (models) as inputs to the insert functions rather than passing the values directly. Much of my time in software engineering was spent desiging and implementing APIs that read and write to SQL databases so I have a lot of experience to lean on.

— Act III —

Data structures

Interfaces are the hardest parts of a system to modify so should be the focus during design. I've pre-abstracted the inputs by into simple and easy to handle data structures. They are more meaningful and easier to pass around. They each map directly to a table within the database.

orders

// Order represents an order of pizzas, one or many pizzas per order 
type Order struct { 
 
	// Unique identifier for each order placed by a table 
	Id int 
 
	// Date & time the order was placed 
	// (entered into the system prior to cooking & serving) 
	Datetime time.Time 
}

order_details

// OrderDetail represents a specific pizza type order, one or more pizzas, 
// within an order 
type OrderDetail struct { 
 
	// Unique identifier for each pizza placed within each order 
	// (pizzas of the same type and size are kept in the same row, and the quantity increases) 
	Id int 
 
	// Foreign key that ties the details in each order to the order itself 
	OrderId int 
 
	// Foreign key that ties the pizza ordered to its details, like size and price 
	PizzaId string 
 
	// Quantity ordered for each pizza of the same type and size 
	Quantity int 
}

pizzas

// Pizza represents a specific pizza type at a specific size 
type Pizza struct { 
 
	// Unique identifier for each pizza (constituted by its type and size) 
	Id string 
 
	// Foreign key that ties each pizza to its broader pizza type 
	TypeId string 
 
	// Size of the pizza (Small, Medium, Large, X Large, or XX Large) 
	Size string 
 
	// Price of the pizza in USD 
	Price float64 
}

pizza_types

type PizzaType struct { 
 
	// Unique identifier for each pizza type 
	Id string 
 
	// Name of the pizza as shown in the menu 
	Name string 
 
	// Category that the pizza fall under in the menu 
	// (Classic, Chicken, Supreme, or Veggie) 
	Category string 
 
	// Comma-delimited ingredients used in the pizza as shown in the menu 
	// (they all include Mozzarella Cheese, even if not specified; and they all 
	// include Tomato Sauce, unless another sauce is specified) 
	Ingredients string 
}
— Act IV —

In-memory

Here is an in-memory implementation of the interface. I've excluded most of the functions for brevity. They are almost identical to InsertOrder .

package database 
 
type inMemory struct { 
	dataDictionaries []MetadataEntry 
	orders           []Order 
	orderDetails     []OrderDetail 
	pizzas           []Pizza 
	pizzaTypes       []PizzaType 
} 
 
func NewInMemoryDatabase() *inMemory { 
	return &inMemory{} 
} 
 
... 
 
func (db *inMemory) InsertOrder(order Order) error { 
	db.orders = append(db.orders, order) 
	return nil 
} 
 
...

The error return value is redundant for this in-memory implementation but is required in order satisfy the interface. Implementations to real databases will almost certainly return an error.

Why?

Because it was extremely quick taking only ten to fifteen minutes. It gave me something concrete to test the interface design against. Interfaces are the hardest part to change in a system because they require all consumers and implementations to change at the same time.

Pulling the data from the files into the pre-established data structures was a bigger unknown for me than SQLite integration. I've done SQLite implementations before. It can be a little tedious but it's not difficult.

Furthermore, I now have a database ready for testing the consuming workflow code. Starting up a real database is a real drag, double so when we have to recreate or reset it for every test.

— Act V —

CSV

An example of reading a CSV file and pushing each record into the database. It's not the nicest code to read but it'll do for now. In terms of volume, there's not much data so individual database writes will suffice. AS data volume increases we have the option of batching inserts and using database transactions to optimise the time consuming IO.

I'm ignoring discussion of error handling because it's not of interest from a data analysis perspective. Errors are being handled though as can be seen.

const ( 
	DatetimeFormat = "2006-01-02 15:04:05" 
)
func InsertOrdersFromCSV(db PlatosPizzaDatabase, filename string) error { 
	records, e := readCSV(filename) 
	if e != nil { 
		return ErrDatabase.CausedBy(e, "Failure to read orders %q", filename) 
	} 
 
	orders := make([]Order, len(records)) 
	for i, record := range records { 
		id, e := strconv.Atoi(record[0]) 
		if e != nil { 
			return ErrDatabase.CausedBy(e, "Bad order ID discovered") 
		} 
 
		datetime, e := time.Parse(DatetimeFormat, record[1]+" "+record[2]) 
		if e != nil { 
			return ErrDatabase.CausedBy(e, "Bad order date or time discovered") 
		} 
 
		orders[i] = Order{ 
			Id:       id, 
			Datetime: datetime, 
		} 
	} 
 
	if e = db.InsertOrders(orders...); e != nil { 
		return ErrDatabase.Wrap(e) 
	} 
 
	return nil 
}

Reading

In case you were wondering how I'm reading the CSV files. I'm using Go's standard os and encoding/csv packages. Nothing fancy needed here.

func readCSV(filename string) ([][]string, error) { 
	f, e := os.Open(filename) 
	if e != nil { 
		return nil, ErrCSVFile.CausedBy(e, "Could not open file %q", filename) 
	} 
	defer f.Close() 
 
	r := csv.NewReader(f) 
	records, e := r.ReadAll() 
 
	if e != nil { 
		return nil, ErrCSVFile.CausedBy(e, "Could not read file %q", filename) 
	} 
 
	records = records[1:] // Remove header 
	return records, nil 
}
— Act VI —

Workflow

The workflow package contains the code that pulls it all together. The functions for reading and inserting the CSV have been placed in the database package because I felt it was more at home there.

package workflow 
 
import ( 
	"github.com/PaulioRandall/analytics-platos-pizza/act-3/pkg/database" 
	"github.com/PaulioRandall/analytics-platos-pizza/act-3/pkg/err" 
) 
 
var ErrExecuting = err.Track("Failed to execute workflow") 
 
func Execute() error { 
	db := database.NewInMemoryDatabase() 
 
	if e := insertCSVData(db); e != nil { 
		return ErrExecuting.TraceWrap(e, "Inserting CSV data into database") 
	} 
 
	return nil 
} 
 
func insertCSVData(db database.PlatosPizzaDatabase) error { 
 
	... 
 
	e = database.InsertOrdersFromCSV(db, "../data/orders.csv") 
	if e != nil { 
		return err.Wrap(e, "Failed to insert orders") 
	} 
 
	... 
 
	return nil 
}
To be continued...
Back to top