Come Thou Fount of Every Blessing – Piano Instrumental

After beginning to learn piano again starting last May 2022, Come Thou Fount of Every Blessing is the first hymn I began learning. This is a fairly easy arrangement suited to my first year learning to play again: https://www.musicnotes.com/sheetmusic/mtd.asp?ppn=MN0213498.

While I cannot say which is my favorite hymn, this one is in the top five, alongside Amazing Grace, Be Thou My Vision, In Christ Alone, and It is Well with My Soul.

I am playing here on a Roland FP-30X Digital Piano. One of the primary reasons I chose this digital piano are the keys which are considered comparable to digital pianos of higher quality both for Roland and other manufacturers.

Walmart Daily Sales Prediction Using Time Series Analysis: Seasonality

Time series prediction can be beneficial in many fields including logistics, weather, sales forecasting, and predictive maintenance. Walmart provided a complete data set on Kaggle that can be used to evaluate time series prediction techniques. I will be making several posts using the Walmart data set from the M5 Forecasting – Accuracy competition to develop and evaluate time series methods in Python.

This first post demonstrates preliminary exploratory data analysis (EDA) and prediction using seasonal features. The post also provides a brief summary of polymorphism in Python using an abstract parent class to minimize code duplication and avoid conditionals like switch or if statements.

Walmart Data Set

The Walmart data set includes data for items in three categories of products from 2011 through 2016: hobbies, foods, and household. Each item is associated with a store in CA, FL, or TX. Three tables contain data to identify daily unit sales, selling prices, and event data for any given day.

The calendar table
  • The calendar table has daily rows with weekday and event labels providing the date of notable events. The event labels contained in the table include religious holidays, such as Chanukah End and Easter, sporting events, such as SuperBowl and NBAFinalsStart, and US national holidays, such as Thanksgiving and IndependenceDay.
  • The sales_train_validation table includes daily unit sales data for products in the three categories among stores in the three states. This table is in wide format with each row containing all daily sales data for one product and columns for each day in the full time range.
The sales_train_validation table
  • The sell_prices table provides weekly prices for each item.
The sell_prices table
Exploratory Data Analysis

Prior to predicting unit sales, we would like to identify good candidate items that have strong seasonal variation. Since the Walmart data set only provides item_id labels instead of true item names or descriptions, EDA is needed to identify these good candidate items. Items that are the best candidates for seasonal prediction have a high correlation with events and holidays. Since some foods are often associated with events (e.g., chocolate on Valentine’s Day), this analysis focuses on items in the FOOD category. This initial round of exploratory data analysis identifies foods that demonstrate higher sales on events. Same items from multiple stores are grouped since preliminary EDA showed these same food items behaved similarly among multiple stores and states.

To identify good candidates for seasonal prediction, the data tables needs to be merged into a form with one column per event and one row per item with the same items from multiple stores groups into one row.

  1. Unpivoting (pandas melt) the sales_train_validation table converts the table from a wide format with a column per day to long format with a primary key including day.
  2. Grouping and averaging (pandas groupby) combines each item sold on one day among all stores into one row for that item with an average unit_sales for this day. The grouped sales_train_validation table now only has three columns: d (day), item_id, and average unit_sales across all stores.
  3. Merging (pandas merge) joins this grouped table with the calendar table on the day column. This step adds the event labels per day to the average unit_sales per day.
  4. Grouping again combines items per event to produce a table with a primary key of item_id and event_name. This table identifies the average unit_sales per event.
  5. Pivoting (pandas pivot) converts this grouped table into wide format with one column per event including a ‘None’ column to group sales on days without events.
  6. Dividing the unit_sales values in the event columns by the the unit_sales values in the ‘None’ column produces a unit_sales ratio to highlight foods with higher sales on event days. This step produces the final table values and structure.
Final unsorted wide table with average unit sales per item and per event

Sorting this wide table to be descending in the ‘Thanksgiving’ column identifies FOODS_3_069 as the food with highest increase in average unit_sales on Thanksgiving Day compared to days without events.

Sorted table identifies foods that sell more on Thanksgiving than normal days

The unit_sales for FOODS_3_069 at the TX_1 store demonstrates the unit_sales seasonality for this food. Distinct peaks occur near New Years Eve, Christmas, Thanksgiving, and Valentine’s Day although not all holidays have a peak each of the five years.

Unit sales of the FOODS_3_069 item shows peaks near three holidays
Unit Sales Prediction with Seasonal Features

This analysis uses a combination of deterministic time series features to predict unit sales. These features are a linear trend, weekly seasonal indicators, and annual seasonal indicators. The linear trend enables the model to detrend a long-term linear trend in time. The seasonal features are Fourier series in which each series has an integer number of cycles within a one year time frame. This analysis uses annual seasonal indicators with 1 to 32 cycles per annum. The statsmodels.tsa.deterministic.DeterministicProcess container class is a convenient class that provides the Fourier Series in addition to constants, time trends, and seasonal indicators for each week. The following method demonstrates the DeterministicProcess syntax. The DeterministicProcess requires a pandas index format for the index column.

    def create_seasonal_features(self, df_merged_store):
        """Creates seasonal features for one item and one store"""
        df_copy = df_merged_store.copy(deep=True)
        y = df_copy['unit_sales']

        df_copy['date'] = pd.DatetimeIndex(df_copy['date'])
        df_copy.set_index('date', inplace=True)
        fourier = CalendarFourier(freq='A', order=16)
        dp = DeterministicProcess(index=df_copy.index,
                                    constant=True,
                                    order=1,
                                    seasonal=True,
                                    additional_terms=[fourier],
                                    drop=True)
        X = dp.in_sample()

        return X, y

The model fitting and prediction problem presents an opportunity to apply polymorphism in Python using the abc package. A parent class contains a generic plotting method and abstract methods for fitting and prediction. A child class defines fitting and prediction methods that are tailored to a specific combination of input features. This first analysis only uses the seasonal features described previously, and the UnitSalesPredictionSeasonal child class fits a linear regression model from sklearn.linear_model.LinearRegression. The full code used in this example is available on GitHub: https://github.com/bspivey/M5ForecastingAccuracy.

from abc import ABC, abstractmethod

class UnitSalesPrediction(ABC):
    def plot_predictions(self, X, y, y_pred):
        list_of_tuples = list(zip(X.index, y, y_pred))
        columns = ['date', 'y', 'y_pred']
        df_wide = pd.DataFrame(list_of_tuples, columns=columns)
        value_vars = ['y', 'y_pred']
        df_tall = pd.melt(df_wide,
                            id_vars='date',
                            value_vars=value_vars,
                            var_name='y_label',
                            value_name='y_value')

        fig = px.line(df_tall,
                        x='date',
                        y='y_value',
                        color='y_label',
                        width=900,
                        height=300)
        fig.update_layout(
            yaxis_title='unit_sales')

        fig.show()

    @abstractmethod
    def fit_unit_sales_model(self):
        pass

    @abstractmethod
    def predict_unit_sales(self):
        pass

class UnitSalesPredictionSeasonal(UnitSalesPrediction):
    def fit_unit_sales_model(self, X_seasonal, y):
        """Trains a model to predict unit sales for one item and one store"""
        X = X_seasonal
        model = LinearRegression().fit(X, y)

        return model

The model trains on FOODS_3_069 time series data excluding the final two years. The final year contains the test data not used for model tuning, and the prior year contains the validation data used for model tuning.

The results for unit_sales predictions on validation and test data demonstrate that the seasonal features model successfully identifies peaks around Thanksgiving and Christmas and a possible peak near Valentine’s Day. The y_pred signal is the predicted unit_sales shown versus the y signal which is the validation or test data unit_sales.

FOODS_3_069 unit sales predictions on validation data
FOODS_3_069 unit sales predictions on test data

While the results demonstrate a correlation with several holidays as expected, the results smooth the predictions and show potential for improvement. Ideas for next steps are (1) including categorical features using actual event and holiday labels combined with lag/lead features, (2) using a hybrid linear regression and nonlinear regression model, and (3) using deep learning packages such as Facebook Prophet.

Compare GPU and CPU Training Times for Image Recognition with Tensorflow 2

This article compares the training times for fitting a Tensorflow 2 convolutional neural network (CNN or convnet) using a GPU or CPU on the Kaggle Dogs vs. Cats dataset. The Dogs vs. Cats competition was an early Kaggle competition to demonstrate the power of convnets to solve computer vision recognition problems as winning entries reached 95% accuracy.

The training time comparison follows my prior post explaining how to setup an nvidia-docker container to run TensorFlow 2 on a GPU. I will begin this article by reviewing the main steps to train the convnets using an example in Deep Learning with Python 1st edition by Chollet. These steps are provided in more detail on the book GitHub site: https://github.com/fchollet/deep-learning-with-python-notebooks.

Starting the Container

The GPU can be enabled or disabled when starting the nvidia-docker container by keeping or removing the --gpus all option in the following line:

sudo docker run --gpus all -d -it -p 8848:8888 -v "$(pwd)/data:/home/jovyan/work" -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes --user root cschranz/gpu-jupyter:v1.4_cuda-11.0_ubuntu-18.04_python-only

If the GPU is not selected as an option, the following command should show no GPUs in the list of local devices:

from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
from tensorflow.python.client import device_lib
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 2823115825857772105
]

Training the Model

The convnet is constructed with a series of paired convolution and max pooling layers. The first Conv2D layer slides 3×3 windows over the 150 x 150 x 3 pixel tensor representing the scaled RGB input image to produce a 148 x 148 x 32 pixel output feature map with 32 layers for each of the 32 convolution filters. The output height and width can maintain the input height and width by setting padding="same". The MaxPooling2D layer downsamples the feature maps. Downsampling is important to reduce the number of model parameters and to achieve output feature maps that represent general image features such cat eyes or ears. The convnet is completed by flattening the output feature map and adding Dense neural network layers. The convolution and max pooling layers transform input images to generalized image features which serve as inputs to the Dense neural network classifier. The reader may find many more detailed explanations of convnets online.

from keras import layers
from keras import models

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
          input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 148, 148, 32)      896       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 74, 74, 32)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 72, 72, 64)        18496     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 36, 36, 64)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 34, 34, 128)       73856     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 17, 17, 128)       0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 15, 15, 128)       147584    
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 7, 7, 128)         0         
_________________________________________________________________
flatten (Flatten)            (None, 6272)              0         
_________________________________________________________________
dense (Dense)                (None, 512)               3211776   
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 513       
=================================================================
Total params: 3,453,121
Trainable params: 3,453,121
Non-trainable params: 0

The model is compiled with a binary_crossentropy loss function and the acc metric as a generic accuracy metric. These may be used together for a two target class problem, but the metric should be changed for a multiclass problem.

from keras import optimizers

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-4),
              metrics=['acc'])

A data generator is used to generate batches of image tensor data that can be augmented at runtime. The first example shows the training time comparison with only image rescaling, and the second example shows the results with rotations, x-y shifts, shear, zoom, and horizontal flip augmentations.

# Image data generator with only scaling
from keras.preprocessing.image import ImageDataGenerator

# All images will be rescaled by 1./255
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')
# Image data generator with additional data augmentations
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,)

# Note that the validation data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

The image transformations used for data augmentation are beneficial to reduce overfitting since the model becomes less sensitive to placement and orientation of the objects within an image. The convnet model is fit using 30 epochs without data augmentation and 100 epochs with data augmentation. The model is fit with more epochs in the latter run since model validation performance continues to improve without overfitting.

history = model.fit(
      train_generator,
      steps_per_epoch=100,
      epochs=30, # 100 epochs with data augmentation
      validation_data=validation_generator,
      validation_steps=50)

Model Validation Results

The convnet without data augmentation demonstrates overfitting that begins by the second epoch as the training accuracy exceeds the validation accuracy. The validation accuracy saturates at ~70%.

Accuracy and loss learning curves demonstrate overfitting early in learning

The convnet with data augmentations demonstrates increasing validation accuracy above 80% by the final epoch.

Accuracy and loss curves demonstrate continued improvement through 90 epochs.

GPU vs. CPU Training Time Results

Without data augmentation, the training time for all GPU epochs after the first one was 8 seconds versus the CPU epoch time of 27 seconds.

GPU training time without data augmentation
CPU training time without data augmentation

With the data augmentations used above, the training time for the GPU epochs were 15 seconds versus the CPU epoch time of 28 seconds.

GPU training time with data augmentation
CPU training time with data augmentation

The reason the training time for GPU epochs increased compared to the CPU epochs may be because the ImageDataGenerator augmented the images asynchronously using the CPU. The following post describes more details about how the data augmentation may be done synchronously with the GPU: https://keras.io/examples/vision/image_classification_from_scratch/ and https://github.com/keras-team/keras/issues/12120.

Setup TensorFlow to use the GPU with Docker Containers

Having built a machine suitable for deep learning, I was ready to put my EVGA GeForce 1080 Ti GPU to the test. Unfortunately I found that configuring TensorFlow + GPU to run on my local machine was not as straightforward as any other Python package I have installed. This story has been repeated on many posts online with all the pitfalls that can occur. This post chronicles the simplest approach I have found to start using TensorFlow with the GPU in the simplest and easiest manner as possible.

I am motivated to make this post since I found no sites that chronicled the complete journey to start from a fresh GPU installation and have Tensorflow running on a GPU. Many sites show individual steps, and some advertise how easy this can be while only showing the last Conda install steps required, none of the prior CUDA configuration steps. Having tried multiple approaches to install TensorFlow on my local machine directly to work with the GPU, I found that using a Docker container was a reliable method and also makes work more portable to other machines.

In this post, I will describe all steps that were required to stand up a Docker container that can run TensorFlow on Ubuntu 18.04 OS with an EVGA GeForce GTX 1080 Ti GPU.

1. Install Nvidia Drivers

Prior to installing Nvidia drivers, I recommend removing all existing Nvidia drivers. I have seen errors with the GPU not being recognized due to prior Nvidia GPU and CUDA drivers. If you find that you later want to install Tensorflow with GPU support on the local machine, this is the key first step.

$ sudo apt remove nvidia-*
$ sudo apt install
$ sudo apt autoremove

The next step is to find the appropriate driver for the GPU. Here I performed a Manual Driver Search for GeForce 10 Series: https://www.nvidia.com/en-us/geforce/drivers/. Select the OS with bits (e.g., Linus 64-bit) and downloaded the latest driver for this GPU: Linux x64 (AMD64/EMT64T) Display Driver Version: 465.31 and the run file NVIDIA-Linux-x86_64-465.31.run.

The file permissions may need to be changed prior to executing the run file:

$ sudo chmod +x ./NVIDIA-Linux-x86_64-465.31.run
./NVIDIA-Linux-x86_64-465.31.run

If you do not know the meaning of installation options, I recommend selecting the defaults since other options can produce errors. You may receive a warning about the GCC version being different. I had no errors as long as the system GCC version is more recent than the GCC used to compile the run file.

2. Install Docker

Docker provides the latest instructions to install the Docker engine on Ubuntu here: https://docs.docker.com/engine/install/ubuntu/. Note that Docker may change the steps below, and I recommend following the latest steps from the Docker site. It is recommend to start with the Uninstall Old Versions step to prevent incompatibility issues. Next use the Install Using the Repository and Set Up the Repository steps:

$ sudo apt-get update
$ sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg \
    lsb-release

Add Docker’s official GPG key:

$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg

Setup a stable repository:

$ echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

Install the latest version of the Docker engine:

 $ sudo apt-get update
 $ sudo apt-get install docker-ce docker-ce-cli containerd.io

Finally verify that Docker is working:

$ sudo docker run hello-world

3. Install Nvidia Docker Support

Nvidia provides working instructions to setup Docker and the Nvidia Container Toolkit here with Install on Ubuntu and Debian: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker. I recommend using this link maintained by Nvidia. However, I will also document the steps I used recently to setup Nvidia with Docker support. Note that you can skip the Setting up Docker step since we setup Docker in the prior step. Use the $ docker -v command to confirm that the Docker version is 19.03 or later which is required for nvidia-docker2.

$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

$ curl -s -L https://nvidia.github.io/nvidia-container-runtime/experimental/$distribution/nvidia-container-runtime.list | sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list

$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo systemctl restart docker
$ sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

The output should show the GPU status similar to below (extra points if you catch the pop-culture reference):

In prior installations I received an error installing nvidia-docker like this one: https://github.com/NVIDIA/nvidia-docker/issues/234. If this error occurs, the solution is to install this deb file: https://github.com/NVIDIA/nvidia-docker/files/818401/nvidia-docker_1.0.1-yakkety_amd64.deb.zip. Then the nvidia-docker2 package should be able to be installed, or else you may also try to install with sudo apt-get install -y nvidia-container-toolkit.

4. Pull a Pre-Built Docker Image

The easiest way to get started with Docker is to pull a pre-built image that has Jupyter notebook and TensorFlow GPU support. I recommend selecting an image with a terminal window to make updating the Python virtual environment easier, and I recommend to choose an image that connects to the local filesystem.

The GPU-Jupyter image provides these features: https://github.com/iot-salzburg/gpu-jupyter/commits?author=ChristophSchranz. I started with Quickstart Step 4 to pull the Docker image. If only Python is needed, the site provides names of additional images that exclude Julia and R which should save time in downloading the image. Also select the proper image for the Ubuntu OS. I used the following command to pull the image:

$ cd your-working-directory 
$ docker run --gpus all -d -it -p 8848:8888 -v $(pwd)/data:/home/jovyan/work -e GRANT_SUDO=yes -e JUPYTER_ENABLE_LAB=yes --user root cschranz/gpu-jupyter:v1.4_cuda-11.0_ubuntu-18.04_python-only

The command and tags used for pulling a docker image are explained here: https://docs.docker.com/engine/reference/run/. The specific commands used for GPU-Jupyter are explained as follows:

  • -d: the container exits when the root process running the container exits.
  • -it: creates a tty (teletypewriter) as a terminal window for interactive processes.
  • -p: specifies the ports to be accessible on the local host.
  • -v: specifies the volumes or shared filesystem. In the command above, a data folder will be created with admin access only in the working-directory.

Once the image has been pulled, it will begin running automatically at http://localhost:8848. The password at the time of this article is gpu-jupyter.

5. Check that Tensorflow Runs on the GPU

One way to confirm that TensorFlow runs with the local machine GPU is to open a Jupyter notebook in the GPU-Jupyter image and use the is_gpu_available() function which returns a Boolean:

import tensorflow as tf
print(tf.test.is_gpu_available(cuda_only=True))

TensorFlow also provides a function to check the GPU device:

print(tf.test.gpu_device_name())
GPU-Jupyter image provides a JupyterLab web interface

As seen above, both commands confirm that TensorFlow recognizes the GPU. If the image is configured correctly, TensorFlow will use the GPU by default.

In my next post I will show initial results of using TensorFlow + GPU for a common deep learning problem.

My First Woodworking DIY Project: A Bookshelf

I built a hardwood bookshelf during 2016 while living in an apartment. I found a good selection of different wood and sizes at Houston Hardwoods, Inc. (https://www.houstonhardwoods.com/) and found a local makerspace TXRX Labs (https://www.txrxlabs.org/woodshop) where I could finish the wood pieces and build the bookshelf. The makerspace required safety training and a woodshop training prior to granting free access to the woodshop. I took these preparatory classes and made my first wood creation, a cutting board.

If I had to choose inspiration for this work, it must have been how my uncle and dad have both built wooden furniture. My dad build a bookshelf out of pine and built more structures outdoors. My uncle has built bed frames and other furniture and has a woodshop in his basement, and my great grandfather built his home on a farm. These family stories inspired me to try my hand at DIY woodworking. I am inspired by our Creator who has instilled many of us with the mind and ability to create using the raw work of His hands. “So God created man in his own image, in the image of God he created him; male and female he created them. And God blessed them. And God said to them, ‘Be fruitful and multiply and fill the earth and subdue it,…’ ” – Genesis 1:27-28. I may also have been inspired by the many hours my parents watched The New Yankee Workshop on PBS given that we didn’t have cable growing up.

Design

The first step of the project was to decide on a bookshelf design. I found a design very close to what I was seeking, but it was a tall bookshelf with shorter, shallower shelf heights than I needed. I adopted the general design and adapted the dimensions and some of the style to fit my bookshelf. Then I sketched the wood dimensions and counts I would need prior to visiting the lumber shop.

Wood Selection and Preparation

Once I had the dimensions and count of wood pieces, the next decision is the type of wood to buy. I decided early to use a hardwood instead of pine since I wanted a wood with fine grains, few knots, and durability. I landed on maple since I liked the clean appearance and since it was one of the woods recommended by the lumber shop.

Since I needed two 13 inch wide boards for deep shelves, the selection of boards was slim. I also could not find a board wide enough for the top width of 16 inches. Many boards also had some amount of warping, and I had to wait for a second round of lumber to arrive to get all boards I needed. The shop needed to mill some rough lumber pieces to a grade less than S4S. Since I planned to plane the pieces in the makerspace, I did not need the lumber shop to finish all pieces to S4S. After using a 24″ planer and 8″ jointer to finish the shelves, I saw that the surface was still too rough. Since the makerspace did not have a working drum sander, I took these to the the lumber shop where they sanded the top and shelf pieces to produce smooth surfaces.

The 13″ wide middle and bottom shelf pieces.

I finished the remaining pieces of wood using the 24″ planar to smooth surfaces and 8″ jointer to make right edges. It is important to plane in the direction of the grains and angle the wood downward to avoid tear-outs.1 The makerspace blades weren’t as sharp as ideal due to less frequent maintenance than needed.

All other pieces for the legs and frame

One lesson I learned on this project is that wood can warp significantly when you take it home from the lumber shop. I stored the wood inside my apartment, and the wood subsequently dried compared to the higher humidity inside the lumber shop warehouse which is not air conditioned. The wide shelf and top boards warped before I started construction. A good solution found is to place damp towels on the boards and followed by weights on top of the boards. Leaving the boards under weight for a few days made them flat again.

Building the Frame

I began building a frame for each side of the bookshelf. Each frame consists of two legs, a top and bottom board, and the side board. I attached all of these with a Kreg pocket hole jig. I recommend starting with a practice board before drilling a pocket hole in the final wood piece. I drilled pocked holes from the inside of the side board along all four edges to attach the legs and top and bottom boards to the side board. It is important to set the angle to ensure the holes break through the edge of the side board and not the opposite side.

Kreg pocket hole jig and drill bit
Clamp the jig and board to the table to ensure a clean hole

The assembled side frames turned out well. I added a chamfer to the legs for added style.

Completed side frame
Close-up of chamfered legs

Next I attached the top two rails to attach the side frames.

Building the Shelves

I built the shelves in the same manner as the side frames using pocket holes to attach a front and back edge. For all shelves besides the bottom shelf, the design calls for rotating the front edge downward to have the middle shelves inset from the bottom shelf.

I chose sides with discoloration or knots to be on the bottom as possible

I filled the pocket holes with a water based wood filler.

Building the Top

I could not find a wood piece wide enough for the top, but I was able to use a similar method as making the cutting board. I glued together six 3″ wide boards to form the top. Once the glue had dried, I also brought this to the lumber shop to finish with the drum sander, and I used the router table to cut chamfers along the edges of the top as a design feature. The chamfers were not part of the original design, but I saw they provided added character to otherwise square pieces.

Finishing the Back

I wanted a unique look for the back of the bookshelf. The back is often an area neglected on store bookshelves where they may have a cardboard composite backing, but I wanted something better given the time invested thus far. I found that bead board provided a nice style while still being cheaper and lighter than the rest of the bookshelf.

Staining the Bookshelf

I found staining the bookshelf one of the surprisingly harder steps. Since I had chosen maple wood and sanded with a 220 grit to remove hairs prior to staining, the stain did not penetrate well. Maple wood has much finer grains than oak or especially pine, and I would use either a very light or dark stain if I ever work with maple again. Instead of absorbing, the stain tends to smear. The stain will also dry quickly and becomes darker with each coat. Thus I had to be careful to stain all wood that was nearby at corners to prevent lines from multiple staining rounds. The bookshelf turned out well. In retrospect I would have chosen a cherry wood to produce a similar finished color.

The finished bookshelf is wide enough to serve as an end piece to a sofa.

  1. https://www.popularwoodworking.com/techniques/your-guide-to-tear-out/

Crispy Buckeye Recipe

One of my favorite desserts to make for the holidays are buckeyes. Making chocolate desserts is easier than most expect their first time, but the process has several important steps that determine the outcome.

This Christmas I prepared about 45 buckeyes which requires the following ingredients:

  • Peanut butter: 1 cup (16 oz).
  • Confectioner’s sugar: 2 cups.
  • Crispy rice cereal: 1.5 cups.
  • Butter: 1/2 cup.
  • Vanilla extract: 1 tbsp.
  • Semi-sweet chocolate chips: 16 oz.
  • Dark chocolate chips: 16 oz.

The first step is to melt the butter and mix the peanut butter, sugar, rice cereal, butter, and vanilla extract together in a bow. The mix should be relatively dry and not sticky. One may also reduce the confectioner’s sugar as long as the butter is also reduced. These two must be balanced to ensure sufficient consistency to keep the buckeyes together.

It is important to ensure the right consistency after adding the butter. This image shows after adding butter but before adding more sugar. The consistency should be drier than shown here.

Combining the semi-sweet and dark chocolate is a variation that I have found produces a good chocolate taste. I create a double-boiler using a larger frying pan containing the simmering water and a smaller pot to melt the chocolate. I maintain the water temperature low enough to prevent the chocolate from drying on the side of the pot.

Freezing the peanut butter balls overnight ensures they remain in one piece while dipping in warm chocolate.

How Lethal is the Covid-19 Virus for Millenials?

When Covid 19 infections began to be publicized in the United States in March, we had little data to understand this virus except whatever reports came from China. Even by late April, many states outside of New York had little internal data to assess the impact this virus would have in their locale. The early absence of data led to widely varying opinions on the virus, and individual behavior varied with the absence of local government guidance.

In November 2020 after eight months, all states in the US and many countries have firsthand experience with higher than normal hospital ICU admissions. Behavior has become more normalized as the US has adapted to living with this virus until vaccination becomes widespread. However, opinions still vary with regards to the lethality of Covid-19.

Method

This purpose of this article is to take an objective approach to analyze deaths reported by the CDC. More specifically this article examines deaths due to natural cause across the United States to investigate the impact of the virus across all age groups and for millenials. Death counts are based on received and coded records received by the CDC by November 25, 2020. This work analyzes death counts to minimize debate over subjectivity with the Covid diagnoses. Deaths due to natural cause also removes non-viral causes such as drug overdoses correlated with the increased economic hardship or lockdowns. Data is downloaded from the National Center for Health Statistics at the following link: https://data.cdc.gov/browse?category=NCHS&sortBy=most_accessed . Plots are created using Plotly and embedded using the Plotly Chart Studio. Note that some plots below have axes titles which will appear after selecting the plot.

Weekly Deaths for All Age Cohorts

A review of the deaths by natural cause indicates that the United States experienced a peak 46% increase in weekly deaths due to natural causes with 73,981 in 2020 versus 50,799 in 2019 in mid-April. Texas experienced a peak 76% increase in weekly deaths due to natural causes with 5,887 in 2020 versus 3,349 in 2019 in late July. The Deaths by Natural Cause demonstrates the mortality of Covid across all age cohorts. No other natural cause besides Covid-19 can explain these deaths, and the deaths correlate with commonly known timing of reported spikes in United States Covid-19 related deaths.

Weekly Deaths by Natural Cause in the United States

Figure 1. Weekly deaths by natural cause in the United States shows a primary peak in mid-April and a secondary peak in mid-July.

Note that the Weekly Deaths by Natural Cause decreases during the summer months in 2019 which is typical. This may be attributed at least partly due to the influenza virus and pneumonia which is more prevalent in winter months. The Weekly Deaths by Natural Cause increases during the summer months in 2020.

Weekly Deaths by Natural Cause in Texas

Figure 2. Weekly deaths by natural cause in Texas shows a primary peak in late-July and remain elevated since the peak.

Note that the excess deaths shown in Figure 2 as the 2020 counts above the 2019 counts under counts Covid-19 lethality compared to the 2019 baseline. The United States has undergone historic measures both voluntary and involuntary to reduce viral spread, and typical cases of other respiratory infections is expected to be lower. Analyzing and verifying a decrease in other respiratory infections is left for a future post.

Monthly Deaths for 25-34 Year Olds

Another question is the extent to which Covid is causing deaths in the 25-34 year old cohort which includes the most millenials. A plot of monthly deaths by natural cause for this cohort indicates increasing deaths throughout 2020 culminating in a peak 83% increase in monthly deaths due to natural causes with 3,008 in October 2020 versus 1,636 in October 2019. Note that the baseline death rate for 25-34 year olds remains relatively low. For comparison, the October monthly deaths due to natural causes for 65-74 year olds was 44,360 in 2019.

df_monthly_deaths_us_25-34_NC

Figure 3. Monthly deaths by natural cause in the United States for the 25-34 year old cohort shows increasing rates throughout October 2020.

The classification for many of these excess deaths in 2020 compared to 2019 among the 25-34 year old cohort is classified as “Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified R00-R99.” This classification may be due to the R99 classification being used for “pending Covid-19 testing” as described here: https://www.cdc.gov/nchs/data/nvss/coronavirus/Alert-2-New-ICD-code-introduced-for-COVID-19-deaths.pdf?fbclid=IwAR2XckyC93jfKqvOue5EdPlNA8LlKKgz4vPZTU1whI4vXLSOADSjsL9XY-M. The excess deaths for 25-34 year olds is following a pattern of exponential increase through October 2020.

df_monthly_deaths_us_25-34_R

Figure 4. Monthly deaths (R00-R99) in the United States for the 25-34 year old cohort demonstrates an exponential increase throughout 2020 to date.

Comparing the 25-34 year old cohort with older age cohorts in October 2020 shows that the deaths by natural cause are approaching the same order of magnitude as older ages possibly due to higher viral load. The deaths for the 25-34 year old cohort has reached 61% of deaths for the 55-64 year old cohort with 1,372 for 25-34 year olds and 2,237 for 55-64 year olds.

October 2020 Death Count by Natural Cause for Age Groups

We can assume that the exponential increase in excess deaths during the Covid pandemic among 25-34 year olds is due to changing behavior in this demographic in 2020. One cannot blame them without considering that this cohort is more likely than older cohorts to work in service industry jobs that are customer facing. The cohort also has more need to socialize outside of a family bubble compared to older cohorts. However, this data shows that younger Millenials and older Gen-Z members are not immune to Covid-19 related mortality and in fact have increased their chances of death by 83% on average, and those in this generation with least careful behavior must have increased their rated of death by natural cause even higher. This serves as a warning not to take the virus too lightly despite their age.

Big Bend National Park

Five friends and I traveled to Big Bend National Park in April 2017. Big Bend has the least light pollution of any National Park, and on a clear night, visitors can see the Milky Way galaxy and over 2000 stars. We saw the Milky Way during our trip in the southern sky.

April is a good time to visit before the temperatures become hot. The temperature at the park headquarters averages about 84 deg F, but the temperatures where we stayed at the Rio Grande Village reach 5-10 degress warmer, and we saw highs in the 90’s during the day. Since most of our hiking was at Chisos Basin in the mountains, temperatures were 10-20 degrees cooler. The dry desert environment can have 30 degree temperature swings between night and day in April.

Documenting the 2020 Pandemic

As we all experience larger changes to our world than many of us in the States experienced with the dot-com crash, 9/11, or the Great Recession, I am documenting what it is like to live during this time.

Friday, March 13, 2020

The past week has been stressful. The coronavirus COVID-19 has begun spreading in the US, and the stock market has begun its steepest decent since 2008. Washington and New York has over 300 cases, and Texas has 23 cases. Overall the US had 1,215 cases and 36 deaths, all in Washington. The Washington outbreak began in a nursing home.

Life is Still “Normal.” I am in a work training class with colleagues from Indonesia, Russia, England, and possibly other countries flying to Houston for this class. All of us were in a normal classroom sitting a couple feet away sharing the same table. All virus cases in Houston are being reported as due to international travelers returning home to Houston. We also have a teacher who had walking pneumonia for the past month with a dry cough, very similar to COVID-19 symptoms. This is not reassuring as he approaches our desk and talks two feet from us. Meanwhile, by Friday we are being told at work to maintain six feet from others. I feel like I am at higher risk based on all the information we have now, but it seems I would be overreacting to skip the class and return to my desk. I was encouraged this past week to ask our supervisor to work from home if we feel this is necessary given the virus, but the company has not provided official guidance on working from home. By the end of the day, I hear that class members are having to rebook their flights to return to their home countries amid tightening travel restrictions.

In the prior month, President Trump placed a ban on non-resident travelers from China on February 1 and quarantined US resident travelers from Wuhan. We have seen 80,000+ people infected and 3,000+ deaths in China though we find later that these numbers are underreported. I read in the WSJ about Wuhan medical workers wearing hazmat suits all day and had family members trapped outside Wuhan due to a quarantine.

Church Services. By Thursday evening our church had given guidance that the Sunday service would be online, and other Sunday classes are cancelled. Meanwhile some classes at church still planned smaller weekly gatherings. Our church was ahead of government restrictions which would come later.

Monday, March 16, 2020

Working from Home. The past weekend has seemed like a whirlwind of increasing restrictions. On Friday, I was debating whether to request to work from home as some colleagues had done. By Sunday evening our workplace sent an email stating that only essential employees be asked to come into work this week. In the past two weeks I attended the IADC conference in Galveston and a training class with international travelers. I managed to complete these just before the restrictions would have stopped both.

The President also issued guidance called “15 Days to Slow the Spread.” The federal guidance generally was to stay home if you feel sick, are elderly, or are otherwise at increased risk. It also recommended to attend work or school from home as possible unless you are in a critical industry as defined by DHS and avoid social gatherings of more than 10 people, avoid dining out, and practice good hygiene.

Official gatherings at church are generally cancelled now since they usually involve more than 10 people.

Thursday, March 19, 2020

Schools and Restaurants. Texas Governor Greg Abbott issued the first public health disaster order in Texas since 1901. Schools will be closed, public gatherings are limited to 10 people or fewer, restaurants are limited to take-out orders only, and non-essential state employees are called to telework.

The California governor Newsom issued one of the strictest lockdown orders outside of China and Italy which limits Californians to their homes except for exercise and essential needs and would not allow gatherings of up to 10 people.

Tuesday, March 24, 2020

Stay at Home. The Harris County Judge Hidalgo issued an order for residents to stay at home similar to the California order and other orders since issued. The main restriction now is that people should not be interacting within six feet of others outside their household unless caring for a friend or family member. Groups of up to 10 people are no longer permitted.

Since I live at home, this restriction was particularly difficult, but we are allowed to go out for walks and visit parks with friends as long as we mind social distancing guidelines. The order also closed “non-essential” businesses like barbers, gun ranges, and church business besides preparing for services.

Government officials repeat that masks will not help prevent you from contracting the virus. I cannot believe they recommend not wearing masks in good faith. The virus is spread by respiratory droplets that exit the body through the mouth and nose. Some masks like surgical masks may not filter the virus well but will hinder the sick from spreading it and create some marginal barrier for inhaling the virus.

Sunday, March 29, 2020

Interstate Travel Restrictions. Texas Governor Abbott ordered drivers from Louisiana to self-quarantine for 14 days. He also expanded the self-quarantine for airline travelers from Miami, Atlanta, Detroit, Chicago, California, and Washington state. The Texas DPS is enforcing checks at airports and along highways.

Tuesday, March 31, 2020

I trimmed my hair for the first time, and it actually looks good. A friend in person and colleagues on videochats gave their approval. No barbers are open with the social distancing lockdowns.

April. The federal and state stay-at-home orders were extended through end of April. Our company likewise extended working from home orders through the end of April.

Saturday, April 4, 2020

Masks. The CDC has finally changed their position and now recommend that the general public wear non-medical, cloth masks to hinder the spread of the virus. Many people do not show symptoms, maybe up to 50%, and wearing a mask will reduce the spread from asymptomatic people. The President said it is voluntary and that he would not be wearing a mask.

United States. The US has by far the most cases worldwide now with 275,000 confirmed cases and 7,100 deaths, 1,100 today alone. Texas has 6,050 confirmed cases while New York is at 102,000 confirmed cases. Spain has the largest number of cases outside the US at 124,700.

Houston. The number of cases in Harris County continues to increase. I hear reports of hospitals having their COVID units already full and placing patients showing COVID symptoms in other units. Houston Methodist is seeing patients double every 3 to 4 days. The hospital currently has 116 patients testing positive, not all requiring the ICU, and can handle 450 ICU patients. Texas is still far behind other states with testing. I read that the 25 county SE Texas region has about 1,000 hospital cases. Since about 10-20% of those infected require hospitalization, we could have 5,000-10,000 in SE Texas actual cases alone. Masks and ventilators are in short supply in New York already for hospital workers. Large hospital systems in Houston are not reporting shortages, but smaller providers are.

Life Changes. Besides what I have already noted, a few other life changes are:

Family members and friends have experienced layoffs or furloughs.

87 gas prices at $1.50 in Spring, TX.

Tape on the floor of stores to encourage social distancing.

Standing line at grocery stores in the morning especially in hopes of getting a pack of toilet paper or paper towels. This has improved somewhat, but at the peak I could not get most of my frozen vegetables even after waiting in a line for 45 minutes before HEB opened.

Parks are closed in other states like SC, NC, CA, … but not in TX.

Wimbledon was cancelled for the first time since WW II. I had tickets to the theater on March 20 and the Rodeo on March 22 cancelled and still waiting on refunds for the latter one.

Weddings have been postponed including a brother’s wedding. Funerals are having limited or no attendance permitted in some states.

Manufacturers have switched their production lines to making ventilators (auto companies) and sanitizer (distillers). ExxonMobil has ramped their IPA production and is helping with a new mask design which uses less polypropylene.

We have daily press conferences by the President, state and local authorities. In a recent mayor conference, Mayor Turner said the city has a right to take over a former hospital building for sale regardless of the owner’s preference.

Building a Deep Learning Machine — Part 4: Installing the Ubuntu 18.4 OS

The machine booted successfully using the chassis power button. The initial screen displays that the CPU and RAM are recognized. The image was taken when only the initial 16 GB RAM was recognized before moving the RAM card to another slot.

I decided to install the latest LTS (Long Term Support) desktop version of Ubuntu 18.04.1 LTS. I chose Ubuntu over Windows 10 since some machine learning packages like OpenCV can only run on Ubuntu, and some applications like Docker are built on Linux and are easier to install on Ubuntu.

I downloaded the 2 GB Ubuntu 18.04.1 ISO file and burned a bootable ISO file onto a USB flash drive using Rufus. The machine booted off the USB flash drive without changing BIOS boot settings beforehand. I chose default Ubuntu installation options with login credentials required. Installation steps are described here in more detail.

The only installation issue I had was that I could not get past the Ubuntu login screen. After every attempt to login by entering the password, Ubuntu would return to the login screen. No error appeared for an incorrect password since the password was correct. I selected the Use LVM with the new Ubuntu installation option on the first installation. LVM stands for Logical Volume Management and allows the user to add, modify, resize, and take snapshot partitions.

The infinite login loop issue was resolved by reinstalling Ubuntu without selecting the LVM option.

Ubuntu 18.04.1 desktop