Weekly snippets
Oct 6, 2019
- TL Voice
- Played around with CMUSphinx, specifically pocketsphinx with python wrapper, seeing some promising results
- (birthday weekend!)
Sep 29, 2019
- TL Voice – Researched options for phoneme-based voice recognition for simple dictionaries
- RL library:
- Struggled but finally succeeded in extending pytorch autograd (writing modules) for using a precomputed gradient in backward pass. This will help me reuse existing NN-model code for upcoming Policy-gradient implementation.
Sep 22, 2019
- Tagave and Data pipeline!
- RL library further refactoring:
- Made it much easier/cleaner now to use a common NN model across my projects, and agnostic to the inputs and outputs.
- This enabled me, in the Racecar project, to feed both state and action into the input of the NN rather than having the actions only represented at the output. I’m seeing much better performance here.
- More importantly, this clears my path for implementing Policy Gradient next.
Sep 15, 2019
- RL library refactoring:
- RL racecar:
- Making good progress on having the NN agent drive itself (unlike earlier where the QLookup was the “driver” and NN was merely an observing learner). This still uses only Value Function Approximation, and I soon plan to try the Policy Gradient approach.
- Added some more debug/analysis tools to library.
Sep 8, 2019
- RL racecar and RL coindrop:
- Extracting all common code to a new library I call ReaLLy in a new git repo.
- Got the new library fully functioning for both racecar and coindrop but not yet ‘transitioned’ the old codebases to use the new library.
Sep 1, 2019
- Cleaning up the setup of this blog on aws/bitnami, including theme, email, https support and dns settings.
- RL racecar:
- Rewatched policy-gradient RL video (David Silver)
- Refactoring old racecar code and merging it with my newer RL code, in order to implement/test Policy gradient algorithm with the simple racecar problem.
- Trying to get racecar to drive itself (along the rectangular track) using NN alone as the main driver. Still haven’t found the right hyperparams to reach finish line.
Aug 25, 2019
Aug 18, 2019
Aug 11, 2019
Aug 4, 2019
- Some aws / bitnami wordpress setup for this blog
Jul 29, 2019
Jul 22, 2019
Jul 15, 2019
Jul 9, 2019
- {travelling}
Jul 1, 2019
- deeplearning.ai course continued
- {travelling}
Jun 23 2019
- Kaggle Expedia hotel recommendations dataset:
- Tried various RF configurations with one-vs-rest
- Found bug (leak) in my earlier xgboost experiment
- deeplearning.ai course on Improving Deep NNs: Hyperparameter tuning, regularization and optimization
Jun 16, 2019
- (Some work on the Kaggle dataset but not much.)
- RL Connect-4: Continued wrangling with the NN:
- Maybe the problem is linear output layer
- Works better without Batchnorm
- Perhaps this is the limit to “value approximation” RL and I should move on to policy gradient methods.
- Came up with a new idea for propogating intermediate “reward vectors” (through sarsa for example) instead of a single reward value. Brainstorming.
Jun 9, 2019
- RL Connect-4: Continued wrangling with the NN model
- Tried overfitting for just the end-of-game states
- Tried various NN architectural changes, including adding layers, changing sigmoid to tanh/relu, trying BatchNorm and Dropout
Jun 2, 2019
- Kaggle Expedia hotel recommendations dataset:
- Cleaned up data with OneHotEncoder and LabelBinarizer
- Memory overflow solved using sparse matrices instead of full matrices
- Got poor results with RandomForest approach
- Playing with XGboost.
- RL Connect-4
- Tried several more architecture configuations. Tried narrowing the network to naturally increase regularization.
- Tried deepening the network in order to first get some good overfitting. However, the NN doesn’t learn anything except an overall blind-guess. Tried with Tanh output. Tried with/without new episodes. I’m still trying to overcome this local minima.
- Also plotting ratio of dead ReLUs to help debug.
- RL David Silver: Watching lecture on Model-based RL
May 26, 2019
- Kaggle Expedia hotel recommendations dataset:
- Exploring/visualizing the data, finding column correlations, dummifying categorical columns.
- RL Connect-4
- Debugging tool: Output activations from every layer for further analysis.
- Productivity: Tried fixing live-plot-stealing-focus issue but didn’t succeed.
May 19, 2019
- RL Connect-4
- Implemented a browser-based interface to play an interactive game against the trained model (using onnx.js). Turns out the model plays quite poorly (despite seemingly having trained well.)
- Wrote a faster lookahead agent to train against (using alpha-beta pruning)
- Better tooling for graphs and other stats to help understand the training progression and debug issues.
- Tried various ideas for the model but got no substantial improvements yet.
- Sarsa(λ) seems to work better than Q(λ)
- For the input to the NN, I “bound” / applied the actions to the states (i.e. I used the resulting board state) producing a single output value – rather than having the original state as input and 7 separate action-specific outputs. This seems to have improved things somewhat.
- Getting acquainted with Kaggle by walking through the Titanic dataset.
- Continued watching cs231n with its interesting insights into CNN implementation tips as well as RNNs.
May 12, 2019
- RL: Connect-4
- Digging deeper into why error costs could possibly drop (or jump) purely because of Q(λ) updating of the targets. I strongly suspected a bug but eventually realized this was perfectly normal, that there is a coordinate-descent effect between the RL algorithm and the FA algorithm.
- Improved FA, switched to sigmoid output. Improved feature engineering. Somewhat better results.
- Fixed the way validation set data was collected.
- Better tooling for debugging, storing episodes and model to file.
- Cleaned up code for github
- Finished up and posted blog post with my results so far.
- Continued watching cs231n course, because of useful CNN insights that could help me with Connect-4.
- Started setting up this blog on AWS.
May 5, 2019
- RL: Connect-4
- Various debugging to get things to run. Didn’t get positive results with Q algorithm.
- Implemented Q(λ). Still not great results. Simple 3-layer NN not good enough?
- Watched some videos about Convolutional NNs and plugged in here (pytorch).
- Training including “mirror image” and opponent’s episode history.
- Trained successfully against “random” player, then against 1-lookahead and 2-lookahead, with good progress.
- Various experiments trying to debug funny-looking error-cost graphs.
- Testing against validation set; attempted regularization of NN but got very poor results. Needs further investigation.
- Started writing up blog post.
Apr 28, 2019
- RL: Connect-4 with Neural Networks
- Coded up connect-4 game mechanics
- Refitted racecar RL code to allow game-play between n players and learning from experience. (I had to turn the code inside-out to be episode/game-centered rather than agent-centered)
- Streamlined code to be better suited to collection of historical episodes for retraining.
Apr 21, 2019
- Kaggle Careercon (online talks about data science career).
Apr 14, 2019
- {Return from Italy vacation}
Apr 7, 2019
- {Italy vacation: Art and pasta and Grand Canal}
Mar 31, 2019
- {Italy vacation: explored Ancient Rome and Christian Rome, and took a train to Renaissance Florence}
Mar 24, 2019
- Attempted big refactor to remove duplication of code across Q/Sarsa and lambda/non-lambda algorithms, but stuck on figuring out how to refactor eligibility trace code (such that it would work across all types of function approximators.)
- {Italy vacation: landed in Ancient Rome!}
Mar 17, 2019
- Started writing up blogpost for Polynomial RL work I’d done earlier.
- Re-ran experiments with better params for circle and rectangle track. With polynomial of degree 3, I got improved results for the circle track but hardly enough to go around. Multi-polynomial of course did better.
- Fixed polynomial to use pytorch. Reduced GPU memory usage by 50%.
- Looked into rewiring the code to (a) allow real-valued/continuous states, and (b) parellelize the episode-runs preferably on GPU… but the implementation will take some time, so shelving it for now.
- Continuing David Silver’s RL videos again where I left off.
Mar 10, 2019
- Published blog post for Q(λ) and Sarsa(λ) / Q-lookup RL experiments of October.
- Performed lots of renewed experiments and hyperparam searches in order to flesh out the blog post.
- Refactored hyperparam / “CV” code and collect CV results into a database that can be easily read to plot graphs from. Improved the “scoring” to reward both optimality and speed of algorithm (this may still need tweaking).
- Refactored the trainer setup code so as to eliminate duplication of code across racetracks, and to make future experiments easier.
- RL RaceCar and Neural Networks
- After some amount of debugging, this immediately started working well! Specifically, training a pytorch NN on the accumulated “training set” (generated earlier from a different RL approach) worked very well and the “NN car” drove around the circular track. (My “single output backprop” hack seems to be working.)
- Live-training of a “mimic” NN model also seemed to go well.
- Live-training of an independent observer/”student” NN driver also did well enough, although it’s unclear how much of the learning has generalized well enough. Needs further investigation before moving on to a truly independent NN driver.
- Memory-leak issues, which I tried debugging using various memory profiling options (but ultimately just resorted to commenting out individual pieces of code), and found the satisfying solution lay in simply turning off pytorch’s autograd during the episode-runs.
- Figured out how to plot graphs “live” during the training — so now I can see if a run is clearly doing poorly and fix things immediately.
- I’m finally getting a good grip on git! (great!) I continue to use a feature-based branching flow but I also intend to keep branches as checkpoints for older blogposts.
Mar 3, 2019
- RL RaceCar:
- Verified that the prerecorded dataset was fairly consistent in target values. MultiPoly on Rectangle seems to do okay on this dataset but not great.
- Read up on pytorch’s NN library and coded up a function approximator on top of this. (Tricker than Poly FA because now I have multiple outputs from the NN but each training datapoint trains only one of those outputs.) Nearly ready to test this out.
- Dealt with some git issues, recovered lost branch using git reflog and checkout.
- Preparing long-pending blogpost for RaceCar part 2.
Feb 24, 2019
- CUDA/PyTorch:
- Finally installed CUDA and PyTorch in order to use my GTX 1080 GPU.
- RL RaceCar:
- Training rectangular track: Not much progress. Also dealt with memory problems in loading full dataset.
- Switched from numpy to PyTorch (GPU) and saw 6x speedup in training.
Feb 17, 2019
- RL RaceCar:
- Added various debugging plots to help improve the results for “mimic” and “student” MultiPoly racecar but the results were only getting worse.
- Eventually, tried the fully-bootstrapped Q-learning (using MultiPoly) for circular track and got this to work better!
- Rectangular racetrack: This was harder but by adding some feature-engineering, started getting decent results with mimic for pre-recorded dataset.
- Code refactoring and git check-in
Feb 10, 2019
- RL RaceCar:
- Implemented “MultiPoly” FA, which is 9 separate polynomials, each modeling one of the 9 possible actions.
- This performed notably better than regular polynomial FA. The MultiPoly FA was able to satisfactorily model the data from the fully-trained QTable FA.
- Got this to work even for “live” learning in independent Student mode.
- Tweaked #epochs, #reexplorations and learning rate to find a good balance so that the model covered all types of episodes without forgetting old ones.
Feb 3, 2019
- RL RaceCar:
- Discovered that all the polynomial FAs that I trained so far were not actually modeling the data well.
- Tried engineering various feature including Splines but these barely helped.
- Tried tweaking number of epochs, and number of reexplorations.
- Tried training on data from a fully-trained QTable FA, but got poor results.
Jan 27, 2019
Jan 20, 2019
- RL RaceCar:
- Attempted polynomial linear regression alone (bootstrapping without a guide) for the circular track. However this performs terribly. In fact it gets worse every epoch. Perhaps this has to do with uneven distribution of training data.
Jan 14, 2019
- Tagave project (Tool to help learn language):
- Researched online for latest voice recognition developments and open-sourced tools and and pre-trained models.
- Played around with Mozilla’s DeepSpeech pre-trained model but this is probably not what I’m looking for.
Jan 6, 2019
- RL RaceCar:
- Refactoring to fully decouple “student” learning from “guide” function approximator.
Dec 30, 2018
Dec 23, 2018
Dec 16, 2018
- (Bought a desktop computer with a GTX 1080 GPU, set up dual-boot with Ubuntu.)
Dec 9, 2018
Dec 2, 2018
Nov 25, 2018
- (vacation)
Nov 18, 2018
- (vacation)
Nov 11, 2018
- RL RaceCar:
- Trained polynomial function approximator (FA) on off-policy episodes for circular track
Nov 4, 2018
- RL RaceCar:
- Got Q(λ) working for table lookup.
- Implemented Sarsa and Sarsa(λ) for table lookup.
- Started working on function approximators (polynomial regression)
Oct 28, 2018
- Back to RaceCar Reinforcement Learning:
- Implemented Q-lambda for the same race car problem. Only a modest improvement over Q-learning. Still trying to figure out the ideal hyperparameters.
Oct 21, 2018
Oct 14, 2018
- Refactored NN code to have similar structure as CS231n’s assignment code
Oct 7, 2018
- (CS231n Assignment 2 (in progress) – Layer Normalization)
- vacation!
Sep 30, 2018
- (CS231n Assignment 2 (in progress) – Fully-connected NN, ReLu, Batch Normalization)
Sep 23, 2018
- Got RaceCar project (part 1) satisfactorily working:
Sep 16, 2018
- Made lots of improvements on the RaceCar project
- Got it to run successfully on the first circular track (It learns to drive the whole loop).
- Played around with hyper-parameters to get it to run faster.
- Lots of refactoring it make it extensible. Uploaded to github.
- Created a more flexible way to define new racetracks. Got it to work on a rectangular track, but still struggling to get it to drive a longer convoluted track.
Sep 9, 2018
- Posted blogpost on the Ads dataset categorization with Neural networks
- Reinforcement learning: Started RaceCar project – a relatively simple RL agent that uses table-lookup and q-learning.
- Learned a lot about matplotlib for drawing images and animations
- Devised a lot of visual tools for displaying statistics on and debugging RL runs.
Sep 2, 2018
- Restarted working on datasets. Tweaked NN and other models for Ads dataset, halfway through writing up blogpost.
- (Restarted CS231n online course where I left off – lecture 5 on Neural Networks)
Aug 26, 2018
- (Week 15 and 16 of Hinton’s NN course: PCA and Auto-encoders)
- Full revision of all lectures.
- Completed Hinton’s NN course, with 96.6% final grade.
- (David Silver’s RL course: Lecture 6 on Function Approximation with linear or neural network models, and Batch methods)
Aug 19, 2018
- (Week 14 of Hinton’s NN course: Stacked RBMs and pre-training deep nets)
- (David Silver’s RL course):
- Lectures 4 and 5: Model-free prediction and control (MC as well as TD/Sarsa algorithms), and off-policy learning (Q-learning)
- Assignment: Finding optimal policy for “Easy21” blackjack game.
Aug 12, 2018
- (Week 13 of Hinton’s NN course: Belief nets and Wake-sleep algorithm)
- Project: Restricted Boltzmann Machines with Contrastive Divergence.
- (David Silver’s Reinforcement Learning course: Watched first 3 lecture videos):
- Markov chains, MRPs, and Markov Decision Processes (MDP)
- MDP-based Planning: Policy evaluation and optimization for a known MDP
- Model-free value estimation: Monte Carlo and Temporal Difference Learning
Aug 5, 2018
- (Week 12 of Hinton’s NN course: Restricted Boltzmann machines)
Jul 29, 2018
- (Week 11 of Hinton’s NN course: Hopfield networks and Boltzmann machines)
- (CS231n lecture 5: Neural networks – various optimizations, batch normalization)
Jul 22, 2018
- (Week 10 of Hinton’s NN course: Combining models, MCMC for Bayesian, Dropout)
- (CS231n assignment 1 – 2-layer NN, feature-extraction, Dropout. )
Jul 15, 2018
- (Week 9 of Hinton’s NN course: Generalization techniques.)
- Project: Backpropagation for 2-layer logistic/softmax NN in Octave
- (CS231n assignment 1 continued.. Debugged softmax code, got it working.)
Jul 8, 2018
- (Week 8 of Hinton’s NN course: Recurrent NNs – multiplicative, echo-state, LSTMs)
- (CS231n assignment 1 – kNN, linear SVM, SoftMax .. to be continued)
Jul 1, 2018
- (Week 7 of Hinton’s NN course: Recurrent NNs)
- (Watched the first 3 lectures of Andrej Karpathy’s CS231n course)
Jun 24, 2018
- (Week 6 of Hinton’s NN course: Momentum, rmsprop, other optimizations)
June 17, 2018
- (Week 5 of Hinton’s NN course: Convolution networks and max pooling.)
June 10, 2018
- (Week 4 of Hinton’s NN course: Word prediction and softmax.)
June 3, 2018
- Successful run of neural network on the Ads dataset, after lots of parameter tweaking/debugging.
- (Week 3 of Hintons’s NN course)
May 27, 2018
- Continued python implementation of a neural network, preparing to run it on Ads dataset.
- (Week 2 of Hinton’s NN course)
May 20, 2018
- Started a python implementation of a neural network (based on pointers from Prof Ng’s coursera course)
- Started Hinton’s Neural Networks coursera course. Duration: 14 weeks.
May 13, 2018
May 6, 2018
Apr 29, 2018
- Fixed some environment issues with matplotlib.
- Got decent results on Ads dataset with SVM Linear but kernelized SVM didn’t work – it required too much memory.
Apr 22, 2018
- Filled missing values for Ads dataset, and got decent results with Logistic regression.
Apr 15, 2018
Apr 8, 2018
- Started working with Internet Ads dataset but stumbling on dealing with missing values as well as the huge feature space (and sparsity of data).
- Getting familiar with Pandas for handling datasets.
- Re-watching video lectures on Neural Networks, which I plan to implement soon.
Apr 1, 2018
- Setup remote repository and uploaded codebase on github.
- Further Abalone dataset analysis, this time running cross-validation for one-vs-one multiclassification with kernelized RBF SVM.
- Final test-run of codebase on Abalone to make sure everything’s still working after all the refactoring – unfortunately obtained a couple of bad results during an overnight run, and unfortunately unable to reproduce those. Shelving it for now.
Mar 25, 2018
- Diagnosed and fixed several more subtle bugs that were only discovered in overnight cross-validation runs of SVM. Mostly having to do with numerical precision issues in calculating the basis w0.
- Published blog post on implementing the SVM
- Obtained final validation and test results (and updated blogposts) for WDBC dataset and Abalone dataset.
- Lots of code cleanup, refactoring; improved debugging using python’s logging library.
Mar 18, 2018
- Got SVM working for multi-classification (Abalone dataset) including kernelized SVM.
- Also implemented one-vs-one multi-classifier, to replace “linear” multi-classifier.
- Implemented and ran Cross-validation for wdbc and Abalone datasets, having to fix a couple of bugs that were hard to nail down. Still dealing with a bug for Abalone.
Mar 11, 2018
- Re-watched old coursera videos on SVMs and researched about SVM solver algorithms; realized it was never meant to be solved easily with gradient descent. Decided to proceed with implementing a Quadratic Programming based SVM.
- Studied a bit of QP, worked out the math for applying CVXOPT QP library to the SVM problem; and implemented this.
- Studied more closely the dual-primal SVM problem and worked out the math for calculating w0 in the case of soft SVM (and kernelized SVM); implemented this.
- Successful runs on the wdbc dataset (as compared to sklearn) but still working on multi-classification SVM for Abalone dataset.
Mar 3, 2018
- Finished extending Logistic Regression and Perceptron to multiple classes, and analyzed Abalone dataset.
- Wrote blog post report for the Abalone dataset including all algorithms tried so far.
- Implemented k-Nearest Neighbors algorithm and tested both wdbc and Abalone datasets and updated the blog posts.
- Added utility code to automatically save graphs as image files. Also worked on improving logging various algorithm measurements but shelved this for now.
- Further setup of local git environment.
- Started implementing SVM, figured out the math and then struggled with its gradient descent algorithm, only to eventually realize that it’s not as simple as applying Linear Algebra. I need to read up on Quadratic Programming.
Feb 25, 2018
- Implemented Logistic Regression (gradient descent) and ran on wdbc dataset, successfully found the separating hyperplane.
- Plotted graph of solution likelihood over iterations, for training as well as test.
- Extended Logistic Regression as well as Perceptron to multiple classes (for abalone dataset), still working on this.
Feb 18, 2018
- Extended the Naive Bayes classifier to support multiple classes. Abalone dataset model produced a 58.4% accuracy, almost as good as Full Bayes.
- Setup Git locally
Feb 11, 2018
- Extended Bayes Plug-in (Gaussian) classifier to support multiple classes.
- Started working on Abalone dataset. Applied Bayes Gaussian classifier but this gave only around a 60% accuracy for 3 classes.
Feb 4, 2018
- Fixed convergence issues with Perceptron and optimized performance
- Created several useful plots in an attempt to understand fluctuations and spikes in batch perceptron
- Added normalization and basis component
- Added dampening of learning rate
- Vectorized code to get 10x speed improvement
- Read up theory on perceptron convergence behavior
- Updated and published lengthy blog post about the first dataset and the three classifiers (Bayes plug-in, Naive Bayes and Perceptron) implemented so far.
Jan 28, 2018
- Explored dataset (continuing on last week) by plotting various charts, such as histogram of distributions, pdf distributions. Investigated why the distributions of the two classes looked so different.
- Wrote code to plot an ROC curve (using an optimized greedy approach) for the Bayesian plug-in classifier and attempted to find an ideal threshold value for low false-negative results.
- Coded up a Naive Bayes algorithm and got results almost as good as the Plug-in classifier.
- Coded up a Perceptron (stochastic as well as non-stochastic), and found linear-separability (perfect classification) under some circumstances.
- (Learning Python): Refactored code into classes and multiple files. Edited to conform to Google’s Python style guide.
- Started writing up a report on this dataset for this blog but it’s not yet complete.
Jan 21, 2018
- Downloaded my first data set from UCI ML repository – Breast cancer classification – and wrote a python/numpy program for a simple Bayesian plug-in (Gaussian) classifier model, which gave 96% accuracy on the test set. Looked closer at the data column (feature) variances to understand the model.
- Started (this) personal blog and wrote my first post on advice on the CSMM online course in Machine Learning that I completed last month.