Machine Learning Solutions: Where to Start

6 min readDec 18, 2019

The new mesmerizing advancements in machine learning has caused widespread attention towards utilizing AI in providing solutions to various problems. From aerospace to retail business, data science and deep learning can play essential roles in optimizing the strategies and maximizing positive outcomes.

In this post, I am going over my personal approaches when it comes to providing a deep learning solution to a problem, whether it is an industrial or an academic one.

Step 1: Do We Know Enough?

Whether or not a human expert can utilize the information we are feeding to the machine to make accurate inferences (that we expect the machine to make) is a “sufficient” but not necessary condition in supporting the assumption that it is possible for the machine to solve it.

Please consider the following problem:

“Use this picture of a flower to predict breast cancer in the middle-eastern patient cohort.”

Well, this does not make any sense since the information that you need is most probably not in there. Aside from this trivial example, it is not always obvious what is or is not in the data that we have in terms of vital information to the desired task.

Note that before jumping into the deep learning and mathematics, one should ask themselves: What is the relationship we are trying to find here? For example, if the problem is to detect brain hemorrhage or heartbeat arrhythmia, one is trying to connect a semantic role to a physical entity, a disease to a human in this case. Now, we have an observation of this entity. For example, an image is a 3D tensor with values indicating the sampled light intensity associated with a specific color channel. This representation is in no way a full and thorough representation of that entity. For example, you are not looking at images in your phone gallery to predict depth or infer something about the object temperatures, therefore whether or not you are aware of the fact that such information is included in them is irrelevant.

Even though everything is relatively in the form of conjecture when it comes to problems as such, a straightforward workaround is to see whether or not a human expert can use these observations and these observations only to make accurate inferences consistent with the ground truth that you have in mind. This is a huge step in knowing that such information is useful, let’s take a look at some examples in this regard:

Humans can recognize objects from images (a funny example is the story of Andrej Karpathy training himself on ImageNet and increasing his accuracy)
Human experts can learn to play a game (Atari, correlate it with the advancements in the modern reinforcement learning)
Humans can swiftly perform domain adaptation in most areas (This is also an inspiration for many different meta-learning approaches).

Step 2: Choosing the Proper Framework

As programmers, we all have inclinations towards skipping the design step to jump to the “fun” implementation phase. However, one must note that before getting to implement we need to know precisely what we intend to build. This way we can choose the framework, project structure, etc. in a way that benefits our design decisions the most.

The next step is choosing the proper tools for this carpentry. Let’s say if the project is mainly data science-oriented (meaning that we mainly have to do with mastering the dataset and figuring out internal relationships, etc.) one might find languages such as R and Python more interesting for it. I personally would go with Python as I find it more robust than R. I have prepared many utilities for myself, two of which are my two libraries plauthor and dataflame (full documentation and easy pip installation is provided for both). These two are some of the main tools that I’d use since the combination of Pandas, PyTorch, Plotnine (for ggplot and grammar of graphics in general), Scipy, Numpy, and Matplotlib are my essentials in dealing with such situations. For computer vision applications, utilities such as the kornia library, open cv, etc. are some of the things I have found useful. For preparing to implement a machine learning pipeline, I usually would start with PyTorch (it is more famous in development and research, rather than TensorFlow which is really powerful in the industry), however, I would use the old TensorBoardX or recent PyTorch built-in TensorBoard plugins to utilize the advanced logging mechanisms of TensorFlow.

There are many more useful utilities as well, for example, one might find KNIME an easy way to validate basic machine learning theories on small datasets (I call it the Simulink for machine learning). Or if a web application is what is requested, data-driven documents library (D3.js) can provide users with an effective way of interacting with more complicated visualizations, from word clouds to 3D graphs.

For the time being, assume that the problem is recognizing a bunch of images and then assigning classes to them. After thinking about the above, the same as software engineering practices such as Agile, the design problem is finished. For example, I have come to the conclusion that certain family of CNN architectures is potential great candidates for such a problem. Now is the time for implementation.

Step 3: Project Structure

In this step, I mainly focus on the specific needs of the desired project and also the commonalities between that and the general machine learning approaches. This step is mainly my effort to organize my files and libraries in a way that I can minimize the technical debt and make it as easy as possible to be further developed in the desired direction.

In this case, I’d start with making the following folders:

- configurations
- models
- optimization
- trainers
- utilities
- tests

In this case, I usually write main configuration files as JSON files and store them in the configurations folder. As time goes by, I usually find that the number of parameters would grow and it is easier to utilize these files to store the parameters better and to feed them to the system in a more consistent manner. Note that this does not cause any critical issues in AutoML approaches as well as these can be used in conjunction with different configuration reading parameters.

Note that the configurations and optimization folders are not used in my Github sample as they are not needed since I am not using that many numbers of parameters in my launchers as well as specific loss functions.

In the models folder, I would store the main designed models, and the rest of the folders would also be consistent with the underlying libraries that I store in them. You can think of it as an instance of Layered design in which one would try to encapsulate the components of the project as much as possible.

Step 4: Implementation

It does not seem enough to mention the “implementation” as merely a step in this process as it involves extensive testing and debugging, etc. Nevertheless, my assumption is that the reader is already familiar with programming and deep learning so I do not go over much of the details in this case. However, I implemented this mock project and made it available in this Github repository. The codes are fully commented and one can find the documentation in there as well.

Step 4: Testing

I remember my software engineering Professor always used to say: “Programmers are the experts in automatizing everything but themselves”. Testing is not only important but it should be as automatized as possible so as to give the developers (especially if multiple groups are working on a giant project) the utmost confidence that the changes that they have made do not reflect as issues in any “unit”s of the design nor has any negative impacts on the quality of the output which is the result of an integrated effort.

Refer to the tests folder to see how we can write and automatize testing in Python.

Step 5: Packaging and Documentation

This step is the final step which is to package everything and provide web-based documentation (for example, refer to my packages up above) and write the article on them. Web-based documentations are my favorite since the nice interconnections between different concepts that we explain in there makes it easy for the users to follow those instructions (similar to using an advanced IDE to automatize our coding).

I would like to include some references in this post (especially about my claims in the beginning regarding several inspirations in machine learning), however, the concepts are fairly general. Therefore, please refer to the deeplearningbook.org and recent articles to validate those.

Thank you.