Things I wish I knew before I started computer vision

Enos Jeba
Towards AI
Published in
6 min readMay 21, 2023

--

Python will not save you!

Computer Vision is where you see the world with your eyes and explain it to a computer and make it look at the world the same way that you do.

Computer vision is a fascinating and diverse field of research that has applications in many domains. It is the science of making computers understand and interpret visual information, such as images, videos, and scenes. Computer vision is special because it combines aspects of different disciplines, such as mathematics, physics, psychology, and engineering.

Some of the subfields of computer vision are:

Image processing the manipulation and enhancement of images, such as filtering, compression, segmentation, and restoration.

Object recognition the identification and classification of objects in images or videos, such as faces, cars, animals, or text.

Scene understanding the analysis and interpretation of complex scenes, such as indoor or outdoor environments, events, or activities.

Computer graphics the creation and rendering of synthetic images or animations, such as 3D models, virtual reality, or augmented reality.

Machine learning the use of algorithms and data to learn patterns and make predictions or decisions, such as deep learning, neural networks, or reinforcement learning.

Computer vision is a challenging and exciting field that requires creativity and innovation. It has many potential benefits for society, such as improving security, health care, education, entertainment, and more. Computer vision is also a rapidly evolving field that offers new opportunities and challenges for researchers and practitioners.

The spot where computer vision exactly scratches the itch (Diagram Below) is explained by Michigan Online in one of their class, Watch here

Lecture 1: Introduction to Deep Learning for Computer Vision — YouTube

Entering such a field was thrilling. It was like thinking from numerous perspectives at the same time, like going on multiple water rides at the same time.
A little about myself I have a bachelor’s degree in computer science, so I have some knowledge of coding and how it works, even though I never practiced coding throughout my undergraduate years since I was too involved with other creative endeavors. After finishing my bachelor’s degree, I pursued a master’s degree in data science to discover what was new in the field. I had a strong desire to be unique, and this was a new subject at the time, with just a few individuals enrolled, which piqued my interest, so I chose it.

I was exposed to Computer Vision in the last semester, and part of me believed that was the professional route for me. So, after some time in the field, here are some things I wish I had known.

Python may not save you.

This is only related to specified usage circumstances. Python was the craze when I first began out, and because I wasn’t into coding much, Python offered simple steps for me to get started. It was simple to grasp, so I stopped investigating other languages and concentrated solely on Python. Again what I am about to say is that this is limited to a few use scenarios.

It was time for deployment after a while of getting my hands dirty by building models and training neural networks using various datasets. I was really looking forward to this stage. I like deployments for some reason; it’s like offering your light to the rest of the world to utilize. I was mistaken about how simple I assumed this would be compared to the other stages. You will need to examine a number of real-world criteria to ensure that your solution does not stop and continues to function properly. The icing on the cake is that computer vision deployment occurs mostly on edge devices. Edge Devices are tiny computers with some GPU memory. Because of their compact factor, it may be placed practically anywhere, making them widely employed in a variety of industries. Because these gadgets are compact, their processing units and memory will be as well. So, assuming you successfully trained on your RTX or Quadro GPU, you must now optimize your solution to work on the edge device.

Python runs well on this device; nevertheless, C++ or C would give additional optimization options. Nvidia CUDA libraries, as well as other key libraries like openCV and Tensorflow, are developed in C. Using C gives you direct access to memory, allowing you to make use of multi-threading and other optimization capabilities. Python, in my opinion, is still useful for model training/construction owing to its interpretative nature, but when it comes to deployment, C or C++ offers more.

An Idea about the Business / Industry domain would help you.

This is for the “I want to save the World” part of you. We can never fall short of How many problem solvers we need in this world. If you can look at a problem with your eyes and come up with a computer vision algorithm to solve it, you are a Hero. You can easily sell the solution or make it available to the real world, making a huge difference in people’s lives. Many industries are also stuck with methodologies, and they would surely like your help with computer vision.

For example, If you want to integrate machine learning into healthcare industry here’s how you would do it!

Counting people, examining traffic, AR filters, Face Recognition, every algorithm solves a real-world problem and. Computer vision is a somewhat complex field; many real-world problems are yet to be solved, and most part of them will benefit much from the addition of computer vision. If you have just thought about that idea, I would recommend that you pursue it.

No one knows the whole thing.

It is a field that forms a line across many other fields. As a result, no one can ever fully comprehend how everything works. Everyone has a basic understanding of how the world works and can use that knowledge to address the problem at hand, whereas someone in another part of the world is doing the same thing but with a different difficulty.

Whatever problem we choose, there is always something new to learn, which brings me to my next point.

It may seem like a forever learning

At every stage of the project, there is so much to learn. You may be thinking and learning about multiple parameters involved in the real world, simulating a possible blockage for your solution and then finding a patch for it, and then going about learning how exactly things work in the real world to modify your solution to yield a better result. It’s learning, learning, learning. It’s Learning, learning, learning.

You need a Phd

I’m not sure about this. The current situation does not qualify me to respond to this, but I will certainly update it here, or this may be a chance for you to drop a comment on in and inspire others.
I believe that coding and deep learning skills can get you started in your career, but I have seen many job postings that require a PhD, so I included this point.

At the end

If you enjoy thinking about several topics and combining knowledge from other disciplines to find a solution, this may be the field for you.

It’s exciting. It’s Awesome. Be prepared to learn more.

If you are inspired and want to get started with computer vision, here are a bunch of tools you should keep.

BECOME a WRITER at MLearning.ai // AI Factory // Super Cheap AI.

--

--