A digital image is fundamentally a NumPy array of numbers where each pixel contains numerical values representing intensity or color information; grayscale images have one channel with values ranging from 0 (black) to 255 (white) for 8-bit images, while color images have three channels (RGB) representing red, green, and blue components, and scientific images can have multiple channels for different fluorescence markers or spectral bands.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
Tutorial 1: Images as Data: Pixels, Channels, and Formats
Added:Hello everyone. Welcome back to my channel Digital Sreenu for those of you who already subscribed. And if you haven't subscribed yet, this would be a great time to do so. And I started my channel about in 2019 actually pre-pandemic after visiting quite a few institutes and universities, research centers where I met with a lot of let's say life science researchers, material science researchers, and geoscience researchers and found one common theme.
Obviously, most of you are amazing researchers. You understand your domain very well, but I see a bit of a weakness when it comes to understanding images, image analysis, and how to process them, and especially when it comes to coding.
So, I started the channel with a few videos in terms of how to get started with Python. If you really want to get started with Python again, I recommend doing that. Now, the reason why I'm doing this one comprehensive, you know, well, I should say about five tutorials as part of this series in 2026 is coding has become so much easier nowadays with chat GPTs and so on.
So, it's easy for you to learn coding, but the fundamentals of images, what images are, you know, how to process them, how to handle them. So, this requires like a a bit of a knowledge.
Like you have to learn this.
Of course, you can learn this from chat GPTs and stuff, but hopefully you find these tutorials, the the five that I'm going to make as part of this series to be a bit more intuitive, more useful.
I am going to walk you through the basics of images, image handling in the first part, and then we progressively will go through, you know, creating features, for example, generating features, what features are, and then machine learning, and then convolutional neural networks, and that's basically the plan here. So, this is one, and all of the code is going to be in one long Google Colab notebook.
So, it's think of this as your reference, the mother of all image analysis tutorials if you want to call that using Python.
Okay, I have been talking at least for 2 minutes, so let me go ahead and jump into the Google Colab. Everything is a walk through code and while we are walking through I'll explain the core concepts of image analysis and I hope at least that's how I love to learn new concepts and I hope you also some of you, let's say, also like to do that. Okay.
Let me jump in.
And welcome to Google Colab and again for those of you who are new to coding and probably never tried Google Colab. This is if you have a Google account, this is pretty straightforward. Go ahead and Google search, you know, for Colab and we are working with notebooks.
And if you want to work, you know, with this code on your local Python environment, of course, please feel free. Except there you'll have to install the libraries that you need, you know, into a local environment.
And this is one long notebook.
As you can see and again we divided this into five different tutorials. This one is basically getting you started with the setup and discussing what an image is, what are different image formats, and what do we mean by channels, you know, like RGB for example, red, green, blue, multi-channel and so on. Gives you a not big, but let's say brings everyone onto the same page in terms of understanding what images are. Then comes image processing, filters, and so on. This is our plan. Don't want to repeat what I already mentioned. Now, if you are I don't know managing a lab or if you have like new people trying to actually get into image processing, please do share this video along with this notebook because for the most part this would be a uh this can be a self-guided, you know, learning experience for newcomers." Okay, given that, now uh the chapter zero, the first one is environment setup and the data set that we're going to work with. Now, install and import all libraries needed and Google Colab already comes with those, so you don't waste time uh doing that.
Although uh you know, if you if you like, you can actually uh you know, work offline uh locally.
And we're going to work with a dogs versus cats data set. Again, I'm talking about scientific image analysis and now I'm going back to dogs versus cats. If you understand dogs versus cats from a classification point of view and the methodical way we are going, you know, extracting features, machine learning, artificial neural networks, and uh and convolutional neural networks. For example, if you if you can understand that, you should be able to extend this to any imaging that you are working with, whether it is life sciences or geosciences, it doesn't matter.
Okay, since we're working on Google Colab, it makes sense for us to connect to uh this notebook to our drive, Google Drive, because I'm going to save the dogs versus cats data set to my Google Drive, so I don't have to download it every time I do my tutorial.
I recommend the same for you, so go ahead and do that. But before that, I just want to show you the runtime. When I go to change runtime I am going to use CPU only right now, but on tutorial five I'm going to you know, when we get there, I'm going to switch to GPU because that definitely helps us in speeding up our neural network uh training.
Okay, so let's go ahead and connect to our Google Drive.
And uh a couple of steps right here.
Just go ahead and say continue.
And we should be connected. And on the left-hand side, when you expand, you should see uh well, the the drive show up anytime. Let me refresh this. There you go. The drive is right here. So, we're all set.
Okay?
Now, this is when uh you know, typically I install any new packages onto Google Colab, but all the ones that we're going to use like OpenCV or scikit-image, scikit-learn, they're all part of Colab anyway. So, no point in doing that. And in case you wonder, uh sometime in April 2026 that I'm uh uh recording this video, uh I'm working on Python 3.12, but the concepts we're going to cover, they should be timeless, hopefully.
Yeah?
Including the VGG16, the convolutional neural network concepts and everything.
The next step is importing all the libraries that we need. And I don't want to make this a Python tutorial in general, but if you're new and this looks intimidating, we're just importing code that someone else already wrote. And NumPy is our workhorse.
That's the one that actually converts your uh or handles your images as these structured array of numbers. Numbers nothing but pixel values, right? And uh For example, for RGB image, you have three such structured numbers. If it's a 256 by 256 image, you have one array of two dimensions of 256 by 256 representing red, one for green, one for blue.
Matplotlib is used for plotting, Seaborn for plotting, CV2 is basically OpenCV package. We use that for image input, you know, image reading, image saving.
And Pillow is another library that also allows you to read images and write images. scikit-image is another library.
There are so many libraries that can you know, that allows you to read images, write images, work with images, and these are the common ones, and there are also some proprietary ones that we'll talk about later on when we get there.
Now, if you ask me which one is the right one, there is no right or wrong. I use scikit-image for the most part, primarily because I also work with TIFF files. Like TIFF files can be three-dimensional images, and uh uh for those, scikit-image already includes the TIFF file package, so I don't have to worry about it. OpenCV and Pillow are primarily designed to work with Well, Pillow for sure, uh primarily designed to work with like JPEGs and PNGs and so on. So, I almost all the time exclusively use scikit-image, in case you wonder. Okay, all this is uh just the style settings to get our uh, you know, plots consistent.
Okay, that's it.
Not necessary, but uh I try to do that.
Okay, now that we imported the libraries, let's go ahead and download the data set. There are two ways. One is I'm going to use TensorFlow Datasets.
I'm going to import that package, and from that package download the data. Uh or you can actually click on this link and go and manually download and upload it to your Google Drive, and then you can access those.
Okay, if you do that, please place a subset of those images in a directory called train. Okay? So, for example, create a directory called train and train/cats and train/dogs. Within the train directory, put cats and dogs and dump all your images there, because the rest of the code assumes that's the structure, otherwise you have to change the code. That's the uh only point.
Okay, and the actual data set has about 25,000 images, and uh I am in this case only going to work with about 1,100 to 1,200 images per class, like for uh cats and for dogs, because we don't need any more than that. Why slow my tutorials by loading all the data if you're not going to use that?
Okay, so this is where my data gets stored.
Okay, on my Google Drive. And by the way, how did I get this link? Again, if you're new to Google Colab, once you have your drive, go to your drive, go to where where where is this? My Drive, Colab Notebooks.
And uh you know, here for example, data, right click, copy the path, and that's exactly what the path is. And within that path, I want a sub directory called doc dogs versus cats, and that's where all the data is saved. And I have a couple of functions here. One is if it's already downloaded, I'm not going to download again, so it checks it checks the uh you know, whatever the path I provided here. And if the data is not there, we're going to download it from the uh you know, from from the TensorFlow library that I mentioned earlier. So, let's go ahead and run this.
It should tell us that uh you know, it's skipping the download. We already have the images.
Okay? Now, >> [clears throat] >> let's go ahead and use OpenCV library to read the color These are all color images, which means they have RGB channels. So, we're using OpenCV. And most libraries, whether it is OpenCV, I mean, you use import CV2.
Uh most of these libraries, they have a method called imread, image read, I am read, to actually read the images. And all you need to do is give the path, okay? And the one uh one thing I should mention when it comes to OpenCV is typically you think of images as red green blue. Like if you have your color images, you have channels RGB.
OpenCV, CV2 here, it reads images as BGR.
There's some historical reasons for that, but they use BGR. So, immediately after reading the image, I'm going to take that NumPy array and convert that BGR to RGB, just rearranging the channels. That's it. Okay? That's what I do. And since these images can come in different sizes, one image can be 1024 by whatever, and other can be 256, other can be something else. I am going to resize these into all the images into a specific size, because that's of course is required if you want to do, let's say, convolutional neural networks based training. So, that's what that function does, and all I'm doing is getting a few samples and plotting them. So, we can actually see how, you know, these images look like. Always always do print statements, plots if it's images, so you know exactly that you're going in the right direction. So, the rest of this is just plotting code and just formatting the plots. When I say plot, displaying the images. Again, for those of you coming from uh non-Python background, when we exclusively when I say plot, that means plot the pixels, so we can see the images.
Okay? So, now here dogs versus cats, uh we can see the cats on the top row, dogs in the bottom row. Of course, they look like cats and dogs. So, right? So, these are some of the images from our dataset. Now, >> [clears throat] >> again, going back to the basics, since I intend these tutorials to be, uh you know, also for someone who is starting completely from scratch, what you're going to learn now is a digital image is basically a numerical array that I already mentioned.
Yeah? And let's get down to that. So, image is just a grid of numbers. At every dot in the image, and the dot is called a pixel, you have some information, some like number. Typically, the number ranges from 0 to 255.
Okay? If it's an 8-bit image. What is 2 to the power of 8? 256. That means the values in in that pixel, you know, the pixel value can be between 0 to 255, which is total 256 numbers. Okay, 0 being black, 255 being white. So, that's a pixel. Pixel is short for picture picture element.
And for a grayscale image, right?
Grayscale image, each pixel just has one number. That's it.
Like I mentioned earlier, it's just the brightness. 0 is black, 255 is white. Okay? And again, for color image, again, I'm repeating here, a each pixel holds three numbers because it's it's uh you have three different channels.
So, one per color channel, red, green, and blue.
So, when you look at the image shape, if it's a grayscale image, meaning there's no color, it's just a grayscale image, you only have height and width information associated with that image along with the pixel intensity information, right? So, the shape of that image is only height and width. If it's a color image or RGB image, let's say, the dimensions are going to be height, width, in addition to that, you have three uh channels, red, green, and blue.
Meaning, each of these channels, red or green or blue, they all have the same height and same width. And if it's a scientific image where you have multi-channel, it can happen in uh uh for example, life sciences like microscopy images or remote sensing in, you know, where you have multiple bands of information, in those case, you are not restricted to only three channels like RGB, you could have infinite number of channels.
Okay? It's quite common in microscopy to actually have, I don't know, uh eight, nine, 10, even more channels because if you're doing high content analysis, you know, you have multiple different, you know, uh dyes that you apply onto a tissue sample and you have different you know, laser wavelengths that you're using to image your sample, then each of those can be captured as a separate channel and you have already very nice rich information that you can start, you know, using for your analysis. So, that's uh uh so much that that's a long long explanation for image shape. And already mentioned, the most common data type is called uint8, which stands for unsigned integer 8-bit integer. Unsigned basically means there are no negative numbers. The numbers go from zero and above. And 8-bit means 2 to the power of eight values. So, the value can be anywhere between zero to 2 to the power of eight uh or one to 2 to the power of eight, let's say.
Uh that many values. So, values range from zero, black, again to 255.
And float images, like float 32 or float 64, are normalized to range zero to one.
So, you can actually have these float and not just the unsigned integer and those values are I don't know, 0.1568 or something of that sort. They they are normalized to be between zero to one.
They're primarily used for all these numerical, you know, let's say you apply multiple different types of filters and so on onto your images, then typically, you know, those some of the filters, they convert your image into float 32, float 64. You need to be aware of what's happening so you can convert them back to unsigned integer eight if if it comes down to it. Okay, I'm I feel like I'm going into a bit more depth than I intended to be, but hopefully uh it's not it's not boring.
Uh at the same time, it's not very, you know, difficult for, you know, the the newcomers to this field.
Okay, so now load one image to inspect its properties. Again, I'm using OpenCV, right? cv2.imread. I'm loading one image. In fact, I'm loading the first image because look at the path. There is a uh path right there. I'm looking at the 0th image, which is the first image, and I'm opening that and immediately converting my BGR to RGB. Okay? And all I'm looking for here is uh I'm printing the shape. Again, once you have the image array as NumPy array, you can do dot shape. It gives you height, width, and channels, what dtype it is, data type it is, what is the minimum pixel value, maximum pixel value, what is the mean, uh you know, and so on. Okay?
So, here this is the file name, cats00.jpeg, and it's 256 by 256 by 3. Why is it 256 by 256? We uh resized it when we actually started to uh read those images, okay? So, height, width, and channels. This is a color image, and this is unsigned integer 8, which is the most common type anyway.
Um minimum value is zero. Apparently, we have uh black pixels. And also, the maximum value, you know, we can go up to 255, but in this case, it's 252.
And mean mean usually doesn't make much of a difference. I mean, I usually don't look at it, but if you care about it, it's 84.1.
Uh that's the mean. And total pixels, 65,536.
What does that mean? What do you get when you do 256 multiplied by 256?
>> [clears throat] >> Okay? And total values, uh which is nothing but 65,536 multiplied by three because we have three such channels like red, green, and blue, and this is the total number of pixels. That's the full information about the image that you have, you know, right there.
Now, let's go ahead and zoom into the image and start to understand these pixels.
That's exactly what I'm trying to do.
So, we are reading an image at size 400 by 400, so we can see it very nice. And then I'm picking somewhere around the middle uh of the image. I'm actually picking 10 pixels, 10 by 10, and I'm displaying that. So, let's go ahead and run this. Instead of me walking through the code, let's go ahead and look at the result. Again, code, just paste it into any of your large language model and ask it to explain if you don't know what's going on. But, here is the original image and a little box that I put right there and 10 by 10 pixels and that's what I'm zooming in here. You see how you're going from light to, you know, the light brownish shade to dark brown right there.
And if you look at the actual pixel values, you see right there, uh this is for the red channel, by the way. Again, this is a brown image. Of course, that means it's a mixture of red, green, and blue.
Here. And if you only look at the red, the value at the top left pixel right there is 123.
And as the brown gets darker, apparently the red contribution goes down. So, your value here at that pixel is 71 right here, bottom right here.
Okay? So, that's that. I just want to make sure you look at this and now hopefully it makes more sense like what's happening at every pixel. Again, remember, this is only for red. Now, if you plot green and if you plot blue, probably the blue contribution is going up here.
I'm pretty bad with what colors you get when you mix what colors, so um so, uh let's look at these numbers. That's it.
And here it's very common to actually plot uh histograms, so um I'm zooming in, so it's tough to see everything at once, but let me go ahead and show you this. So, you see how the pixel values are going between zero all the way to 250. In our case, we know it's going to 252, and the maximum it could go is 255.
And this is how the pixel values are distributed.
Okay? And this is the grayscale of the whole image, meaning I converted the entire image into grayscale. So, we can actually Otherwise, you can plot this for red, green, and blue separately. And you can see most of the pixels are somewhere centered around here, around let's say 90 to 100, and few pixels up here and few pixels down here. That tells us there are some black pixels in this image, some bright pixels, but overall it's somewhere in the middle.
Again, going back to the image, that obviously makes sense. There are some white, some black, but most of the time the image, you know, the pixels are kind of distributed.
Okay. Now, what different intensity values actually look like. This is again now I'm plotting just so you can see.
Value of zero is black, value of 255 is bright, and other values fall somewhere in between depending on how much of black and white you combine to get these different grayscale. It's pretty straightforward, but again, this gives you a nice idea of what the you know, what what these pixel values actually mean.
Again, remember, when I say pixel value of 64 on a color image for the red channel, the red is contributing by this much. Each of those images is nothing but let's say think of each of this red, green, and blue as a grayscale image.
Each red is grayscale, green is grayscale, blue is grayscale. We are artificially mixing colors to kind of make it interpretable by humans. Because that's how our our eyes actually work, okay? Otherwise, uh red means nothing in terms of uh our digital images, okay?
Uh so, that's what uh now direct pixel manipulation, what I'm doing here is I pick like three little boxes onto the image we already saw, and then I'm overwriting some of the values with the values I'm providing here. So, you can see this is the original image, and this is what uh you know, I put like a white box here, black box here, red box here.
Again, I'm overwriting whatever the values we had here all of those with the uh 255 0 0, which is basically uh the red color, yeah? The 255 0 0 is nothing but red, black is all zeros, and white is all 255s. So, I'm putting these three boxes onto the image. Again, why? So, you can see how images can be manipulated, and this is the first step that we are taking into manipulating these images. Well, this is literally the manipulation of images, okay? I'm not even processing.
So, in scientific images, you don't manipulate image, you process the image to extract the information that the image actually contains. Please note that, okay?
Now, what the key takeaways? Again, I'll let you read this, but let's move on to the By the way, each tutorial I have multiple chapters. So, uh so I don't want to break my videos into very small videos, you know, talking about one chapter at a time.
Uh Please feel free to Feel free to watch these at your own pace. Okay, now, let's go ahead and look at what the common formats are, like JPEG, PNG, TIFF, and what are the scientific formats, and how they affect the quality and downstream analysis. This is very important, because if you're working on uh any scientific images, or uh when I say scientific, there can be engineering, there can be, you know, life sciences, or whatever. If you're using those for quantitative analysis, let's say.
Anytime you work with those, never ever ever work with JPEG. PNG is acceptable, TIFF is uh preferable, and other proprietary formats are um there's not much you can do about it other than work with them.
Uh compression, again, I say don't work with JPEG images for this work because they're highly compressed.
Uh pixel data is preserved and bit depth is also important, right? And we already know what that means. And metadata is embedded in some of these images and the channels, of course, is also very important when it comes to uh your images. So, if I have to compare these, again, JPEG, uh typically you use that for photos, right? So, even when I take like photos on my camera, I never ever save them as JPEG until I process them and then I'll save them. I always save them in raw format, uh meaning the literal pixel information with some embedded metadata. So, JPEG [snorts] has some metadata included.
Uh for example, when you take pictures using your iPhone and if it saves it as JPEG, then you have this EXIF information that tells you the location, for example, you know? So, that's what I mean by metadata.
Uh PNG is lossless compression.
Uh compression, basically, is uh as as you can imagine, it compresses your images then the information in your images to the point where humans cannot tell the difference. Machines can tell the difference. Humans cannot tell the difference. But if you're sure saving your images to share them on Instagram, who cares if it's slightly, uh you know, lossy compression. That's when JPEGs are okay.
Now, when you're training a machine learning model, uh you know, then you should actually get data that's lossless.
You know, you can you should try to get this lossless. Like that's why most people work with PNGs when it comes to training your machine learning models.
Uh and there is minimal uh metadata loss.
So, well, not meta- There is minimal metadata that you can embed into a PNG file. Now, if you really care about metadata, TIFF is uh very useful. TIFF is like uh very low com- uh you know, compression or no compression at all. And you can work with 8, 16, or even 32-bit images when it comes to TIFF. And you can work with multi-dimensional images when it comes to TIFF, like the 3D images.
And you can actually have like a whole bunch of TIFF tags and everything. So, that's why TIFF is commonly used for scientific images. BMP, not much anymore. BMP is again, there is no compression.
Uh Microsoft Windows actually brought that uh a while ago. I kind of don't use BMP anymore, or I never used it.
And DICOM is uh another file format that you'll run into if you're working with uh you know, medical images. And in uh in metadata, you can actually embed the patient data. And DICOM, it's it comes in many standards. You know, it's not a one standard format. DICOM is a container in general. It has like these uh pixel information, but it also has uh you know, different metadata standards, depending on exact uh you know, whether you're whether you're using CT or MRI or any different types of uh you know, uh imaging modality.
CZI is something I work with on a daily basis because I work for Zeiss, and Zeiss microscopes actually store information in CZI format, and they have such rich metadata. All the experimental parameters, when it's taken, and so much metadata. Uh I put Zeiss fluorescence microscopy here, but it doesn't matter.
Zeiss uh some of the electron microscopes, and some of the bright field microscopes, in fact, all of the bright field, uh you know, it's it's it's CZI, one common format. Other uh whether it is like on Nikon Olympus, they all have their own proprietary formats.
Makes them Why do they have proprietary? Why can't they use TIFF? Because it makes them you know the the microscopes to be more efficient in terms of reading, writing, storing data and all that stuff. And there is a reason why companies use their own proprietary formats. OME-TIFF is I think there is a newer version OME-Zarr, but OME-TIFF is again from the microscopy community to actually standardize all the all the image formats. So you can convert your CZI image into for example OME-TIFF and you can actually work with that image as part of your image analysis pipelines. Again HDF5 another hierarchical format. So if you have pyramidal type of structure in your data, so as you keep zooming like Google Maps for example, as you keep zooming in, you know, you load a higher resolution image. HDF5 and there are other formats also. And NIFTI again, I don't work I have I did work with NIFTI of course in the past, but I don't work with them on a daily basis. And this is again for if you're working with MRI or fMRI data, that's that's the dot NII file format. Okay, that's a lot of information. Hopefully you're working with one of these image formats.
Now, scientific and medical formats I already mentioned those, so no point in me walking you through these. Go ahead and read this. I added a bit more context into into what these each file types are. One thing I should mention again, why is it important? Because you can ask hey Zeiss, why are you not using TIFF file format? Why do you need your own format? Well, think of an experiment where you have a let's say a tissue sample, slightly thick tissue sample and you're collecting 3D data. Well, TIFF file can handle 3D data, right? So, you have X, Y, and Z. What if you're doing 3D data that's multi-channel, not just one channel? So, now you need uh uh width and height and uh you know uh and and uh also the depth, right? All the three. And on top of that, you need uh you know, the channel information.
Now, what if it is a time series? So, you're observing it over time. Now, you have a time dimension. But, what if you have multiple scenes of that? In one location, you have one tissue, another location, another tissue, and so on.
What if you have those all in one image?
It gets very complicated and proprietary formats like CCI are designed to handle those.
Moving down, uh now let's go ahead and again, you don't need to install anything uh right now because uh Google Colab already has those installed. We do need to install uh let's say CCI file, for example. Some of these proprietary ones are not uh you know, included, but we'll we'll save that for later. Right now, we are only looking at JPEG versus PNG and the compression trade-off right here. Again, let's go down here and you can see the JPEG here. These are different compressions for uh JPEG all the way up to here and this is the PNG image, which is lossless compression anyway. So, here, this is a loss compression, lossy compression. And the file size for this one would be very small, so you can store millions of images in your let's say flash drive, for example.
Now, as you increase the quality of the images, then the file size goes up because you're storing more information, of course. That means uh the files are larger, fewer files that you can actually store. That's basically the trade-off here.
Not just the storing trade-off, reading and writing. If you're If you're trying to read like millions of images and you're writing those, then writing high-quality images, reading high quality images with a lot of pixel information takes time. And the lower ones are, you know, easy. So, you can see one way of assessing the quality is peak signal-to-noise ratio, and this is 24 decibels, this is 31, 63, 49, and I believe there is a cutoff like about 40 is where, you know, I thought I put that somewhere.
Um Oh, yeah. Above 40 is generally considered visually lossless. That means humans cannot tell the difference if it's like 49 or 63, for example. Both are above 40.
And this is the PNG, which is again lossless.
Okay? And peak to signal-to-noise ratio is infinity because that you're dividing this by zero in a way.
Okay. Now, hopefully that entire compression part makes sense to you. Now, let's go ahead and move on to the bit depth, 8-bit versus 16-bit depth. Again, I already mentioned 8-bit means 2 to the power of eight values between black to white. So, you have 0 to 255. 16-bit, of course, is 2 to the power of 16. So, black is zero, and white is 2 to the power of 16, which is 65,536, but since we start at zero, 65,535.
Is why do you why do you why do you care, you know? Well, if you have a lot of detail that you're trying to capture, like very nice range in your information, then you're cutting it, you know, you're binning it too much if you only store that information in 256 bins. At the end of the day, these are different bins that you're storing in, right? So, if you have 65,536 bins where the information is stored, then you have more room to adjust when you're trying to process your images.
That's why most of the time scientific images are uh captured at uh let's say 16-bit uh in general.
Okay? Uh so, bit depth, let's go ahead and uh you know, run that cell so you can see this is the red channel only, by the way, and I'm displaying that again. You artificially apply color, you know, when I say red, but in this case we are extracting the first channel in our NumPy array, which corresponds to the red channel. Typically, we apply red color to that channel, and that's the 8-bit and that's the 16-bit grayscale.
Uh in this case, the histogram is going from 0 to 255.
In this case, 0 to 65,500, whatever. And as you can see, a lot more information is stored. So, when you compress, when you process this histogram, which is nothing but process these images, you have a lot more room, uh you know, to adjust those without seeing any weird artifacts.
Okay? So, try to work with 16-bit if you can, especially if you're trying to process uh images uh adjusting the contrast, brightness, or so.
Now, let's go ahead and read the same image using three different libraries.
So far, we uh only used OpenCV, which is cv2.
I am reading this image, but once the image is loaded, I always convert the BGR to RGB, so the channels are aligned correctly for us.
And then, I'm using Pillow here. And Pillow does not directly By the way, OpenCV and scikit-image, as soon as you read the images using these libraries, it's already a NumPy array. Pillow, it's not a NumPy array. When you do image.open, you have to convert that into a NumPy array. That's what I'm doing here. Otherwise, scikit-image uh uh you know, converts saves it loads it as a NumPy array.
Let's load uh the same image using these three libraries, and you should see, I mean, there's no reason why you should see anything different.
You should see exactly the same.
That's why use whatever you want. All three libraries return the same NumPy array. There is no difference. Same values at every pixel. Okay? So, I use scikit-image, but you can use whichever one uh, you want.
Okay. Now, moving on to chapter three.
Again, I hope this doesn't go on too long uh, this tutorial, but I hope you're still engaged, you're still uh, feeling like you're learning something.
Uh, not just feeling like you're actually learning something. Uh, let's look at single uh, channel RGB and multi-channel. Now, a channel, what is a channel? It's an independent layer of information. And again, you have this 2D uh, array of numbers.
And for a grayscale, you only have one of those.
For uh, you know, a color image, you have three of those, red, green, and blue, right? So, you have three uh, such arrays of dimension HW, and you have three of them. And uh, there is something called RGBA image. A stands for the alpha channel, alpha channel, and it's a four-channel. So, it has red, green, and blue. On top of that, it has a transparency, you know, information uh, in the fourth channel. And uh, 16-channel microscopy image, for example, if you have 16 fluorescence wavelengths, then it is height and width, like two-dimensional uh, pixel information, and 16 such uh, you know, copies.
I shouldn't say copy. Copy means you're copying a 16 such two-dimensional images stacked together that give us the 16-channel uh, image.
So, let's go ahead and uh, demonstrate that RGB image is just a three-stacked.
So, what I'm doing is I'm separating from our uh, array, I'm separating the zeroth channel, uh, and I'm calling it red. And then, the first channel, calling it green and the third channel calling it blue and I'm just plotting them. That's it. There's nothing fancy going on. So here is our input image and this is the red channel. This is the green channel and this is the blue channel.
You see how the intensity is slightly different and apparently the cat itself has equal amount of red, green and blue because they're equally dark. Same with the bright ones and the other regions you will probably see some changes.
And if I instead of just grayscale here, if I say all these pixels, I want to apply red color to it. All these I want to apply green and all these I want to apply blue.
You know, just like I showed you here.
When you combine these, you get this image.
Yeah? And if you look at down here, you see how you see this text area in white, the same area here is in dark.
Obviously that makes sense because this is the red image and in the red image you see more red where there is text in red right here.
Okay? Very simple.
Now, let's go ahead now that we separated these three red, green and blue channels. Now let's go ahead and look at the histogram. How is the information distributed in these histograms? So here is the cat.
There's a lot of green right here.
Dog, again some green in the back, some red. The dog has like some of the reddish shades.
And there's a lot of black. It's a black cat. So you can see a lot of black right here in all the three channels. It's zero.
I mean our lower end and you can see the green histogram with a few more pixels showing up in the on the brighter side.
That's coming from these green.
Same thing here. There is like some green right here.
Excuse me. Um yeah. So that's that's I just want to make sure you see how the histograms, you know, can be split for color and then you can get some information out of them.
Now, so far I mentioned red, green, and blue.
Red, green, and blue because you have red and you have green and you have blue and you have the brightness. So, the red, green, and blue image is nothing but showing the information from zero to white, which is basically the brightness for red, brightness for green, brightness for blue.
You don't have to work in that color space. You can work in hue, saturation, and value color space.
Or lightness and A is for green to red axis, B is for blue to yellow axis. People who work with I don't know, Photoshop or you know, any any image editing software, you probably already know what some of these are. And why do we need any of these? Sometimes, depending on the type of processing you're doing to the images, it helps if you work in the HSV space because you can adjust only the saturation.
Okay?
But you can do that in RGB space. So, if you want to do certain tricks, you know, you can convert your images into different color spaces. That's all I wanted to show you there. And you can see this is the original red, green, blue. This is the HSV H channel in HSV. This is the hue. You can see how it's showing you different colors.
This is the saturation. For example, if you look at let's look at this pinkish or reddish color.
You know that there is reddish color right there and you know that there is for example, the brightness. More brightness here.
Also, a bit more saturation right there.
For that color.
The grayscale lab L stands for lightness again. There you go.
And A stands for green to red. Where is it going from green to red, and this is blue to yellow. This is a heat map. So, again, just want to make sure you're aware of these and how you can change your images. I've done you know, in-depth videos on almost every topic that I'm talking about 4 or 5 years ago that still works. You know, please go back and check those videos to get more information about this. Okay?
Now, finally, I would like to show you the multi-channel images. Yeah? This is one image that I don't think you have.
I'm installing the library CZI file to handle a.czi file, which is from Zeiss. Go ahead and Google search for.czi and then download.
You'll find some free files out there, and then you can practice this.
So, in my case, I have an image called osteosarcoma, it's a cancer-related image,.czi.
And let's the shape of this image is three channels, and this is not a color RGB image. These three channels are different. They actually represent one for nuclear marker, one for cytoplasm, and one for, I guess, mitochondria.
Okay? So, those three channels right there, and yeah, it's given right there.
So, the channel number zero is mCherry channel, organelle. I mean, life scientists, you probably recognize this.
And channel one is GFP, that represents cytoplasm, and channel two is DAPI, which shows up, you know, we typically apply blue color to it, and it reflects it reflects nuclei. Now, let's go down and actually look at this image. There you go. This is the original image, and this is only the red channel, or the channel number zero, the organelle channel, and this is the cytoplasm in the cell, and that's the nucleus in the cell.
Yeah? It's I mean, it's it's very similar to our TIFF images or you know, JPEG images.
You know, like RGB. We are the way we are processing this is very similar.
Except the way you load them is the one that's different. Instead of using your scikit-image or CV2, we are using CGI file. Once you read the file, separating the channels and all that is just NumPy array manipulation.
Okay, there you go. Individual channel information right there.
Uh I should also mention the grayscale conversion.
When you load a color image and when you use like scikit-image or OpenCV to convert that image into grayscale, for example, here we are converting this RGB into grayscale. How does it convert you know, there are many ways you can do, right? So, you have red, green, and blue channel. Do you just combine them and average, like divide by three?
That's one approach, but typically the standard formula is this.
Okay. Now, I'm printing these both ways. Like here is the original image and here is the converted image into grayscale by averaging the pixels, R + G + B / 3. And here is the luminosity formula I just showed you earlier. 0.299 * I mean, who came up with this formula, right?
So, apparently this formula actually Did I write that down? Yeah, it gives more natural looking grayscale because human eyes are most sensitive to green, less to red, and least to blue. That's how they weighted this. You see, they weighted red almost by 30%, green almost by 59% right there, and blue is 11%. And you're combining these three channels accordingly, so this apparently for humans looks more natural. And this is OpenCV. OpenCV uses exactly the same formula, this luminosity. That's why you shouldn't see any difference between these two. And this one is the average and I hope on the video you can see that these two are slightly different. This is more richer white compared to this one which looks more dull white almost like a gray shade of gray right there lighter shade of gray in this region.
Okay, that's how you convert and I think we should end there again this is tutorial two is about now that you know what images are what these pixels are and how these are stored and everything. Now let's try to manipulate this you know what I mean by that is just filtering these images applying Gaussian filter for example median filter and so on but I hope you found this tutorial to be really useful and check the next you know video as part of this series.
I'll see you in the next tutorial then thank you very much.
Related Videos
Walmart Manager Arrested After Stealing $670,000 - A Data Analyst 800 Miles Away Caught Him
bodycamsecretsyt
111 views•2026-06-09
GitLab’s Manav Khurana: AI Agents, Orbit, and the Future of Coding
TechVoices-live
374 views•2026-06-10
"What's the Difference Between a Class and an Object?"#class #programming #softwaredevelopment
CS-with-Alireza
349 views•2026-06-08
Why Your Computer FREEZES?
GreshamCollege
1K views•2026-06-09
Feodo Tracker: Botnet C2 Intelligence Platform #CyberCavin
CyberCavin
269 views•2026-06-06
I thought this feature would be easy to deploy... I was wrong.
dreamsofcode
815 views•2026-06-10
The Operating System That Should Have Beaten Linux
BitByteTalks
23K views•2026-06-08
STCS - Class 23: How to make your Mobile App Fast
mosesmbadi
116 views•2026-06-07











