This lecture covers Java's memory management concepts including stack frames (allocated at program start, fixed size, cannot grow) and heap memory (dynamic, requested as needed, accessed via references). The instructor demonstrates how to trace code using memory diagrams, showing how variables declared inside code blocks are erased when blocks end, while variables declared before remain accessible. Key data structures introduced include ArrayList (sequential, indexed from 0, requires explicit import and type parameters like ArrayList<Integer>) and HashMap (key-value store, uses put() and get() methods, does not preserve insertion order). The lecture emphasizes understanding references (memory addresses like 0x1000) and how the dot operator accesses heap objects, with practical examples of file reading, CSV parsing, and exception handling using try-catch blocks.
Deep Dive
Prerequisite Knowledge
- No data available.
Where to go next
- No data available.
Deep Dive
CSE116 Lecture 2: Java Intro 2Added:
Okay, let's do this.
So, welcome back to lecture two.
Nothing ever works the way I want it to, but we'll do. All right, so welcome back.
Lecture two and let's pick up where we left off. Uh, I'll give I guess one reminder. I went through the whole syllabus last time. I'm not going to read it all again. Just a reminder, don't fall behind. Falling behind is a quick way to um to lose track of the course. And just to put some emphasis on that, by the end of today and 2 and 1/2 hours from now, we'll be 20% of the way done with the content of this course.
There are only 10 lectures. Uh this is number two. That's 20%. So by the end of next week, we'll be 40% done with all of the material I'm going to present to you. Uh it really is after especially after today, it's really in your hands doing these problem sets and making sure you're keeping up to date on everything you need to be doing with uh problem sets and studying for the eventual exams in a couple of weeks. So that midterm exam June 23rd, that's going to come up very very quick. Uh so making sure you're ready for that and getting the problem sets done. Okay, with that, let's jump right into some freaking content. All right. So, we left off with loops.
And I said I'd uh uh review the loops.
We kind of went through both loops last time, but I had to rip through them pretty quick. And it's just kind of tacked on and rushed at the end of a lecture. So, let's take a little bit more time and talk about these loops. We have two styles of loops in Java. The well loop and the for loop. The while loop that you see here is effectively the same syntax as an if statement. You have a the keyword while and then a boolean expression. This is any expression that results in a boolean value. If that value is true, then the body of the loop executes. So as soon as we hit this loop, val is going to be 10.
10 is greater than one. That's true. So we're going to execute the body of the loop. Print 10 to the screen. And this syntax here divide by equals is the same thing as your plus equals that you should have seen in Python. If not plus equals divide by equals star equals for multiply and subtract equals is the equivalent of saying val in this case veil equals veil / 2. So this is effectively dividing veil by two and reassigning that back to the variable named veil. So we're dividing veil by two. So veil goes down to five. we hit the end of the body of the loop. And the big difference between a well loop and an if statement is once we hit the end of the body of a well loop, we're going to check the conditional again.
Five greater than one in this case is still true. So we're going to execute the body again and we're going to keep doing that until the boolean expression resolves to false. Then once it's false, we skip to the end of the loop and continue on with the program. So, it's like an if statement that keeps checking the condition every time it executes.
And your loops might not execute at all.
The it's a fairly common mistake to rely on your loops for some sort of variable initialization or doing some processing with your data. But if your loop never executes, that processing never happens.
So making sure all your initialization and any processing that absolutely has to happen happens outside of your loop.
So here if val is neg five greater than one is false the body of this loop these two lines execute exactly zero times they never execute. So just something to always keep in mind when you're writing with loops especially when you're submitting to AutoLab and you can't see what the test cases are. Some of our test cases, we're usually going to throw one at you that doesn't execute your loops at all, like an empty data structure where your loop just doesn't do anything at all. So, making sure that you can handle those edge cases, is a pretty important part of programming.
Just something always to to be aware of while we're coding. Uh and then our other flavor of loop and this one I'll spend a bit more time on because you haven't seen uh this style if you're coming from 115 or coming from anything that taught Python. Python doesn't have this style of loop which this is uh what I call the traditional for loop. This is what you know what I just recognize as a for loop. And then we have the for each loop which is Python style. That's a newer style of loop. Here I'm using the word newer pretty loosely. is probably 20 30 years old by now. Um, but this is the traditional uh for loop that's been around for like I don't know 50 years or however long it's been around uh since uh well since forever since programming was a thing probably I don't know the exact history anyway. Uh so this style of loop is written in a sequence of three statements separated by semicolons. So we have three statements that all have very specific purposes for controlling this loop. First we have the uh the initialization statement which is going to be executed exactly one time when we reach the loop for the first time and this statement is executed within the scope of the loop or within the block of the loop. I'll expand on that when we get to the memory diagram. um but it is effectively executed inside the braces you know not literally but effectively it's in the scope of the for loop. So when we're writing our memory diagrams that variable x in this case is going to be declared inside of the code block for the for loop and again when I do the memory diagram that'll be explicitly clear.
The next statement is the boolean expression the or the condition. This uh this expression is just like the boolean expression for the the while loop. It's going to check the loop. We get to the loop. We execute the initialization statement. Then we check the boolean expression. If this resolves to true, which it will in this case 0 less than 5 is true. Then we're going to execute the body of the loop. And every time we execute the body of the loop. When we get to the end of executing the body, we will run what's called the in the uh increment statement. This statement can be any code that you choose. You can write any code here. Uh but typically this is where we're incrementing a variable.
If you're writing, this is kind of the standard way to use this loop. Starting with xals 0, go till x= 4 in this case cuz it's strictly less than. and then increment one at a time. So x is going to be 0 1 2 3 4 in Python. This is the same as saying for x in range of five.
It's going to do the same thing uh in this case. But you have a lot more flexibility with this style of loop because these three statements can be anything you want. So when you reach the end of the body of the loop, you execute the increment statement, then check the conditional again. If the condition is still true, execute the body again, execute the increment again. Then check the condition again. The condition is still true, you just keep going. You keep doing that until the condition is false.
Oh, and the x++ x++ is the same as x +als 1 or x= x + 1. So a little bit I think you do see this syntax in uh 115 or no sorry you don't the previous one uh the x divide by equals you might not have seen but you seen x plus equals you would have seen x plusals 1 in python to do the same thing because python doesn't have the the increment operator. So x++ is the same as saying x= x + one. it's going to increment x by exactly one and reassign that value back to the variable. So it's a nice shortcut to be able to do that. It's something we often do is increment a val v variable an int specifically by one. So uh so we have a nice shortcut for syntax a nice syntactical shortcut to do that just plus+ in general we execute we reach the loop we run the initialization statement once then check the boolean expression if it's true we execute the loop body then the increment statement then check the expression again and we do that until the expression is false and just like our while loop the boolean expression might be false the first time we get to the loop. Usually not with a for loop because you're initializing the variable right away and then checking it. You know, it can happen, but it's usually, you know, less common with a for loop because you're using it just the way that we use for loops. You certainly can have it have that be the case where the loop never executes at all. Uh but usually the way we use a for loop, it's less common. Uh and yeah, that's our that's our for loop.
So, let's see this in action in another memory diagram. Just like my other ones, I have the memory diagram in the slides.
If you're if you want to go through my commentary in the slides, uh I have that available for you if you want to when you're reviewing if you want to go through that one, that's perfectly fine.
But uh uh but I'm going to in lecture do it in the tracing tool because that's what you're going to be doing on your exams and I want to give you maximum practice on those. Uh and as we we kind of discussed last time I will give uh either probably the TAs will give practice uh uh practice uh practice trace questions before the midterm.
Uh, I'll make sure we get that. So, loops. Let's grab this code. Let's head back over to the tracing tool. We have our old trace. I'm going to trash can this gone. Hide the toolbar.
Get our new code in here and trace through this code. Make sure I'm in Java. I can see because there's no global section that I'm in Java mode.
Uh, and let's trace through this thing.
There's one thing I was checking. Oh, I do.
No, I just checked that and I I saw But the TAs will get their office hours. We do have two TAs now. Uh Jack and Miam, and they'll get their hours on the office hour schedule. Uh I was hoping they would already. Let me check if they did just recently. Nope.
Uh actually, I haven't even added Jack to the Piaza, so Jack might have an excuse.
Uh but uh we'll get that sorted. We'll get office hours on there. Uh though on Tuesday, I mean it's pretty early in the early in the semester. Tuesday for an hour and a half, I didn't have any customers in office hours. So uh one person trying to that had technical issues with Zoom. Uh so technically one person but uh I don't think there's too much demand right now for office hours but we will get the schedule especially now that prom set zero is in your hands.
Next week I assume more of you will be coming for help on prom set zero. We'll make sure we get that schedule up and running. All right. So every Java program starts with a main stack frame.
So let me get my main stack frame there.
And then first thing we do is create a variable named veil with a value 10.0.
And then we hit our first loop.
Now a loop is a code block. So I would have to create a code block for this loop. Every time we have a code block, you absolutely can create your code block like this. Except we won't have any variables in this code block.
There's nowhere I can see in this body of the loop. And since it's a well loop, there's no initialization statement.
There won't be any variables inside of this code block. it'll just be an empty code block. If you want to add that to your memory to your tracing diagrams, you are more than welcome to. This is absolutely correct. But when it's an empty code block, we do leave it as optional. You do not have to add that empty code block to your trace. I'm actually going to leave this one here. I usually don't have them in lecture uh because it clutters up the screen and then once I get the stack large enough that I have to keep scrolling up and down just makes lecture a little more disorienting and I don't like to do that. For this one the uh I have a small enough program that I'm not going to fill up the stack. So I'll leave my empty code block here just for emphasis that there is a code block there that is created. It's just empty. So it doesn't have any effect on our program. But I'll leave it there for this uh for this example, for this trace. In most cases, I won't put my empty code blocks on the stack. So, we check the condition val.
And again, I'm not going to again from last time, I'm not going to look at my code to resolve the symbol val. This variable veil, it's an expression that's a variable that I need to uh resolve to an actual value. I'm not going to look at my code to figure out what veil is.
If I do that, it's just always going to be 10. And then I have to keep track in my head and remember what it is. But that's exactly what we're doing with the memory diagram. We're tracing through the code and keeping track of what this variable is at all times. I'm going to use the trace to look up what val is. So val I'm going to look up. I'm in the main method. So I'm going to look at the main stack frame. Val has the value 10.0. So it's 10.0. 0 greater than one.
That's true. So, I'm going to execute the body of the loop. I do want to resize this a bit. We do have a bit of things being printed to the screen. So, we're going to print veil. And again, veil, I'm going to look at my stack 10.0 hasn't changed, but just so I don't even have to remember anything. I don't want to think too much while I'm doing my traces. And you shouldn't think on your midterm and final too much either.
Remember the rules of how to trace through code. Remember what the computer does in each situation. and then just kind of blindly go through the code when you're doing a trace is the best way to do it. Uh too many I see a lot of students do poorly on traces because they're trying to memorize every little thing, memorizing exactly uh what each example in lecture does instead of understanding the underlying rules that I'm going through like how variable declaration well variable declaration very few people make mistakes on that. Um, but code blocks, remember the rules for code blocks, for stack frames, for variable reassignment, eventually object creation. If you remember all the rules, and there's honestly not that many uh in this course, I would say I don't know 10 total things to remember how they work.
Once you know those things, you can trace through any program I'm ever going to throw at you on the midterm or final.
So, we print it to the screen and then veil divided by equals 2. This is the same as veil equals val / 2. Same exact thing.
So I'm going to take val / 2 and that's going to give me 5.0 as the value for veil.
I reach the end of the body of the loop.
So it's a while loop. So I'm going to check the condition again. Veil greater than one. Vil again I'm never going to look at my code to resolve a variable.
I'm going to look at my stack. Vale is 5.0. 0 5.0 greater than 1. That's true.
So, we're going to print veil. Veil still 5.0.
Print to the screen. Divide it by two.
Veil gets to 2.5.
Check the We reach the end of the body.
Check the condition again. 2.5 is greater than one. That's true. So, we're going to go around the loop again.
2.5. Divide veil by two.
reach the end of the body of the loop.
Check the condition again. Veil is now 1.25. That's still true. So, we're going to go around the loop again. We're going to print veil to the screen. We're going to divide veil by two.
Oh my goodness.
Windows drives me absolutely insane.
I don't even I guess I do have PDF files open, but Adobe just had a popup that said, "You should make me your default option for PDF files." No, no, you don't. You don't do what I want you to do. That's why I use ocular. Uh, so veil is 0.625 and then we reach the uh end of the body of the loop again. So we check the condition again and veil greater veil which is 0.625 greater than one. Now the condition is false.
So now we do not execute the body again.
We go to the end of the loop and continue on with our program to the next line. And there's a little bit of a nuance point here is that the value of val did reach 0.625. It did reach a value that causes the condition to be false. We did get there, but the loop body was never executed at that value.
So the final value for Val is 0.625, but 0.625 is never printed to the screen because the body of the loop never executed to be able to print it at 0.625.
And we reach the end of the loop. So we cross out our our code block. There are no variables in that code block. So it doesn't have any effect on our program.
And since veil was declared before we reach that code block, veil is in the main stack frame directly, not inside a code block. So veil does live on beyond the scope of that loop. So even though the loop's done and Val it's Vale's purpose really was to run that while loop but since it was declared before the while loop it it's still in memory after the while loop ends and we get to our next loop our for loop. So we have another code block and whenever you see a pair of braces like this you know you have a code block and then the question is do we have variables inside of it and this time we do the the initialization statement of a for loop is executed within the scope of the for loop. So that actually does go in the code block for this for loop.
So x getting the value zero is inside this for loop. So we run the initialization statement exactly one time when we first reach the loop.
Then we're going to check the conditional. X less than five. X is currently zero. That's less than five.
That's true. So, we're going to execute the body of the loop. In this case, we're just printing out X to the screen.
X is zero. Print zero to the screen. We reach the end of the for loop. So, then we execute the increment statement. X ++.
So, increment X. This is the same as X= X + 1 or X +als 1. All three of those do the same uh the same thing. So, X is now one.
And we're going to check the condition again.
One less than five. That's true. So we're going to execute the body of the loop again. Print out one to the screen.
Reach the end of the loop. Increment.
Then check the boolean expression. So two two less than five is true. Print two.
in uh as loops do gets a little bit repetitive here. So we'll uh speedrun this a little bit. Increment x to three.
Three less than five is true.
Increment x to four.
Four less than five is true. So we'll print four. Get to the end of the loop again. And then we're going to increment x again to five. So the value value of x is now five and fours we gota we should make a slight change here. Fours when you cross them out it overlaps with the cross line on the four. Doesn't look like it's crossed out but that four is crossed out. And then check the condition again. Five less than five is false. So now we're done with that loop.
The code block is gone. X does not exist anymore in memory. And then we reach the end of the main stack frame. Program ends. Main stack frame is out no longer in scope. It returns. It's out of memory. And the whole program ends. And this is our final output uh for our program. So again, notice that x did reach the value five. It gets to a value that causes the conditional to be false.
But the body of the loop does not execute at that value of five. We do not print five to the screen. Four is the last value that we print to the screen.
Oh, I never full screen. I guess it's too late now. But okay. And I want to hop over to Intelligj just to emphasize a few points in this program. So, first let's run the program. to make sure I didn't make any mistakes in my code. Should print out the same exact values that we did in the trace.
But let's do a couple of things. Let's do and I'm going to use the shortcut s out. I'm doing s out and tab. And then print out the value of val again right here. And we saw that at the end of the loop val was 0.625.
But the loop never executed. We never printed 0.625.
But two things here. veil is still in scope after the loop ends. It still exists in memory and its value should be 0.625.
So let's run it and make sure we get 0.625 after our 1.25 right here. We should see it. And there it is. So that is the final value of veil. And let's do the same thing with X to see if we can see the final value of X. And this one, well, we have a little bit more of an issue. So, let's try to run this. And we're going to get this error. Cannot find symbol variable X. And whenever you have an error in your code, you're going to get a blue link that you can click on to take you exactly to the right line.
This file all fits on the screen at one time. And I have a single file program.
So, it doesn't really I guess if I move my cursor, I can see it moving my cursor to exactly where the issue is. My cursor just happened to already be there, I guess. and it'll take you right to where the bug is and tell you what the bug is.
Cannot resolve symbol X. Now, this can be a frustrating error when you're new to this because it you can see X right here. Int X is being declared right here. You're using X and just two lines before you're doing the same exact thing and printing X to the screen. But right here, you get an error.
Can be weird. Hopefully, it's not weird to any of you because we just went through it. But, um, but it can still feel weird. That's why I'm talking about it still now is the reason for this is when we cross out this code block when we hit this closing brace when the condition is false for this for loop the code block for that for loop is erased from memory including every single variable that was associated with that code block which includes variables from the initialization statement of the for loop. Those variables are all erased gone from memory. they don't exist anymore. So if I try to access X after the for loop ends, after that code block is erased from memory, I can't access it anymore because it doesn't exist. The symbol X no longer exists in memory.
It's not in memory. Java is not going to be able to find it because it was erased when we hit the end of that for loop because it's lifespan of the lifespan of X was exactly the lifespan of that for loop. It got cleaned up as soon as the for loop ended. So we cannot access X anymore. So I can't see the value of X.
The value of X did reach five, but we can't really see it. We can, if I dare do the debugger in the first two lectures, we can run the debugger and still verify we have veil in memory with a value 0.625. And if I step through this, I can see X going from 0 1 2 3 4. And if we click again, oh the well the debugger doesn't show us it. It skipped some intermediate steps, but we can see the X is removed from memory. As the program is ending, Val is still in memory. For what it's worth, args is still in memory. U but we don't use that in 116. So we don't bother tracing it. If you want to trace args in your memory diagrams, by the way, go for it. It's more correct than what we do.
we just omit it because it's just extra clutter on every single trace that we'll ever do. It's just unnecessary. U but we can see that X is no longer in memory after the for loop ends and then one more click and the program itself ends.
Um because that's you know it's the last line of code.
Okay, any questions? Chat's been pretty quiet ever since we finished going over the syllabus. Chat's been pretty quiet.
But if you do have questions, that is the purpose of doing these live is so you can ask, y'all can ask questions during. So if you have questions about any of that, I would love to hear some questions.
Uh, and I'll keep watching chat if there are questions that come in. Uh, I can jump back to whatever. I can hop around.
Uh, to do this live really is so you can interact with me. uh if you don't interact with me, hey, that's fine. I'll I'll still do my thing. I'll just go through the content. But, uh you do get more value if you ask questions. All right. Can you explain when to use for or Well, so a for loop, good question.
So, a for loop, let me go to Intelligj.
Uh a for loop is usually when you have some structured uh some structured way you want to iterate over some data usually over a data structure. Now we have which we're about to talk about we're about to introduce arraylists and hashmaps we can actually let me jump to that example. We can use the Python style of loops. We can use this what Java calls an enhanced for loop or a for each loop in most languages the more general term where we're creating a variable and saying for string key in some data structure. We can iterate over our data structures like this but sometimes we want more control over our over how we're iterating over them.
Specifically, we want to iterate over a data structure, but we need access to the indices. So, if I ask you find the index for the max value in this array list is something uh I have something similar. I forget exactly what it is on problem set zero where I'm making you iterate over the indices so you can have access to the index and the value.
That's where you would use something like this where this would be like my array size.
So this is going to iterate exactly over the indices of this array that doesn't exist in this program. You got to use a little imagination here. But this is where I would use a for loop. This is probably the most common reason why we use for loops in programming is to iterate over the indices of a data structure. Or if you do have like I want to execute this exactly 100 times, then you would do 100.
uh if you have it a fixed number of times that you want to iterate over something. A well loop is when you can't quite fit what you're doing into this structure. Our biggest use case for a while loop is going to be checking well some data structure is not empty.
So later in the semester, we'll have a situation where we want to execute some code well uh over we have a que. We don't want to like iterate over the queue.
We'll talk about it later, but we can't really iterate over a que and we want to iterate as long as the queue isn't empty. If there's something left to process, I'm going to keep processing.
So we'll say as long as the que is not empty, keep doing my keep doing the thing.
uh and especially when the data structure is changing like in that application we'll be adding things into the queue in the body of the loop uh this is something that doesn't really fit in the for loop structure either this traditional style of for loop or the newer Python style of for loop the for each loop we can't quite do that when this is our application we have a Q where we're adding things to it removing things from it we just want to go until it's empty but we can't just iterate over the initial values of of the queue because it's constantly changing. This is something where we're going to pull out our for loop uh or uh that's a bad example. We're going to see this later today and not do this.
But if you're reading from a file and you want to read until you reach the end of the file initially, you don't know the size of the file. So you don't know how much you're going to iterate. You're just going to iterate while there's something left to read in the file. It's kind of a bad example because the way I'm going to show you to read files, we're going to do it. Uh, we're just going to use a library method that does it internally will do that for us. Um, but that's another application where I'd use a while loop.
All right, let's get rid of our errors here and head over to some data structures.
All right, array list and hashmap.
But first, let's talk about memory. And and if you have more questions, please please let me know. Uh the uh uh first let's talk about memory a little bit because we talk about memory a lot. They probably, I would assume, talk about this at this level in 115 at some point, but I at the very least want to give a reminder and if they didn't, I'll explain it for the first time of what we actually mean when we talk about memory because we talk about stack and heap memory and memory diagrams, these memory traces. We talk about memory all the time. What are we actually talking about? And what we're actually talking about is RAM. When you when you spec out your computer, whatever computer you're looking at right now, at some point you speced it out and said, "How much memory do I want to pay for or can afford?" Uh, that's a spec that you probably all know about your machines. It's a very important spec for the performance uh of your machine. And that's exactly what we're talking about when we're talking about memory. How much of that RAM are we using for our program? And how are we using that uh that memory? So when you buy your RAM, you get like eight or 16, 32, however many gigs of RAM you have.
What you have is effectively an array of that size. If you have 16 gigs of RAM, you have an array of size roughly 16 billion. It works out to like 18 billion bytes.
Um because a gigabyte is not strictly a billion, but it's billions of bytes. And it's one giant array that's indexed zero to about 16 billion. And that's what your RAM is. That's what we have to work with. That's our working memory for everything your computer does. That's like its scratch space where your hard drive is long-term storage. RAM is the short-term memory that each program uses to write down anything that it needs to write down which is all the values of all of its variables which is what we put in the stack and also all the values in well still variables but in all your data structures and objects which will have their own variables where they're going to store their values. So that's what we have. And when we start our program, what we're doing is asking the operating system for a slice of memory of from that RAM. So when we start our program, we say, "Hey, can can I get some memory? I got this program I got to run." And the operating system is going to give us one section of RAM, and that's going to be our stack space.
That's going to be our stack memory. And that's fixed. That's what we get to use for our program. And that's, you know, that's it. You get your stack space when the program starts and you can't get any more of it. You can't give away any of it either if you ever for whatever reason wanted to do that. This is the the uh memory that we get to use. And we're going to use this as a stack data structure. So every time we use more memory in the stack, we use the next available location in memory. And when we remove things, when a code block ends or a stack frame ends, a pro method call returns, then we're going to erase things from that stack from the top and then reuse that space, that actual space for the next variable, stack frames, code blocks that we add to the stack. Uh so that's one slice of memory. And I say operating system, I am abstracting this out to the general case.
uh we don't actually talk to the operating system in Java. We talk to the JVM which is an intermediary which talks to the operating system for us. So things get a little obsticated there.
But in general in every programming language you ever write in one way or another you have to get this memory from the operating system. The operating system is what controls memory. So it controls your RAM and allocates it to each program that's running on your machine and is in charge of making sure that no program is using the memory that's been allocated to another program. So right before our stack space, the index before the first bite of our stack space might be used by another program. It might not be, but the operating system will not let us access that bite of memory. we're not allowed to access it. Same thing with the other end of our stack space. If we try to use the bite after what's been allocated to us, that may or may not be used by another program. But the operating system will yell at us and say, "You cannot access that memory.
We're not letting you access that. This is no good. You you're you're not, you know, something's wrong with your program. We're not going to let your program run anymore."
uh if you try to access memory outside of that range. So say you create a ton of variables and you run out of stack space and you try to use another bite that you're not allowed to that hasn't been allocated to you, your program is going to crash and you're going to get what's called a stack overflow error.
The namesake of what used to be our favorite website. It's kind of fallen out of favor now, but it will give you a stack overflow error, meaning that you overflowed the stack. You used more memory than you were allocated for your stack space. your program crashes, which isn't something we actively really worry about in programming. We get tons of stack space. We get way more than we need for most programs that we're ever going to run. And under normal operation, you're I've I mean, I've never overflown the stack under normal operation of a program. You can have lots and lots of stack frames, lots and lots of variables um and be perfectly fine. Uh but you will see and I have caused many stack overflow errors when your program goes infinite. Usually with infinite recursion where you're putting a literal infinite number of stack frames on the stack. Doesn't matter how big your stack is. If you're going infinite, you're going to run out of stack space. You're going to overflow the stack and get your stack overflow error and program's going to crash. Um, but don't worry about, oh, I've called I have a method call and a method call and a method call. I'm five method calls deep. You're going to be fine. You're never going to have to worry about this unless you go infinite.
But sometimes we need Oh, I got ahead of my slides a little bit. This is one block of uh memory. So, stack space is continuous. Meaning that if uh if we get memory allocated to us, we might get memory address 10,000 10,0001 10,0002 10,0002 uh 10,0003 oops and we'll get a consecutive chunk of memory addresses from our RAM. Uh so it's it's fixed size and its fixed location. It cannot move for the lifetime of our program. We can't just move our stack somewhere else.
uh and we can't grow the stack. Um very uh very important that's the the case. And it is a literal stack of stack frames. So it's only the stack frame that's on the top of the stack that's active at any given time. That's the currently executing method. And then when that method returns, it goes back to the previous method call on the stack. That's how the program keeps track of which method is currently running and which method to return to once that method stops running once it does return. That's how we have the the call stack which is going to uh you know method returns that method returns that method returns. Which method do we go back to each time? Well, it's just the one that's underneath the one that just returned. Whatever one's still in memory that's underneath that one. Visually in our traces we go upside down. So our stack is upside down. So it's the lowest stack frame that hasn't been crossed out is the active stack frame. Um but the conceptually it's the stack frame on top of the stack.
But sometimes we do need more memory. So we have this fixed chunk of memory which is pretty large in terms of stack frames and variables. But it's not large if we're talking about data structures. So once we have our data structures which will be array lists and hashmaps in Java they were lists and dictionaries in Python. Once we have our data structures we can't we need something more dynamic.
So we can't use that stack structure. We can't put our data structures on the stack um because it's you know it's a lot of memory but it's not you know not a lot compared to what we need to store in our data structure. If we're opening um a 100,000line file, for example, we would overflow our stack real quick and things would just generally get messy.
It's not what the stack's made for. So, we have to go somewhere else and this is where we're going to go to the heap. So, the heap is our dynamic memory. This is where we can ask the operating system for more memory after our program starts. We cannot ask for more stack space, but we can always ask for more heap space. So when we ask for heap space, we go to the operating system and we say, "Hey, I need x bytes of memory.
I need I'm creating a data structure. I need 10,000 bytes of memory." And the operating system is going to look somewhere on our RAM sticks and find a chunk of memory of 10,000 bytes. Did I say 10,000?
I'll stick with 10,000. 10,000 bytes.
However many bytes you asked for. And it'll say, "Okay, I found some space for you. Here is your new heap space. And here's the memory address of the first bite of that heap space that I allocated for you. And this memory can be anywhere. It's not it might not be anywhere even close to our stack space, but it will exist like we will get space. Uh under normal circumstances, of course, you might uh run out of RAM and be in swap space and run out of soft space. Besides we, you know, crazy things like that. Not crazy. But aside from uh error cases like that, we're going to get the memory we asked for and we're going to get a reference to that location in memory. And that reference is going to be our our road map to where to find that location in memory. So we can always access that location because nowhere in the stack do we have uh some information that says hey we have some heap space allocated to us over here.
Instead we have a reference which will be stored in a variable which tells us how to access the heap space that was allocated to us. So whenever we go to the heap and ask for this dynamic memory, we're going to have a reference to that location on the heap, which will be a memory address, which will be a hex index to that location in memory. Uh I won't dwell on the hex part of that, but when we have these memory addresses, these references in our memory diagrams, it'll always say ox and then some numbers. The ox means that it's a hex value. If you don't want to worry about that in 116, that's fine. You don't have to dwell on it. Uh when we get to 220 and later, then we do have to worry a little bit more about the hex values themselves and what hex is. 116, not so much. But we'll put the ox's there just so you're aware that there's something different about these numbers because they are hex. uh hex. The brief version of that is hex is a base 16 numbering system instead of a base 10 that we're used to. So it's the values 0 through nine, the normal digits that we're used to, and then a through f for the values 10 through 15. Then everything works in base 16 instead of base 10. So you have the ones digit, the 16's digit, the 256 digit. Um and then count like that.
But the important thing for us in 116 is that we only get references to our objects on the heap, which is an object on the heap is that location in memory that we get when we ask for heap space.
So now with that, we're ready to talk about arraylist. And if you took 115 here, you're you're well aware that your lists in Python, your dictionaries in Python, they were always references to objects on the heap for your data structures. Same thing in Java. Um, but that's the breakdown of why it is that way because we can't store our data structures in stack space. We have to ask for heap space and then get a reference to the new location in memory which might be could even be on a completely different RAM stick inside your machine for your stack verse data structure could be completely different very very different location. We don't know where it is. We also don't care where it is. As long as the operating system gives us our the memory we asked for, we don't care where it is. Who cares? Why would you care? Um, but the point is that we don't know where care where that location is, but we do care that we can access it. And that's what the reference is. The reference says here's how you access that memory that you don't know or care where it is, but here's how you access it is through this reference.
Uh, so that's why when we switch to dash structures, we will start talking about the heap and using the heap.
uh and we'll circle back to that when we get to the memory diagram. For now, let's go over syntax. And there is actually quite a bit of syntax to explain with data structures. We have some new concepts mostly because we're strongly typed. Um but some of it is just Java stuff. So let's talk about all of it. It will take a while just to talk about array lists so we can start using these things. So array lists are your our typical sequential data structure.
It's kind of like an array, but it's an array list. Uh it implements the list interface if you want to be technical about it. Uh and it is an object, which does add more functionality on top of a plain array. If you're coming from 115, you've never even used a plain array. So you don't even know what it's like, the the pain of using an array. uh it's not that painful. In 220 you'll know all about what that means where you won't have an arraylist or list um equivalent.
You have to use plain arrays but that's a topic for another day. Array list is going to be our sequential data structure indexed from zero. So if I have an array list of size 10, I have indices 0 through 9ine that store my 10 values and I can access those values using those indices.
So that's what we're trying to get. But let's look at all the extra syntax to be able to get there. So first we do have to import the arraylist class. So all of our code in Java is contained within a class. And if we want to use code from a different class, we have to import it.
So a different class means that code lives in a different file. So by default we don't really have any way of knowing or Java doesn't have any way of knowing what we mean if we just start using arraylist. It's going to say I I can look all over your code and I don't see any definition of an array list. So we do have to tell Java hey I want to use the arraylist class from the Java util package. So when I import java util arraylist, what this means is I'm importing a class named arraylist that lives in the Java package inside a subpackage named util inside the Java package.
And then once I have this import statement, then I can start using the arraylist class and use arraylist throughout my code. Now we've had we've already used some classes and we haven't imported them. So, we used the I never I always forget I don't have one in this. Technically, we have one right here. I guess uh we used the string class. String was had a capital S and we used it and never imported it. There are certain classes, anything in the Java.lang package. There are certain classes that are just automatically imported in every single Java program uh you'll ever write. Oh, I have it right here actually. String. So we have the string class is autoimp imported. Uh this integer, we haven't talked about this yet, but there's an integer with a capital I. So that's another class that lives in a different file. Its definition is in a different file. That is autoimp imported. All of our primitives, int, double, boolean, those are all autoimp imported.
Uh and the system class, system starts with a capital s. This is a class. Uh that's also autoimp imported. So we have a handful of classes that are just autoimp imported in every Java program.
Array list is the first class that we're using that is not autoimp imported. So it's the first one we do have to explicitly import. Uh and imports these days with modern tooling the ID usually Intelligj will usually just add your imports for you if I'm being honest. It used to be a bigger deal to remember your imports, but if you start typing this line, uh, at some point, Intelligj usually, I forget exactly when it does and doesn't autoimp import, but it'll usually autoimp import this for you. If it doesn't, you you know, you got to add in your imports. Uh, just make sure you have them. If you get errors on your array list and it says symbol not found arraylist, that means that you forgot your import. So, make sure you get your imports in there.
And once we have our import, we can start using the arraylist class. And we're seeing this new keyword, which is a new keyword to us. The keyword new means we're creating a new object on the heap. We'll crack that open a lot more on uh next week on next Tuesday's lecture about classes and objects. Uh for now, just know that we see the keyword new. We're creating a new array list in this case on the heap. And internally, Java's the JVM is going to ask the operating system for memory for our array list and go through all all the memory stuff. But it is going to give us a location on the heap for our new array list that we're creating.
And then we have two parameter lists, a type parameter list and then the regular parameter list. They're both kind of empty in this case. uh in this situation this is integer I'll talk about it when we go to the the other side the variable side but it is integer here but since we're assigning it to a variable of type array list of integer we can omit integer we can also type it here that's fine but we can omit it Java will say hey look you only have to tell us that that's integer one time but it is technically integer here and the second is a parameter list we or an argument list. So we are calling a method here. We're calling a special method called a constructor whenever we use the keyword new. So new is always followed by a method call to a constructor method which is a special method used to create objects on the heap. And again that's a topic I'll dive into much much deeper next Tuesday. Uh for now just know that this is a method call. So it needs an argument list. And in this case, our argument list is empty. So we're just giving the constructor no arguments. So we're calling a parameter list constructor.
Much Oh, my site says week four, but it's lecture three. Lecture three. Much much more details will come on that.
Well, we'll crack open all that syntax.
Uh, next Thursday we'll crack open this uh this syntax I'm about to talk about as well with the angle brackets. Uh, we'll know all of this stuff by the end of next week by the 40% mark of the material. It's crazy how fast the summer goes.
Uh, so whenever we create a variable of type arraylist, so R1 is going to store a reference to an arraylist on the heap.
And whenever we have a variable of type arraylist, we don't just say give me an arraylist. R1 is a variable of type array. We we just don't do that in Java.
Uh technically you can, but it's a good way to get a bunch of errors in your program. Uh instead, we say give me an arraylist of a specific type. So, Java is not only strongly typed, meaning your variables all need a type, but some variable types need a subtype. So, in this case, an array list, you can't just create an array list.
You can, but you shouldn't. You can't just create an array list. What you need to to do is create an arraylist of a specific type. So this R1 isn't just a variable that can only ever store a reference to an array list, but it can only ever store an array list of type integer. And that arraylist can only ever store integers. So we have one type that we can store in our array list.
This array list can only store integers.
We could create an array list of doubles, an array list of strings, an array list of array lists of hashmaps of strings to ins. We can get as crazy as we want with it, but we can only ever store one specific type in this array list. In this case, this array list can only ever store integers. If we try to add a double into this array list, we will get a type error because we just can't do that. So, we create arraylists of specific types. And in this case, I'm creating an array list of integers.
So in the problem set descriptions if I say create a method that takes an array list of integer. This is your syntax.
This is a method that takes an array list of integer. The sum method takes an array list of integers. That means that the inside the angle brackets you'll have the type integer.
This is called a type parameter. And this is technically a type argument that we're giving. This is a type argument list with the argument integer which sets the type parameter of this array list to integer. Again, much more to come next Thursday. We'll actually write a class that takes type parameters, our linked list class. Uh for now, just know that you have to give your array list a type parameter. And that type parameter must be a class type. This must be a type that starts with a capital letter.
It needs to be a class type, it cannot be any of our primitives. If you, as much as you'd like to create an array list of ints, you can't do that. It's not allowed. An array list of lowercase doubles, not allowed. Lowerase B booleans, not allowed. lowercase i int not allowed. Your type parameter has to be a class type meaning a type that's defined by a class in Java that has its own file with all of its code and all of its definition has to be that class a class type. Now lucky for us all of our primitives Java does provide class equivalents for all of our primitive values. So instead of int we can use capital I integer. Instead of double, we can use capital D double. Instead of boolean, we can use capital B boolean.
So we can still create array lists of our primitive kind of our primitive types by using their class equivalents.
And Java is usually pretty good about converting between these. For example, in the sum method, I have uh an int out and then I'm adding out of type int. I'm going to add to it a value from the data structure of type integer. So I'm going to add an int and an integer. That's perfectly fine. Java just says, "Oh, I know what you're doing here. Let me auto well, it's called autoboxing. Let me autocon convert your integer type into an int type and then perform the addition." So in most cases, it's going to do that. Here I have ints. X is an int. I'm taking 10 minus x. That's going to be an int and I'm adding it to an array list of type integer. Java is going to say, "Oh, let me grab your int primitive and convert it into a an integer type." So, it's going to convert between them most of the time, except in certain situations when it's return types and parameter types. Specifically, when you submit to AutoLab, there will be times where if you're if the method asks you to write a meth the problem set asks you to write a method that returns an int and instead you return an integer or if it takes a double and you have it take a capital D double instead of a lowercase Double, that will cause issues in AutoLab. It's something you got to uh watch out for. The rule of thumb in 116 is we will always always use the primitive types unless we unless it is a type parameter. This the only time where we have to use the class types. In every other situation ever, we're always going to use the primitive types. So if you have a method that takes a double, it's got to be a lowercase double every single time. As long as you're always doing that, you don't really have to overthink this. You don't have to double check the wording of a problem set and say, okay, is that a lowercase D or a capital D or a lowercase B or it's always going to be lowercase unless it's a type parameter.
So unless we have to use the class type, we will always prefer and always use the primitive types.
Uh every I dwell on this quite a bit because every semester it's it's it's like clockwork. It will happen. someone some of you probably somebody who's not watching lecture let's be honest is going to use the capital types and then if your code doesn't compile because the code won't compile in auto lab in that case then the error message in auto will say make sure you check your types for example lowercase Double does not equal capital Double I have to hint right there and then uh somebody will still lose credit for a problem set for doing that for having that don't be that person. Just always use the primitive types unless it's a type parameter and you'll never have that issue in AutoLab.
Okay. So, anybody listening, anybody in lecture right now or watching the VOD will be fine. You're not going to have that issue. You're never going to run into that.
Uh the people who are like, I'm not going to watch lecture. I got this. I'm just going to do the problem sets.
They're the ones who will typically have that issue. Uh that said, there are always some people, you know, it's just one part of a two and a half hour lecture. Some people, yeah, you miss it.
Come to office hours, I'll get you straightened out. PostP will get you straightened out. But uh because it's usually something I'll just be like, hey, you know, you got capital I integer instead of in there. I'll just uh point it out so you can get through it. But uh just know that that's a thing. You can save a time, save a post, save an office hour visit by heating my words right now. Uh and then actually using the array list I don't have too much to talk about incidentally. Uh so to insert into an array list it's the add method. Uh and to get a value add a specific index it's the get method. If you want to ask for its size it's the size method. And the big thing is notice that these are all methods.
So everything that we do with an object in Java. So anything that's on the heap is going to be a method call.
Meaning that we can't just use our index notation that we used in Python where you have your square brackets. You can't use that notation in Java for array lists. Uh you have to use methods. So it's the add method. This will insert this value into the next available index. So when the array list is empty, it's going to add to index zero. When the array list has one value, it's going to add into index one, etc. It's just going to add to the end of the array list. Get is going to take an int and it's going to return the value at that index. Uh, and size is going to tell you how many values are in that uh the array list. And this gets to the question of when would I use each loop. Uh, we we got to this example here. Here I want to iterate over the indices of an array list. Here I could say for int uh int val in r in that'll do the same thing. It'll iterate over all the values of this array list. But if I need to get the index I have to use this style of loop which is the big reason why I show this style of loop. Sometimes you have to use this. You can't just iterate over your data structures values here. uh I'm iterating over the indices effectively. I'm starting at zero. I'm going to strictly less than the size. So it's going to stop one before the size which is the last index. The indices go from 0 to size minus one and increment one each time. So I'm going to get every single value in this array list. And I'm effectively summing all of the values.
I'm gonna add to out initialize at zero and then add the value at every index in this array list and then return the the total sum.
Uh can you use with an index in it?
Yeah. Uh so that that is what I'm doing here. I'm storing the indices in x. So when x is zero, I'm going to do getit of zero and that's going to return excuse me the value at index zero of this array list. or when x is one, I'm going to get the value at index one, etc. And you can hardcode it. So if you know you want to get the value at index two, uh you can outside of a loop, outside of variables, do get two, and that's going to give you the value with u at index two.
Assuming that's what you mean. If that's not what you mean, let me know.
Okay, that's that's a I yeah let's go to the memory diagram. Uh sometimes I like to go through all the slides and then hit the memory diagrams.
Let's do them one at a time today before we talk about hashmaps.
Uh rail list example.
Let's Oops. I gotta bend this first.
Hide toolbar. Don't forget full screen this time.
I'm only printing one thing, two, well, three things to the screen. I don't need too much room here.
Yeah, it looks like I'll be able to fit everything on the screen. Once we get to um classes, objects, and inheritance specifically, uh the memory diagrams get so big I won't be able to fit them on the screen anymore. So, I'm always happy with these earlier ones where I can get everything on the screen in one go. It's a lot easier for you visually to see the whole thing without me having to scroll up and down and and all that. Uh, so every Java program always starts with the main method.
And then our first thing, our one gets a reference to a new array list. So, we see the keyword new.
We're going to create an array list.
So this is Oh, let me let me slow down a little bit. For some of you, this might be the first object creation. So, let me slow down a little if you didn't take 115 at UB. So, whenever I have the keyword new, this new keyword is your very strong hint. It's basically your cheat sheet to say we are creating an object on the heat. If you see the keyword new, you're clicking this button. If you don't see the keyword new, you're not clicking that button.
That's one of your rules when you're going through through memory diagrams.
If you remember that rule, there's a lot of mistakes that you just will never make because you you just remember that rule. Keyword new.
I'm creating a new object. When I create a new object, I have, you know, a lot of options here. The first thing that'll be uh immediately highlighted is the type of the object. I'm creating an array list.
So, I'm going to make the type arraylist. And with arraylists and hashmaps, we're not going to worry about this constructor uh just yet. Uh we'll do that next Tuesday. We'll start writing our own constructors and we'll map constructors and constructor stack frames. It's a whole thing. We'll get to it next week.
And then we don't have anything in this array list right away, but the name will be the indices and values will be the value at that index when we start adding things to this array list. But right now, it's initially empty. And we're going to store a reference to the array list in the variable R1. So when I have R1, this is a variable of type array list of integer doesn't actually store an array list of integers. That would seem obvious like, oh, it's a type of array list of integer. It's got to store an array list of integer. But Java don't work that way. What Java does and a lot of languages, most languages, Python did the same thing. Uh what uh I don't know if there's a language that doesn't do this. I think every language does this.
Uh but uh but R1 doesn't store an array list of integer. It stores a reference to an array list of integer cuz the array list of integer is somewhere on the heap. We don't know where that array list is. It's wherever the operating system decided to grab some memory and allocate it for us to be able to store this array list in uh in memory. We don't know where it is. Uh well, we kind of do. Uh but right here in the corner, what I got is this ox.
This is going to represent the memory address of where the array list lives in memory. And these values could be anything you know these these can be whatever values uh whatever value the operating system decided to use. Uh not value but wherever the operating system decided to allocate memory for us. This is going to be the location of that memory. In the tracing tool, each object will be ox 0 0 then o x200 0 o x300 0. But these values can be anything. It it's whatever the operating system does. Uh instead of generating random values like an operating system will not literally random uh we just do ox 2000 x 3000. The ox like I said before the ox means that this is a hex value. So technically this could have letters in it. You could have something like this. That'd be perfectly fine. Um, but sticking with the defaults that the tracing tool gives you is perfectly fine as well. Ox10 0 is our location. And that location in memory is what's stored in our variable. So, R1 stores ox 0, the location in memory where this array list lives in the heap.
That's the only thing that the variable R1 stores is this ox110 of memory address of that location. Where in memory where in RAM does this array list live and that uh memory location is what we call a reference. So R1 stores a reference to an array list which is your road map of how to find that array list in memory. how to locate this array list.
And then we have a loop. It's a for loop. So our initialization statement is going to be executed inside the code block of this loop. So x= 0 is inside of a code block for that loop.
Check the condition. X less than 4. X is zero. That's true. So we're going to execute the body of the loop. R1 dot.
This little guy right here, this little dot is our our way, the dot operator is our way of accessing an object on the heap because R1 is kind of useless to us. R1 stores a reference to an array list. So R1's value being a reference is meaningless. I don't care about the reference. I have never ever in my career cared about the value of a reference. I really don't care where in memory this array list lives. Nowhere in while you're programming will you have a situation where you're like I would really like to know where this array list is in memory. You don't care. We make up some values for the memory diagram just to make it clear that it is somewhere in memory and give it a you know just some value o x1000 in this case. But we don't actually care about that when we're programming and or when we're doing our memory diagrams for that instance. What we do care about is that these values match. But I don't really care what this value is. So I'm just going to leave the default ox110 0. It's what I think pretty much everybody does in the memory diagrams. You can change it if you want, but there's no reason to. Um, but what I am deeply interested in is the object that that reference refers to. I want to know, hey R1, take me to the object to which you refer. That's what I really want to know. So R1 resolves to ox1000 and dot says take me to that location.
Take me over to ox110. Take me to that memory address. Take me to that object that's located there. And then I want to muck around there. I want to do things there. So R1 dot means take me to your object. So when we call the add method, we're no longer on the stack. We've followed this reference over to this arraylist. And now we're at this arraylist on the heap. And now we're going to call the add method at this arraylist or by this array list. This arraylist is calling the add method because that's where we are. That's where the dot operator took us is over to ox10.
So now we call the add method on this array list and we're giving it the argument 10 - x. This is a new expression 10 - x. So we start over at the stack frame.
We look for x. x is 0. 10 - x is 10.
So the add method at the array list gets the argument 10, which means we're adding 10 to this array list on the heap.
So let's oops add a variable at index zero. We're going to get the value 10.
Now just looking at this code, even if you're brand new to Java, you could probably you could deduce that this is going to be the final outcome of that code that you're going to add 10 to index zero in this array list for the first iteration of this loop. But knowing all of those mechanics are going to be critically important as we go forward. we start creating our own classes that create our own objects and use inheritance and polymorphism.
Understanding all of those little mechanics are going to be critically important, especially how the references work and the dot operator. It's why I explained that quite a bit right here.
Um, knowing exactly what this dot operator does, not just kind of getting a half feel for what it does, which is fine in like 115 in your CS1 course is perfectly fine. Um, but we have to start understanding a little deeper what's going on with that dot operator.
So then we're going to increment x. We can kind of rip through the rest of this loop now that I explained it all at once. One less than four is true. So we're going to go array 1.add.
So we're adding to this array list the value 10 minus 1. So 9. We're going to add at the next available index which is 1, the value 9.
End of the loop. So we're going to increment x to two.
Two less than four is true. So we're going to execute again. R1 add 10 - 2 which is 8 2 8.
End of the loop increment x to three. 3 less than four is true. So we're going to add or r1 add at index 3.
We're going to add the value 10 - x. X is currently three, so seven.
End of the loop, we're going to get X= 4 from the increment operator. Four less than four is false. So the loop ends and we move on to the next line of the program. So we never add six into this array list. We never execute the body of the loop when x is four. Once four breaks the condition, we get on out of there. We get out of that loop. Then we have uh the loop ends. X oops wrong button.
The loop loop ends. So X is no longer in memory. If we try to access X, we're going to get a symbol not found error because X does not exist in memory anymore.
Then we're going to print the array list. Uh for what it's worth, I I will never harp on the syntax like this on the uh not syntax but formatting. But for what it's worth, the when you print an array list, you're going to get commaepparated values in square brackets. So this will be 10 at index 0, comma 9, comma 8, comma 7. This is how your array list will print when you print it to the screen.
I'm going to copy that into my clipboard for what's about to come. Then we create another variable R2.
So I'm going to do R2 in the main stack frame and it's going to be assigned the value R1.
The value of R1 is ox 0. Whenever you have an assignment like this, you're assigning a v value of a variable to another variable. You take the v the value of the variable and you just assign it. If this if these were ins and I said x= 7 y = x= 7 y = 7 like that's it. Same exact thing. That's all we do is assign the value of R1 to the variable R2.
Period. That's it. That's all we do. And the value is ox 0. Done. We do not create a copy of the array list. We don't create a new object. We don't copy all the values over. Nothing. We We're just copying the reference. This is an assignment by reference. We're just taking the reference here and assigning that reference here.
So we have two variables that refer to the same object on the heap, which is a critically important thing to understand how these references move around our code and how we can have multiple variables referring to the same object.
Very very important to to know. We'll see that a couple times today.
And then we're going to print R2. R2 refers to ox1000, which is the same exact array list, and it hasn't changed.
So we're just printing the same exact thing again.
They refer to the same object. They refer to the same array list. Two variables referring to the same object.
That's what we have here.
Okay. Then we're going to hit the sum method.
Oh. Uh, and so earlier we saw the keyword new and I said that's your cheat sheet. That's your cheat code. New means you're pushing this button. If you don't see new, you're not pushing that button.
So when you get to this line right here, because some students will create a new object, copy all the values and put ox2000 in R2. That's a huge mistake. You clearly don't understand references and you got to take the final exam if you do that on the midterm.
>> Uh because you you didn't complete the learning objective, assuming you do it on both traces. Uh the uh but when I look at this line, I don't see the keyword new. I don't I don't see new R1 there or anything or new array list. If I don't see the keyword new, I am not clicking this button. Period.
That's it. You don't I don't see the keyword new, I'm not clicking that button. So, if you're not clicking that, if I'm not clicking that button, well, what do I do? R2 is going to have to store reference to an array list. I only have one array list. You know, even if you don't remember that R1's value just gets copied here. There's only one array list in this example. It's not something you can mess up if you know that keyword new is the only time you're creating a new object on the heap.
Okay. Then we get to the sum method. We have a new method call. So we have a new stack frame for the sum method. Sum has a parameter RN. So our first thing when we create a stack frame is always get the parameters on that in that stack frame. So RN equals R1.
The value of R1 is ox10 0. So we have array list of integer RN equals R1. This is the same exact thing that we did on this line. We have a variable assignment be getting assigned the value of another variable. Same exact thing. R1 is assign is the value of R1 is assigned to R in that's just ox 0 0.
So we have a third variable in our program that refers to the same object.
So we have three variables referring all to the same array list in memory.
And let's uh let's go through this code here. So we have out equals 0.
out equals 0. And we have a for loop. So we need a code block. X is declared inside of that code block. X less than RN.
RN.
RN is ox 0. Dot means let's head over to the heap and check out that object. And then size. So we're calling the size method on this array list which size is going to return the size of the array list or the number of elements in it which is four. So this is zero less than four is the boolean expression that's true. So we're going to execute the body of the loop. So RN.get.get.
So we're calling the get method on ox1000.
X is zero. So we're returning the value at index zero which is 10. So we're adding 10 to out and then reassigning out to that sum. So out 0 + 10 it's going to be 10. Cross out the zero give it 10. We reach the end of the loop which means we're going to execute the increment statement. X is 1.
Check the condition again. Size is still four. One less than four is true. So, we're going to do it again.
RN.get 1 is going to give us 9. So, we have 10 + 9, 19.
That's the end of the body. So, we're going to increment x again.
Check the condition.
Two less than four is true. So, we're going to increment out by RN.getx.
Again, rn.get.
So, we're calling get on this array list again. X is now two. So we're getting the value index 2, which is 8. And we have 19 + 8 is 27.
Hit the end of the loop again.
x++.
3 less than 4 is still true. So we're going to execute again.
Array in.get of x, which is three adding seven. So we have 34.
Reach the end of the loop. Execute the increment statement again. We have four.
Four less than four is false. So we're going to break out of this loop. The code block ends for the loop. Then we're going to return out which is 34.
The method call ends. Oops, I always forget my return arrows.
So we need a new variable in the main stack frame total.
That's where we're returning to. So we return out follow the return arrow to total which is 34. And then the stack frame ends once it returns.
Then we're going to print to the screen.
34, which is the end of the main stack frame, end of the program.
So, there's quite a bit going on there.
We're combining just about everything. I guess we don't have a an if statement here, but we're combining pretty much everything else that we've learned. loops, code blocks, method calls, uh arrays, objects on the heap, assign it by uh by reference, and I forgot to call this by name, but this is call by reference. So, we're taking a reference of RN uh as a parameter in a method. So, we're calling this by reference. We're giving it the reference R1 into RN and then it's going over to the same object that we have access to in the main method and uh using that object. If the sum method were to make a change to this array list, that change is going to be seen by both R1 and R2 because it's making a change in the heap to that same object that they're all referring to.
Which we will see throughout the semester. We'll do that quite a bit.
So any questions on this trace and this trace array list loops for loops code blocks method calls assign by reference call by reference.
If this is all new to you, you can you're you should be feeling how fast the summer goes at this point. We learned quite a bit and combined it all.
And if you are feeling good, a lot of you might be feeling good about this. Uh try to go through a trace on your own too to make sure you really understand.
Tracing is one of those things that I think it's a lot easier to follow.
Coding also, it's a lot easier to follow in a lecture than to do on your own. Uh which is why I emphasize so many points so many times because I want to hammer it into your head so then when you go down to do your own trace that you know exactly what to do.
Um, but I'll keep watching for questions.
Meanwhile, I'm going to head over to hashmaps. And how we doing on time? We got hour left. We're doing good. I think uh I think I can do files in less than 50 minutes. I don't think there's a a ton there.
Okay. So, our hashmap is our key value store.
This is the same as your dictionaries in Python. Uh we used to do Python and JavaScript in 115. But u so this won't really apply to many of you, but uh it's objects in JavaScript. Our hashmaps were our uh arraylists were our arrays in JavaScript. They call them arrays over there. Uh so for what it's worth, it's hashmap is our uh map in C++ if you're coming from C++.
Um, if you're coming from a different language, let me know. I'll I'll do the comparisons. But I think uh most of y'all will be coming from Python or Java. If you're coming right out of AP in high school, uh you might just be coming from Java and this might be pretty boring to you except for the memory diagrams. You wouldn't have seen that.
Uh okay. So hashmap is our key value store. Meaning we're not ind indexing by uh indices.
That sounds weird. U but we're not indexing from zero to size minus one anymore. But we're going to store values at keys and then use those keys to look up values. This is our first nonsequential data structure, meaning that order doesn't matter. The hash hashmatch will not preserve the order in which we add things to it. uh it just doesn't do that. We're so we're going to get our values in a seemingly random order in our hashmaps which important to know but comes up uh less often than you think as long as you're careful about it.
So just like our array list we do have to import hashmap. It lives in the same package java util. So import java util hashmap. If we don't import, then uh well, we're not going to be able to use a hashmap. So, if you get type not found uh or symbol not found for a hashmap, it's probably means you forgot your import. It's not a big deal. You just slap your import in there, then you're good to go. And we'll see lots of import statements throughout the rest of the semester.
And the hashmap, a lot of it's the same as array list that we just went through.
So, the keyword new to create a new hashmap. and then our empty um type parameter and uh type argument and argument list.
And the big difference here is that we have two type parameters. So with a hashmap, we don't just have a hashmap of some type. We have a hashmap which specifies the types of the keys and the types of the values. So the keys and values can be different types. In this example, I have a hashmap of strings to integers. So I'm mapping strings as keys to integers as the values. So whenever I add something to this hashmap, I have to add a string mapping to a in integer and they're commaepparated.
So when I add things to a hashmap, it's not the add method. I'm sure there's a good reason for this, but it's it's not add. It's put where we put key value pairs into a hashmap. We add values to an array list. And I'm not exactly sure why it's different. I'm sure there's a good reason somewhere, but it's put to add a key value pair into a hashmap. So, we're going to put key value pairs into a hashmap. and then it's and then some key that's going to give us the value the that that key maps to.
So a little bit different syntax and again it's all method calls. Everything we do with our objects in Java will be method calls where we had again the bracket notation in Python. You would do bills bracket allen equals 17 in Python.
We don't get that syntax in Java. It's all method calls.
And then we have a lot of different ways to iterate over a hashmap. Uh the uh this is where we're going to use our third style of loop. This is the one if you're coming from 115. I feel like I say that way too much, but if you're coming from 115, uh this is your your typical loop. This is the loop used most often in Python. This is what we call a for each loop. In Java terms, they call it an enhanced for loop. Um, but the general term, the language independent term is a for each loop. U, so we're going to iterate over all the values of a data structure. And we can use this same loop for arraylist. If you want to use this loop for your array list, as long as you don't need the indices, that's fine. I show you the one with iterating over the indices because we will need that in our problem sets and you'll need that in programming in general. You'll need the indices at some point uh for some applications that you do. But we can use this for loop which is for declare some variable. This is like creating your int x equals zero.
Same thing. We're going to declare a variable. Then we have a colon which is read as in in Python. It's literally the characters I in for in. In Java, it's the symbol in a lot of other languages, it's the symbol of a uh of colon, but it's still read as in. And then any data structure, specifically anything that can be iterated over in Java, anything that implements the iterable interface technically, but any of the data structures that we'll use, we can iterate over them. Uh, and what this will do is store the first value of this data structure in the variable key, execute the body of the loop, then go to the next value in this data structure, store in the value key, execute the loop, and keep doing that until you iterate over every single value in this data structure.
So that's what we're doing here. And there are two different things that we can access, two different data structures we can access for the hashmap. we can iterate over all the keys or iterate over all the values. So if we get the key set, so we're asking this this hashmap, hey, give me a data structure with only your keys in it. We can iterate over those keys, which are strings for this hashmap. Or we can say, hey hashmap, give me a data structure containing only your values, and then iterate over all those values, which would be integers in this case.
uh then uh iterate over those. And for my money, what if I need both, what I like to do is iterate over the key set and then the first line of my loop, I'm going to grab the value at that key. So now I have access to both the key and the value, which is something we'll often need when you're programming. You often want the keys and the values. So this is a good uh simple way to do that.
There is something called an entry set where you can iterate over both the keys and the values simultaneously with the for loop syntax. Uh but that's just more syntax you have to remember and for whatever reason my brain just can't remember that syntax. So I just remember key set iterate over the key set and then grab the values. Simpler. There's enough things we have to learn and memorize and just know in this field. If I can take one less thing that I have to remember that isn't really helpful, I'm gonna do that. If you want to look up entry set, learn how to use it, and I want to emphasize, actually learn how to use it. If I have a meeting with you and I say, explain what this entry set is and how the syntax works, and you're like, I don't know, the internet just told me to do it. You know, that's that's bad. But, uh, if you want to study it and actually learn how it works and then use it, knock yourself out.
Have fun. Uh for me, I just like to keep things simple whenever I can. So I'm going to iterate over key set and grab the values. I recommend you do the same thing, too. You have enough you're learning this semester. You don't have to add one more thing to it.
Unless you want to unless you're like, you know what, this course isn't going fast enough for me. I I want more. Then yeah, look up entry set. That describes very few students. But for those of you who whom that does describe uh I don't know we should chat sometimes we we should talk about computer science you'll have a lot of fun in this department in uh UB uh okay so key set is going to iterate over the keys values is going to iterate over the values and the or like I said the hashmap is not going to preserve order for you. So these loops might not iterate over these key value pairs in the order that I add them. And in fact in this example it will not. It will go Coleman then Allen. It will re uh go in the you know happens to be reverse order. But if you add a lot of key value pairs to a hashmap you'll iterate over them and get them in seemingly random order. it is by the order is actually the hash values of the keys but there's no consistent you know good way for us to compute those you could if you really dug deep um figure out the hash function for strings in Java and we're not going down that road in 116 so when you're doing your memory diagrams and when we do the memory diagram here you can iterate over them in any order you'd like there's no reasonable expectation that you'll know the hash functions for the keys and go over the proper value.
So when I do the memory diagram for the this I'm going to iterate over them as Allen then. But if you actually excuse me but if you actually run this in Java if you actually run this in Java you will get them in the reverse order. So my memory diagram in this case won't 100% match what will happen when you run the code. And that's fine. If we ever have you iterating over hashmaps in a memory diagram, that's perfectly fine.
You don't have to match 100% what will actually happen. And the takeaway when you're programming is that your loops must not depend on the order. If you're writing loops, looping over a hashmap and you're relying on a specific order, the order in which you inserted the key value pairs, you're going to have bugs in that code because you're not getting that order. So, always something to be aware of. the order does not matter in our key value stores specifically hashmap and Java. This is one where the order will not be preserved. If you do need that behavior, there are other ways to do that in programming. We won't ever need that in 116. Uh but there is I think tree map in Java does preserve order. Um but we're not going to need that in 116.
Okay. So let's memory diagram this thing. Again, the memory diagram is in the slides. If you want to uh use that for reference, I'm not going to use that in lecture. I am going to what am I going to do? I'm going to grab the code right here and do it in the tracing tool. What would happen if you print the hashmap itself? Actually, well, uh I think that's in this code.
Uh yeah, right here. We we will do that.
What it'll do is print the key value pairs using brace notation. And actually, let's jump to Intelligj first.
I forget the exact notation. I'm pretty sure I got it, but uh I want to jump to Intelligj anyway just to emphasize this ordering point. So when I run this, we have our three loops iterating over the data structure and it's going to be concaid then Allen and uh oh, this is not this does not match our slides. Uh we're going to do concade instead of Coleman, but it's uh concaid uh concaid's number concaid then Allen. And to answer your question, it's going to be this brace notation, an equal sign, no quotes around the strings is the one that I I was doubting a little bit in my head. Is there quotes around the strings or not? Uh no quotes around the strings and then comma separation for the key value pairs. And notice again here the order is concaid then Allen. So it is the reverse order of what we would expect if we expected order to be preserved.
All right. So let's show toolbar bin this hide toolbar full screen.
I am printing quite a few things to the screen. Let's just for the memory diagram get rid of our import and our package declaration.
Resize this a little. There we go. It should I I think we have enough room to to print. What do we have?
Uh quite a few lines. Nine lines. Uh let's do this. We'll we'll resize if needed. So, every Java program starts with the main method. We have our data structure here. New means I'm creating a new object. The type is hashmap and we have nothing in it so far. Then we're going to put two key value pair.
Oops. And then Bill's the variable stores just ox 0. Then we're going to put two key value pairs in here. So I'm going to add and then the name is going to be the key and the value is the value.
So it says name for our specific case of a hashmap. It's going to be the key. So key and then uh value then concade. So it's bills dot take me to your object and then put we're calling from this hashmap and then concade 86 and then print. I always get this question. It's a very nuance point. Uh, again something I would never I don't think I finished this thought, did I?
When I mentioned it last time, I'll never care about the formatting uh in your traces. So, for example, like these strings will have quotes around them, but when we print to the screen, they won't have quotes around them. Uh, here I'm using print, which means I'll print without a trailing new line character, and then print line will add a new line character. I'm doing that just for the formatting so it looks nice when you go to the code and the repo and you uh print the you run the program. It prints a little nicer in your memory diagrams. I don't care if you don't get this exactly right. Um but I will do it the right way in lecture at least. What is Alen's number? Space with no new line character. And then we're going to print bills.get Allen. So Bills.get get get me the value at the key Allen which is going to be 17 and then that 17 will be on the same line right here because print didn't have a trailing new line character the next thing we print is printed on the same line and then this we did have the new line character with print line so now the next thing that's printed will be on the next line when we print the entire data structure I'm going to do Allen equals 17 space concade equals 86.
So that is on the next line and I'll stick to the order that I inserted in.
So we saw that it will be the reverse order when we actually run this program.
It's okay to have the ordering messed up in your traces. Again, there's no reasonable way for you to know the order in a especially in a timed quiz environment like I don't care. Uh same thing if this 17 was on the next line. I don't care when we're doing the traces. When you're doing the traces on the midterm and uh potentially final exam if you need it.
What I'm looking for is do you understand the 116 concepts involved?
There's no learning outcome of this course where I'm like students need to know the difference between print and print line. Uh students need to know that when you print to the screen there are no quotes around strings. Like those are not learning objectives of the course. Those are just some Java specific things that are, you know, just features of the language. By no means a learning outcome of a CS2 course like this. I don't I don't care. I will if you have uh all of that wrong in your trace, but you demonstrated an understanding of the learning objective, which is, you know, the OP concepts, the data structures concepts. If you demonstrate understanding of those, you're going to pass that uh trace and be fine. So don't waste time memorizing the difference between print and print line unless you have extra time and and you know you know don't avoid it intentionally.
But if you already are struggling, you have got a lot to learn in this course.
It's a tough course. It moves fast.
Don't get hung up and waste time on extra super extra things superfluous superfluous things like that. Uh, I don't care about those in your traces. You don't have to be 100% correct on a trace, but you do have to demonstrate un um an understanding of the learning objective that's being assessed, completion of the learning objective being assessed.
Uh, and the learning objectives are in the the syllabus. It's can you trace code that uses classes, objects, inheritance, polymorphism. Those are the concepts I'm really concerned about.
uh which we haven't even gotten to yet to be honest. Um okay, so string key. So we have a code block and just like our oops I clicked the wrong button. So just like our traditional for loops, our variable for iteration is going to be declared inside of the block for that uh the code block for that loop. So we have key iterating over the key set. So, it's iterating over all the keys. And again, I'm going to do it in not the order that the program will actually run in. Uh, Alen, we're going to print Allen to the screen.
Get to the end of the loop. And now we move to the next value in the data structure, which is concaid.
Print concade.
and then the loop ends.
So like with our traditional for loop, we had a boolean expression. We were iterating until the expression was false. We don't have the same thing here. We're iterating over all the values in the data structure. In this case, all of the values in the key set, which is all the keys, and then uh we get out of there. So the loop ends, we cross it out, meaning this var variable key is no longer in memory.
Then we move on to the next code in our program, which is another loop. We're going to have value, which is iterating over only the values in this dash structure, bills. So bills do values 17 and 86 are the values we're iterating over. So 17 print 17 update to 86 print 86 and then the loop ends.
So we're iterating over all the values all two values here uh to get those values.
We'll have to resize a little bit here.
It should be fine. Uh, and then we're going to iterate over the next loop. So, we're creating a new code block. And this is where, well, let me just get it on screen first. Uh, we're iterating over the keys again. So, we're going to do Allen Alen Allen create a new variable value. So, we have int value is 17. We're getting the value Allen the value at the key Allen which is 17. And notice that I'm reusing variable names.
This is one thing that we one benefit or um one uh feature that we get because of the way code blocks work in Java. Since key and value from the first two loops are no longer in memory, we are free to reuse those variable names. Normally we would have a name conflict. You can't have two variables with the same name.
That's insane. that would cause all kinds of issues. But key and value from the first two loops don't exist anymore.
They don't exist. So, we're welcome to reuse those uh those variable names, which might seem like a small feature, but it's a really nice quality of life feature to have, especially if you have sequential uh subsequent loops that are all looping over indices. We often like to use i, int i, as our iteration variable. And if you had to keep coming up with new names for that variable for each subsequent loop gets kind of annoying, gets kind of painful. Uh so with this feature, we can just use int i, int i, int i, int i multiple times in the same method, in the same stack frame, and it's no big deal because each variable i is being destroyed at the end of its corresponding loop. So we can reuse these variable names. We don't have to get creative with, you know, key 2 value two or anything. You know, that's not creative, but we don't have to do anything like that. Uh, and then we're going to print Allen. Man, I did it again. Alen's We'll get there. Allen's number is colon space and no new line character because it's print and then print line the value 17 and then a new line at the end of that.
Then we get to the the end of the loop.
So we update get to the next value of the key set.
Oh my goodness.
Uh next value of the key set is concaid and then value equals bills.get get concaid which is 86 and then print concade's number is colon 86.
That's the end of that loop and that's the end of our program.
So any questions on any of that? We got our for each loops. We have hashmaps. We have key sets and values.
Those are our big concepts. We have a few little things. Print verse print line. I just gave a whole speech that I don't care about.
Yeah, I think those are things. Any questions on any of that?
I'll keep watching for questions and uh I can always jump back to this.
But in the meantime, we press on reading files. So sometimes we want to read files in Java. Let's learn how to do it.
Uh and specifically for prompts set zero, the last couple of questions are uh do have you reading files and parsing CSV files. Let's talk about it.
All right. So, and hopefully they talked about this in 115. I assume that they do, but uh as a refresher, why do we even use files? Why do we use files at all? And the big buzzword here is persistence, which oh my god, I goodness, I don't even have on the screen. Um, oh no, here it is. Persistent. Like I'm I'm just blind. Persistence. Uh, I don't have literally persistence, but persistent. So persistent storage. This is why we use files. So everything in memory, once we cross out a stack frame, all that memory is gone. So when we cross out the main stack frame, by the time we get there, the stack is completely empty. And then Java is going to clean up our heap as well. Once the program ends, Java goes and grabs all of that heap space and reclaims all of that memory, gives it back to the operating system or manages it itself rather. Uh it'll take back all of that space. So everything that we did when we run a program, once our program ends, it's all gone. There's no memory, no trace of our program ever existing, ever having ran on the computer after it runs unless we go to the hard drive. So when you go to disk, your hard drive, that's where we have our long-term storage. So if we make any changes to disk, then that those changes will persist after our program ends. And there can be data sitting there in a file before our program starts as well. So if we want to load some initial data, we can store in a file and then load that into our program and then save uh to a file and then that file will live after our program ends. So that persistent storage even through a restart obviously when you restart your computer you don't lose everything in your hard drive. But when you restart your computer you do lose everything that every single program had stored in RAM. All of that is gone every time you restart your computer. RAM starts up fresh, empty uh but your hard drive remains. So that's why we use files. This is why it's such an important concept. And eventually as you go throughout your career, you'll start using uh databases. Databases is um is the same benefit of file storage. It's persistent storage except it handles the low-level actually creating files on the hard drive for you and lets you have a more structured way of storing data to disk. But uh uh databases is the more advanced way of doing persistent storage. So if you I'm sure you've heard of databases plenty of times. It's uh is a way of storing files so we don't have to mess around with things that like we're about to do like CSV files. If CSV was the only way we had to store tabular data, that'd be quite painful. Uh, a database will let us have a more structured way, a more organized way to be able to store lots and lots of data in disk on our disk. Uh, so let's look at this example. Let's break down exactly how we're going to read files in Java. There are multiple ways to do this. This is the way that I want to see in 116. This is the easiest way to read a file in Java. If you want to read all the lines of a file, uh, which is our use case, this is the way to do it.
There are other ways. Again, if you use those ways, you better know how to explain everything that you're doing, or else I'm going to assume you didn't write the code, which is often the case.
Uh, when students don't do this, like, why wouldn't you do the very simple thing I told you to do? Let's be honest, is written by an LLM. Uh, and then I'm I'll uh call the students in unless uh actually I dug myself this hole. I got to finish explaining it. Uh some students have taken Java in high school.
Uh they already have some Java background and they do it the way that they did it in high school. That happens sometimes as long as you can explain it.
Whatever. I don't care. But you better be able to explain it. Uh if you're using iterators and uh uh well loops to read a file, long as you can explain it, it's fine.
Uh okay. So with this uh we're going to use this files readall lines method to open our file. But this method can't just take a file name and read it. In Python, this would be with open string as a file name uh as file and then you're free to iterate over everything and do whatever you want after that. Uh bit of a different syntax, bit of a lot of different syntax here in uh in Java.
So read all lines can't just take a string and interpret it as a file name.
We do have to explicitly convert a string to a path on our file system first. And that's where this paths.get method comes from. So, we're going to do paths.get. Paths is an object uh sorry, a class. Paths is a class that comes with Java specifically in the Java NIO files uh file uh package and then the paths class. So, we're going to import it and we're going to call get which is a static method in the paths class. So this is how we can call a static method from a from a class. We can do class name dot and then the name of the method. So paths has this static method named get. We're going to give it our file name as a string and it's going to interpret that as a path on our file system.
Then we're going to feed that path into readall lines for which is another static method from the files class in the same package. files readall lines is going to take that path and read every single line of that file and put it into a data structure of strings. I believe it's a a list which is actually an array list but it's it's weird. Um some will understand when we talk about polymorphism. Um, but this will return a list of strings of all the lines of that file. And then we're going to stuff that into an array list just so we don't have to think about another new data structure. If you want to use the list directly, that's fine. Go for it. U, but I'm going to stuff it in an array list because we only know two data structures. I want to stick to those two at least for now.
Now, read all lines. The reason you won't often see this read all lines method is this is unbuffered input. we are literally going to the file reading the entire thing and storing the contents of the entire file in uh in heap space. So we need space for this entire file to fit in memory all at one time which is inefficient if your files are huge. Uh but in 116 our files just aren't that huge. Uh we'll have decently large files actually. Um, actually we won't uh for the problem sets because we're not doing the game features. We we won't really have the the larger files. Um, but even when we have larger files in 116, the biggest one that I've had was like 130,000 lines. Um, about 130,000 lines, which is, and if we say each line, um, let's say each line's 100 characters. I'm just making up numbers now. Uh, the 130 is about accurate. If each line averages 100 characters, that's uh 13 megabytes. Uh 13 megabytes is nothing. 13 million characters. 13 million bytes. You measure your RAM in terms of billions. We're a whole order of magnitude off of what's actually considered a large amount of data. Uh in our programs, computers are very efficient, very they handle very big numbers very efficiently. uh you also buy your machines in terms of how many billions of calculations it can do per second. Uh the numbers are huge. So even if you have a fairly large file, it's actually tiny compared to what your computer's capable of. Uh so read all lines uh if you have like a gigabyte file or even a 100 gigabyte file, you can't use read all lines anymore because 100 gigabytes unless you have over 100 gigs of RAM uh isn't going to fit in heap space. You won't be able to do that. you'll run out of heat space and not be able to store it. So then you do have to do buffered inputs. It's a whole another thing where you're reading the file one chunk at a time, often one line at a time, but it depends what structure your data is. Uh if it's non-extual data, it doesn't have lines. For example, you'd have to read a certain number of bytes at a time. Uh and you would buffer the input, only read a little bit into memory, process that, and then get rid of it and then read the next part. That's a buffered input.
That's a buffered way of doing it. We don't need to do that in 116. So, we're just going to call read all files. But out there in the real world, if you have enormous files that you're reading, make sure you look up how to read them in a buffered way if you're using Java. So, that's my spiel on that. In 116, we're just going to use read all lines. Then, we're going to stuff it into an array list if you don't want to think about what a list is.
And there's one big problem with this is well this program this method as written doesn't run. Java will refuse to run this code. This code does not work. Uh and that's because this code is dangerous code. This code will sometimes crash and die if the file you try to open doesn't exist.
So, if you give it a file name that doesn't exist on your hard drive, what is this code supposed to do? It's not going to be able to read a file. It's not going to be able to convert your file name to a path. There's no lines to read. Like, what what does it do? And you might think, well, just give me an empty data structure. You know, it's fine. Just give me an empty dash structure. What it does instead, which we probably want it to do, is throws an exception.
This program throws an exception which could crash your program. So this throws an exception which means that something went wrong. Something went wrong when Java is trying to do something. So it throws an exception and if you don't catch that exception, if you don't handle that exception, then your program crashes. If that exception bubbles up to your main method and nothing handles it, your program blows up.
And uh this is when you see the stack trace in Intelligj. When you see a bunch of red text and errors and everything, uh assuming they're not compiler errors, if they're during runtime, that means that you have an uncaught exception, an unhandled exception. Uh you'll see these if you say go out, you can simulate this very easily. If you go outside the bounds of your array, so if you create an array list of size two and you say.get get five, you're out of bounds of that array and you're going to get an index out-of-bounds exception thrown. If you don't catch that exception, your program crashes and you can see what it's like to have an uncut exception.
When we get to link lists, most or all of you will see null pointer exceptions.
This means that you tried to access a null pointer, which is something we'll talk about when we get to next Thursday.
That means you tried to access a null pointer. Null means a reference to nothing. And if you use the dot operator on it and say, "Hey, take me to your object." And it says, "I'm null. I don't refer to anything." You're going to get a null pointer exception. And then something that can't be caught, if you have a Stack Overflow error, you can't catch this ex. It's not an exception, it's an error, as the name implies. Uh, this will absolutely just crash your program no matter what, and you have no choice.
But we can catch exceptions. So specifically in this case, if the file doesn't exist, this code is going to throw an IO exception. So we can use this try catch syntax to handle that exception. We're going to say try just try Java try to run this risky code.
And if something goes wrong while you're running this risky code, if something goes wrong here and you throw an exception, jump over to my catch block, catch that exception, and then handle it in whatever way uh I tell you to handle it. So if this throws an IO exception, everything is going to stop immediately in the try block.
And then we jump to the catch block. we switch control over to the catch but ah I got a messed up slide here. Um and specifically we're catching an IO exception. So if an exception is thrown by this code that's not an IO exception that's still going to be uncaught and that's going to crash my program. Here I'm catching an IO exception which will be thrown for various input output um issues. In this case, we're catching if the file was not found, an IO exception is going to be thrown. So, we're going to catch that IO exception and then we're going to handle it however we want to handle it. So, if we have uh here our catch block, we're going to say if the file doesn't exist, I'm just going to return an empty array list because there's nothing to read. It's it's a file that doesn't exist. There's nothing to read. I'm going to give you an empty array list of strings. There are no lines to the non-existent file. So, I'm going to return an empty array list. And we can catch multiple types of exceptions. If we have code that might throw an IO exception and a divide by zero exception and an index out-of- bounds exception, maybe we want to catch all those. For our purposes in 116, the only exception we have to catch is this IO exception and only when reading files. And a reminder came and went really actually I don't even think I read it um on the first lecture but the but it's in the syllabus any code that you find in the examples repo or any lecture slides which the examples repo should be a supererset of all the code in the slides. Uh all of that's fair game for you to use in your programs.
And I'm showing you right now the read file method. This is in the repo. This method exists. You can feel free to take this method and just paste it into your prom set zero or wherever you need to read files and then call this as a helper method and then not think too hard about reading files. Reading files is more of a means to an end and less of a learning objective of 116. So, I'm comfortable with that. I'm fine with that.
uh you should still be able to explain it, but that one I would be less concerned about since uh it's something I'm giving you uh because really I just want you to start processing files. So, but this is how to read a file and you do have to use a try catch block with files.
There's other code like every time you access a value at an index of an array list or a key in a hashmap, you might get a key not found exception. those that code is kind of risky code. It can throw exceptions, but you don't have to catch the exception. Java is like, you know what? Yeah, whatever. If an exception is thrown, your program crashes. It's fine. In this case, it does make you catch the exception. For an IO exception, you have to catch the exception or else Java will not run your program. Uh so, uh we do need the try catch block here to handle a file not found. uh if you try to use the code from the first few slides without the try catch block, Java will refuse to run it. Uh so we do need this. It is a necessity.
Once we have our files uh our files read into memory, we're reading all the lines of that file. Then we can start parsing them. And we're going to parse CSV files in this course.
In the prom set, all the file reading questions are parse this CSV file and you know sum up the values of a column. You know what uh some give me the number of values in the in the entire file across every line. Uh it's going to be CSV files. So that's commaepparated values. So each line has a number of values all separated by commas. And we're going to split the line on commas. Here's my next disclaimer of the course. This is not the right way. And I have two disclaimers in the same lecture. I don't I don't like it, but u it's what we kind of need to do to simplify things. Is this is not the right way to parse CSV.
We're just going to straight up split on commas. That's not how you parse CSV. If you're going out there in the wild, uh, out there on an internship or whatever, uh, or on your your full-time job, and you're parsing CSV files by splitting on commas, you are wrong. You will get the wrong data and you might get fired.
Don't do this. We're doing this to simplify things. So, I don't have you writing full CSV parsers or pulling in.
Unfortunately, Java doesn't have a nice CSV parser built into the language, and the external libraries for CSV parsing are all pretty awkward to work with if you're new to Java and new to programming. I haven't found one that has a nice clean interface, a nice clean way of doing things. Uh, so we're just going to split on commas and just pretend that that's okay. Uh, the problem with this is sometimes the data, the values themselves will contain commas. So if the value contains a comma and you're splitting on commas, you're chopping a value in half and treating it as two values. That's bad. We don't want that. The CSV format has a way to handle this. If a value contains a comma, it'll surround that value in double quotes. So any values that are surrounded in double quotes, you don't treat any commas within that value as a comma that's separating values but more but you treat it as a part of that value itself. And then the next issue arises is what happens when a value contains a comma and also a double quote. That's a valid value. That's something people might might have in their values is commas and double quotes. In that case you would escape the double quote I believe. Oh, I'm not 100% 98% sure you escape it with a backslash. So, if you see backslash uh double quote, then that means that you have an actual double quote character and not a double quote that's ending that value. And then what happens if an actual value contains a comma and a double quote and that double quote is preceded but preceded by an actual backslash character. Can happen. it can't happen. Uh, then you escape the backslash with a second backslash. So, it's not that complicated. And if I gave you a problem set question that said properly parse this uh CSV input, you you'd all be able to do that. It might take a little work, a little effort, and everything, but you'd be able to do that. Uh there's there's no issue with that. The problem is it just detracts from the actual learning objectives of the course. It would be good programming practice, but it's not related to OP, data structures, memory. It it just just minutia. Um maybe someday we'll do that, but I I just don't see there's more value we have in having you do other things. So, we're going to simplify CSV and then have you do things that are more directly related to the learning outcomes of CSSE 1116. So, that's my big disclaimer. After all that, we're just going to do split on comma. So when you have a line of a file, I'm simulating a line of a file. A comma b comma c. So we have three values separated by commas.
And we're going to split it on comma. So the string class has a method named split that will take a string and separate all your values based on that whatever delimter you give it. In this case, a comma. So line.split split on comma on a comma b comma c is going to return a data structure with the values a b and c and the commas are completely removed from the final output. So that's what we get from line.split and line.split.
So when you're iterating over the lines of your CSV file, you split each line on commas and then process the data for whatever application you're doing at the time.
line.split returns an array. If you want to use a plain array, you're comfortable with that, you know what that is and everything, go for it. You can just use a plain array, that's perfectly fine. If you don't want to think about another data structure and new syntax and how to use it and everything, you can convert this into an array list using a little bit more than we did last time. We're first going to convert that array into a list using arrays.list.
And then we're going to convert that list into an array list by stuffing it into the arraylist constructor. Similar to what we did when we were reading all the lines of a file.
Then we have an array list of all the splits after calling split.
And there are a few situations where we're going to want to work with numbers where we have CSV data in a file and some of the data is numeric.
We're we have the situation where we need to convert strings into numbers and we have two methods that are going to help us do that. If we're expecting an integer integer.parse intint and if we're expecting a double double.parse parse double. These two methods are going to convert from strings to integers and doubles respectively.
This only works if the number is wellformed. If the number is not wellformed, we're going to get a number format exception thrown, which you could now that you know about exception handling, you could catch that exception. Um, but in our cases, usually, you know, that means you have a bug in your code and you got to fix the bug anyway. Uh, so for 116 purposes, all the data, oh, by the way, all the with the CSV stuff, all the data is sanitized. We're going to make sure everything is nice and clean. There are no commas in your values and there are no ill-formed numbers. All the numbers are going to be wellformed. Uh, the data is going to be squeaky clean. That will almost never be the case in the real world. You'll have to sanitize your your own data in the real world. for 116, we've sanitized all the data beforehand.
So all the data is nice and clean. Uh so all your numbers will be wellformed. And if they're not, that's when you would catch the exception and handle that. You don't have to do that in 116, but you're welcome to if you want to if you want to do extra stuff. If you want to learn more, that'd be uh that'd be awesome, actually. Um but if the number is not wellformed, for example, if you have the string fo, that is not a well-formed int. That's going to crash. We as humans know and recognize that as the number four. The program, the parse int method specifically is not going to recognize that. It's not that robust. It needs to have the numeric digit four and that's it. And then it's going to parse it as a four.
Okay, let's jump over to Intelligj and uh look at our files. Oh, we got these.
No, it's part of Java, right? Am I blind?
Oh, countries example. Uh this is where our our file reading examples are. I I want to emphasize this point right here.
When you read a file, uh actually for your problem set, it's just take a file name and and run it. Uh but when you read a file in Java, specifically running it in Intelligj, the path is relative to the root directory of your project. So I have all my data in the root directory. then a data subdirectory and then I have all these files here for all the data. So if I want to open a file in my program, I'm going to do data cities.csv.
That's going to be data/cities.csv.
Make sure all your paths are relative to the root directory. And don't add a leading slash. Sometimes people do this.
That is a very different path than than this for uh reasons for reasons. Uh a leading slash means go to the root directory of your file system. So you're going to be uh way outside of where you want to be. Um but data/cities.csv is going to give me this cities.csv file, which incidentally happens to be that 130,000 file. Oh, dang it. No, it's not.
No, movies is the the 130,000.
No. What?
Where am I getting that number from?
Oh, I got it from somewhere. Apparently from data that we don't use anymore.
Huh.
Country. you know, it's not 130,000 countries. Well, anyway, this is about 48,000 lines of uh of data, which is almost nothing to a computer. As long as you're careful about how you're iterating over your data, that's nothing. Absolutely nothing.
So, I'm going to read this file and uh through this load countries method, I'm going to call read file, which is the helper method that we just learned about. So, that's going to convert my file into lines, and then I'm going to iterate over those lines, split the lines by commas, and then I'm going to start accessing all my values and doing whatever I need to do with my values. If I don't know how many values I expect per line or if there's a variable number of values, I might iterate over this using our for each loop. So I could say for uh for string split in splits and then process each split independently individually. Uh so I can do this over an array list. In the slides, we only saw this style of loop for hashmaps, but you can use the same style for array list. It'll do the same thing as for uh int i equals zero.
Uh I less than splits.
I ++ string uh split equals splits.get I, which is a little more verbose. Uh so if you don't need the indices, it can be just a little cleaner to use this style of loop. But if you need the indices, you have to use this style of loop.
I say have to, but I mean it's not literal.
Uh, split.
Oh, is that do we not have that in Oh, apparently Java doesn't have that. So, yeah, I guess you have to use that loop.
I don't know what language I'm thinking of, but you can ask your data structures for a data structure with all the indices.
Apparently, Java doesn't have that.
You can tell that I never use it. Uh I just use this style of loop, but I thought Java did have that.
Okay. Anybody have questions? There was uh there was something I wanted to come back to Intelligj and do. Oh, I just remember what it is. Um but if you have questions, I I will be more than happy to take your questions. But yeah, I did just remember what it was. I want to do some more debugging. So, the debugger is an extremely useful tool. I I get debugger examples in wherever I can get them. And right now is a perfect example. Uh usually I live code I delete all this code and then live code this.
We're definitely not going to have enough time to even get started on this.
Uh we won't get very far unless I just rip through it way faster than would be useful to anybody. Feel like this lecture is 2 and 1/2 hours. Well, two hours and 20 minutes now of uh pretty rapid fire content.
I I don't want to pick up the pace right at the end. Um unless that's what you all want, you know, hit me up in chat. But, uh I do want to show a little more debugger. So, I'm going to use the debugger. I'm going to set this breakpoint to the first line of the main method. the breakpoint. Uh what it's going to do is when you use the debugger, it's going to run your code until it hits a break point. So if you don't know where you really want to debug, the safest place to put that breakpoint is the first line of main and just run through your entire program.
So, when I run this, instead of hitting the go button directly, I'm going to hit the debug button. And it'll run until it hits the break point, which is immediately in this case because I put it on the first line of main. And it's going to show me all of the same information that we see in our memory diagrams with a few little quirks that I'll mention as they come up. Uh, for one, args is going to be in the main stack frame. It is always in the main stack frame. We just don't trace it in 116 because it's just a little extra clutter that you'd have to put on every memory diagram. It would never be used in any of your programs. So why do it?
Why bother? Uh we can also see whenever we have a value that has this at and then some number, we know that that's an object on the heap. So that at and then some number can be read the same way as our ox numbers as our references in our memory diagrams. You know this variable args stores a reference to a string array on this the heap. So that what we know by seeing that at symbol there. And then we can use this button right here.
The step into will advance our code by one line. So if I do step into, I'm going one line into my code and I can see all the changes in memory as the my code advances. So I can see that I have a second stack frame on the stack. Now this load countries method and it has a parameter file name. This p tells me this is a parameter that comes from file name here which got the value data/cities.csv.
And I can jump between my stack frames.
Unlike our traces where we have every stack frame, we can see them all at the same time. Here, we do have to click on whatever stack frame we want to see at a given time. So, we can see we have the string literal data/cities.csv.
That was never stored in a variable. So, it never made it onto the stack, but we did pass that into our method call as an argument. So, that's what got stored in file name, the variable.
Then we're creating a new hashmap. When I get my hashmap countries, I can see that it's an object on the heap because it's this at 475. I know this is an object on the heap because of that. And it has size zero. There's really nothing to to see about it yet uh because it's empty, but it is a uh a variable that stores a reference to an object on the heap. And then we're going to have this read line read file method call. So we have another stack frame for this method call.
This is going to actually read this file data/scities.csv.
And here I have a line that has three different method calls in it. So Java is asking me which method call I want to go into.
And I don't really want to go into any.
But I'm going to click step into because that's the only button I talked about so far.
and we get into the arraylist constructor is where it took us. So, we're inside Java at this point. And if I keep clicking step into, we're going to get kind of buried and kind of lost in Java code. And it's going to take a long time to click through all these. Uh we're going to see a lot of strange syntax, a lot of, you know, kind of crazy stuff going on. This code has been optimized over decades, half a century by now, right? Is Java that old? I think so. Close to at least. Uh, but I don't want to be in any of this code. I'm not debugging Java. I don't want to do that.
So, the next button is the step out button. This is going to get us out of the current stack frame. It'll execute the program until the current stack frame returns. So, I'm in this copy of stack frame. Step out is going to get me out of there. I'm still in a different copy of step out gets me out of there.
To array, get out of there. And then the arraylist constructor. Step out is going to get me out of there. And I'm finally back to my code. And now I can start stepping into again, which gets me out of read files. That method was done.
Gets me back in my code. And if I step into one more time, I can see lines is going to be a data structure containing every single line of that file. So the helper method did what it was supposed to do. It went through every single line of the file and gave me an array list of strings where the values are each individual line of the file. So lines of zero is the header line. Lines of one is my first city. I'm reading city data here. Line uh line three at index two is the next city and so on. And then the debugger doesn't always show you everything. It'll show you the first hundred values. But if you want more, you can get more and keep going through them. Not going to go through all 47,000, but you have all of this data loaded into heap space at this point in time. So when you have an object on the heap, you have this at 589. In this case, you'll get an arrow next to it if it's not empty. Countries doesn't have anything because it's empty, but you have this arrow which will say, "Show me what's on the heap." So whenever you see this at some number and an arrow, the arrow means this is everything that's stored on the heap. So if you were tracing through this, this is everything that would be on the heap. We would never have you trace something with 47,000 elements. But if you were tracing this, this is everything that would be on the heap in your array list object at 589.
And then remove of zero. This is going to remove that index line so we don't accidentally process that as data.
Then we're going to iterate over the lines so we have a variable line. And notice this is a string and it has an arrow next to it. Strings do technically live on the heap. It's an array of bytes that's also on the heap. So we have an uh we're referring to a string object that refers to another object of type byte array in this case. And uh and we're just not going to mess with that in 116. If you want to map two objects for every string that exists in your program, I guess I ain't going to stop you. Um but I'm not going to require that by any means. We're just going to put the value of the string directly in the variable as though it were stored by value instead of stored by reference.
And then we're going to hit this big old line.
And if I click step into again, it's going to give me the three options like we had before. I'm going to get lost in the array list class. Instead, I'm going to click this step over button. What this does is runs the next line in its entirety and skips over any any stack frames method calls that were created during that line. So the array list constructor, the as list, the split, all that. I'm just going to skip it all.
It's all internal to Java. I'm not debugging Java. I want to debug my own code. So I just skip over that entire line, including all those stack frames that would have been created. And then I can start getting the individual values.
AD is the country code.
So country ad, name, region, uh, population is going to be an int. I'm going to do step over so I don't get into the parse int method like it wants me to do. lat long as doubles.
So I have uh population lat long. Those are doubles not strings because I converted them to numbers. And I think at that point hits everything that we that we talked about in lecture.
Everything I wanted to to show. uh the uh in the in the debugger here the uh I lost my train of thought. There was one other thing that I wanted to show not not just going through everything here. Uh Uh and especially because the next thing is a constructor call. We haven't we won't get to this material until next Tuesday. So kind of can't go another another line.
Yeah, I think that might have been it. I I think there was something that uh I think it was the three buttons. I think I already did it just naturally.
Uh yeah. So, any questions or anything else you want to see while I have the debugger running?
Oh, I I think it was the splits. I didn't crack open splits. So, splits is an array list of size six. We can crack that open and see all the values. So when we have a line and we're splitting it on commas, we're going to get all the values one at a uh in each indicy of the array list and then our numeric values are stored as strings. You can only read a text file as strings and everything we read in 116 will be a text file. You can only read these as strings. So that's why when we grab population latin long, we have to convert them to an int and two doubles and that's going to give us goodness and that's going to give us those values those numeric values instead of their string values.
And that's definitely everything I wanted to cover, which is perfect timing, too. Got there right at the end.
So, anybody have quick questions on that? On anything we've seen today, Thursday, anything about the course.
If not, I'll see everyone on Tuesday.
Uh, and prom set zero, you have everything you need for prom set zero.
At this point, you can clone the repo the same way we did the examples repo last time. Grab this link. Make sure you're on HTTP. Oh, I don't even have a different option here. Grab that link.
Go to Intelligj clone. Paste the link.
And that'll give you the starter code for problem set zero.
And then in the prom problem set zero, this is where you'll write all of your code for the uh uh for the problem set. And we also do provide tests for this. So right now I can run these tests. If you go to your test problem set zero, rightclick and run it, this will run all of these tests on your code. And I'm going to fail them because I haven't written any code, of course.
And there are tests for the other problem sets too. If you highlight these, all these tests, and dot control slash, it's going to comment them all in. And this is going to cause errors because I didn't write these methods yet. The sum of lines method doesn't exist at this point. Uh, so I'm going to get errors.
That's why they're commented out. But as you write each method of the problem set, comment in the test cases for that method. And then you can run the new tests and make sure you're still keeping um you're still doing good on each uh problem. And then you'll know without having to submit to AutoLab over and over. You'll know one to expect and two when a uh if a test fails, you can run the debugger. since I got to show it a few times already. Run the debugger, trace through your code, and see exactly what's going wrong with it.
And yeah, I don't see any questions, so I'm out of here. I'll see y'all on
Related Videos
Agentforce NOW AMA: Build with React and Salesforce Multi-Framework
SalesforceDevs
490 views•2026-05-28
How agent o11y differs from traditional o11y — Phil Hetzel, Braintrust
aiDotEngineer
450 views•2026-05-28
WEB TECHNOLOGIES UNIT-2 | Degree 4th sem BCOM Computers web technologies unit-2 full explanation💯✅
LearnwithSahera
1K views•2026-05-29
More tests are always better? How to use AI to identify tests that bring little value
Alliance4Qualification
335 views•2026-05-29
Search Algorithms Explained in 60 Seconds! 🤖💨
samarthtuliofficial
218 views•2026-06-01
People of Game of Thrones using JavaScript DOM
AltCampus
296 views•2026-05-30
Introduction to Problem Solving Part - 1 | Lecture 1 | Intermediate DSA
ascensionix
107 views•2026-05-29
So What's Odin Lang Even Good For
TechOverTea
131 views•2026-06-01











