This is a really great video, thanks for the explanation and code example! Side note: I think the first element of the resulting vector should be 10 rather than 9.
Hey, thanks a lot for the feedback :) Can you explain how you come up with 10? The first row (including the zeros) of the A matrix is [1, 0, 0, 2, 0] and the right-hand side is [1, 2, 3, 4, 5]. If we take the dot product of the two, this yields the first entry of the solution vector, which should then be 1*1 + 2*4 = 1 + 8 = 9.
Note: the row_ptr array always has i+1 elements (No. of rows +1 = 6 in this case). Always, irrespective of how many empty rows are there. Since the given example has matrix with all rows filled with at least 1 non-zero value, it looks like we stop counting row_ptr, at the last non-zero element + 1. However, that is not the case. If in above example, last two rows were completely empty, we would've still proceeded to calculate 6 row_ptr elements, except last 3 elements would have been all 6. (new row_ptr = {0,2,6,6,6}) After going over some other resources to solve my own confusion, I found that it's better to think of row_ptr as: the i-th element in row_ptr holds the number of non-zero elements before the i-th row (excluding the i-th row). This way even in case of empty rows the row_ptr will always have i+1 elements.
Hi, it's been a time since I uploaded the video, I just quickly looked at the material on the GitHub repo. github.com/Ceyron/machine-learning-and-simulation/tree/main/english%2Fsparse_matrices%2Fcsr And it should be as you say that the row pointer array has one more entry than there are rows. To which point in the video are you referring? (Note also that the video follows C/Python style looping, excluding the last element in a range)
Hey, you probably refer to the algorithm for the matrix-vector product at roughly 12:35, right? I think that the difference is due to how you define range-based looping in different languages. IP[i] points to the first element in that row, while IP[i+1] points to the first element in the next row. In C-like languages (i.e. C, C++, Python etc.) you would loop excluding the last element [¹] whereas in MATLAB-like languages (i.e. Julia, Octave, MATLAB) you loop including the last element. Hence, the former languages implicitly loop over the row and for the latter you have to loop to IP[i+1] - 1 as you correctly pointed out. I should have probably mentioned this and will do from now on once I write down pseudo-code :) (Personally, I prefer the C-like looping, although 0-based indexing can be nasty from time to time) Let me know if this answered your question :) [¹] For instance in Python, if you loop with range(5, 10) you get the elements [5, 6, 7, 8, 9] but not the last element.
Thanks a lot for your feedback. :) Regarding your question: You would have to change the creation of the CSC-Matrix and change the looping a bit, but they follow a mostly similar strategy. I have also planned to cover this in a future video, but it will probably take me a longer time until I can produce it. Until then, you could take a look at how SciPy does it: github.com/scipy/scipy/blob/v1.7.1/scipy/sparse/csc.py#L16-L232 However, this could be a bit overkill since it contains additional points in their class-based interface.
@@MachineLearningSimulation no problem, I can explain. Image we have a list of size N where we can add elements at a specific index k. When the list gets full, we start replacing oldest elements with the new one. For example, we have the list L with maximum size N = 3 and we insert elements from 0 to 10. The list will look like this: insert 0: L = [0] // still not full insert 1: L = [0, 1] // still not full insert 2: L = [0, 1, 2] // now it's full, start replacing oldest elements insert 3: L = [3, 1, 2] // 0 was the oldest element and was replaced with 3 insert 4: L = [3, 4, 2] // 1 was the oldest element and was replaced with 4 insert 5: L = [3, 4, 5] // 2 was the oldest element and was replaced with 5 and so on... Now imagine that instead of numbers we have rows. We want to have a matrix of size R x C (R rows, C columns) and we always want to replace the oldest row with a new one.
I can't understand one thing. Say we have matrix A with i-th row with all zeros, that is A[i] = (0, 0, ..., 0) . Then we build A' by swapping that i row with its successor, that is: A'[i] = A[i +1] and A'[i+1] = A[i] . Then A and A' would have the same CSR representation but A != A'. Any words? I think that maybe CSR can't represent matrixes with rows with all zeros.
Hi, thanks for the interesting question! :) It's been some time since I uploaded the video and worked with sparse matrices. I will try to answer the question as good as I can. If sth is unclear, or I made a mistake, please let me know. As you proposed: Let's have a matrix A with the i's row full of zeros and the (i+1)'s row with some non-zero elements. Hence, if you built an A' with the i and (i+1) rows swapped those are not identical, I agree. However, you said that the two had the same CSR representation. I think this is not true. As an example, let's say you want to swap row 7 and row 8. Row 7 is all zero, whereas row 8 has 3 non-zero elements. In the CSR representation, that would result in an array of row-pointers for which the 7th entry (corresponding to the 7th row, using 1-based indexing here) points to the same location in the columns array and the values array than the 8th row pointer. For a matrix-vector product, there would be no looping over the 7th row, because its row-pointer entry is the same as the next row-pointer entry. This is different from the 8th row, because the 9th row pointer is exactly pointing three entries further down. Hence, in a matrix-vector product, you would loop over 3 entries. If you swapped the two rows, then the 8th pointer would be pointing 3 entries further down than the 7th. Hence, you would have the looping for the 7th row whereas the 8th row had no looping as the 9th pointer is pointing to the same as the 8th. Consequentially, the CSR representation of A and A' is identical for the columns and values array, but it differs in the row-pointers array. Hence, to my understanding, the CSR format should be able to represent all zero rows. It does so by setting the row-pointer of the next row to the same location as the row-pointer of the all-zero row. Actually, this should allow you to also represent an all zero matrix, if you like :D. Hope that made sense :)
I will note it down, but can't promise a video in the near future. I am curious, what is your application doing sparse matrix-matrix operations? So far, I've only seen matrix-vector products being sufficient for most applications, mostly in the discretization of partial differential equations.
Thanks a lot for your insightful explanation and kind code example. I just subscribed to your channel, and will go through other videos as well!
Hi, thanks so much for the comment 😊 I'm happy that the videos are of great help.
Welcome to the channel and enjoy the content 🙂
THAT IS AWESOME! THANK YOU VERY MUCH! YOU HELPED ME A LOT WITH MY UNI
Great to hear! :) I'm happy that I could help
This is a really great video, thanks for the explanation and code example!
Side note: I think the first element of the resulting vector should be 10 rather than 9.
Hey, thanks a lot for the feedback :)
Can you explain how you come up with 10?
The first row (including the zeros) of the A matrix is [1, 0, 0, 2, 0] and the right-hand side is [1, 2, 3, 4, 5]. If we take the dot product of the two, this yields the first entry of the solution vector, which should then be 1*1 + 2*4 = 1 + 8 = 9.
Great lesson! Thank you!
You're welcome :) Glad you liked it.
Note: the row_ptr array always has i+1 elements (No. of rows +1 = 6 in this case). Always, irrespective of how many empty rows are there. Since the given example has matrix with all rows filled with at least 1 non-zero value, it looks like we stop counting row_ptr, at the last non-zero element + 1. However, that is not the case.
If in above example, last two rows were completely empty, we would've still proceeded to calculate 6 row_ptr elements, except last 3 elements would have been all 6. (new row_ptr = {0,2,6,6,6})
After going over some other resources to solve my own confusion, I found that it's better to think of row_ptr as: the i-th element in row_ptr holds the number of non-zero elements before the i-th row (excluding the i-th row). This way even in case of empty rows the row_ptr will always have i+1 elements.
Hi, it's been a time since I uploaded the video, I just quickly looked at the material on the GitHub repo. github.com/Ceyron/machine-learning-and-simulation/tree/main/english%2Fsparse_matrices%2Fcsr
And it should be as you say that the row pointer array has one more entry than there are rows. To which point in the video are you referring?
(Note also that the video follows C/Python style looping, excluding the last element in a range)
i didn't quite understand why we use the IP(i+1).
in some algorithms, there is IP(i+1) -1. Can you give an example, please?
Hey, you probably refer to the algorithm for the matrix-vector product at roughly 12:35, right?
I think that the difference is due to how you define range-based looping in different languages. IP[i] points to the first element in that row, while IP[i+1] points to the first element in the next row. In C-like languages (i.e. C, C++, Python etc.) you would loop excluding the last element [¹] whereas in MATLAB-like languages (i.e. Julia, Octave, MATLAB) you loop including the last element. Hence, the former languages implicitly loop over the row and for the latter you have to loop to IP[i+1] - 1 as you correctly pointed out.
I should have probably mentioned this and will do from now on once I write down pseudo-code :)
(Personally, I prefer the C-like looping, although 0-based indexing can be nasty from time to time)
Let me know if this answered your question :)
[¹] For instance in Python, if you loop with range(5, 10) you get the elements [5, 6, 7, 8, 9] but not the last element.
thank you, nice explaination. One question is , how can I modify code so that formatting csc?
Thanks a lot for your feedback. :)
Regarding your question: You would have to change the creation of the CSC-Matrix and change the looping a bit, but they follow a mostly similar strategy.
I have also planned to cover this in a future video, but it will probably take me a longer time until I can produce it. Until then, you could take a look at how SciPy does it: github.com/scipy/scipy/blob/v1.7.1/scipy/sparse/csc.py#L16-L232 However, this could be a bit overkill since it contains additional points in their class-based interface.
how would you create a CSR matrix which allows us to replace rows efficiently in order to obtain a ring buffer?
Hi, could you elaborate? I'm unsure what you mean by replacing rows and the connection to ring buffers? :)
I don't have a theoretical CS background
@@MachineLearningSimulation no problem, I can explain. Image we have a list of size N where we can add elements at a specific index k. When the list gets full, we start replacing oldest elements with the new one. For example, we have the list L with maximum size N = 3 and we insert elements from 0 to 10. The list will look like this:
insert 0: L = [0] // still not full
insert 1: L = [0, 1] // still not full
insert 2: L = [0, 1, 2] // now it's full, start replacing oldest elements
insert 3: L = [3, 1, 2] // 0 was the oldest element and was replaced with 3
insert 4: L = [3, 4, 2] // 1 was the oldest element and was replaced with 4
insert 5: L = [3, 4, 5] // 2 was the oldest element and was replaced with 5
and so on...
Now imagine that instead of numbers we have rows. We want to have a matrix of size R x C (R rows, C columns) and we always want to replace the oldest row with a new one.
I can't understand one thing. Say we have matrix A with i-th row with all zeros, that is A[i] = (0, 0, ..., 0) . Then we build A' by swapping that i row with its successor, that is: A'[i] = A[i +1] and A'[i+1] = A[i] .
Then A and A' would have the same CSR representation but A != A'. Any words? I think that maybe CSR can't represent matrixes with rows with all zeros.
Hi,
thanks for the interesting question! :)
It's been some time since I uploaded the video and worked with sparse matrices. I will try to answer the question as good as I can. If sth is unclear, or I made a mistake, please let me know.
As you proposed: Let's have a matrix A with the i's row full of zeros and the (i+1)'s row with some non-zero elements. Hence, if you built an A' with the i and (i+1) rows swapped those are not identical, I agree. However, you said that the two had the same CSR representation. I think this is not true. As an example, let's say you want to swap row 7 and row 8. Row 7 is all zero, whereas row 8 has 3 non-zero elements. In the CSR representation, that would result in an array of row-pointers for which the 7th entry (corresponding to the 7th row, using 1-based indexing here) points to the same location in the columns array and the values array than the 8th row pointer. For a matrix-vector product, there would be no looping over the 7th row, because its row-pointer entry is the same as the next row-pointer entry. This is different from the 8th row, because the 9th row pointer is exactly pointing three entries further down. Hence, in a matrix-vector product, you would loop over 3 entries.
If you swapped the two rows, then the 8th pointer would be pointing 3 entries further down than the 7th. Hence, you would have the looping for the 7th row whereas the 8th row had no looping as the 9th pointer is pointing to the same as the 8th.
Consequentially, the CSR representation of A and A' is identical for the columns and values array, but it differs in the row-pointers array. Hence, to my understanding, the CSR format should be able to represent all zero rows. It does so by setting the row-pointer of the next row to the same location as the row-pointer of the all-zero row. Actually, this should allow you to also represent an all zero matrix, if you like :D.
Hope that made sense :)
Nice video! 👍
Thanks for your feedback, Jenna :) Really appreciate it
better to shrink the screen so show complete view instead of scrolling up and down gaining some and losing some view
Hi,
thanks for the comment and the feedback :)
Are you referring to the handwritten (blackboard) part of the video or the coding part?
Can you please do CSR matrix-matrix product?
I will note it down, but can't promise a video in the near future.
I am curious, what is your application doing sparse matrix-matrix operations?
So far, I've only seen matrix-vector products being sufficient for most applications, mostly in the discretization of partial differential equations.
Thanks!
You're welcome 🤗
You've got a nice accent there mate. Are you Scottish?
Haha :D Haven't heard that before. No, I am German.
Понял, хотя я не английский не понимаю
That's great 😊 I'm happy the video is of help.
Hope the auto translate feature of the subtitles was helpful.
@@MachineLearningSimulation 👍