CS-526: Homework 3 grades

Hello everyone,

I have finished grading Homework 3, and you should now be able to see your grades. If not, let me know!

I provided comments like last time. A more detailed breakdown for each subquestion, as I didn't want to fragment it excessively in the comments to the solution:

Problem 1: 11 pts

1) 8 pts:

1 pt – general correct treatment of probability distribution
1 pt – first moment
2 pts – second moment
4 pts – third moment

2) 3 pts

Problem 2: 21 pts

1) 3.5 pts:

0.5 pts x 3 – each of the 2D tensors
1 pt x 2 – each of the 3D tensors

2) 14.5 pts:

0.5 pts x 3 – correct rank of each of the 2D tensors
1 pt x 2 – correct rank of each of the 3D tensors
3 pt – justification of the rank of G
8 pt – justification of the rank of W

3) 3 pts

Problem 3: 16 pts

1) 3 pts

2) 9 pts:

4 pts – justification of non-existence of rank R+1 limit for the sequence of rank R matrices
4 pts – justification of the existence of a rank R-1 limit for the sequence of rank R matrices
1 pt – idea for the construction of the rank R-1 limit for the sequence of rank R tensors

3) 4 pts

Problem 4: 8 pts

Problem 5: 7 pts

1 pt – transpose of Kronecker product
3 pts – scalar product of Kronecker products
3 pts – dot product of Khatri-Rao products

Total: 63 points.

If you have any questions about this or any previous homework, please feel free to email me. For course questions, you can also email me or ask on the forum. It’s best to do so before June 16, as I may not be able to respond promptly after that.

Have a smooth end of the semester, and good luck on your exams!

Re: Homework 3 grades

by Anastasia Remizova - Tuesday, 3 June 2025, 02:24

A little note about what I meant under "general correct treatment of probability distribution". It's not really about the course content and more about the classical probability theory, but I think it's something important to understand.

In the exercise, we define the PDF of the Gaussian mixture model, $p(x)$ . While it most often arises from the contexts of modelling data coming from multiple underlying sources, this function is mathematically just a PDF on $\mathbb{R}^D$ , and it does not automatically assume latent variables representing the choice of component. We can indeed consider a joint distribution $p(x, z) = p(x \vert z) p(z)$ where $z$ plays the role of a mixture component such that $p(x)$ would be its marginal distribution. And it's totally ok to introduce this model to find the moments of this distribution conveniently. But it was not specified in the problem statement, and you have to do this yourself. Besides, not all random $X$ with the PDF $p(x)$ are generated this way, though indeed they would be equal in distribution.

More rigorous example when $Z$ cannot be properly defined: we can take a probability triple of the sample space $\Omega$ being $\mathbb{R}^D$ , event space being $\sigma$ -algebra on $\mathbb{R}^D$ , and probability measure induced by $p(x)$ . Then, $X(\omega) = \omega$ is a valid random variable with probability density $p(x)$ .

Now, say $Z$ is a random variable representing a component from which $X$ came. And naturally, different components should produce the same values of $X$ . But the space we introduced before is not rich enough: we cannot have $(Z(\omega_1) = 1, X(\omega_1) = x)$ and $(Z(\omega_2) = 2, X(\omega_2) = x)$ because from how we built it, $\omega_1 = \omega_2$ so $Z$ should be the same as well.