From: Robert Saalbach <saalbach.robert@outlook.de>
Date: Tue, 28 Oct 2025 22:52:11 +0000 (+0100)
Subject: more notes for ml
X-Git-Url: https://git.saalbach.dev/?a=commitdiff_plain;h=c05e99854780bfb596407380b955bc042b12c66a;p=research-obsidian.git

more notes for ml
---

diff --git a/.obsidian/workspace.json b/.obsidian/workspace.json
index 4db4229..b831289 100644
--- a/.obsidian/workspace.json
+++ b/.obsidian/workspace.json
@@ -4,16 +4,16 @@
     "type": "split",
     "children": [
       {
-        "id": "073fe34e944da9c4",
+        "id": "c6dea15c43e47f34",
         "type": "tabs",
         "children": [
           {
-            "id": "1799484043bce33c",
+            "id": "73f492393c1ef2cc",
             "type": "leaf",
             "state": {
               "type": "markdown",
               "state": {
-                "file": "Watchlist & Do List.md",
+                "file": "University/Machine Learning/Full Notes.md",
                 "mode": "source",
                 "source": false,
                 "backlinks": true,
@@ -28,7 +28,7 @@
                 }
               },
               "icon": "lucide-file",
-              "title": "Watchlist & Do List"
+              "title": "Full Notes"
             }
           }
         ]
@@ -88,7 +88,8 @@
       }
     ],
     "direction": "horizontal",
-    "width": 300
+    "width": 300,
+    "collapsed": true
   },
   "right": {
     "id": "52c8cd2985704b8e",
@@ -174,19 +175,20 @@
       "pdf-plus:PDF++: Toggle auto-paste": false
     }
   },
-  "active": "1799484043bce33c",
+  "active": "73f492393c1ef2cc",
   "lastOpenFiles": [
-    "Untitled.md",
+    "Pasted image 20251027175634.png",
+    "Pasted image 20251027173751.png",
+    "Pasted image 20251027160630.png",
+    "Pasted image 20251027160619.png",
+    "Pasted image 20251027160603.png",
+    "Pasted image 20251027160553.png",
+    "Pasted image 20251027160532.png",
+    "Pasted image 20251027160014.png",
+    "Pasted image 20251027155945.png",
+    "Pasted image 20251027155909.png",
+    "Pasted image 20251027154610.png",
     "University/Machine Learning/Full Notes.md",
-    "Pasted image 20251025164452.png",
-    "Pasted image 20251025163915.png",
-    "Pasted image 20251025162205.png",
-    "Pasted image 20251025161239.png",
-    "Pasted image 20251025131332.png",
-    "Pasted image 20251025122355.png",
-    "Pasted image 20251025122036.png",
-    "Pasted image 20251025121602.png",
-    "Pasted image 20251025120122.png",
     "Introduction, Software Project.md",
     "University/Machine Learning",
     "European Union/Child Sexual Abuse Act/cellar_13e33abf-d209-11ec-a95f-01aa75ed71a1.0001.02_DOC_1.pdf",
@@ -204,6 +206,7 @@
     "Blog/sis50.nl Experiences Writing that Software.md",
     "Blog/Engine-Light and Experiences writing that Software.md",
     "Blog/Saalbach.dev and experiences writing that software.md",
+    "Untitled.md",
     "Blog",
     "University/Software Project/Grading our Own Report Assignment.md",
     "Nebulous Command/Notes on Nebulous Command.md",
@@ -216,7 +219,6 @@
     "Thoughts on Politics and Researching, and finding out things that you think are right.md",
     "Quotes.md",
     "Poet List.md",
-    "Pasted image 20250207160807.png",
     "University/Software Project/General Assembly 10-1-2025.md",
     "University/Software Project",
     "University/Recontextualising Creativity/RC Research Project/References on RC Research Project.md",
diff --git a/Pasted image 20251027151936.png b/Pasted image 20251027151936.png
new file mode 100644
index 0000000..26c0e1b
Binary files /dev/null and b/Pasted image 20251027151936.png differ
diff --git a/Pasted image 20251027152814.png b/Pasted image 20251027152814.png
new file mode 100644
index 0000000..40cdac5
Binary files /dev/null and b/Pasted image 20251027152814.png differ
diff --git a/Pasted image 20251027154610.png b/Pasted image 20251027154610.png
new file mode 100644
index 0000000..c2d2ada
Binary files /dev/null and b/Pasted image 20251027154610.png differ
diff --git a/Pasted image 20251027155909.png b/Pasted image 20251027155909.png
new file mode 100644
index 0000000..443e8ba
Binary files /dev/null and b/Pasted image 20251027155909.png differ
diff --git a/Pasted image 20251027155945.png b/Pasted image 20251027155945.png
new file mode 100644
index 0000000..80ee46b
Binary files /dev/null and b/Pasted image 20251027155945.png differ
diff --git a/Pasted image 20251027160014.png b/Pasted image 20251027160014.png
new file mode 100644
index 0000000..01f3d91
Binary files /dev/null and b/Pasted image 20251027160014.png differ
diff --git a/Pasted image 20251027160532.png b/Pasted image 20251027160532.png
new file mode 100644
index 0000000..c1f7d67
Binary files /dev/null and b/Pasted image 20251027160532.png differ
diff --git a/Pasted image 20251027160553.png b/Pasted image 20251027160553.png
new file mode 100644
index 0000000..33787a1
Binary files /dev/null and b/Pasted image 20251027160553.png differ
diff --git a/Pasted image 20251027160603.png b/Pasted image 20251027160603.png
new file mode 100644
index 0000000..697b042
Binary files /dev/null and b/Pasted image 20251027160603.png differ
diff --git a/Pasted image 20251027160619.png b/Pasted image 20251027160619.png
new file mode 100644
index 0000000..aeb8e24
Binary files /dev/null and b/Pasted image 20251027160619.png differ
diff --git a/Pasted image 20251027160630.png b/Pasted image 20251027160630.png
new file mode 100644
index 0000000..46e2c71
Binary files /dev/null and b/Pasted image 20251027160630.png differ
diff --git a/Pasted image 20251027173751.png b/Pasted image 20251027173751.png
new file mode 100644
index 0000000..25193b3
Binary files /dev/null and b/Pasted image 20251027173751.png differ
diff --git a/Pasted image 20251027175634.png b/Pasted image 20251027175634.png
new file mode 100644
index 0000000..a25926e
Binary files /dev/null and b/Pasted image 20251027175634.png differ
diff --git a/University/Machine Learning/Full Notes.md b/University/Machine Learning/Full Notes.md
index 2651bf0..9967887 100644
--- a/University/Machine Learning/Full Notes.md	
+++ b/University/Machine Learning/Full Notes.md	
@@ -240,3 +240,145 @@ i.e.
 $\lambda_{21}\space p(y_2|x)$ and $\lambda_{12}\space p(y_1|x)$
 (missclassification error loss times the prob its in that class (posterior prob.))
 
+
+# Parametric Densities
+
+For the output of a model we would find, for each object in the feature space: $p(y|x)$
+In practice we approx: $\hat{p}(y|x)$
+or we fit a function.
+
+Difference between $p(x)$ and $P(x)$ (first is probability density, second is probability mass, the first is continuous, the second discrete)
+
+In Bayes' rule how do you get $p(x)$?
+You can compute it: (assuming two classes here):
+$p(x)=p(x|y_1)p(y_1)+p(x|y_2)p(y_2)$
+![[Pasted image 20251027151936.png]]
+
+up to now we assumed we know $p(y|x)$ or $p(x|y), p(y)$
+but realistically we only get a sample - so we have to approx.
+
+For this, we need models of multiple categories:
+- Discriminative and Generative
+- Parametric and Nonparametric
+
+## Generative Models
+$p(y|x)\propto p(y)p(x|y)$
+
+When we know the prior and conditional densities we know everything about the data for classification
+the density has to be estimated and given examples from different classes, 'standard' density estimation is sufficient
+It is possible to 'generate' (sample) from the classes
+
+## Discriminative Models
+$\hat{p}(y|x)$
+When we don't know the class conditional probs and prios, directly estimate posterior?
+- hard problem: given measurements e.g. height, how to estimate $p(\text{woman}|\text{height})$
+- Strong assumptions or sloppy approx
+
+## Parametric Modeling & Estimation
+Density Estimation and related topics:
+- Simple Nonparametric approach
+- curse of dimensionality
+- parametric models
+- sphering
+- properties of gaussian
+- mixture modeling
+
+## Histogram based Density Estimation
+![[Pasted image 20251027152814.png]]
+
+the problem here though, is accuracy, you may need a lot of repetitions, like in this case.
+
+for 1 dimensional data apparently +- 1000 objects needed
+For each bin we estimate on value: 50 bins, 50 parameters
+
+for $M$-dimensional data $\pm 1000^M$ objects needed. this becomes basically unworkable if $M>2$
+
+This is part of the ***curse of dimensionality***
+intuitively, using more features should give us more information, and make prediction easier
+But we hve to estimate the densities, and the number of parameters increases with the number of features
+to estimate these well you need more objects
+Consequence: There is an optimal number of features to use
+
+![[Pasted image 20251027154610.png]]
+
+> Parametric: need only a few parameters and assume a simple global model e.g. Gaussian
+> Non-Parametric: depends on training data, simple local model such as uniform or Gassuian
+
+Normal Distribution === Gaussian Distribution
+Standard normal distro: $\mu=0, \sigma^2=1$
+95% of data between $[u-2\sigma, u+2\sigma]$ (1 dimensional)
+
+1D formula:
+$$p(x)=\frac{1}{\sqrt{2\pi\sigma^2}}\exp\left(-\frac{(x-\mu)^2}{2\sigma^2}\right)$$
+p(x) is the density don't forget that
+
+Why gauss?
+special, central limit for large i.i.d. random vars will be gaussian
+approx occurs in real life
+e.g. sum of eyes on 10,000 dice throws
+also has few params
+easy to estimate parems when using max. likelihood
+
+### Multivariate Gaussians
+$M$-dimensional density:
+$$p(x)=\frac{1}{\sqrt{(2\pi)^M\det(\Sigma)}}\exp\left(-\frac{1}{2}(x-\mu)^T\Sigma^{-1}(x-\mu)\right)$$
+Which is also written as (crazy)
+$$N(x|\mu,\Sigma)$$
+![[Pasted image 20251027155909.png]]
+
+![[Pasted image 20251027155945.png]]
+
+![[Pasted image 20251027160014.png]]
+(from top-down view)
+
+## Max. likelihood estimates
+
+What are the max. likelihood estimators for the mean and the covariance matrix? (the parameters we want to estimate)
+$$
+\begin{align}
+\hat{\mu}=\frac{1}{n}\sum^n_{i=1}x_i \\
+\hat{\Sigma}=\frac{1}{n}\sum^n_{i=1}(x_i-\hat{\mu})(x_i-\hat{\mu})^T
+\end{align}
+$$
+seeing as estimate the mean for the second forumla using the first, the second estimator is biased.
+
+to make it unbiased, we can make one simple change
+$$
+\hat{\Sigma}=\frac{1}{n-1}\sum^n_{i=1}(x_i-\hat{\mu})(x_i-\hat{\mu})^T
+$$
+#### example as on slides:
+![[Pasted image 20251027160553.png]]
+![[Pasted image 20251027160603.png]]
+![[Pasted image 20251027160619.png]]![[Pasted image 20251027160630.png]]
+(x is typically on bold on these slides, because after all it is a vector if there are multiple features, and I guess technically even if there is only one feature)
+
+REMIND YOURSELF: "T" operation (and matrix stuff in general such as finding inverse)
+![[Pasted image 20251027173751.png]]
+to solve that you would need to get some 'c' for which $0\times c = 1$, which doesn't work for obvious reasons
+
+Number of objects is insufficient to find the inverse, two object in a 2 dimensional feature space is a _degenerate_ Gaussian distribution
+
+### Parametric Estimation
+Now for $M$-dimensional data:
+$\mu$: is a vector with $M$ elements
+$\Sigma$: is a matrix with $0.5M\space (M+1)$ elements
+
+number of parems increases quadratically with m, and you might still need a lot of data
+
+I am not sure what this means but apparently "Any projection of a high-dimensional gaussian is itself again Gaussian", so I guess reducing it to one feature? but that makes sense (or a lower feature count rather)
+
+### Estimating class priors
+Given a training set, how can you estimate $\hat{p}(y)$?
+The classes are discrete, $\hat{p}(y)$ is a true probability, and often are known or assumed. If not, we need to learn them
+
+Max. likelihood estimator for priors turns out to be counting:
+$\hat{p}(y_1)=\frac{N_1}{N}$ and $\hat{p}(y_2)=\frac{N_2}{N}$
+(you don't need unconditional probablity for finding which is larger remember) ![[Pasted image 20251027151936.png]]
+
+### How to define the classifier based on the estimates:
+#### two-class case
+Discriminant: $f(x)=\log p(y_1|x)-\log p(y_2|x)$
+and then this wonderful jumble, which is just simply plugging everything in:
+![[Pasted image 20251027175634.png]]
+
+