Muhammad Haris Rao
From here on, $R$ is a commutative, unital domain with $\text{char} \, R \ne 2$.
Definition: Let $n, m \in \mathbb{Z}_{>0}$ positive integers. A matrix $M$ of dimension $n \times m$ over the ring $R$ is a function \begin{align*} M : \{ 1, 2, \cdots, n \} \times \{ 1, 2, \cdots, m \} \longrightarrow R \end{align*}
Usually, matrices are written as arrays of elements from $R$. That is, if $M$ is a matrix of dimension $n \times m$, we will usually write \begin{align*} M &= \begin{pmatrix} m_{11} & m_{12} & \cdots & m_{1m} \\ m_{21} & m_{22} & \cdots & m_{2m} \\ \vdots & \vdots & \ddots & \vdots \\ m_{n1} & m_{n2} & \cdots & m_{nm} \\ \end{pmatrix} \end{align*} where $m_{ij}$ is the $M$ evaluated as a function at $(i, j)$ for $i \in \{ 1, \cdots, n \}$ and $j \in \{ 1, \cdots, m \}$. Therefore, $m_{ij}$ is called the $(i, j)$ entry of $M$, or the entry in the $i$th row, $j$th column. The $(i, j)$ entry of a matrix $M$ will also sometimes be notated $[ M ]_{ij}$. The set of $n \times m$ matrices over the ring $R$ is called $M_{n, m} \left( R \right)$. If $n = m$, we will shorten this to $M_n\left( R \right)$.
Now let $A, B$ be matrices over the ring $R$ of dimensions $n_1 \times m_1$ and $n_2 \times m_2$ respectivly. If $M, N$ have the same dimensions, then their sum is defined pointwise as functions. If $m_1 = n_2$, then a matrix product is defined with the entires being \begin{align*} \left[ AB \right]_{ij} &= \sum_{\ell = 1}^{m_1} \left[ A \right]_{i, \ell} \left[ B \right]_{\ell j} \end{align*} for $i \in \{ 1, 2, \cdots, n_1 \}$ and $j \in \{ 1, 2, \cdots, m_2 \}$. Thus, the matrix product is a map \begin{align*} \cdot : M_{n_1, m_1} \left( R \right) \times M_{n_2, m_2} \left( R \right) \longrightarrow M_{n_1, m_2} \left( R \right) \end{align*} whenever $m_1 = n_2$. Clearly, if $n_1 = m_1 = n_2 = m_2$, then matrix addition and matrix multiplication are both defined as operations on $M_n(R)$ giving back elements in $R$. We have in fact
Proposition: For any $n \ge 1$, $M_n(R)$ is a (generally non-commutative) ring under the addition and multiplication operations defined above.
Proof. The fact that $M_n(R)$ is an abelian group under addition is obvious. In fact, it is the $n^2$-fold product of the underlying additive group of $R$. We will prove now that multiplication is associative, and that there exists a multiplicative identity. After this, we will prove addition and mulitplication are compatible through the left and right distirbution.
Let $A, B, C \in M_n(R)$ with their $(i, j)$ entry labelled by $a_{ij}, b_{ij}, c_{ij} \in R$ respectively. Then we have \begin{align*} \left[ \left( A B \right) C \right]_{ij} &= \sum_{\ell = 1}^n \left[ AB \right]_{i\ell} c_{\ell j} = \sum_{\ell = 1}^n \left( \sum_{k = 1}^n a_{i k} b_{k \ell} \right) c_{\ell j} = \sum_{k = 1}^n a_{i k} \left( \sum_{\ell = 1}^n b_{k \ell} c_{\ell j} \right) = \sum_{k = 1}^n a_{i k} \left[ BC \right]_{kj} = \left[ A \left( BC \right) \right]_{ij} \end{align*} Since all their entries are equal, we have $\left( A B \right) C = A \left( BC \right)$ which is the associative property.
Now to prove the existence of a multiplicative identity. Let $I \in M_n(R)$ be the matrix with $[I]_{ii} = 1_R$ for all $i \in \{ 1, \cdots, n \}$ and $[I]_{ij} = 0_R$ for $i \ne j$. Then we have for the arbitrary $A$ as above \begin{align*} \left[ A I \right]_{ij} &= \sum_{\ell = 1}^n a_{i\ell} \left[ I \right]_{\ell j} = \sum_{\ell = 1, \ell \ne j}^n a_{i\ell} \left[ I \right]_{\ell j} + a_{ij} \left[ I \right]_{jj} = \sum_{\ell = 1, \ell \ne j}^n a_{i\ell} \cdot 0_R + a_{ij} \cdot 1_R = a_{ij} = \left[ A \right]_{ij} \\ \left[ I A \right]_{ij} &= \sum_{\ell = 1}^n \left[ I \right]_{i \ell} a_{\ell j} = \sum_{\ell = 1, \ell \ne i}^n \left[ I \right]_{i \ell} a_{\ell j} + \left[ I \right]_{ii} a_{ij} = \sum_{\ell = 1, \ell \ne j}^n 0_R a_{\ell j} + 1_R \cdot a_{ij} = a_{ij} = \left[ A \right]_{ij} \end{align*} so $I$ acts as a multiplicative identity.
It remains to prove that the left and right distribution properties hold. With $A, B, C$ as before, we have \begin{align*} \left[ (A + B) C \right] &= \sum_{\ell = 1}^n \left[ A + B \right]_{i \ell} c_{\ell j} = \sum_{\ell = 1}^n \left( a_{i \ell} + b_{i \ell} \right) c_{\ell j} = \sum_{\ell = 1}^n a_{i \ell} c_{\ell j} + \sum_{\ell = 1}^n b_{i \ell} c_{\ell j} = \left[ A C \right]_{ij} + \left[ B C \right]_{ij} \\ \left[ A (B + C) \right] &= \sum_{\ell = 1}^n a_{i \ell} \left[ B + C \right]_{\ell j} = \sum_{\ell = 1}^n a_{i \ell} \left( b_{\ell j} + c_{\ell j} \right) = \sum_{\ell = 1}^n a_{i \ell} b_{\ell j} + \sum_{\ell = 1}^n a_{i \ell} c_{\ell j} = \left[ A B \right]_{ij} + \left[ A C \right]_{ij} \end{align*} This completes the proof. Thus, $M_n(R)$ is a ring under the addition and mulitplication operations.$\blacksquare$
We can also give $M_{n, m}(R)$ the structure of an $R$-module. It is an Abelian group with under addition, and we can define for $A \in M_{n, m}(R)$ and $r \in R$ the matrix $r \cdot A$ to have $(i, j)$ entry $[ r \cdot A ]_{ij} = r \cdot [ A ]_{ij}$. It is easy to verify that $M_{n, m}(R)$ becomes an $R$-module under this action of $R$. It is also not hard to show that for matrices $A, B$ over $R$ such that $AB$ is defined, we have for all $r \in R$ that $A ( rB) = (r A) B = r(AB)$.
Given $A \in M_{n, m}(R)$, we define the transpose $A^\top \in M_{m, n}(R)$ with entries $[A^\top]_{ij} = [A]_{ji}$. The following property holds:
Proposition: For any $A \in M_{n, m}$ and $B_{m, k}$, we have $(A B)^\top = B^\top A^\top$.
Proof. This is a simple computation: \begin{align*} \left[ (AB)^\top) \right]_{ij} = \left[ AB \right]_{ji} = \sum_{\ell = 1}^n [A]_{j \ell} [B]_{\ell i} = \sum_{\ell = 1}^n \left[ B^\top \right]_{i \ell} \left[ A^\top \right]_{\ell j} = \left[ B^\top A^\top \right]_{ij} \end{align*} $\space$$\blacksquare$
The determinant map is defined as \begin{align*} \text{det} : M_n(R) \longrightarrow R, A \longmapsto \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n \left[A\right]_{\ell,\sigma(\ell)} \end{align*} It is now hard to show that $\text{det}(A) = \text{det}(A^\top)$.
Proposition: The determinant is invariant under taking transpose.
Proof. For any $A \in M_n(R)$, we have \begin{align*} \text{det}(A) &= \sum_{\sigma \in S_n} \text{sgn}(\sigma) \prod_{\ell = 1}^n \left[ A \right]_{\ell, \sigma(\ell)} \\ &= \sum_{\sigma \in S_n} \text{sgn}(\sigma) \prod_{\ell = 1}^n \left[ A \right]_{\sigma^{-1} (\ell), \ell} \\ &= \sum_{\sigma \in S_n} \text{sgn}(\sigma^{-1}) \prod_{\ell = 1}^n \left[ A^\top \right]_{\ell, \sigma^{-1} (\ell)} \\ &= \sum_{\sigma \in S_n} \text{sgn}(\sigma) \prod_{\ell = 1}^n \left[ A^\top \right]_{\ell, \sigma (\ell)} \\ &= \text{det} \left( A^\top \right) \end{align*} $\space$$\blacksquare$
More remarkably, the determinant satisfies the following fundamental property:
Lemma: Suppose the $i$th and $j$th rows of an $n \times n$ matrix $A$ over the ring $R$ are the equal, and $\text{char} \, R \ne 2$. Then $\text{det}(A) = 0$
Proof. Let $\tau \in S_n$ be the transposition exchanging $i, j$. In the $\ell = i$ and $\ell = j$ factors of the product in the deterimant formula, we can use the fact that the $i$th and $j$th rows are equal and that $\sigma(\ell) = \ell$ for $\ell \notin \{ i, j \}$ to get \begin{align*} \prod_{\ell = 1}^n a_{\ell\sigma(\ell)} &= \left( \prod_{\ell = 1, \ell \ne i, j}^n a_{\ell\sigma(\ell)} \right) a_{i\sigma(i)} a_{j\sigma(j)} = \left( \prod_{\ell = 1, \ell \ne i, j}^n a_{\ell, \left( \sigma \tau \right)(\ell)} \right) a_{j\sigma(i)} a_{i\sigma(j)} = \prod_{\ell = 1}^n a_{i, \left( \sigma \tau \right) \left( \ell \right)} \end{align*} Hence, because $\text{sgn}(\tau) = -1$, \begin{align*} \text{det}(A) &= \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n a_{\ell\sigma(\ell)} = - \sum_{\sigma \in S_n} \text{sgn}(\sigma \tau) \prod_{\ell = 1}^n a_{\ell\sigma(\ell)} = - \sum_{\sigma \in S_n} \text{sgn}(\sigma \tau) \prod_{\ell = 1}^n a_{i, \left( \sigma \tau \right) \left( \ell \right)} = - \sum_{\sigma \in S_n} \text{sgn}(\sigma) \prod_{\ell = 1}^n a_{i\sigma\left( \ell \right)} = - \text{det}(A) \end{align*} which can only mean $\text{det}(A) = 0$ since we are working over the ring $R$ which is an integral domain of characteristic not equal to 2.$\blacksquare$
Theorem: For all $A, B \in M_n(R)$, $\text{det} \left( AB \right) = \text{det} (A) \text{det} (B)$
Proof. The proof is really just a long algebraic manipulation. We begin with \begin{align*} \text{det} \left( AB \right) &= \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n \left[AB\right]_{\ell,\sigma(\ell)} = \sum_{\sigma \in S_n} \text{sgn}(\sigma) \prod_{\ell = 1}^n \sum_{k = 1}^n a_{\ell k} b_{k \sigma(\ell)} \end{align*} The product over the sum is \begin{align*} \left( a_{11} b_{1\sigma(1)} + a_{12} b_{2\sigma(1)} + \cdots + a_{1n} b_{n\sigma(1)} \right) \left( a_{21} b_{1\sigma(2)} + a_{22} b_{2\sigma(2)} + \cdots + a_{2n} b_{n\sigma(2)} \right) \cdots \left( a_{n1} b_{1\sigma(n)} + a_{n2} b_{2\sigma(n)} + \cdots + a_{nn} b_{n\sigma(n)} \right) \end{align*} Expanding everything gives $n^n$ terms, with one term for each way to choose a single term from each factor. Letting $k_\ell$ be the term chosen in the $\ell$th factor, we have \begin{align*} \prod_{\ell = 1}^n \sum_{k = 1}^n a_{\ell k} b_{k \sigma(\ell)} = \sum_{k_1, k_2, \cdots, k_n = 1}^n \prod_{\ell = 1}^n a_{\ell k_\ell} b_{k_\ell \sigma(\ell)} \end{align*} Thus, the determinant becomes \begin{align*} \text{det} \left( AB \right) &= \sum_{k_1, \cdots, k_n = 1}^n \left( \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n a_{\ell k_\ell} b_{k_\ell \sigma(\ell)} \right) \\ &= \sum_{k_1, \cdots, k_n = 1}^n \left( \prod_{\ell = 1}^n a_{\ell k_\ell} \right) \left( \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n b_{k_\ell \sigma(\ell)} \right) \end{align*} We will show that when $k_i = k_j$ for any $i \ne j$, the summand is 0. Let $B(k_1, k_2, \cdots, k_n)$ be the matrix whose $\ell$th row is the $k_\ell$th row of $B$. Then, \begin{align*} \text{det} \left( AB \right) &= \sum_{k_1, \cdots, k_n = 1}^n \left( \prod_{\ell = 1}^n a_{\ell k_\ell} \right) \left( \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n b_{k_\ell \sigma(\ell)} \right) = \sum_{k_1, \cdots, k_n = 1}^n \left( \prod_{\ell = 1}^n a_{\ell k_\ell} \right) \text{det} \left( B \left(k_1, k_2, \cdots, k_n \right) \right) \end{align*} If $k_i = k_j$ for any distinct $i, j$, then there are distinct rows of $B \left(k_1, k_2, \cdots, k_n \right)$ which are equal so the summand vanishes by the previous lemma. Hence, we only need to analyse the summands where the $k_i$ are distinct. There is an obvious correspondence between such a choice of $k_1, \cdots, k_n$ and permutations from $S_n$ so that we may write \begin{align*} \text{det} \left( AB \right) &= \sum_{k_1, \cdots, k_n = 1}^n \left( \prod_{\ell = 1}^n a_{\ell , k_\ell} \right) \left( \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n b_{k_\ell , \sigma(\ell)} \right) \\ &= \sum_{\tau \in S_n} \left( \prod_{\ell = 1}^n a_{\ell, \tau(\ell)} \right) \left( \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \right) \prod_{\ell = 1}^n b_{\tau(\ell), \sigma(\ell)} \right) \\ &= \left( \sum_{\tau \in S_n} \text{sgn}(\tau) \prod_{\ell = 1}^n a_{\ell, \tau(\ell)} \right) \left( \sum_{\sigma \in S_n} \text{sgn} \left( \sigma \tau^{-1} \right) \prod_{\ell = 1}^n b_{\ell, \left( \sigma \tau^{-1} \right) (\ell)} \right) \\ &= \text{det} (A) \text{det} (B) \end{align*} which is the desired result.$\blacksquare$
Definition: Let $A$ be an $n \times n$ matrix over $R$. The $(i, j)$ cofactor of $A$ is \begin{align*} \text{cof}(A)_{ij} &= \sum_{\sigma \in S_n^{i, j}} \text{sgn}(\sigma) \prod_{\ell = 1, \ell \ne i}^{n - 1} a_{\ell, \sigma(\ell)} \end{align*} where $S_n^{i, j}$ is the permutations in $S_n$ which take $i$ to $j$. The cofactor matrix of $A$ is the $n \times n$ matrix whose $(i, j)$ entry is the $(i, j)$ cofactor, and is denoted $\text{cof}(A)$. The adjugate of $A$, denoted $\text{Adj}(A)$ is the transpose of the cofactor matrix.
The cofactor matrix satisfies the following property:
Proposition: For any $A \in M_n(R)$, we have $\text{cof}(A)^\top = \text{cof}(A^\top)$
Proof. This is a simple computation: \begin{align*} \left[ \text{cof}(A)^\top \right]_{ij} &= \text{cof}(A)_{ji} \\ &= \sum_{\sigma \in S_n^{j, i}} \text{sgn}(\sigma) \prod_{\ell = 1, \ell \ne j}^n \left[ A \right]_{\ell, \sigma(\ell)} \\ &= \sum_{\sigma \in S_n^{j, i}} \text{sgn}(\sigma) \prod_{\ell = 1, \ell \ne j}^n \left[ A^\top \right]_{\sigma(\ell), \ell} \\ &= \sum_{\sigma \in S_n^{j, i}} \text{sgn}(\sigma) \prod_{\ell = 1, \ell \ne i}^n \left[ A^\top \right]_{\ell, \sigma^{-1}(\ell)} \\ &= \sum_{\sigma \in S_n^{i, j}} \text{sgn}(\sigma) \prod_{\ell = 1, \ell \ne i}^n \left[ A^\top \right]_{\ell, \sigma(\ell)} \\ &= \text{cof}(A^\top)_{ij} \end{align*} $\space$$\blacksquare$
The importance of the adjugate matrix is the following result:
Theorem: Let $I$ be the multiplicative identity in $M_n(R)$, and $A \in M_n(R)$. Then, \begin{align*} A \cdot \text{Adj}(A) = \text{Adj}(A) \cdot A &= \text{det}(A) I \end{align*}
Proof. The $(i, j)$ entry of the product $A \cdot \text{Adj}(A)$ is \begin{align*} \sum_{\ell = 1}^n a_{i\ell} \text{Adj}(A)_{\ell j} &= \sum_{\ell = 1}^n a_{i \ell} \text{cof}(A)_{j \ell} \\ &= \sum_{\ell = 1}^n \sum_{\sigma \in S_n^{j, \ell}} \text{sgn}(\sigma) a_{i \ell} \prod_{k = 1, k \ne j}^n a_{k, \sigma(k)} \\ &= \sum_{\sigma \in S_n} \text{sgn}(\sigma) a_{i, \sigma(j)} \prod_{k = 1, k \ne j}^n a_{k, \sigma(k)} \end{align*} If $i = j$, then the above expression is the definition of $\text{det}(A)$. Now suppose $i \ne j$ and let $\tau \in S_n$ be the transposition exchanging $i, j$. Then, \begin{align*} \sum_{\sigma \in S_n} \text{sgn}(\sigma) a_{i, \sigma(j)} \prod_{k = 1, k \ne j}^n a_{k, \sigma(k)} &= \sum_{\sigma \in S_n} \text{sgn}(\sigma) a_{i, \sigma(j)} a_{i, \sigma(i)} \prod_{k = 1, k \ne i, j}^n a_{k, \sigma(k)} \end{align*} This is the formula for the determinant of a matrix whose $j$th row is the same as the $i$th row, so it is zero. Thus, the diagonal entries of $A \cdot \text{cof}(A)$ are $\text{det}(A)$ while the rest are 0, and so $A \cdot \text{Adj}(A) = I$. Now to show that also $\text{Adj}(A) \cdot A = I$. We have already just shown that $A^\top \text{Adj}(A^\top) = \text{det}(A^\top)I = \text{det}(A) I$. Hence, \begin{align*} \text{Adj}(A) \cdot A &= \left( A^\top \text{Adj}(A)^\top \right)^\top = \left( A^\top \text{Adj}(A^\top) \right)^\top = \left( \text{det}(A) I \right)^\top = \text{det}(A) I \end{align*} which is what we wanted to show.$\blacksquare$
This yields the following result:
Corollary: Let $A \in M_n(R)$. Then $A$ has a multiplicative inverse if and only if $\text{det}(A)$ is a unit. In particular, if $R$ is a field, then $A$ is invertible if and only if $\text{det}(A) \ne 0$.
Proof. If $\text{det}(A) \in R$ is a unit, then we can take $A^{-1} = \text{det}(A)^{-1} \text{Adj}(A)$. Then $A^{-1} A = A A^{-1} = I$ by the previous result. Conversely, if such an $A^{-1}$ exists, then \begin{align*} 1_R &= \text{det}(I) = \text{det}(A A^{-1}) = \text{det}(A) \text{det}(A^{-1}) \end{align*} and likewise $1_R = \text{det}(A^{-1}) \text{det}(A)$ which proves that $\text{det}(A)$ is a unit in the ring $R$.$\blacksquare$
Theorem: For any $n, m \in \mathbb{Z}_{> 0}$, we have an $R$-module isomorphism \begin{align*} \text{Hom}_R\left( R^m, R^n \right) \cong M_{n, m} \left( R \right) \end{align*}
Proof. Let $\mathcal{B}_1 = \{ r_1, \cdots, r_n \} \subseteq R^n$ and $\mathcal{B}_2 = \{ s_1, \cdots, s_m \} \subseteq R^m$ be bases of their respective $R$-modules. Define the map \begin{align*} \Phi : \text{Hom}_R\left( R^m, R^n \right) &\longrightarrow M_{n, m} \left( R \right), f \longmapsto \Phi(f) \end{align*} where $\Phi(f)$ is the matrix whose entries are defined by the relations \begin{align*} f \left( s_j \right) &= \sum_{i = 1}^n \left[ \Phi(f) \right]_{ij} r_i \end{align*} for $j \in \{ 1,2, \cdots, m \}$. These elements of $R$ are uniquely determined since $\mathcal{B}_1, \mathcal{B}_2$ are bases, and conversely any choice of these elements will yield a uniquely determined homomorphism. Hence, $\Phi$ is a bijection. Now to prove that the map $f \longmapsto \Phi(f)$ defines an $R$-module isomorphism. Let $g \in \text{Hom}_R\left( R^m, R^n \right)$ as well, and let $\lambda, \mu \in R$. Then, we have \begin{align*} \left( \lambda f + \mu g \right) \left( s_j \right) &= \lambda f(s_j) + \mu g(s_j) = \lambda \sum_{i = 1}^n \left[ \Phi(f) \right]_{ij} r_i + \mu \sum_{i = 1}^n \left[ \Phi(g) \right]_{ij} r_i = \sum_{i = 1}^n \left( \lambda \left[ \Phi(f) \right]_{ij} + \mu \left[ \Phi(g) \right]_{ij} \right) r_i \end{align*} Thus, we will have \begin{align*} \left[ \Phi \left( \lambda f + \mu g \right) \right]_{ij} &= \lambda \left[ \Phi(f) \right]_{ij} + \mu \left[ \Phi(g) \right]_{ij} = \left[ \lambda \Phi(f) + \mu \Phi(g) \right]_{ij} \end{align*} This proves that $\Phi(\lambda f + \mu g ) = \lambda \Phi(f) + \mu \Phi(g)$, so $\Phi$ is an $R$-module homomorphism. Since it is also a bijection, it gives the desired isomorphism.$\blacksquare$